
In this tutorial, we will present how to profile PICSAR with Intel Advisor on KNL. We will use the CORI supercomputer at NERSC. What we will present can be easily transposed to any system with small modifications.
Presentation of Intel Advisor
Intel Advisor is composed of Vectorization Advisor and Threading Advisor that help to ensure that your code realizes full performance potential.
In this tutorial, we will focus on Vectorization advisor which is a vectorization optimization tool:
- Help to identify time-consuming loops that can benefit from vectorization or already vectorized
- Help to identify vetorization and efficiency issues (dependencies, spilling, memory access…) and propose solutions
- Help to ensure that vectorization is safe and quantify effects of vectorization (vectorization efficiency, roofline performance model)
In command line, Advisor is called via advixe-cl.
advixe-cl -collect analysis -trace-mpi -- ./application
analysis
refers to the type of survey you want to do. They can be:
- survey: general overview of the performances and the vectorization state of the code.
-
tripcounts: improves the survey by dynamically exploring loop iteration execution and propose better decisions about your vectorization strategy. It measures #FLOP count and cumulative data traffic necessary for the Roofline performance model.
-
dependencies: refine analysis by checking for real data dependencies in loops the compiler did not vectorize because of assumed dependencies.
-
map: (Memory Access Pattern) refine analysis by checking for various memory issues, such as non-contiguous memory accesses and unit stride vs. non-unit stride accesses.
Then, a series of additional arguments can be used. You can choose the name of the analysis directory using the following argument:
--project-dir
You can specify the location of the sources using this command:
--search-dir src:r=
In order to take into account the masks in the vector operations on KNL for specific collections such as tripcounts, you have to specify:
-flops-and-masks
The finalization of the results on KNL can take a long time. You can decide to not finalize the results after the analysis by specifying:
-no-auto-finalize
Here is a list of very useful links toward Intel Advisor documentations:
- Intel Advisor Linux presentation (description of the different surveys, links toward specific command documentation)
- Information and tutorials about the collections
- Intel Advisor Getting Started
- Intel Advisor FAQ
- The NERSC website
Compiling PICSAR for Intel Advisor on KNL at NERSC
First of all, you have to install PICSAR with the right flags for Advisor. We recommend to create a new PICSAR installation in your SCRATCH directory. We also recommend to use the Intel compiler and the corresponding libraries for KNL:
module unload craype-haswell module load craype-mic-knl module unload Prgenv-gnu module load Prgenv-intel
To setup your environment rapidly, you can create a file that you will source like for instance source ~/.knl_config_advisor
. This file can be like this:
if [ "$NERSC_HOST" == "cori" ] then # Modules module unload craype-haswell module unload PrgEnv-gnu module unload darshan module load PrgEnv-intel module load craype-mic-knl # not required for compilation but at execution module load advisor # This activate the Cache-Aware Roofline Feature export ADVIXE_EXPERIMENTAL=roofline # Path to PICSAR compiled with Intel for KNL and Advisor PICSAR=$SCRATCH/Codes/install_intel_knl_advisor/picsar/ # PICSAR paths export PYTHONPATH="$PICSAR/python_bin/:$PYTHONPATH" export PYTHONPATH="$PICSAR/example_scripts_python/:$PYTHONPATH" export PYTHONPATH="$PICSAR/python_libs/:$PYTHONPATH" export PYTHONPATH="$PICSAR/postproc_python_script/:$PYTHONPATH" fi
In this script, PICSAR is installed in
$SCRATCH/Codes/install_intel_knl_advisor/picsar/
Change this path accordingly to your installation. Instead of sourcing a file, you can also setup your environment manually.
You can easily compile PICSAR for Advisor using our Makefile:
make SYS=cori2 MODE=advisor
The compilation will generate a binary file called picsar_cori2_advisor
.
The compilation is made with the following flags:
- -g: Produces symbolic debug information in the object file (required).
- -dynamic
- -O3 -xMIC-AVX512 -qopenmp: enable OpenMP, optimization and vectorization on KLNLs
- -debug inline-debug-info: more debug information
- -align array64byte: data alignment for better vectorization efficiency.
Batch script for Intel Advisor
To use Advisor, we have to load the corresponding module on CORI: module load advisor
. In the following example, we simply source the setup file source ~/.knl_config_advisor
.
In the following batch script (List. 2), the basic survey is performed on PICSAR on a KNL node configured in quadrant cache with 1 MPI rank and 64 OpenMP threads.
#!/bin/bash -l #SBATCH --job-name=advisor_analysis #SBATCH --time=00:30:00 #SBATCH -N 1 ##SBATCH -S 4 #SBATCH -p knl #SBATCH -C knl,quad,cache #SBATCH -e advisor_analysis.err #SBATCH -o advisor_analysis.out export OMP_NUM_THREADS=64 export OMP_STACKSIZE=128M export OMP_DISPLAY_ENV=true export OMP_PROC_BIND=spread export OMP_PLACES=cores"(64)" export OMP_SCHEDULE=dynamic source ~/.knl_config_advisor cp $PICSAR/fortran_bin/picsar_cori2_advisor . # Add the following "sbcast" line here for jobs larger than 1500 MPI tasks: # sbcast ./mycode.exe /tmp/mycode.exe numactl -H # Survey srun -n 1 -c 272 --cpu_bind=cores advixe-cl -collect survey -project-dir advisor_analysis -trace-mpi -- ./picsar_cori2_advisor # Tripcounts srun -n 1 -c 272 --cpu_bind=cores advixe-cl -collect tripcounts -project-dir advisor_analysis -flops-and-masks -trace-mpi -- ./picsar_cori2_advisor
Visualization of the results
We use the GUI Advisor interface to visualize the data. You can open the GUI on a login node by first loading the Advisor module module load advisor
and then using the following command:
amplxe-gui
You can directly specify the path to your data or open it via the GUI interface.
If you wish to use the Cache-Aware Roofline feature with Advisor, do not forget to do:
export ADVIXE_EXPERIMENTAL=roofline
At NERSC, it is recommended to use NX to use advisor-gui
for network performance.

Mathieu Lobet, last update: January 31, 2017