Using PICSAR with Intel SDE

This section presents how to use PICSAR with Intel® Software Development Emulator (SDE). Intel SDE is a platform that emulates AVX512 architectures (available on Knight Landing.) and future instruction sets. You can refer to the Intel SDE presentation page for a detail software description and user instruction. SDE can be run on any platform including Ivy Bridge and Haswell and provides useful reports on how the code behave on new architectures. Intel SDE is now used by PICSAR mainly to determine the number of flops performed by the code or specific subroutines.

To use SDE with your application, you have to use the sde command followed by arguments and your application name at the end:

sde -arch arguments -- ./application

The argument -arch is the processor architecture on which SDE is run. For instance, -hsw has to be used for Haswell and -knl for KNL.

This is a list of references for SDE:

  • NERSC: This page contains a section on how to use SDE with Edison.

Compiling PICSAR for SDE

First of all, you have to compile PICSAR with the right flags for SDE. We  recommend to create a new PICSAR installation in your SCRATCH directory. We also recommend to use the Intel compiler and the corresponding libraries for KNL:

module unload craype-haswell
module load craype-mic-knl
module unload Prgenv-gnu
module load Prgenv-intel

To setup your environment rapidly, you can create a file that you will source like for instance source ~/.knl_sde_config. This file can be like this:

if [ "$NERSC_HOST" == "cori" ]
then

   # Modules
   module unload craype-haswell
   module unload PrgEnv-gnu

   module load PrgEnv-intel
   module load craype-mic-knl

   # Required at execution
   module load SDE

   # Path to PICSAR compiled with Intel for KNL and Advisor
   PICSAR=$SCRATCH/Codes/install_intel_knl_sde/picsar/

   # PICSAR paths
   export PYTHONPATH="$PICSAR/python_bin/:$PYTHONPATH"
   export PYTHONPATH="$PICSAR/example_scripts_python/:$PYTHONPATH"
   export PYTHONPATH="$PICSAR/python_libs/:$PYTHONPATH"
   export PYTHONPATH="$PICSAR/postproc_python_script/:$PYTHONPATH"

fi

In this script, PICSAR is installed in

 $SCRATCH/Codes/install_intel_knl_sde/picsar/

Change the path for your installation directory. Instead of sourcing this file, you can also setup your environment manually.

You can easily compile PICSAR for SDE using our Makefile:

make SYS=cori2 MODE=sde

The compilation will generate a binary file called picsar_cori2_sde.

The compilation is made with the following flags:
-D SDE=1: this flag tells PICSAR to activate specific subroutines for SDE. SDE uses markers that can be put in the code to profile specific sections. PICSAR have subroutines with these markers to profile for instance only the kernel without initialization (purpose of SDE=1).

  • -g: Produces symbolic debug information in the object file (required).
  • -dynamic
  • -O3 -xMIC-AVX512 -qopenmp: enable OpenMP, optimization and vectorization on KLNLs
  • -debug inline-debug-info: more debug information
  • -align array64byte: data alignment for better vectorization efficiency.

 

In this section, we have presented how to compile PICSAR for SDE on KNL. Using KNL is not the most efficient particularly for big runs. SDE wil run faster on Ivy Bridge or Haswell. In this case, you can compile for Haswell by doing:

make SYS=cori1 MODE=sde

Or for Edison:

make SYS=edison MODE=sde

Running PICSAR with SDE

In the following script, PICSAR is run with SDE on a single KNL node with 4 MPI ranks and 32 OpenMP threads per MPI rank.

#!/bin/bash -l
#SBATCH --job-name=sde_analysis
#SBATCH --time=01:00:00
#SBATCH -N 1
##SBATCH -S 4
#SBATCH -p knl
#SBATCH -C knl,quad,cache
#SBATCH -e sde_analysis.err
#SBATCH -o sde_analysis.out

export OMP_NUM_THREADS=32
export OMP_STACKSIZE=128M
export OMP_DISPLAY_ENV=true
export OMP_PROC_BIND=spread
export OMP_PLACES=cores"(16)"
export OMP_SCHEDULE=dynamic

source ~/.knl_sde_config

cp $PICSAR/fortran_bin/picsar_cori2_sde .

# Add the following "sbcast" line here for jobs larger than 1500 MPI tasks:
# sbcast ./mycode.exe /tmp/mycode.exe

numactl -H

srun -n 4 -c 64 sde -knl -d -iform 1 -omix my_mix.out -i -global_region -start_ssc_mark 111:repeat -stop_ssc_mark 222:repeat -- ./picsar_cori2_sde

Here the following arguments are used:

  • -start_ssc_mark 111:repeat:this argument specifies that when the code reach the marker 111, the profiling should start.
  • -stop_ssc_mark 222:repeat: this argument specifies that when the code reach marker 222, the profiling should stop.

 

Mathieu Lobet, last update: January 26, 2017