User Tools

Site Tools


dev:profiling

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
profiling [2013/10/16 09:27] – [nvprof] oschuettdev:profiling [2020/08/21 10:15] (current) – external edit 127.0.0.1
Line 70: Line 70:
   - TOTAL TIME: How much time is spent in this subroutine, including time spent in timed subroutines. AVERAGE and MAXIMUM as defined above   - TOTAL TIME: How much time is spent in this subroutine, including time spent in timed subroutines. AVERAGE and MAXIMUM as defined above
  
-Note thatfor the threaded codeonly the master thread is instrumented.+By defaultonly routines contributing up to 2% of the total runtime are included in the timing report.  To see smaller routinesset a smaller cut-off with the [[http://manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL/TIMINGS.html#desc_THRESHOLD|GLOBAL%TIMINGS%THRESHOLD]] keyword
  
 +Note that, for the threaded code, only the master thread is instrumented.
 ==== Modifying the timing report ==== ==== Modifying the timing report ====
  
Line 227: Line 228:
 export PMI_NO_FORK=1 export PMI_NO_FORK=1
 # no cuda proxy # no cuda proxy
-# export CRAY_CUDA_PROXY=1+# export CRAY_CUDA_MPS=1
 # use all cores with OMP # use all cores with OMP
 export OMP_NUM_THREADS=8 export OMP_NUM_THREADS=8
 # use aprun in MPMD mode to have only the output from the master rank (here 169 nodes are used) # use aprun in MPMD mode to have only the output from the master rank (here 169 nodes are used)
-COMMAND="/cp2k.psmp -i test.inp -o test.out-profile"+COMMAND="./cp2k.psmp -i test.inp -o test.out-profile"
 PART1="-N 1  -n 1 -d ${OMP_NUM_THREADS} nvprof -o log.nvprof ${COMMAND}" PART1="-N 1  -n 1 -d ${OMP_NUM_THREADS} nvprof -o log.nvprof ${COMMAND}"
 PART2="-N 1  -n 168 -d ${OMP_NUM_THREADS} ${COMMAND}" PART2="-N 1  -n 168 -d ${OMP_NUM_THREADS} ${COMMAND}"
dev/profiling.1381915673.txt.gz · Last modified: 2020/08/21 10:14 (external edit)