dev:profiling
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
profiling [2013/08/13 10:53] – [Valgrind] 129.132.169.21 | dev:profiling [2020/08/21 10:15] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 70: | Line 70: | ||
- TOTAL TIME: How much time is spent in this subroutine, including time spent in timed subroutines. AVERAGE and MAXIMUM as defined above | - TOTAL TIME: How much time is spent in this subroutine, including time spent in timed subroutines. AVERAGE and MAXIMUM as defined above | ||
- | Note that, for the threaded code, only the master thread is instrumented. | + | By default, only routines contributing up to 2% of the total runtime are included in the timing report. |
+ | Note that, for the threaded code, only the master thread is instrumented. | ||
==== Modifying the timing report ==== | ==== Modifying the timing report ==== | ||
Line 212: | Line 213: | ||
The result, a file named callgrind.out.XXX, | The result, a file named callgrind.out.XXX, | ||
+ | ===== nvprof ===== | ||
+ | |||
+ | Profiling the CUDA code can be done quite nicely using the nvprof tool. To do so, it is useful to enable user events which requires compiling cp2k with < | ||
+ | < | ||
+ | nvprof -o log.nvprof ./cp2k.sopt -i test.inp -o test.out | ||
+ | </ | ||
+ | and visualize log.nvprof with the nvvp tool, which might take several minutes to open the data. | ||
+ | |||
+ | An example profile for a linear scaling benchmark (TiO2) is shown here | ||
+ | {{ :: | ||
+ | |||
+ | To run on CRAY architectures in parallel the following additional tricks are needed | ||
+ | < | ||
+ | export PMI_NO_FORK=1 | ||
+ | # no cuda proxy | ||
+ | # export CRAY_CUDA_MPS=1 | ||
+ | # use all cores with OMP | ||
+ | export OMP_NUM_THREADS=8 | ||
+ | # use aprun in MPMD mode to have only the output from the master rank (here 169 nodes are used) | ||
+ | COMMAND=" | ||
+ | PART1=" | ||
+ | PART2=" | ||
+ | aprun ${PART1} : ${PART2} | ||
+ | </ | ||
dev/profiling.1376391214.txt.gz · Last modified: 2020/08/21 10:14 (external edit)