User Tools

Site Tools


howto:compile_with_cuda

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Last revisionBoth sides next revision
howto:compile_with_cuda [2019/04/09 10:19] alazzarohowto:compile_with_cuda [2019/12/18 11:12] alazzaro
Line 10: Line 10:
 NVCC    = /path_to_cuda/bin/nvcc NVCC    = /path_to_cuda/bin/nvcc
 DFLAGS += -D__ACC -D__DBCSR_ACC -D__PW_CUDA DFLAGS += -D__ACC -D__DBCSR_ACC -D__PW_CUDA
-LIBS   += -lcudart -lcublas -lcufft -lrt+LIBS   += -lcudart -lcublas -lcufft -lnvrtc
 </code> </code>
  
Line 18: Line 18:
  
 ===== Libcusmm ===== ===== Libcusmm =====
-The acceleration of DBCSR is performed by libcusmm. This library provides a number of kernels. Each of these kernels can multiply blocks of specific blocksizes. The blocksizes of a simulation are determined by the employed basis-set. As of DBCSR 1.1, by default libcusmm is able to generate any kernel for {m,n,k}≤80, see [[ https://github.com/cp2k/dbcsr/blob/develop/src/acc/libsmm_acc/libcusmm/README.md | here ]] for more details. The //DBCSR Statistics// are printed at the end of every CP2K-run, example+The acceleration of DBCSR is performed by libcusmm. This library provides a number of kernels. Each of these kernels can multiply blocks of specific blocksizes. The blocksizes of a simulation are determined by the employed basis-set. As of DBCSR 1.1, by default libcusmm is able to generate any kernel for {m,n,k}≤80, see [[ https://github.com/cp2k/dbcsr/blob/develop/src/acc/libsmm_acc/libcusmm/README.md | here]] for more details. The //DBCSR Statistics// are printed at the end of every CP2K-run, example
  
 <code> <code>
Line 38: Line 38:
 </code> </code>
  
-More supported GPUs can be added, please refer to the description [[ https://github.com/cp2k/dbcsr/blob/develop/src/acc/libsmm_acc/libcusmm/README.md | here ]].  +More supported GPUs can be added, please refer to [[https://github.com/cp2k/dbcsr/blob/develop/src/acc/libsmm_acc/libcusmm/tune.md | this howto]].
- +
-New kernel parameters have to be optimized, which [[howto:libcusmm | this howto]] explains in detail.+
  
 ===== Profiling ===== ===== Profiling =====
-If you are interested in profiling CP2K with nvprof have a look at [[dev:profiling#nvprof | these remarks ]].+If you are interested in profiling CP2K with nvprof have a look at [[dev:profiling#nvprof | these remarks]].
howto/compile_with_cuda.txt · Last modified: 2020/08/21 10:15 by 127.0.0.1