User Tools

Site Tools


howto:compile_with_cuda

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
howto:compile_with_cuda [2019/04/09 10:06] alazzarohowto:compile_with_cuda [2020/08/21 10:15] (current) – external edit 127.0.0.1
Line 10: Line 10:
 NVCC    = /path_to_cuda/bin/nvcc NVCC    = /path_to_cuda/bin/nvcc
 DFLAGS += -D__ACC -D__DBCSR_ACC -D__PW_CUDA DFLAGS += -D__ACC -D__DBCSR_ACC -D__PW_CUDA
-LIBS   += -lcudart -lcublas -lcufft -lrt+LIBS   += -lcudart -lcublas -lcufft -lnvrtc
 </code> </code>
  
-See [[https://github.com/cp2k/cp2k/blob/master/INSTALL.md#2j-cuda-optional-improved-performance-on-gpu-systems | here]] for more details.+See [[https://github.com/cp2k/cp2k/blob/master/INSTALL.md#2j-cuda-optional-improved-performance-on-gpu-systems | here]] for details.
 As a prerequisite the [[https://developer.nvidia.com/cuda-toolkit |Nvidia CUDA Toolkit ]] has to be installed. As a prerequisite the [[https://developer.nvidia.com/cuda-toolkit |Nvidia CUDA Toolkit ]] has to be installed.
  
  
 ===== Libcusmm ===== ===== Libcusmm =====
-The acceleration of DBCSR is performed by libcusmm. This library provides a number of kernels. Each of these kernels can multiply blocks of specific blocksizes. The blocksizes of a simulation are determined by the employed basis-set. As of DBCSR 1.1, by default libcusmm is able to generate any kernel for {m,n,k}≤80, see [[ https://github.com/cp2k/dbcsr/blob/develop/src/acc/libsmm_acc/libcusmm/README.md | here ]] for more details. The //DBCSR Statistics// are printed at the end of every CP2K-run, example+The acceleration of DBCSR is performed by libcusmm. This library provides a number of kernels. Each of these kernels can multiply blocks of specific blocksizes. The blocksizes of a simulation are determined by the employed basis-set. As of DBCSR 1.1, by default libcusmm is able to generate any kernel for {m,n,k}≤80, see [[ https://github.com/cp2k/dbcsr/blob/develop/src/acc/libsmm_acc/libcusmm/README.md | here]] for more details. The //DBCSR Statistics// are printed at the end of every CP2K-run, example
  
 <code> <code>
Line 38: Line 38:
 </code> </code>
  
-More supported GPUs can be added, please refer to the description [[ https://github.com/cp2k/dbcsr/blob/develop/src/acc/libsmm_acc/libcusmm/README.md | here ]].  +More supported GPUs can be added, please refer to [[https://github.com/cp2k/dbcsr/blob/develop/src/acc/libsmm_acc/libcusmm/tune.md | this howto]].
- +
-New kernel parameters have to be optimized, which [[howto:libcusmm | this howto]] explains in detail.+
  
 ===== Profiling ===== ===== Profiling =====
-If you are interested in profiling CP2K with nvprof have a look at [[dev:profiling#nvprof | these remarks ]].+If you are interested in profiling CP2K with nvprof have a look at [[dev:profiling#nvprof | these remarks]].
howto/compile_with_cuda.1554804382.txt.gz · Last modified: 2020/08/21 10:15 (external edit)