User Tools

Site Tools


howto:compile

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
howto:compile [2014/02/09 23:08] – [Small Matrix Multiplication Library] ltonghowto:compile [2023/11/13 13:04] (current) oschuett
Line 1: Line 1:
-====== How to Compile CP2K ====== +This page has been moved to: https://github.com/cp2k/cp2k/blob/master/INSTALL.md
- +
-===== Prerequisites ===== +
- +
-You need the following before you can compile CP2K +
- +
-  * [[compile#Obtaining a copy of CP2K source|A copy of CP2K code]] +
-  * [[compile#GNU make|GNU make]] +
-  * [[compile#Fortran 95 Compiler|A Fortran 95 compiler]] +
-  * [[compile#yacc|yacc]] +
-  * [[compile#BLAS and LAPACK|A version of BLAS and LAPACK linear algebra libraries]] +
-  * [[compile#MPI and SCALAPACK|MPI (version 2) and SCALAPACK]] (optional, required for parallel version) +
-  * [[compile#Exchange-Correlation Functionals Library|libxc]] (optional, for exchange-correlation functionals used in QUICKSTEP) +
-  * [[compile#Fast Fourier Transform Library|FFTW]] (optional) +
-  * [[compile#Hartree-Fock Exchange|libint]] (optional, for Hartree-Fock exchange) +
-  * [[compile#Small Matrix Multiplication Library|libsmm]] (optional) +
-  * [[compile#Library ELPA|Library ELPA]] (optional, replaces SCALAPACK ''SYEVD'' if used) +
-  * [[compile#CUDA Support|CUDA]] (optional, for utilising GPUs) +
-  * [[compile#Machine Architecture Abstraction Support|Machine architecture abstraction support]] (optional) +
-  * [[compile#Process Mapping Support|Process mapping support]] (optional) +
- +
-==== Obtaining a copy of CP2K source ==== +
- +
-To obtain a copy of CP2K source, please follow the instructions given [[:download|here]]. +
- +
-==== GNU make ==== +
- +
-GNU make should be on your system (gmake or make on Linux) and it is used for the build. If you do not have it, go to +
- +
-http://www.gnu.org/software/make/make.html +
- +
-and download from +
- +
-http://ftp.gnu.org/pub/gnu/make/ +
- +
-==== Fortran 95 Compiler ==== +
- +
-A Fortran 95 compiler should be installed on your system. We have good experience with gfortran 4.4.X and above.  Be aware that some compilers have bugs that might cause them to fail (internal compiler errors, segfaults) or, worse, yield a miscompiled CP2K.  +
- +
-//Please report bugs to compiler vendors; they (and we) have an interest in fixing them.// +
- +
-==== yacc ==== +
- +
-''yacc'' is needed to compile the dependency generator. It can be found as a part of the GNU bison package: +
- +
-http://www.gnu.org/software/bison/ +
- +
-==== BLAS and LAPACK ==== +
- +
-BLAS and LAPACK linear algebra libraries should be installed. Using vendor-provided libraries can make a very significant difference (up to 100%, e.g., ACML, MKL, ESSL). +
- +
-//Note that the BLAS/LAPACK libraries must match the Fortran compiler used//.  +
- +
-Use the latest versions available and download all patches! The canonical BLAS and LAPACK can be obtained from the Netetlib repository. +
- +
-  * http://www.netlib.org/blas/ +
-  * http://www.netlib.org/lapack/ and see also +
-  * http://www.netlib.org/lapack-dev/ +
- +
-A faster alternative is to use the ATLAS project. It provides BLAS and enough of LAPACK to run CP2K, both optimized for the local machine upon installation: [[http://math-atlas.sourceforge.net/]] +
- +
-GotoBLAS is yet a faster BLAS alternative: [[http://www.tacc.utexas.edu/resources/software/]] +
- +
-If compiling with OpenMP support then it is recommended to use a //non-threaded// version of BLAS. +
- +
-==== MPI and SCALAPACK ==== +
- +
-MPI (version 2) and ScaLAPACK are needed for parallel code (''popt'' and ''psmp'' versions). Use the latest versions available and download all patches! +
- +
-If your computing platform does not provide MPI, there are several freely available alternatives: +
- +
-  * MPICH2 MPI:  http://www-unix.mcs.anl.gov/mpi/mpich/ +
-  * OpenMPI MPI: http://www.open-mpi.org/ +
-   +
-ScaLAPACK can be part of ACML or cluster MKL.  These libraries are recommended if available.   +
- +
-Canonical ScaLAPACK can be obtained from +
-   +
-http://www.netlib.org/scalapack/ +
-  +
-and see also http://www.netlib.org/lapack-dev/ +
-     +
-Recently a ScaLAPACK installer has been added that makes installing ScaLAPACK easier: http://www.netlib.org/scalapack/scalapack_installer.tgz +
- +
-==== Exchange-Correlation Functionals Library ==== +
- +
-The version 2.0.1 (**ONLY this one**) of libxc needs to be downloaded from +
- +
-http://www.tddft.org/programs/octopus/wiki/index.php/Libxc +
- +
-and installed (to ''$LIBXC_DIR'', what ever which directory it may be). +
- +
-During the installation, the directory ''$LIBXC_DIR/lib'' is created. Add the preprocessor flag +
- +
-| %%-D__LIBXC2%% | +
- +
-to ''DFLAGS'', and +
- +
-| %%-L$(LIBXC_DIR)/lib -lxc%% | +
- +
-to ''LIBS'' in your [[compile#The arch File|arch file]]. +
- +
-==== Fast Fourier Transform Library ==== +
- +
-FFTW can be used to improve FFT speed on a wide range of architectures. +
- +
-It is strongly recommended to install and use FFTW3.  The current version of CP2K works with FFTW 3.X: +
- +
-http://www.fftw.org/ +
- +
-Note that FFTW must know the Fortran compiler you will use in order to install properly (e.g., ''export F77=gfortran'' before configure if you intend to use gfortran). Note that on machines and compilers which support SSE you can configure FFTW3 with ''--enable-sse2''  +
- +
-Compilers/systems that do not align memory (NAG f95, Intel IA32/gfortran) should either not use ''--enable-sse2'' or otherwise add +
- +
-| %%-D__FFTW3_UNALIGNED%% | +
- +
-to ''DFLAGS'' in the [[compile#The arch File|arch file]]. +
- +
-When building an OpenMP parallel version of CP2K (''ssmp'' or ''psmp''), the FFTW3 threading library ''libfftw3_threads'' (or ''libfftw3_omp'') is required. These can be generated using the ''--enable-threads'' and ''--enable-openmp'' flags during configuration of FFTW3. +
- +
-When using FFTW3 library, add +
- +
-| %%-D__FFTW3%% | +
- +
-to ''DFLAGS'', and link to the appropriate libraries in your [[compile#The arch File|arch file]]. +
- +
-==== Hartree-Fock Exchange ==== +
- +
-Hartree-Fock exchange (optional) requires the libint package to be installed. +
- +
-It is easiest to install with a Fortran compiler that supports ISO_C_BINDING and Fortran procedure pointers (recent gfortran, xlf90, ifort). +
- +
-Additional information can be found in +
-<code> +
-cp2k/tools/hfx_tools/libint_tools/README_LIBINT +
-</code> +
-Tested against libinit-1.1.4 and currently hardcoded to the default angular momentum  +
-<code> +
-LIBINT_MAX_AM 5 +
-</code> +
-(check your ''include/libint/libint.h'' to see if it matches) +
- +
-http://www.chem.vt.edu/chem-dept/valeev/software/libint/libint.html +
-      +
-//Note, do **NOT** use libinit-1.1.3.// +
- +
-When using libint library, add +
- +
-| %%-D__LININT%% | +
- +
-to ''DFLAGS'', and link to the appropriate libraries in your [[compile#The arch File|arch file]]. +
- +
-==== Small Matrix Multiplication Library ==== +
- +
-A library for small matrix multiplications comes with the CP2K package. This library, if built and used with CP2k, should allow significant speedups (depending on the problem and your machine) to your calculations. +
- +
-The library can be built from the included source: +
-<code> +
-cp2k/tools/build_libsmm +
-</code> +
-See the README file inside the ''build_libsmm'' directory. +
- +
-Usually only the double precision real and perhaps complex is needed. Add the following to ''DFLAGS'' in your [[compile#The arch File|arch file]] +
- +
-| %%-D__HAS_smm_dnn%% | to make the code use the double precision real library | +
-| %%-D__HAS_smm_snn%% | to make the code use the single precision real library | +
-| %%-D__HAS_smm_znn%% | to make the code use the double precision complex library | +
-| %%-D__HAS_smm_cnn%% | to make the code use the single precision complex library | +
-| %%-D__HAS_smm_vec%% | to enable the new vectorized interface of ''libsmm''+
- +
-==== Library ELPA ==== +
- +
-This is an alternative library to ScaLAPACK for the solution of eigenvalue problems. A version of ELPA can be downloaded from +
- +
-http://elpa.rzg.mpg.de/software +
- +
-ELPA replaces the ScaLAPACK ''SYEVD'' to improve the performance of the diagonalization. For specific architectures it may be better to install specifically optimized kernels and/or employ a higher optimization level to compile it. +
- +
-During the installation, the ''libelpa.a'' (or ''libelpa_mt.a'' if multi-thread support is enabled) is created. We tested the version of November 2013, with generic kernel and with/without OpenMP. +
- +
-To use ELPA +
- +
-| %%-D__ELPA%% | +
- +
-to ''DFLAGS'' and +
- +
-| %%-L$(ELPA_DIR)/lib -lxc%% |  +
- +
-to ''LIBS'' in your [[compile#The arch File|arch file]]. +
- +
-==== CUDA Support ==== +
- +
-**This is still experimental.** +
- +
-Add +
- +
-| %%-D__DBCSR_CUDA%% | +
- +
-to ''DFLAGS'' in your [[compile#The arch File|arch file]] to compile with CUDA support for matrix multiplication. For linking, add +
- +
-| %%-lcudart -lrt%% | +
- +
-to ''LIBS'' in your [[compile#The arch File|arch file]]. The compiler must support ISO_C_BINDING. +
- +
-Use +
- +
-| %%-D__PW_CUDA%% | +
- +
-in ''DFLAGS'' for CUDA support for PW (gather/scatter/fft) calculations. The Fortran compiler must use an appended underscore for linking C subroutines. +
- +
-USE +
- +
-| %%-D__CUDA_PROFILING%% | +
- +
-in ''DFLAGS'' to turn on NVIDIA Tools Extensions. +
- +
-Consult ''cp2k/cuda_tools/README'' in the CP2K source for more information. +
- +
-==== Machine Architecture Abstraction Support ==== +
- +
-**Still under development** +
- +
-Add +
- +
-| %%-D__HWLOC%% | +
- +
-or +
- +
-| %%-D__LIBNUMA%% | +
- +
-to ''DFLAGS'' to compile with ''hwloc'' or ''libnuma'' support for machine architecture and process/thread/memory placement and visualization. It is necessary to link with +
- +
-| %%-lhwloc%% | +
- +
-or +
- +
-| %%-lnuma%% | +
- +
-The compiler must support ISO_C_BINDING. +
- +
-Machine architecture visualization is supported only with ''hwloc''. Process/threads/memory placement and visualization is supported by both ''hwloc'' and ''libnuma''+
- +
-Note that it is not possible to use at same time ''hwloc'' and ''libnuma''+
- +
-Consult ''cp2k/machine/README'' in CP2K source for more information. +
- +
-==== Process Mapping Support ==== +
- +
-**Still under development** +
- +
-Use the target machine to compile with topology support. +
- +
-You can also define the strategy to be used using a command line, with +
-<code> +
--mpi-mapping [1,2,3,4,5,6,7] +
-</code> +
- +
-  * ''1'' = SMP-style rank ordering +
-  * ''2'' = file based rank ordering +
-  * ''3'' = Hilbert space-filling curve +
-  * ''4'' = Peano space-filling curve +
-  * ''5'' = Round-Robin rank ordering +
-  * ''6'' = Hilbert-Peano space-filling curve +
-  * ''7'' = Cannon pattern mapping +
- +
-The compiler must support ISO_C_BINDING +
- +
-Consult ''cp2k/machine/README'' in CP2K source for more information. +
- +
-===== Compiling the Code ===== +
- +
-==== I am Feeling Lucky ==== +
- +
-The "I'm feeling lucky" version of building will try to guess what architecture you are on. Just type +
-<code> +
-make sopt +
-</code> +
-in ''cp2k/makefiles'', and the script ''cp2k/tools/get_arch_code'' will try to guess your architecture. You can set the ''FORT_C_NAME'' to indicate the compiler part of the architecture string: +
-<code> +
-export FORT_C_NAME=gfortran +
-</code> +
- +
-If you are not feeling lucky... Or you want to know exactly what you are doing when compiling CP2K, and what options are available, please read on. +
- +
-==== The arch File ==== +
- +
-The locations of the compilers and libraries needs to be specified, together with compilation options, in an "arch" file in ''cp2k/arch'' of CP2K source. Examples for a number of common architectures is already available in the directory (e.g., ''Linux-x86-64-gfortran.sopt''). +
- +
-Conventionally, there are four versions: +
- +
-  * ''sopt'' = serial version +
-  * ''popt'' = parallel, MPI only version --- recommended for general usage +
-  * ''ssmp'' = parallel, OpenMP only version +
-  * ''psmp'' = parallel, MPI + OpenMP +
- +
-You will need to modify one of these files to match your system's settings. +
- +
-==== Compilation Commands ==== +
- +
-After you have finished creating or editing your own arch file in ''cp2k/arch'', you can build CP2K in directory ''cp2k/makefiles'' using the following commands: +
-<code> +
-make -j N ARCH=architecture VERSION=version +
-</code>  +
-where ''-j N'' allows for a parallel build using ''N'' processes; ''architecture'' corresponds to the root-name of your arch file, and version is one of ''sopt'', ''popt'', ''ssmp'' or ''psmp''+
- +
-For example, if you have created ''foo.sopt'' in ''cp2k/arch'', then in ''cp2k/makefiles'', you type in the command: +
-<code> +
-make -j 4 ARCH=foo VERSION=sopt +
-</code> +
-to compile (with 4 processes in parallel) the serial version of CP2K, with compilers, libraries and options specified in the file ''cp2k/arch/foo/sopt''+
- +
-As a short-cut, you can build several version of the code at onece: +
-<code> +
-make -j N ARCH=architecture sopt popt ssmp psmp +
-</code> +
-provided you have the corresponding arch files already in place. +
- +
-After a successful compilation, an executable should appear in ''cp2k/exe/*'' +
- +
-All compiled files, libraries, executables, .. of all architectures and versions can be removed with +
-<code> +
-make distclean +
-</code> +
-in ''cp2k/makefiles''+
- +
-To remove only objects and mod files (i.e. keep exe) for a given ARCH/VERSION, use +
-<code> +
-make ARCH=architecture VERSION=version clean +
-</code> +
- +
-To remove everything for a given ARCH/VERSION use: +
-<code> +
-make ARCH=architecture VERSION=version realclean +
-</code> +
- +
-==== DFLAGS Options ==== +
- +
-The following flags should be present (or not) in the arch file: +
- +
-=== For parallel versions === +
- +
-| %%-D__parallel%% +
-| %%-D__BLACS%%     | +
-| %%-D__SCALAPACK%% | +
- +
-=== If using libint (needed for HF exchange) === +
- +
-| %%-D__LIBINT%% | +
- +
-=== For libxc (needed by QUICKSTEP DFT calculations) === +
- +
-| %%-D__LIBXC%% | +
- +
-=== If using ELPA in space of ''SYEVD'' to solve eigenvalue problems === +
- +
-| %%-D__ELPA%% | +
- +
-=== Various FFTs === +
- +
-| %%-D__FFTSG%%   | Stefan Goedecker FFT (should always be there)     | +
-| %%-D__FFTW3%%   | FFTW version 3                                    | +
-| %%-D__PW_CUDA%% | CUDA FFT and associated gather/scatter on the GPU | +
- +
-=== Various compilers/architectures needing their own machine_* file === +
- +
-| %%-D__NAG%% | if using NAG F95 compiler | +
-| %%-D__AIX%% | if using AIX compiler | +
-| %%-D__ABSOFT%% | if using ABSoft compiler | +
-| %%-D__PGI%% | if using PGI compiler | +
-| %%-D__INTEL%% | if using Intel compiler | +
-| %%-D__GFORTRAN%% | if using GNU ''gfortran'' compiler | +
-| %%-D__G95%% | if using ''g95'' compiler | +
-| %%-D__SX%% | | +
-| %%-D__DEC%% | | +
-| %%-D__XT3%% | if compiling on a XT3 machine | +
-| %%-D__XT5%% | if compiling on a XT5 machine | +
- +
-=== Various network interconnections === +
- +
-| %%-D__GEMINI%% | if Gemini interconnect is used in the cluster | +
-| %%-D__SEASTAR%% | if SeaStar interconnect is used in the cluster | +
-| %%-D__BLUEGENE%% | if BlueGene interconnect is used in the cluster | +
-| %%-D__NET%% | | +
- +
-Specific optimized core routines can be selected with +
- +
-| %%-D__GRID_CORE=X%% | +
- +
-with ''X''=1..6. Reasonable defaults are provided (see ''cp2k/src/lib/collocate_fast.F'') but trial-and-error might yield (a small ~10%) speedup. +
- +
-Tuned versions of integrate and collocate routines can be generated using +
- +
-| %%-D__HAS_LIBGRID%% | +
- +
-and ''-L/path/to/libgrid.a'' in ''LIBS''. See ''cp2k/tools/autotune_grid/README'' for details. +
- +
-| %%-D__PILAENV_BLOCKSIZE=1024%% | +
- +
-or similar is a hack to overwrite (if the linker allows this) the ''PILAENV'' function provided by ScaLAPACK. This can lead to much improved ''PDGEMM'' performance. The optimal value depends on hardware (GPU?) and precise problem. +
- +
-=== Options controlling MPI behavior and capabilities === +
- +
-| %%-D__NO_MPI_THREAD_SUPPORT_CHECK%% | Workaround for MPI libraries that do not declare they are thread safe but you want to use them with OpenMP anyways. | +
-| %%-D__NO_MPI_MEMORY%% | Do not use MPI memory allocation/deallocation routines | +
- +
-=== Options on language features === +
- +
-CP2K currently assumes full Fortran 95 compliance and expects the ISO_C_BINDING module of Fortran 2003 to be present, which commonly is available even in current compilers. For OpenMP, version 3.0 is assumed. +
- +
-If you get compilation errors about unsupported language features, then some flags may be used to reduce the language features required. +
- +
-In addition, some flags are used to declare compiler support for additional language features. +
- +
-== Subparts of Fortran 2003 or later that help various aspects of the code: == +
- +
-| %%-D__PTR_RANK_REMAP%% | compiler supports pointer rank remapping | +
-| %%-D__HAS_NO_ISO_C_BINDING%% | compiler does not support all needed ISO_C_BINDING features. (At least g95 0.91 silently fails with segfaults since it does not support C_F_POINTER.) | +
- +
-== Other language capabilities and support: == +
- +
-| %%-D__HAS_NO_OMP_3%% | CP2K assumes that compilers support OpenMP version 3. If this is not the case, specify this flag to compile. Runtime performance will be poorer on low number of processors. | +
-| %%-D__CRAY_POINTERS%% | Compiler supports CRAY pointers | +
-| %%-D__HAS_NO_CUDA_STREAM_PRIORITIES%% | Needed for CUDA sdk version < 5.5 | +
- +
-=== Additional esoteric, development and debugging options === +
- +
-This section can be safely skipped over. Listed here just for completeness besides the flags described in this document. +
- +
-| %%-D__NO_STATH_ACCESS%% | Do not try to read from ''/proc/self/statm'' to get memory usage information. This is otherwise attempted on several Linux-based architectures or using with the NAG, gfortran, compilers. | +
-| %%-D__mp_timeset__%% | Timing of MPI routines. | +
-| %%-D__USE_LEGACY_WEIGHTS%% | Use legacy atomic weights | +
-| %%-D__NO_ASSUMED_SIZE_NOCOPY_ASSUMPTION%% | Do not assume that assumed-size dummy arguments will always be passed in by reference. Unless the ISO_C_BINDINGS is present, CP2K will //not// compile with this option. | +
-| %%-D__cray_pointers%% | CRAY pointers will be used in preference to the ISO_C_BINDINGS call to MPI_ALLOC_MEM | +
-| %%-D__PLASMA%% | PLASMA support for DBCSR (neglected, may not work) | +
-| %%-D__USE_PAT%% | Use with CRAY-PAT profiling | +
-| %%-D__HMD%% |   | +
-| %%-D__HPM%% |   | +
-| %%-D_USE_GA%% | Use Global Arrays Toolkit | +
- +
-===== Compiling Together With PLUMED v1.3 ===== +
- +
-  - Get version 1.3 of ''plumed'' from their svn repository +
-  - Unpack the ''plumed-1.3'' archive somewhere +
-  - Set the environment variable ''$plumedir'' to the root directory of the plumed distribution: ''export plumedir=/path/to/plumed-1.3'' +
-  - Symbolic link the ''plumed-1.3/patches/plumedpatch_cp2k.sh'' into the CP2K ''src'' directory: ''ln -s $plumedir/patches/plumedpatch_cp2k.sh cp2k/src/'' +
-  - run the ''plumedpatch_cp2k'' script with parameter ''-patch'': ''./plumedpatch.sh -patch'', it should create a subdirectory ''src-plumed'' containing a number of cpp files and a ''plumed.inc'' +
-  - compile cp2k and plumed together with (it is safer to run a distclean before compiling): ''make plumed -j ARCH=... VERSION=popt    PLUMED=yes'' +
- +
-===== Tests ===== +
- +
-If CP2K compiled okay, you can run one of the test cases to try out the executable (most inputs in any of the ''cp2k/tests/*regtest*/'' directories are tested on a daily basis). +
- +
-<code> +
-cd /path/to/cp2k/cp2k/tests/QS/ +
-/path/to/cp2k/cp2k/exe/YOURMACHNE/cp2k.sopt C.inp +
-</code>  +
- +
-systematic testing can be done following the description on [[dev:regtesting|regression testing]]. +
- +
- +
-===== Troubleshooting ===== +
- +
-  * If things fail, take a break... have a look at section [[compile#Options on language features]] and go back to section [[compile#The arch File]]. +
-  * If your compiler/machine is really special, it should not be too difficult to support it. Only ''cp2k/src/machine*.F'' (and possibly ''cp2k/src/dbcsr_lib/machine.F'') should be affected.+
howto/compile.1391987338.txt.gz · Last modified: 2020/08/21 10:15 (external edit)