This is an old revision of the document!

This text is probably out of date and needs to be revised.

How to Compile CP2K

Prerequisites

You need the following before you can compile CP2K

A copy of CP2K code
GNU make
A Fortran 95 compiler
yacc
A version of BLAS and LAPACK linear algebra libraries
MPI (version 2) and SCALAPACK (optional, required for parallel version)
libxc (optional, for exchange-correlation functionals used in QUICKSTEP)
FFTW (optional)
libint (optional, for Hartree-Fock exchange)
libsmm (optional)
Library ELPA (optional, replaces SCALAPACK SYEVD if used)
CUDA (optional, for utilising GPUs)
Machine architecture abstraction support (optional)
Process mapping support (optional)

Obtaining a copy of CP2K source

To obtain a copy of CP2K source, please follow the instructions given here.

GNU make

GNU make should be on your system (gmake or make on Linux) and it is used for the build. If you do not have it, go to

http://www.gnu.org/software/make/make.html

and download from

http://ftp.gnu.org/pub/gnu/make/

Fortran 95 Compiler

A Fortran 95 compiler should be installed on your system. We have good experience with gfortran 4.4.X and above. Be aware that some compilers have bugs that might cause them to fail (internal compiler errors, segfaults) or, worse, yield a miscompiled CP2K.

Please report bugs to compiler vendors; they (and we) have an interest in fixing them.

yacc

yacc is needed to compile the dependency generator. It can be found as a part of the GNU bison package:

http://www.gnu.org/software/bison/

BLAS and LAPACK

BLAS and LAPACK linear algebra libraries should be installed. Using vendor-provided libraries can make a very significant difference (up to 100%, e.g., ACML, MKL, ESSL).

Note that the BLAS/LAPACK libraries must match the Fortran compiler used.

Use the latest versions available and download all patches! The canonical BLAS and LAPACK can be obtained from the Netetlib repository.

http://www.netlib.org/blas/
http://www.netlib.org/lapack/ and see also
http://www.netlib.org/lapack-dev/

A faster alternative is to use the ATLAS project. It provides BLAS and enough of LAPACK to run CP2K, both optimized for the local machine upon installation: http://math-atlas.sourceforge.net/

GotoBLAS is yet a faster BLAS alternative: http://www.tacc.utexas.edu/resources/software/

If compiling with OpenMP support then it is recommended to use a non-threaded version of BLAS.

MPI and SCALAPACK

MPI (version 2) and ScaLAPACK are needed for parallel code (popt and psmp versions). Use the latest versions available and download all patches!

If your computing platform does not provide MPI, there are several freely available alternatives:

MPICH2 MPI: http://www-unix.mcs.anl.gov/mpi/mpich/
OpenMPI MPI: http://www.open-mpi.org/

ScaLAPACK can be part of ACML or cluster MKL. These libraries are recommended if available.

Canonical ScaLAPACK can be obtained from

http://www.netlib.org/scalapack/

Recently a ScaLAPACK installer has been added that makes installing ScaLAPACK easier: http://www.netlib.org/scalapack/scalapack_installer.tgz

Exchange-Correlation Functionals Library

The version 2.0.1 (ONLY this one) of libxc needs to be downloaded from

http://www.tddft.org/programs/octopus/wiki/index.php/Libxc

and installed (to $LIBXC_DIR, what ever which directory it may be).

During the installation, the directory $LIBXC_DIR/lib is created. Add the preprocessor flag

-D__LIBXC2

to DFLAGS, and

-L$(LIBXC_DIR)/lib -lxc

to LIBS in your arch file.

Fast Fourier Transform Library

FFTW can be used to improve FFT speed on a wide range of architectures.

It is strongly recommended to install and use FFTW3. The current version of CP2K works with FFTW 3.X:

http://www.fftw.org/

Note that FFTW must know the Fortran compiler you will use in order to install properly (e.g., export F77=gfortran before configure if you intend to use gfortran). Note that on machines and compilers which support SSE you can configure FFTW3 with –enable-sse2.

Compilers/systems that do not align memory (NAG f95, Intel IA32/gfortran) should either not use –enable-sse2 or otherwise add

-D__FFTW3_UNALIGNED

to DFLAGS in the arch file.

When building an OpenMP parallel version of CP2K (ssmp or psmp), the FFTW3 threading library libfftw3_threads (or libfftw3_omp) is required. These can be generated using the –enable-threads and –enable-openmp flags during configuration of FFTW3.

When using FFTW3 library, add

-D__FFTW3

to DFLAGS, and link to the appropriate libraries in your arch file.

Hartree-Fock Exchange

Hartree-Fock exchange (optional) requires the libint package to be installed.

It is easiest to install with a Fortran compiler that supports ISO_C_BINDING and Fortran procedure pointers (recent gfortran, xlf90, ifort).

Additional information can be found in

cp2k/tools/hfx_tools/libint_tools/README_LIBINT

Tested against libinit-1.1.4 and currently hardcoded to the default angular momentum

LIBINT_MAX_AM 5

(check your include/libint/libint.h to see if it matches)

http://www.chem.vt.edu/chem-dept/valeev/software/libint/libint.html

Note, do NOT use libinit-1.1.3.

When using libint library, add

-D__LININT

to DFLAGS, and link to the appropriate libraries in your arch file.

Small Matrix Multiplication Library

A library for small matrix multiplies can be built from the included source:

cp2k/tools/build_libsmm

See the README file inside the build_libsmm directory.

Usually only the double precision real and perhaps complex is needed. Add the following to DFLAGS in your arch file

-D__HAS_smm_dnn	to make the code use the double precision real library
-D__HAS_smm_snn	to make the code use the single precision real library
-D__HAS_smm_znn	to make the code use the double precision complex library
-D__HAS_smm_cnn	to make the code use the single precision complex library
-D__HAS_smm_vec	to enable the new vectorized interface of `libsmm`

Library ELPA

This is an alternative library to ScaLAPACK for the solution of eigenvalue problems. A version of ELPA can be downloaded from

http://elpa.rzg.mpg.de/software

ELPA replaces the ScaLAPACK SYEVD to improve the performance of the diagonalization. For specific architectures it may be better to install specifically optimized kernels and/or employ a higher optimization level to compile it.

During the installation, the libelpa.a (or libelpa_mt.a if multi-thread support is enabled) is created. We tested the version of November 2013, with generic kernel and with/without OpenMP.

To use ELPA

-D__ELPA

to DFLAGS and

-L$(ELPA_DIR)/lib -lxc

to LIBS in your arch file.

CUDA Support

This is still experimental.

Add

-D__DBCSR_CUDA

to DFLAGS in your arch file to compile with CUDA support for matrix multiplication. For linking, add

-lcudart -lrt

to LIBS in your arch file. The compiler must support ISO_C_BINDING.

Use

-D__PW_CUDA

in DFLAGS for CUDA support for PW (gather/scatter/fft) calculations. The Fortran compiler must use an appended underscore for linking C subroutines.

USE

-D__CUDA_PROFILING

in DFLAGS to turn on NVIDIA Tools Extensions.

Consult cp2k/cuda_tools/README in the CP2K source for more information.

Machine Architecture Abstraction Support

Still under development

Add

-D__HWLOC

or

-D__LIBNUMA

to DFLAGS to compile with hwloc or libnuma support for machine architecture and process/thread/memory placement and visualization. It is necessary to link with

-lhwloc

or

-lnuma

The compiler must support ISO_C_BINDING.

Machine architecture visualization is supported only with hwloc. Process/threads/memory placement and visualization is supported by both hwloc and libnuma.

Note that it is not possible to use at same time hwloc and libnuma.

Consult cp2k/machine/README in CP2K source for more information.

Process Mapping Support

Still under development

Use the target machine to compile with topology support.

You can also define the strategy to be used using a command line, with

-mpi-mapping [1,2,3,4,5,6,7]

1 = SMP-style rank ordering
2 = file based rank ordering
3 = Hilbert space-filling curve
4 = Peano space-filling curve
5 = Round-Robin rank ordering
6 = Hilbert-Peano space-filling curve
7 = Cannon pattern mapping

The compiler must support ISO_C_BINDING

Consult cp2k/machine/README in CP2K source for more information.

Compiling the Code

I am Feeling Lucky

The “I'm feeling lucky” version of building will try to guess what architecture you are on. Just type

make sopt

in cp2k/makefiles, and the script cp2k/tools/get_arch_code will try to guess your architecture. You can set the FORT_C_NAME to indicate the compiler part of the architecture string:

export FORT_C_NAME=gfortran

If you are not feeling lucky… Or you want to know exactly what you are doing when compiling CP2K, and what options are available, please read on.

The arch File

The locations of the compilers and libraries needs to be specified, together with compilation options, in an “arch” file in cp2k/arch of CP2K source. Examples for a number of common architectures is already available in the directory (e.g., Linux-x86-64-gfortran.sopt).

Conventionally, there are four versions:

sopt = serial version
popt = parallel, MPI only version — recommended for general usage
ssmp = parallel, OpenMP only version
psmp = parallel, MPI + OpenMP

You will need to modify one of these files to match your system's settings.

Compilation Commands

After you have finished creating or editing your own arch file in cp2k/arch, you can build CP2K in directory cp2k/makefiles using the following commands:

make -j N ARCH=architecture VERSION=version

where -j N allows for a parallel build using N processes; architecture corresponds to the root-name of your arch file, and version is one of sopt, popt, ssmp or psmp.

For example, if you have created foo.sopt in cp2k/arch, then in cp2k/makefiles, you type in the command:

make -j 4 ARCH=foo VERSION=sopt

to compile (with 4 processes in parallel) the serial version of CP2K, with compilers, libraries and options specified in the file cp2k/arch/foo/sopt.

As a short-cut, you can build several version of the code at onece:

make -j N ARCH=architecture sopt popt ssmp psmp

provided you have the corresponding arch files already in place.

After a successful compilation, an executable should appear in cp2k/exe/*

All compiled files, libraries, executables, .. of all architectures and versions can be removed with

make distclean

in cp2k/makefiles.

To remove only objects and mod files (i.e. keep exe) for a given ARCH/VERSION, use

make ARCH=architecture VERSION=version clean

To remove everything for a given ARCH/VERSION use:

make ARCH=architecture VERSION=version realclean

DFLAGS Options

The following flags should be present (or not) in the arch file:

For parallel versions

-D__parallel

-D__BLACS

-D__SCALAPACK

If using libint (needed for HF exchange)

-D__LIBINT

For libxc (needed by QUICKSTEP DFT calculations)

-D__LIBXC

If using ELPA in space of ''SYEVD'' to solve eigenvalue problems

-D__ELPA

Various FFTs

-D__FFTSG	Stefan Goedecker FFT (should always be there)
-D__FFTW3	FFTW version 3
-D__PW_CUDA	CUDA FFT and associated gather/scatter on the GPU

Various compilers/architectures needing their own machine_* file

-D__NAG	if using NAG F95 compiler
-D__AIX	if using AIX compiler
-D__ABSOFT	if using ABSoft compiler
-D__PGI	if using PGI compiler
-D__INTEL	if using Intel compiler
-D__GFORTRAN	if using GNU `gfortran` compiler
-D__G95	if using `g95` compiler
-D__SX
-D__DEC
-D__XT3	if compiling on a XT3 machine
-D__XT5	if compiling on a XT5 machine

Various network interconnections

-D__GEMINI	if Gemini interconnect is used in the cluster
-D__SEASTAR	if SeaStar interconnect is used in the cluster
-D__BLUEGENE	if BlueGene interconnect is used in the cluster
-D__NET

Specific optimized core routines can be selected with

-D__GRID_CORE=X

with X=1..6. Reasonable defaults are provided (see cp2k/src/lib/collocate_fast.F) but trial-and-error might yield (a small ~10%) speedup.

Tuned versions of integrate and collocate routines can be generated using

-D__HAS_LIBGRID

and -L/path/to/libgrid.a in LIBS. See cp2k/tools/autotune_grid/README for details.

-D__PILAENV_BLOCKSIZE=1024

or similar is a hack to overwrite (if the linker allows this) the PILAENV function provided by ScaLAPACK. This can lead to much improved PDGEMM performance. The optimal value depends on hardware (GPU?) and precise problem.

Options controlling MPI behavior and capabilities

-D__NO_MPI_THREAD_SUPPORT_CHECK	Workaround for MPI libraries that do not declare they are thread safe but you want to use them with OpenMP anyways.
-D__NO_MPI_MEMORY	Do not use MPI memory allocation/deallocation routines

Options on language features

CP2K currently assumes full Fortran 95 compliance and expects the ISO_C_BINDING module of Fortran 2003 to be present, which commonly is available even in current compilers. For OpenMP, version 3.0 is assumed.

If you get compilation errors about unsupported language features, then some flags may be used to reduce the language features required.

In addition, some flags are used to declare compiler support for additional language features.

Subparts of Fortran 2003 or later that help various aspects of the code:

-D__PTR_RANK_REMAP	compiler supports pointer rank remapping
-D__HAS_NO_ISO_C_BINDING	compiler does not support all needed ISO_C_BINDING features. (At least g95 0.91 silently fails with segfaults since it does not support C_F_POINTER.)

Other language capabilities and support:

-D__HAS_NO_OMP_3	CP2K assumes that compilers support OpenMP version 3. If this is not the case, specify this flag to compile. Runtime performance will be poorer on low number of processors.
-D__CRAY_POINTERS	Compiler supports CRAY pointers
-D__HAS_NO_CUDA_STREAM_PRIORITIES	Needed for CUDA sdk version < 5.5

Additional esoteric, development and debugging options

This section can be safely skipped over. Listed here just for completeness besides the flags described in this document.

-D__NO_STATH_ACCESS	Do not try to read from `/proc/self/statm` to get memory usage information. This is otherwise attempted on several Linux-based architectures or using with the NAG, gfortran, compilers.
-D__mp_timeset__	Timing of MPI routines.
-D__USE_LEGACY_WEIGHTS	Use legacy atomic weights
-D__NO_ASSUMED_SIZE_NOCOPY_ASSUMPTION	Do not assume that assumed-size dummy arguments will always be passed in by reference. Unless the ISO_C_BINDINGS is present, CP2K will not compile with this option.
-D__cray_pointers	CRAY pointers will be used in preference to the ISO_C_BINDINGS call to MPI_ALLOC_MEM
-D__PLASMA	PLASMA support for DBCSR (neglected, may not work)
-D__USE_PAT	Use with CRAY-PAT profiling
-D__HMD
-D__HPM
-D_USE_GA	Use Global Arrays Toolkit

Compiling Together With PLUMED v1.3

Get version 1.3 of plumed from their svn repository
Unpack the plumed-1.3 archive somewhere
Set the environment variable $plumedir to the root directory of the plumed distribution: export plumedir=/path/to/plumed-1.3
Symbolic link the plumed-1.3/patches/plumedpatch_cp2k.sh into the CP2K src directory: ln -s $plumedir/patches/plumedpatch_cp2k.sh cp2k/src/
run the plumedpatch_cp2k script with parameter -patch: ./plumedpatch.sh -patch, it should create a subdirectory src-plumed containing a number of cpp files and a plumed.inc
compile cp2k and plumed together with (it is safer to run a distclean before compiling): make plumed -j ARCH=… VERSION=popt PLUMED=yes

Tests

If CP2K compiled okay, you can run one of the test cases to try out the executable (most inputs in any of the cp2k/tests/*regtest*/ directories are tested on a daily basis).

cd /path/to/cp2k/cp2k/tests/QS/
/path/to/cp2k/cp2k/exe/YOURMACHNE/cp2k.sopt C.inp

systematic testing can be done following the description on regression testing.

Troubleshooting

If things fail, take a break… have a look at section Options on language features and go back to section The arch File.
If your compiler/machine is really special, it should not be too difficult to support it. Only cp2k/src/machine*.F (and possibly cp2k/src/dbcsr_lib/machine.F) should be affected.

CP2K Open Source Molecular Dynamics

Table of Contents