====== Debugging ====== Debugging CP2K can be a little challenge. So suggestions and techniques here make things easier. ===== reproducer ===== To debug the code, one will need to have an input that reliably triggers the problem. If the bug is really hard (i.e. requires repeatedly running the testcase), finding a sufficiently small testcase is very valuable. See if the bug reproduces with few atoms, lower cutoff, small basis, energy instead of md, ... usually understanding what is needed to trigger the bug will help a lot fixing it. ===== environment dependence ===== Check if the bug only happens on machine X or with library Y. Buggy libraries are unfortunately very common. Try linking netlib scalapack/blas, gfortran (-O0) compiled binaries on linux as a reference. If you find bugs in the libraries/tools, report them to the vendors, only this way things improve in the long run. ===== use the trace ===== Run CP2K with the '[[ http://manual.cp2k.org/trunk/CP2K_INPUT/GLOBAL.html#desc_TRACE | TRACE]]' keyword enabled in the '&GLOBAL' section. The additional output gives a good idea where things might go wrong. ===== debug build ===== A large number of issues can be caught using a debug built with bounds checking. With gfortran, the following flags are useful in the arch file: FCFLAGS = -O1 -fstrict-aliasing -g -fno-omit-frame-pointer -fno-realloc-lhs \ -fcheck=bounds,do,recursion,pointer -ffree-form $(DFLAGS) also, link against netlib blas/lapack/scalapack (compiled with the same options). ===== valgrind ===== ==== undefined variables ==== [[ http://valgrind.org/ | valgrind ]] is very useful to find additional bugs, most commonly related to undefined variables. Unfortunately, the slowdown caused by valgrind makes this practical only for test cases that run within seconds/minutes. The full regtester is run from time to time under valgrind. The following arch file works well with valgrind (in particular no libraries that return false positives, needs gfortran >4.8.X to avoid spurious warnings). CC = cc CPP = FC = gfortran LD = gfortran AR = ar -r CPPFLAGS = DFLAGS = -D__GFORTRAN -D__FFTSG -D__LIBINT -D__FFTW3 -D__LIBINT_MAX_AM=6 -D__LIBDERIV_MAX_AM1=5 -D__LIBXC2 FCFLAGS = -O0 -g -ffree-form $(DFLAGS) LDFLAGS = $(FCFLAGS) -L/data/vjoost/libint_ham/install/lib/ -L/data/vjoost/scalapack/scalapack_installer_1.0.2/install/lib/ -L/data/vjoost/libxc-2.0.1/install/lib LIBS = -lderiv -lint -lstdc++ -lfftw3 -lreflapack -lrefblas -lxc OBJECTS_ARCHITECTURE = machine_gfortran.o Run valgrind with valgrind --max-stackframe=2100192 --leak-check=full --track-origins=yes to get the origin of undefined variables in addition to a leak check report. ==== understanding memory usage ==== valgrind also comes with the 'massif' tool, which can provide detailed information about memory usage. How much is the peak allocated memory, and where do most of these allocations come from? Rather easy with valgrind --tool=massif ../../../exe/local_valgrind/cp2k.sopt test.inp ms_print massif.out.XYZ The valgrind homepage has [[http://valgrind.org/docs/manual/ms-manual.html | detailed description of massif]]. ==== valgrind in parallel ==== valgrind can also be used for debugging parallel code, eg.: mpirun -np 2 -x OMP_NUM_THREADS=2 valgrind --max-stackframe=2100192 --leak-check=full --track-origins=yes cp2k.psmp cp2k.inp ===== Memory leak checking ===== Starting from gcc 4.9.0, good memory leak checking is integrated with gfortran. Compile CP2K with -fno-omit-frame-pointer -O1 -g -fsanitize=leak to get detailed memory leak reports. CP2K should be fully clean, however, this also find leaks in libraries such as mpi and scalapack. It is possible to write suppression files that require an export like export LSAN_OPTIONS=suppressions=suppr.txt for the format of the file see [[ https://code.google.com/p/address-sanitizer/wiki/LeakSanitizer | LeakSanitizer ]] docs. ===== Compiler warnings ===== Unfortunately is the GNU Fortran compiler not on the same level concerning warnings as its C/C++ counterparts. Especially the ''-Wunitialized'' which is part of ''-Wall'' may give spurious warnings of the following kind when building together with ''-O1'' (or greater): attention : ‘arr.offset’ may be used uninitialized in this function [-Wuninitialized] attention : ‘arr.dim[1].stride’ may be used uninitialized in this function [-Wuninitialized] attention : ‘arr.dim[0].ubound’ may be used uninitialized in this function [-Wuninitialized] This is tracked at GNU/gfortran upstream here [[https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66459|Bug 66459]] and [[https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60500|Bug 60500]]