This is an old revision of the document!
Using profile guided optimization (PGO) helps to generate faster CP2K executables, typically a few percent. The basic procedure is rather easy if a recent gcc/gfortran is used (e.g. gcc 4.9.2, as tested below, older versions will/may not work).
1. Introduce in the used arch file (e.g. local.sopt) the variable $(PROFOPT) as part as the FCFLAGS.
FCFLAGS = -I$(CP2KINSTALLDIR)/include -std=f2003 -fimplicit-none -ffree-form -fno-omit-frame-pointer -g -O3 -march=native -ffast-math $(PROFOPT) $(DFLAGS) $(WFLAGS)
2. Clean any eventual leftovers from previous compilations, removing all relevant directories (i.e. realclean)
make -j ARCH=local VERSION=sopt realclean
3. Build the code with extra instrumentation (this binary is slow, and used only for training purposes)
make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-generate
4. Run the binary either on a specific testcase, or on the full testsuite (do_regtest) for good coverage. Only those parts of the code executed during the training run can benefit from PGO. This will write additional files (.gcda) files in the obj directory.
../../exe/local/cp2k.sopt -i test.inp -o test.out
5. Remove the old instrumented object files, retaining the .gcda files (i.e. clean)
make -j ARCH=local VERSION=sopt clean
6. Recompile to build the optimized binary using the profile data.
make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-use