User Tools

Site Tools


howto:pgo

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
howto:pgo [2015/08/28 14:43] – update for makefile based testing vondelehowto:pgo [2020/08/21 10:15] (current) – external edit 127.0.0.1
Line 1: Line 1:
 ====== Profile guided optimization for CP2K ====== ====== Profile guided optimization for CP2K ======
  
-Using profile guided optimization (PGO) helps to generate faster CP2K executables, typically a few percent. The basic procedure is rather easy if a recent gcc/gfortran is used (e.g. gcc 4.9.2, as tested below, older versions will/may not work).+Using profile guided optimization (PGO) helps to generate faster CP2K executables, e.g. up to 20 percent for hybrid functional calculations. The basic procedure is rather easy if a recent gcc/gfortran is used (e.g. gcc 4.9.2, as tested below, older versions will/may not work).
  
-1. Introduce in the used arch file (e.g. local.sopt) the variable $(PROFOPT) as part as the FCFLAGS.+1. Introduce in the used arch file (e.g. local.sopt) the variable $(PROFOPT) as part as the FCFLAGS (available by default in the toolchain arch files).
    FCFLAGS  = -I$(CP2KINSTALLDIR)/include -std=f2003 -fimplicit-none -ffree-form -fno-omit-frame-pointer -g -O3 -march=native -ffast-math $(PROFOPT) $(DFLAGS) $(WFLAGS)    FCFLAGS  = -I$(CP2KINSTALLDIR)/include -std=f2003 -fimplicit-none -ffree-form -fno-omit-frame-pointer -g -O3 -march=native -ffast-math $(PROFOPT) $(DFLAGS) $(WFLAGS)
 2. Clean any eventual leftovers from previous compilations, removing all relevant directories (i.e. realclean) 2. Clean any eventual leftovers from previous compilations, removing all relevant directories (i.e. realclean)
Line 9: Line 9:
 3. Build the code with extra instrumentation (this binary is slow, and used only for training purposes) 3. Build the code with extra instrumentation (this binary is slow, and used only for training purposes)
    make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-generate    make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-generate
-4. Run the binary either on a specific testcase, or on the full testsuite for good coverage. Only those parts of the code executed during the training run can benefit from PGO. This will write additional files (.gcda) files in the obj directory.+4. Run the binary either on a specific testcase, or better on the full testsuite for good coverage. Only those parts of the code executed during the training run can benefit from PGO. This will write additional files (.gcda) files in the obj directory.
    make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-generate test    make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-generate test
-5. Remove the old instrumented object files, retaining the .gcda files (i.e. clean)+5. Remove the old instrumented object files, retaining the .gcda files (i.e. clean not realclean)
    make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-use clean    make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-use clean
 6. Recompile to build the optimized binary using the profile data. 6. Recompile to build the optimized binary using the profile data.
    make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-use    make -j ARCH=local VERSION=sopt PROFOPT=-fprofile-use
howto/pgo.1440773002.txt.gz · Last modified: 2020/08/21 10:15 (external edit)