### Table of Contents

# The H$_2$O molecule: DFT basics

In this exercise, we are going to learn the basics of using CP2K by calculating the total energy, optimizing the geometry, and printing the projected density of states (PDOS) of a single H$_2$O molecule. We will also investigate how different parameters such as the DFT functional and basis sets affect the accuracy and duration of our calculations.

## Part 1: Basic input

In the first part of the exercise, we will go through the input and steps needed to perform a single-point calculation of a water molecule. The first thing we need is to create an input file describing our system and choosing the methods we want to use. Below is a simple example input that we need.

- water.inp
&GLOBAL PROJECT h2o RUN_TYPE ENERGY IOLEVEL LOW &END GLOBAL &FORCE_EVAL METHOD QS &DFT BASIS_SET_FILE_NAME GTH_BASIS_SETS POTENTIAL_FILE_NAME GTH_POTENTIALS LSD 0 &QS EPS_DEFAULT 1.0E-10 &END QS &SCF MAX_SCF 300 EPS_SCF 1.0E-06 SCF_GUESS ATOMIC &MIXING METHOD DIRECT_P_MIXING ALPHA 0.6 &END MIXING &DIAGONALIZATION ALGORITHM STANDARD &END DIAGONALIZATION &END SCF &MGRID NGRIDS 4 CUTOFF 300 REL_CUTOFF 60 &END &XC &XC_FUNCTIONAL BLYP &END XC_FUNCTIONAL &END XC &END DFT &SUBSYS &CELL ABC 15.0 15.0 15.0 ANGLES 90.00 90.00 90.00 &END CELL &COORD O 0 0 0 H 0.7 0.7 0 H -0.7 0.7 0 &END COORD &KIND H BASIS_SET DZVP-GTH POTENTIAL GTH-BLYP-q1 &END KIND &KIND O BASIS_SET DZVP-GTH POTENTIAL GTH-BLYP-q6 &END KIND &PRINT &ATOMIC_COORDINATES &END &END PRINT &END SUBSYS &END FORCE_EVAL

The input file is structured in different sections, started by the “&SECTION_NAME” and ended by “&END SECTION NAME” where the including the section name after &END is optional but recommended. Each section may contain additional sections, and keywords used to control the parameters. There are three main input sections that we will use in this exercise:

- “&GLOBAL” - contains the general options such as the kind of run (in this case ENERGY) and the name of the project.
- “&FORCE_EVAL” - sets the parameters needed to evaluate the forces on the atoms, including the atomic positions, DFT functional, and basis set.
- “&MOTION” - controls e.g. geometry optimizations and MD.

The &GLOBAL section is quite straightforward and controls the type of calculation, our project’s name, etc. but the &FORCE_EVAL section is more complicated and we will go through the most important parts more carefully. The keyword

METHOD Quickstep

chooses that we want to use the Quickstep code which means DFT with the Gaussian and Planewaves method (GPW). We then move in two the &DFT section where we have to specify in which files the program should search for the basis sets and potentials.

BASIS_SET_FILE_NAME GTH_BASIS_SETS POTENTIAL_FILE_NAME GTH_POTENTIALS

In the &QS section, the EPS_DEFAULT keyword sets all tolerances within the Quickstep method. Individual tolerances can also be set, overwriting the EPS_DEFAULT value in the process. The &SCF section controls the self-consistent optimization method for the Kohn-Sham orbitals, with important keywords such as the maximum number of self-consistent loops, MAX_SCF, the initial guess of the wavefunction SCF_GUESS, and the convergence tolerance EPS_SCF. The &MIXING subsection

&MIXING METHOD DIRECT_P_MIXING ALPHA 0.6 &END MIXING

sets how much of the new density that should be mixed with the old density in each step. In this case, 60 % of the new density will be combined with 40 % of the old density when moving on to the next step. A small mixing parameter ALPHA will slow down the calculation but can help with convergence in hard cases. Finally the &MGRID section

&MGRID NGRIDS 4 CUTOFF 300 REL_CUTOFF 60 &END

controls how to set up the integration grid used in the Quickstep method. NGRIDS chooses the number of multigrids to use, CUTOFF selects the number of plane-wave basis functions to describe the electron density, and the REL_CUTOFF determines the grid spacing of the Gaussian basis functions.

After the &DFT section, the other main part of &FORCE_EVAL is the &SUBSYS. Here we set up the system we would like to study including basis sets and potentials. The &CELL and the &COORD sections

&CELL ABC 15.0 15.0 15.0 ANGLES 90.00 90.00 90.00 &END CELL &COORD O 0.0 0.0 0.0 H 0.7 0.7 0.0 H -0.7 0.7 0.0 &END COORD

sets up our simulation cell and atomic coordinates respectively. By default, CP2K will run your calculations with periodic boundary conditions but since we are using a large simulation box, the molecule does not interact with its own copies and we are effectively looking at it in gas phase. In the &KIND sections,

&KIND H BASIS_SET DZVP-GTH POTENTIAL GTH-BLYP-q1 &END KIND &KIND O BASIS_SET DZVP-GTH POTENTIAL GTH-BLYP-q6 &END KIND

the basis sets and potentials associated with each atom is given. In this case, we use GTH pseudopotentials and corresponding basis sets optimized for them.

Remember that a lot of information about the different sections and the keywords can be found in the CP2K manual https://manual.cp2k.org/.

## Part 2: Running an single-point calculation

We are now ready to run the input in CP2K. To start the calculation, type

/cp2k.sopt -i water.inp -o water.out &

or

sbatch cp2k.sh

if you are logged in to the cluster and have modified the *cp2k.sh* accordingly. When the program is finished, you will have two new files in your working directory:

water.out h2o-RESTART.wfn

The first file ending with .out is the main output file of the calculation which contains information about the optimization

SCF PARAMETERS Density guess: ATOMIC -------------------------------------------------------- max_scf: 300 max_scf_history: 0 max_diis: 4 -------------------------------------------------------- eps_scf: 1.00E-06 eps_scf_history: 0.00E+00 eps_diis: 1.00E-01 eps_eigval: 1.00E-05 -------------------------------------------------------- level_shift [a.u.]: 0.00 -------------------------------------------------------- Mixing method: DIRECT_P_MIXING -------------------------------------------------------- No outer SCF Number of electrons: 8 Number of occupied orbitals: 4 Number of molecular orbitals: 4 Number of orbital functions: 23 Number of independent orbital functions: 23 Extrapolation method: initial_guess SCF WAVEFUNCTION OPTIMIZATION Step Update method Time Convergence Total energy Change ------------------------------------------------------------------------------ 1 P_Mix/Diag. 0.60E+00 2.3 0.81098873 -17.0423550185 -1.70E+01 2 P_Mix/Diag. 0.60E+00 2.6 0.50875403 -17.1088252836 -6.65E-02 3 P_Mix/Diag. 0.60E+00 2.7 0.31846629 -17.1438061995 -3.50E-02 4 P_Mix/Diag. 0.60E+00 2.7 0.23820162 -17.1640209526 -2.02E-02 5 P_Mix/Diag. 0.60E+00 2.7 0.16669867 -17.1752275067 -1.12E-02 6 P_Mix/Diag. 0.60E+00 2.8 0.13086134 -17.1817453111 -6.52E-03 7 P_Mix/Diag. 0.60E+00 2.7 0.09423552 -17.1854419077 -3.70E-03 8 DIIS/Diag. 0.55E-01 2.7 0.02674149 -17.1876018813 -2.16E-03 9 DIIS/Diag. 0.12E-03 2.6 0.00054023 -17.1905995511 -3.00E-03 10 DIIS/Diag. 0.16E-03 2.7 0.00033668 -17.1905995449 6.19E-09 11 DIIS/Diag. 0.39E-05 2.7 0.00000741 -17.1905995597 -1.48E-08 12 DIIS/Diag. 0.12E-05 2.6 0.00000090 -17.1905995597 -1.00E-11 *** SCF run converged in 12 steps ***

as well as the converged results:

Electronic density on regular grids: -7.9999999840 0.0000000160 Core density on regular grids: 7.9999999997 -0.0000000003 Total charge density on r-space grids: 0.0000000157 Total charge density g-space grids: 0.0000000157 Overlap energy of the core charge distribution: 0.00000001850899 Self energy of the core charge distribution: -44.54061621428737 Core Hamiltonian energy: 12.90570300282121 Hartree energy: 18.65475043862882 Exchange-correlation energy: -4.21043680541513 Total energy: -17.19059955974348 ENERGY| Total FORCE_EVAL ( QS ) energy (a.u.): -17.190599559743479

The output file also contains information about timing, i.e. how much time was spent on the different parts in the code, and also ends with references to cite, relevant to your particular calculation.

The second file contains the optimized Kohn-Sham wavefunction, which can be used to restart your calculation using the keyword

SCF_GUESS RESTART

in the &SCF section.

## Part 3: Geometry optimization

In our previous calculation, our guess for the geometry of the water molecule, e.g. with a bond angle of 90°, was not very good. It is very often a good idea to optimize the geometry of your system before extracting any information from your calculations. Make another directory and move a copy of our previous input into it. To perform a geometry optimization, we must add an additional section to our input file.

&MOTION &GEO_OPT OPTIMIZER BFGS MAX_DR 3.0E-03 MAX_FORCE 4.5E-04 RMS_DR 1.5E-03 RMS_FORCE 3.0E-04 &END &CONSTRAINT &FIXED_ATOMS COMPONENTS_TO_FIX XYZ LIST 1 &END &END CONSTRAINT &END MOTION

In the &GEO_OPT section we select which kind of optimizer we want as well as the convergence criterion, such as the maximum force component. Then, since we only want to optimize the hydrogen relative to the oxygen, the &CONSTRAINT section allows us to freeze all components of the first atom in our input coordinates (make sure this is the oxygen by examining the &SUBSYS section again). Finally, we must also change the RUN_TYPE to GEO_OPT in the &GLOBAL section.

&GLOBAL PROJECT h2o_opt RUN_TYPE GEO_OPT IOLEVEL LOW &END GLOBAL

We are now ready to start the calculation. As before, type e.g.

./cp2k.sopt -i water_opt.inp -o water_opt.out &

or

./cp2k.sopt water.inp | tee water.out

into your terminal to start the calculation. A geometry optimization contains more SCF cycles and so this calculation will take more time than before. Once the system has converged, we can examine the output file to evaluate our calculation. At each step in the optimization, we get the same information as in the last exercise, along with

-------- Informations at step = 2 ------------ Optimization Method = BFGS Total Energy = -17.1947218061 Real energy change = -0.0002999093 Predicted change in energy = -0.0002336556 Scaling factor = 0.0000000000 Step size = 0.0178822087 Trust radius = 0.4724315332 Decrease in energy = YES Used time = 22.246 Convergence check : Max. step size = 0.0178822087 Conv. limit for step size = 0.0030000000 Convergence in step size = NO RMS step size = 0.0091540732 Conv. limit for RMS step = 0.0015000000 Convergence in RMS step = NO Max. gradient = 0.0026713842 Conv. limit for gradients = 0.0004500000 Conv. for gradients = NO RMS gradient = 0.0017198827 Conv. limit for RMS grad. = 0.0003000000 Conv. for gradients = NO ---------------------------------------------------

describing each step. It is very important to check that your geometry optimization has converged properly, and this can be done by looking at the end of the output file. The last step should look something like this:

-------- Informations at step = 4 ------------ Optimization Method = BFGS Total Energy = -17.1947556096 Real energy change = -0.0000002510 Predicted change in energy = -0.0000002260 Scaling factor = 0.0000000000 Step size = 0.0009840890 Trust radius = 0.4724315332 Decrease in energy = YES Used time = 20.295 Convergence check : Max. step size = 0.0009840890 Conv. limit for step size = 0.0030000000 Convergence in step size = YES RMS step size = 0.0006410487 Conv. limit for RMS step = 0.0015000000 Convergence in RMS step = YES Max. gradient = 0.0000198391 Conv. limit for gradients = 0.0004500000 Conv. in gradients = YES RMS gradient = 0.0000096694 Conv. limit for RMS grad. = 0.0003000000 Conv. in RMS gradients = YES --------------------------------------------------- ******************************************************************************* *** GEOMETRY OPTIMIZATION COMPLETED *** *******************************************************************************

Make sure that the calculation has converged by all criteria. You should also have several more files in your working directory, including e.g.

- h2o_opt-1.restart
- h2o_opt-1.restart.bak-1
- h2o_opt-pos-1.xyz

The h2o_opt-1.restart contains all the input required to restart a calculation from the last step in the calculation. Its h2o_opt-1.restart.bak-1 analogue is the backup containing information from the previous step, and so on. The last file h2o_opt-pos-1.xyz contains the atomic trajectory of the optimization.

Try to open it in e.g. Molden(?) to visualize the steps.

Here we can see that the bond angle is now 102.8°, which is better but still quite far from the accepted 104.45°. Also comparing the final total energy to our first single-point calculation, we can see that our new geometry is more energetically stable.

## Part 4: Computing PDOS

Now that we have obtained a more reasonable geometry, we will know attempt to compute the eigenvalues of our Kohn-Sham orbitals and compare the energy levels to the attached experimental spectra. We will also use this opportunity to try out the option to restart a calculation. -0.316280 To restart our calculation from the previous geometry optimization, we need to copy two files from the old working directory, e.g.

h2o_opt-RESTART.wfn h2o_opt-1.restart

to our current working directory. To make the code read these files, we have to add a section

&EXT_RESTART RESTART_FILE_NAME h2o_opt-1.restart &END EXT_RESTART

as well as adding

SCF_GUESS RESTART RESTART_FILE_NAME h2o_opt-RESTART.wfn

to the &SCF and &DFT sections respectively. To print the PDOS, we also need to add the &PDOS section under &DFT

&PRINT &PDOS FILENAME ./pdos NLUMO 2 &END &END

and specify the spectra file names for example if we want to include unoccupied virtual orbitals. The final input can e.g. look like this:

&DFT BASIS_SET_FILE_NAME GTH_BASIS_SETS POTENTIAL_FILE_NAME GTH_POTENTIALS LSD 0 RESTART_FILE_NAME h2o_opt-RESTART.wfn &QS METHOD GPW EPS_DEFAULT 1.0E-10 &END QS &SCF MAX_SCF 300 EPS_SCF 1.0E-06 SCF_GUESS RESTART &MIXING METHOD DIRECT_P_MIXING ALPHA 0.6 &END MIXING &DIAGONALIZATION ALGORITHM STANDARD &END DIAGONALIZATION &END SCF &MGRID NGRIDS 4 CUTOFF 300 REL_CUTOFF 60 &END &XC &XC_FUNCTIONAL BLYP &END XC_FUNCTIONAL &END XC &PRINT &PDOS FILENAME ./h2o NLUMO 2 &END &END &END DFT

When the calculation is done, open the file

- h2o-k1-1.pdos

and compare your calculated energies to the experimental measurements reported in the attached paper, e.g. Fig. 1. As a reminder, one 1 a.u. = 27.212 eV. Look in particular on the energy differences between the three valence states.

## Part 5: (Optional) Testing your parameters

If you followed this exercise closely, chances are that the computed orbital energies in your PDOS files are quite similar to the experimental ones, given an ad hoc shift of the energies. However, the 1b2 orbital will be slightly off. Can we do better by tweaking our input, or can we get as good results with a faster method? In this exercise, we will investigate and try to get a feel for how different parameters affect the accuracy and time consumption of your calculation. Go back to Part 3 and Part 4 and redo the calculation while changing the following parameters

&XC_FUNCTIONAL {PADE BLYP B3LYP} CUTOFF {100 300 600} BASIS_SET {SZV-GTH DZVP-GTH TZVP-GTH} EPS_SCF {1.0E-04 1.0E-06 1.0E-0.8}

(choose one of the suggested in the brackets). The suggestions are given from less accurate to more accurate. When you feel comfortable, feel free to experiment with other keywords! Note that using e.g. a larger basis set will make the calculation take longer, so you might want to do something else in the mean time.