−Table of Contents

Simple metadynamics simulation using the coordination numbers as variables

Simple metadynamics simulation using the coordination numbers as variables

Problem: Dissociation reaction of nitric acid on graphene and atomic rearrangements of a Si₆H₈ cluster described using coordination numbers

Original author: Marcella Iannuzzi
Complete source and output files: MTD1.tar.xz

Introduction

For this tutorial some input and output files are given in order to present a complete procedure to solve the given problem. Some hints are also given to help in the analysis of the results. In order to be able to run these examples, some paths need to be correctly set in the input files (i.e. set the variables LIBPATH, XYZPATH, RUNPATH). The coordinates are always read from xyz files. All the coordinate files needed for these exercises are collected in XYZ, whereas LIB_TOOLS contains the PP, basis sets and DFTB parameter file.

The results presented in these examples are to be considered only as toy cases. The obtained processes are not very accurate, because no optimal parameters have been selected to speed-up a bit the calculation.

The tasks to be completed are:

Set up and preliminary simulations to learn about the dynamics of nitric acid on graphene, as obtained at the DFTB level of theory
Metadynamics simulation to trigger the dissociation of nitric acid, by following changes of three different coordination numbers
Set up and preliminary simulations to learn about the dynamics of the selected small Si cluster saturated by H atoms
Metadynamics simulation aimed at observing atomic rearrangements of the cluster bay changing the coordination of both Si and H species.

First task: dynamics of two HNO3 molecules over a graphene sheet

The examples on this system are in the directory GR_2HNO3. The goal is to simulate the dissociation of the HNO₃ molecules with formation of products like H₂O and/or NO or NO₂ fragments. These reaction can occur in gas phase. However, the reaction should be catalyzed in presence of C particles (sooth). In the proposed example, the molecules are located in the vicinity of graphene, that should mimic the role of sooth. Graphene indeed is not very reactive, better models can be considered using defective or functionalized graphene.

The initial coordinates are given in XYZ/grly5x3_2hno3.xyz. The study is started with a simple MD simulation at constant temperature (300 K), to learn about the dynamics of the two molecules over graphene. The DFTB description is employed to speed up the simulation, even if this might not be the optimal choice to faithfully describe the dissociation reaction. The provided input file GR_2HNO3/gr2hno3_nvt.inp contains all the instructions to run with either DFT, or DFTB, or PM6. It is sufficient to set the input variable METHOD_TO_USE giving the name of the method to be used, and the parts of the input related to the selected method are activated. In the present case DFTB is selected and the relevant part of the input is

@IF ( ${METHOD_TO_USE} == DFTB )
  &DFT
    &QS
      METHOD DFTB
      &DFTB
        SELF_CONSISTENT    T
        DO_EWALD           T
        DISPERSION         T
        &PARAMETER
          PARAM_FILE_PATH  ${LIBPATH}/scc
          PARAM_FILE_NAME  scc_parameter
          UFF_FORCE_FIELD  uff_table
        &END PARAMETER
      &END DFTB
    &END QS
    &POISSON
      PERIODIC XYZ
      &EWALD
       EWALD_TYPE SPME
       GMAX 25
       O_SPLINE 5
      &END EWALD
    &END POISSON
    .....

The system is fully periodic, and enough space is left above the graphene layer in order to avoid interactions with the images along $z$ . This is a very simple model of the type of particles that might trigger the dissociation reaction, and we are not interested in the dynamics of the layer itself. Therefore, a a few atoms of the layer are constrained to fixed positions by

  &CONSTRAINT
    &FIXED_ATOMS
      LIST    48 51 54 57 60 45 59 44 58 43
    &END
  &END

Otherwise the MOTION section is quite standard for NVT simulations with a Nose-Hoover thermostat. Since the total number of degrees of freedom is small, even with the thermostat the equilibration to the desired temperature might result difficult. Hence, the rescaling of temperature is activated by the MOTION/MD#TEMP_TOL keyword, whenever the difference between the instantaneous temperature and the desired value exceeds the given tolerance.

The output of this short simulation (5 ps) is stored in DFTB_NVT. Kinetic energy (3rd col.), temperature (4th col.), potential energy (5th col), and total energy (6th col) can be monitored from GR_2HNO3/DFTB_NVT/gr2hno3_nvt-1.ener. By visualizing the short trajectory, it is observed that the two molecule move very fast, and explore a large space of configurations, being often far from each other and far from graphene. In order to restrict the exploration to region where the dissociation catalyzed by graphene might occur, and thus avoid configurations that are not interesting for the specific study, it is necessary to limit the movement of the two molecules. To this purpose, an external potential defining a spherical potential centered on the center of the system coordinates is added, which acts only on the two molecules. In order to simplify the definition of the external potential, the coordinates are first centered in zero (XYZ/grly5x3_2hno3_cc.xyz). The new input file invoking the interaction with the spherical potential is GR_2HNO3/gr2hno3_nvt_epot.inp, where the only difference in the FORCE_EVAL section is

  &EXTERNAL_POTENTIAL
     ATOMS_LIST   61..70
     FUNCTION   0.000000000001*(Z^2)^4
  &END

  &EXTERNAL_POTENTIAL
     ATOMS_LIST   61..70
     FUNCTION   0.0000000000001*(X^2)^4
  &END

  &EXTERNAL_POTENTIAL
     ATOMS_LIST   61..70
     FUNCTION   0.0000000000001*(Y^2)^4
  &END

Along the resulting 10 ps trajectory, the two molecules remain close to graphene, where they should be.

The next step is to set up a few collective variables (CV) that can be later used for metadynamics (MTD) simulations. It is important to select good CV that can describe the relevant configuration along the reaction path. Moreover, it is useful to learn about the typical behavior of the selected CV along an unbiased MD run. Hence, after selecting a set of CV, preliminary runs should be performed in order to monitor the dynamics of these variables. This can be done setting up MTD simulations, where in facts no bias is added. The evolution of the variables is then monitored while the system explores the configurations around the initial one, i.e. belonging to the same (initial) basin of attraction on the free energy surface (FES). The evaluation of typical fluctuation amplitudes of the CV is particularly important in order to set the width of the Gaussian beads that are going to build up the penalty potential along the “real” MTD run. Moreover, it is important to learn which variations in the CV can occur spontaneously, i.e. do not need any bias, and where the CV cannot move without activation.

The input GR_2HNO3/gr2hno3_mtd_4cv_h0_p1.inp has been prepared with the definition of four CV. The first is the coordination number (CN) of O to graphene

    &COLVAR
       &COORDINATION
          KINDS_FROM  O
          KINDS_TO   C
          R_0 [angstrom]  1.8
          NN  8
          ND  14
       &END COORDINATION
    &END COLVAR

where NN and ND determine the curvature of the function used to compute the CN and R_0 is the reference OC distance. $CN_{\text{ OC}} = \frac{1}{N_{\text{ O}}} \sum_{i_{\text{O}}} \sum_{j_{\text{C}}} \frac{1-(\frac{r_{ij}}{R_0})^{nn}}{1-(\frac{r_{ij}}{R_0})^{nd}}$ This CV describes the interaction between the O atoms that may dissociate from N and graphene, where adsorption might occur. It should be approximately zero when the molecules are far from the layer. It becomes larger than zero, but always smaller than one, when one or more O get closer to the layer. With the given parameter the CN start being larger than zero for OC distances below 4 Å.

The second Cv is the CN of N to O, that is about 3 for not dissociate molecules and smaller when the dissociation begins.

    &COLVAR
       &COORDINATION
          KINDS_FROM  N
          KINDS_TO   O
          R_0 [angstrom]  1.8
          NN  8 
          ND  14
       &END COORDINATION
    &END COLVAR

The third C is between H and graphene, since also H can be lost from the molecules and be adsorbed on graphene.

       &COLVAR
       &COORDINATION
          KINDS_FROM  H
          KINDS_TO   C
          R_0 [angstrom]  1.6
          NN  8
          ND  14
       &END COORDINATION
    &END COLVAR

Finally the fourth CV is the distance between a point and a plane, where the point is the geometric center between the two N atoms of the system, while the plane is determined by the coordinates of three C atoms of graphene.

    &COLVAR
       &DISTANCE_POINT_PLANE
        &POINT
          TYPE GEO_CENTER
          ATOMS 1
        &END
        &POINT
          TYPE GEO_CENTER
          ATOMS 48
        &END
        &POINT
          TYPE GEO_CENTER
          ATOMS 60
        &END
        &POINT
          TYPE GEO_CENTER
          ATOMS  69 70
        &END
          ATOMS_PLANE 1 2 3 
          ATOM_POINT 4
       &END DISTANCE_POINT_PLANE
    &END COLVAR

This last CV controls the distance of the molecules from the layer, which is an important factor to determine whether the dissociation is somehow favored by the presence of graphene. The output of the unbiased MD that monitors the behavior of these 4 CV is in DFTB_MTD_4CV_H0. It is obtained by invoking a MTD run where no penalty potential is added. Hence, the MOTION/FREE_ENERGY subsection is added within the section MOTION. The MTD run is controlled from the MOTION/FREE_ENERGY/METADYN subsection. In the present example, where no bias has to be added, the MOTION/FREE_ENERGY/METADYN section contains very few parameters

  &FREE_ENERGY
    &METADYN
      DO_HILLS  .FALSE.

      &METAVAR
        SCALE 0.08
        COLVAR 1
      &END METAVAR

      &METAVAR
        SCALE 0.3
        COLVAR 2
      &END METAVAR

     &METAVAR
        SCALE 0.08
        COLVAR 3
      &END METAVAR

      &METAVAR
        SCALE 1.5
        COLVAR 4
      &END METAVAR

      &PRINT
        &COLVAR
           COMMON_ITERATION_LEVELS 3
           &EACH
             MD 1
           &END
        &END
      &END
    &END METADYN
  &END FREE_ENERGY

With MOTION/FREE_ENERGY/METADYN#DO_HILLS set to .FALSE., it is required that no bias is added. Then for each defined COLVAR a MTD variable is initialized. The PRINT%COLVAR section controls the printing of the COLVAR output file, containing the instantaneous values of the CV as well as other parameters when needed. For the run without bias, no other information are needed, and the only interesting data in the output GR_2HNO3/DFTB_MTD_4CV_H0/gr2hno3_mtd_4cv_h0_p1-COLVAR.metadynLog are the second, third, fourth, and fifth columns, which are the instantaneous values of the CV at the indicated time (in fs, first column).

By plotting the CV as recorded along the short MD trajectory (3 ps), the amplitude of the equilibrium fluctuations can be evaluated and then used to set up the size of the Gaussian hills that build up the biasing potential. The first CV fluctuates close to zero, with fluctuations smaller than 0.2. The second is around 2.8. The fluctuations are smaller due to the stiffness of the three NO bonds. The coordination of H to C is also typically zero, but it can change a lot when the molecules approach the layer, even if there is no dissociation of H and no binding to C. This indicates that this variable is difficult to control and might turn out to be tricky to use it to distinguish among different states of the reaction process. The point to plane distance shows quite large fluctuations and it is clearly not suited to distinguish a specific state along the reaction path. Moreover, its minima, when the two molecules are closer to the layer, correspond to the maxima of the third CV, i.e. the CN of H to C. At least before dissociation, the information that this variable provide is redundant. It might be interesting to run again this preliminary simulation after modifying the definition of the CV. For example, by changing the two exponents or even the reference distance of the CN, the range of the function can be made shorter or longer. It is maybe important to remind that the function defining the CV must have a gradient different from zero to affect the behavior of the system in a MTD run. Namely, the MTD force term affecting the dynamics of the atoms involved in the definition of the CV is proportional to the gradient of the CV function.

Second task: Metadynamics of the dissociation of HNO3 over a graphene sheet

The presented MTD run employs as CV only the three CN described above. The related input file is GR_2HNO3/gr2hno3_mtd_3cv_p1.inp and the output is stored in DFTB_MTD_3CV. The MOTION/FREE_ENERGY/METADYN input section has been modified to activate the MTD algorithm.

    &METADYN
      DO_HILLS 
      NT_HILLS 100
      WW 3.0e-3

      &METAVAR
        SCALE 0.2
        COLVAR 1
      &END METAVAR

      &METAVAR
        SCALE 0.3
        COLVAR 2
      &END METAVAR

     &METAVAR
        SCALE 0.2
        COLVAR 3
      &END METAVAR

      &PRINT
        &COLVAR
           COMMON_ITERATION_LEVELS 3
           &EACH
             MD 1
           &END
        &END
        &HILLS
           COMMON_ITERATION_LEVELS 3
           &EACH
             MD 1
           &END
        &END
      &END
    &END METADYN

One Gaussian hill is spawned every NT_HILLS MD steps, while the height of the hill in hartree is given by WW. These parameters together with the width of the Gaussian hills are important to determine the accuracy of the description of the FES through the MTD biasing potential. Since each variable has in principle different dimensions and different dynamics, the shape of the hills filling up the $N_{\text{CV}}$ -dimensional configurations space, as defined by the selected CVs, is not isotropic. The parameter SCALE associated to the $i$ -th MTD variable determines the amplitude of the Gaussian in the $i$ -th space-direction of the $N_{\text{CV}}$ -dimensional configuration space. This parameter, as well as the hill’s height and the frequency of collocation, can be changed along the same MTD run by restarting with different values in the input. This feature is useful when the dynamics of some variable changes after some event has occurred (e.g. the fluctuation of a distance become larger after a bond breaking), or to modulate the resolution of the biasing potential (coarser or finer) in different regions of the space (e.g. coarser at the bottom of the FES basin and finer in the vicinity of the transition state). For an efficient exploration of the configurations space, it is important to spawn hills that are not too big, otherwise important features of the topography of the FES might be not sufficiently well resolved, or even the MTD-trajectory could follow the wrong path, missing the minimum energy path (MEP). On the other hand, filling up the whole space with too small hills might require excessively long simulation time. Given the hill size (height and width) and knowing approximately the size of the space spanned by the CVs and the barrier height, it is possible to estimate the number of hills needed to fill the basin of the FES and move to the next minimum.

The MOTION/FREE_ENERGY/METADYN/PRINT/HILLS print key controls the printing of the HILLS file where the information on the spawned hills is stored: timestep, coordinates of the center in the CVs space (3 CV $=$ 3 columns), width in each dimension of the space (3 more columns), height (last column).

The provided trajectory is about 100 ps long and indeed it shows the dissociation of the two molecules into NO₂ and OH, whose OH fragments tend to interact with the graphene layer. From the behavior of the first and second CN the evolution of the system can be somehow deduced. In particular the are clear changes in the NO coordination, that becomes larger when the to molecules get closer, and becomes smaller when the OH is dissociated. Soon after the dissociation, the coordination is again close to 3, because the lost O is compensated by the fact that the two NO₂ fragments stay close together, i.e. each N sees the O of the other fragments. The coordination of O to C, becomes larger when the molecules are closer to the layer, and fluctuates a lot due to the rapid movement of the molecules. After the dissociation, higher values of the CN are kept for longer time, indicating some more stable interaction of O with C. A better choice of the parameters defining this CN might help in resolving more clearly the two states, O interacting and O not interacting with the layer. As predicted the very large fluctuations of the H to C CN are of difficult interpretation and make this variable not very useful for the description of the process. A CN N to H, describing the dissociation of H from HNO₃ be a better choice as third CV.

Other quantities that can be monitored from the MOTION/FREE_ENERGY/METADYN/PRINT/COLVAR output, beside the instantaneous values of the three CVs (2nd, 3rd, and 4th col.) are : the instantaneous gradient of the bias potential computed with respect to CV (5th, 6th, 7th col.), the gradients wit respect to the CVs of wall potentials, if present, (8th,9th,10th col), the instantaneous value of the bias potential, and the instantaneous values of the wall potentials.

Third task: dynamics of Si6H8

The data file for this example are in SI6_CLU. In this case, a small Si cluster of 6 Si atoms saturated by 8 H atoms is studied. Si clusters show different arrangements. The equilibrium structure should be such that Si atoms keep the preferred tetrahedral coordination. In the presence of H saturating the dangling Si bonds, the structure can be open, like the chair structure that is used here as starting conformation. By loosing H atoms, through the formation of molecular hydrogen, the cluster undergoes some rearrangement. The structure should become more compact in order to saturate the Si coordination shell.

As in the previous example, preliminary MD runs are carried out to study the dynamics of the system. The electronic structure is computed at the DFT level, using the PBE functional. A standard constant temperature MD is simulated running the input SI6_CLU/si6_clu_nvt.inp. The temperature is set at 300 K and it is controlled by applying the Nose-Hoover thermostat and the temperature rescaling (the small number of degrees of freedom does not allow thermalization within a few ps). The output of this simulation is in NVT. The energy curve over the 6 ps of simulation time and the visualization of the trajectory confirm that the cluster with the initial stoichiometry can be easily equilibrated in the chair structure.

Possible changes in stoichiometry (loosing H) and structure are going to induce variations in the coordination shell of the Si atoms. Therefore, three CN are selected as CV : Si to Si, Si to H, and H to H. These variables are monitored over the equilibrium trajectory by running the input SI6_CLU/si6_clu_mtd_h0_p1.inp. As in the previous example, this is a MTD input where the the variable MOTION/FREE_ENERGY/METADYN#DO_HILLS is set to false, so that no biasing potential is added. The output obtained running this test is in MTD_H0. By monitoring the three CN along the 10 ps log trajectory (2nd, 3rd and 4th col. of SI6_CLU/MTD_H0/si6_clu_mtd_h0-COLVAR.metadynLog), it is observed that the three variables are well equilibrate, with relatively small fluctuations around the average. Given the chosen parameters

    &COLVAR
       &COORDINATION
          KINDS_FROM  Si
          KINDS_TO   Si
          R_0 [angstrom]  2.55
          NN  8
          ND  14
       &END COORDINATION
    &END COLVAR

the Si-Si CN oscillates around 2. Actually, in the chair configuration 4 Si are three fold coordinated with neighboring Si atoms and 2 are two fold coordinated. The bond length is about 2.4 Å. By changing the curvature of the function, the average value of the CN can be easily moved towards 3 and it can become more sensitive to fluctuations of the Si-Si bond length. The fluctuations of the two other CN are even smaller. Si-H fluctuates around 1.4 and H-H is very close to zero, since in the initial configuration the H atoms do not see each other.

The output of a second simulation that monitors the same three CVs is in MTD_L_H0, and the corresponding input is SI6_CLU/si6_clu_mtd_l_h0_p1.inp. Nothing has been changed in the definition of the CVs, but the Lagrangian MTD formalism has been used. With this scheme, an auxiliary variable is associated to each CV, and when the biasing potential is added, it is defined as function of the auxiliary variables rather than of the CVs. The auxiliary variable behaves as additional degree of freedom. Therefore, an inertial mass is associated to it and its dynamics is determined by integrating the same type of equations of motion as for all the other degrees of freedom. The variable is coupled to the corresponding CV through a harmonic potential, and the forces acting on it are those derived from the harmonic potential and from the MTD biasing potential, when it is present. Hence, in the MOTION/FREE_ENERGY/METADYN/METAVAR input section, two additional parameters are needed, which are the mass of the auxiliary variable and the coupling constant for the harmonic potential:

       &METAVAR
        LAMBDA  2.5
        MASS   30.
        SCALE 0.1
        COLVAR 1
      &END METAVAR

A temperature is associated to the auxiliary variables and can be controlled by temperature rescaling. The use of thermostats for such few degrees of freedom is questionable. The Lagrangian MTD formalism is used in order to better control the kinetics of the CV. This control is obtained through the coupled to the auxiliary variables, whose dynamics depends on the mass and the temperature, besides the intensity of two contribution to the force. Hence, by tuning properly the coupling constant and the mass, the desired effect can be obtained. This might become important in order to collect the correct probability distribution in the configurations space defined by the CV, it is important that the system visits all the accessible conformations. By controlling the dynamics of the selected variables, a sort of adiabatic separation can be imposed between the relevant reaction parameters and all the other degrees of freedom.

Whenever a MOTION/FREE_ENERGY/METADYN/METAVAR is defined (also in a not Lagrangian scheme), it is possible to limit the range of values that are going to be explored by setting a so called WALL potential. This is typically done to avoid the time-consuming exploration of regions of the configurations space that are not relevant for the process that is under investigation. When auxiliary variables are employed, the WALL potential can also be activated in order to avoid the exploration of unphysical values, that can happen by too large fluctuations away from the corresponding CV. In the case of the CN, negative values are unphysical and must be avoided. For this reason the subsection MOTION/FREE_ENERGY/METADYN/METAVAR/WALL is added MOTION/FREE_ENERGY/METADYN/METAVAR coupled to the H-H CN, which is known to oscillate close to zero in the initial state:

         &WALL
            POSITION 0.0
            TYPE QUARTIC
            &QUARTIC
               DIRECTION WALL_MINUS
               K  100.0
            &END
        &END

In this case the potential function is $f(s)=K (s-s_0) ^4$ , and it is activated whenever the variable becomes lower (DIRECTION WALL_MINUS) than the given $s_0$ (POSITION).

When such Lagrangian scheme is used more columns appear in the COLVAR, which contain all the relevant information. The 1st column is always the time in fs, the next $N_{\text{CV}}$ columns are the instantaneous values of the auxiliary variables, followed by $N_{\text{CV}}$ columns where the instantaneous values of the CVs are reported. The next columns report the values of the potential gradients: $N_{\text{CV}}$ columns for the gradients of the harmonic potential, $N_{\text{CV}}$ for the gradients of the MTD biasing potential, and $N_{\text{CV}}$ for the gradient of the WALL potential (these are zeros when the corresponding potential is not activated). The following $N_{\text{CV}}$ are the velocities of the auxiliary variables. Then there are the instantaneous values of the harmonic potential, of the MTD potential, and of the WALL potential. The last column is the temperature of the auxiliary variables.

The dynamics of the CVs along the two simulations, with and without Lagrangian MTD scheme, is equivalent. The CV and the auxiliary variables, in the Lagrangian MTD simulation, closely follow each other, which points to a strong enough coupling (could be also a bei looser). The masses assigned to the auxiliary variables seem not to affect significantly the time evolution at equilibrium, i.e. the inertia effect is quite small, also because the temperature of the auxiliary variables is also set at 300 K. In order to slowdown the oscillations of the three CN, and sample better the accessible configurations at each point of the in the CV space, the parameters to be tuned are then the mass and the temperature of the auxiliary variables. In the input SI6_CLU/si6_clu_mtd_l_h0_p2.inp, the definition of the Si-Si CN has been slightly changed, and the mass of the auxiliary variables has been increased. The effect f these changes can be investigated by running this input and comparing the results with the previous results (both the input can be run at a lower level of theory, just to explore the effects of the different parameters on the dynamics of the CV). A third input is proposed, where the MTD temperature is reduced to 100 K, SI6_CLU/si6_clu_mtd_l_h0_p3.inp.

Fourth task: Lagrangian MTD of the atomic rearrangement of Si6H8

MTD_L_P2 contains the output of the MTD run performed with the parameters tested by running SI6_CLU/si6_clu_mtd_l_h0_p2.inp. The corresponding input is SI6_CLU/si6_clu_mtd_l_p2.inp. The MOTION/FREE_ENERGY/METADYN/METAVAR#SCALE parameter is the same for the three variables, since the fluctuations of the three CN are going to be quite similar. The collocation rate is every 100 MD steps, which is quite often, but reasonable, also because the hill size is not too big (about 1 Kcal/mol for the height).