====== QM/MM study of UREA Zwitterion in water ====== Problem: QM/MM study of the Urea Zwitterion in water by means of a QM/MM Hamiltonian. * Original author: Marcella Iannuzzi * Complete source and output files: [[http://cp2k.org/static/exercises/2015_cecam_tutorial/UREA.tar.xz|UREA.tar.xz]] ===== Introduction ===== For this tutorial some input and output files are given in order to present a complete procedure to solve the given problem. Some hints are also given to help in the analysis of the results. In order to be able to run these examples, some paths need to be correctly set in the input files (i.e. set the variables ''ROOT'' for instance). In this tutorial exercise, we will cover several theoretical aspects covered during the lectures: * NDDO Hamiltonian * QM/MM Hamiltonians * Molecular Dynamics * Metadynamics In order to have a reasonable QM/MM starting structure, we need to prepare a fully functional classical setup. In order to do so, we need to prepare a classical force field for both UREA (Zwitterionic form) and water. In this tutorial we will not cover specifically this aspect, which has been fulfilled using the [[http://ambermd.org|ambertools]]. For people willing to understand and dominate these operative procedures, the link provided above contains tons of information. For your convenience, I have copied in the main distribution of the source input/output files, everything related to the preparation of the forcefield. The main UREA example distribution contains the following directories: * ''Files'' contains FORCE_EVAL and topology/force field files * ''Prepare_Molecules'' contains all files relevant to the creation of the force field for UREA zwitterion and water using the ambertools * ''Prepare_Solvated_Box'' Contains files and setup to prepare a box of water solvating the UREA zwitterion * ''RUN01_EQUIL_MM'' contains classical NPT equilibration * ''RUN01_EQUIL_MM_AVG'' contains classical NVT equilibration upon averaging the box from the previous run * ''RUN01_EQUIL_QMMM'' contains QM/MM NVT equilibration * ''RUN02_QMMM_MTD1'' contains the sampling (by means of metadynamics) of the reaction from Zwitterionic to neutral form * ''RUN02_QMMM_MTD2'' contains the sampling (by means of metadynamics) of the elimination reaction: i.e. elimination of NH3 and formation of cyanic acid. The tasks we will complete in this tutorial exercise are: * We will equilibrate the simulation box by means of classical Hamiltonian employing an NPT ensemble * Based on the averages of the NPT we will equilibrate at the NVT level with an average simulation box * One the system is equilibrated at the classical Hamiltonian level we will switch to a QM/MM Hamiltonian, employing an NDDO scheme for the QM part and equilibrate further with an NVT ensemble * Study the chemical reactivity of the Zwitterion in solution: by inspecting the possibility of having a reversal reaction with formation of neutral urea or alternatively the elimination reaction, with formation of NH3 and cyanic acid. ===== Theoretical Background ===== Urea is formed in large quantities as a product of catabolism of nitrogen-containing compounds. Owing to its resonance stabilization, urea is highly stable in aqueous solutions. For example, urea spontaneously eliminates ammonia to form cyanic acid with a half life of 3.6 years at 38 Celsius degrees. Cyanate ion further readily undergoes conversion to CO2 and ammonia. In contrast, when catalyzed by ureases, urea is generally believed to undergo hydrolysis rather then ammonia elimination producing either HCO3- and NH4+ or ammonium carbamate, depending on the buffer system. Activation energies for urea decomposition in water at different pH have been obtained experimentally. For neutral pH, the reported activation energy ranges from 28.4 Kcal/mol to 32.4 Kcal/mol. There have been also numerous theoretical investigations of the decomposition of urea and related systems. In all of them the explicit representation of the solvent was found to be essential for detailed resolution of the mechanism, identification of the rate determining step and evaluation of the barrier. In particular in a very recent paper by Jorgensen et al. the hydrogen bonded water molecules were found to act as hydrogen shuttle for the first step of the elimination reaction. The forming zwitterionic intermediate, H3N(+)CONH(-), participates in 8-9 hydrogen bonds with water molecules and its decomposition is found to be the rate-limiting step. The overall free energy of activation for the decomposition of urea in water is computed to be 37 Kcal/mol while the barrier for hydrolysis by an addition/elimination mechanism is found to be 40 Kcal/mol. The goal of this exercise will be to inspect the chemical reactivity of the zwitterion surrounded by water molecules. In this exercise, in a very simplicistic way, only urea will be considered QM while the rest of the system will be described with a classical Hamiltonian. By some high level calculations, performed in literature, it is know that the elimination mechanism, leading to cyanic acid and ammonia, happens with barrier of 5.0 Kcal/mol. ===== First task: MM isobaric/isothermal ensemble ===== The first task to complete is the NPT equilibration of the entire system (UREA+water) with a classical force-field. In principle, one could start, for very difficult molecular cases hard to parametrize, directly with a QM/MM equilibration. Due to the larger cost of QM/MM calculations, compared to classical, whenever possible is always better to equilibrate with a classical Hamiltonian at the beginning. The force-field, as said in the introduction, has been optimized using the AMBERTOOLS: refer to the link provided above in order to aquire the proper knowledge on how to perform a classical force-field optimization. In the directory ''Files'', you will find two files ''[[http://cp2k.org/static/exercises/2015_cecam_tutorial/UREA/Files/mol_solv.crd|Files/mol_solv.crd]]'' and ''[[http://cp2k.org/static/exercises/2015_cecam_tutorial/UREA/Files/mol_solv.top|Files/mol_solv.top]]'' as generated from leap, with the AMBER format. Feel free to open them, to inspect the structure of the two files. These two files can be as well opened with VMD, in order to inspect how the initial systems look like. The classical equilibration is performed in the directory ''RUN01_EQUIL_MM''. Let’s have a look more in details at the input file used to perform the NPT equilibration. In the SUBSYS section the information on the structure and the simulation box are specified as: &SUBSYS &CELL ABC [angstrom] 38.8605230 39.1154930 39.2709120 &END CELL &TOPOLOGY CONN_FILE_NAME ${ROOT}/Files/mol_solv.top CONNECTIVITY AMBER COORD_FILE_NAME ${ROOT}/Files/mol_solv.crd COORDINATE CRD &END TOPOLOGY &END SUBSYS The initial cell size was provided by LEAP and in the topology we specify both the starting coordinates ''[[http://cp2k.org/static/exercises/2015_cecam_tutorial/UREA/Files/mol_solv.crd|Files/mol_solv.crd]]'' and the connectivity (mandatory for meaningful classical systems) ''[[http://cp2k.org/static/exercises/2015_cecam_tutorial/UREA/Files/mol_solv.top|Files/mol_solv.top]]'', specifying it as AMBER type. We run the optimization at the MM level with the following setup: &MM &FORCEFIELD parm_file_name ${ROOT}/Files/mol_solv.top parmtype AMBER &spline rcut_nb 9.0 &end &END FORCEFIELD &POISSON &EWALD EWALD_TYPE spme ALPHA .4 GMAX 54 O_SPLINE 4 &END EWALD &END POISSON &END In the ''[[inp>FORCE_EVAL/MM/FORCEFIELD]]'' section we specify the same AMBER topology file, specified for the connectivity, since it stores the force-field information as well. In FIST (which is the classical module) the non-bonded potential is mapped on splines and in the spline section above, we specify the cutoff for these interactions (in this case 9 Å). The core of the evaluation in a classical run, is the evaluation of the electrostatic. We can adjust these parameters in the ''[[inp>FORCE_EVAL/MM/POISSON]]'' section (similarly to the DFT calculations). For classical runs we can employ either standard EWALD summations, Particle-Mesh Ewald (PME) sums or Smooth-Particle-Mesh Ewald ones (SPME). For this exercise we emply the SPME with a grid mesh of 54 for all 3 dimensions and the $\alpha$ parameter for the reciprocal space contributions is equal to 0.4. The control of the NPT equilibration is specified instead by the ''[[inp>MOTION/MD]]'' section: &MD ENSEMBLE NPT_I STEPS 100000 TIMESTEP 0.5 TEMPERATURE 298 &BAROSTAT PRESSURE 1. TIMECON [fs] 100. &END BAROSTAT &THERMOSTAT TYPE NOSE REGION MOLECULE &NOSE TIMECON [fs] 100. &END &END &PRINT &PROGRAM_RUN_INFO &EACH MD 100 &END &END &ENERGY &EACH MD 100 &END &END &END &END MD For the equilibration we employ an isotropic ensemble (NPT_I), using a time step of 0.5 fs for 50 ps overall. The thermostat used is a Nosé-Hoover thermostat with a timeconstant of 100 fs. The barostat is using a manostat with a timeconstant of 100. fs and an external pressure of 1 Atm. It is important to notice as well the output frequency of files and other additional information, since it may significantly affect the performance of you calculations at home. So be always careful on the amount of output in your calculations. At this level you can launch the run: for the provided setup it takes almost 5:00 hours on a single processor (3 GHz). Upon completion of the equilibration we need to inspect whether 50 ps where enough to equilibrate the system. Check the convergence of the cell parameters, by simply plotting (gnuplot or your preference plot manager) the content of the file ''[[http://cp2k.org/static/exercises/2015_cecam_tutorial/UREA/RUN01_EQUIL_MM/UREA-ZW.cell|RUN01_EQUIL_MM/UREA-ZW.cell]]''. Take the averages of the parameters of the cell lattices, we will employ them for the next run: the isothermal equilibration of the classical ensemble. ===== Second task: MM isothermal ensemble ===== Using the average parameters of the cell lattice, as determined in the previous run, we setup an input file to run an NVT equilibration, restarting all information but the ''[[inp>FORCE_EVAL/SUBSYS/CELL]]'', form the previous run. [..] &CELL # these are the averaged cell lengths ABC [bohr] 38.5881 38.8412 38.9956 &END CELL [..] &EXT_RESTART RESTART_FILE_NAME ../RUN01_EQUIL_MM/UREA-ZW-1_100000.restart RESTART_CELL F &END All the files generated in this task can be found in the directory ''RUN01_EQUIL_MM_AVG''. For the purpose of equilibrating the system in the NVT ensemble we will run 5 ps. Check the ''[[http://cp2k.org/static/exercises/2015_cecam_tutorial/UREA/RUN01_EQUIL_MM/UREA-ZW-1.ener|RUN01_EQUIL_MM/UREA-ZW-1.ener]]'' for the conservation of the energy, the temperature fluctuations and the potential energy. ===== Third task: QM/MM isothermal ensemble ===== Starting from the MM system, equilibrated at the right pressure and temperature, we will start now an equilibration at the QM/MM level in the directory ''RUN01_EQUIL_QMMM''. Compared to previous input files, we need to specify everything related to the QM/MM Hamiltonian. In particular, for this system, in order to keep the computational load small, I decided to treat at the QM level only the zwitterion. We need to specify which atoms will be treated QM and also the atomic kinds of these atoms. In this tutorial example we will use SE as quantum Hamiltonian, but the extension to the DFT is immediate, since the only difference are the specification of the basis set and pseudo potential and of the proper DFT section (all these specs are not related to the QM/MM itself). All the informations about a QM/MM run are specified in the QMMM section being part of the ''[[inp>FORCE_EVAL/QMMM]]''. In particular, we need to specify first the QM CELL. This is mandatory and important for DFT calculations (performance, correctness, efficiency) while in principle for SE runs one may use a cell as large as the MM one. &CELL ABC [angstrom] 20.4199430428 20.5538777943 20.6355827553 PERIODIC NONE &END CELL The next thing to setup is the type of coupling. For SE we will use ECOUPL COULOMB Finally we need to specify what kind of classical atoms will be treated at the QM level: &QM_KIND O MM_INDEX 8 &END QM_KIND &QM_KIND N MM_INDEX 1 6 &END QM_KIND &QM_KIND C MM_INDEX 5 &END QM_KIND &QM_KIND H MM_INDEX 2 3 4 7 &END QM_KIND For this specific example, we need also to introduce an additional modification in the force-field. In fact, in the classical force-field, there is no direct Lennard-Jones interaction between the hydrogens of water and the oxygen and nitrogens of UREA. The possible collapse of the hydrogens on the Oxygen,Nitrogens is avoided by the presence of Lennard-Jones terms between Oxygen and Nitrogens of UREA and the Oxygen of water (inspect the force-field file). When performing QM/MM calculations, we may face an additional problem mainly known as electron spill-out. In fact it is possible, especially for COULOMB coupling scheme, that the electrons tend to interact in an unphysical way with the classical charges. This leads to strong attraction of classical charges inside the QM electron density. In order to avoid that, we need to implement an additional force-field term, to avoid that the hydrogens of water (extremely light) may be attracted on the Oxygen or Nitrogens of UREA, leading to a system explosion. We can do that with an additional ''[[inp>FORCE_EVAL/QMMM/FORCEFIELD]]'' section inside the ''[[inp>FORCE_EVAL/QMMM]]'' one: &QMMM ... &FORCEFIELD &NONBONDED &LENNARD-JONES ATOMS HW N2 EPSILON [kcalmol] 0.052 SIGMA [angstrom] 2.42 RCUT [angstrom] 9.0 &END &LENNARD-JONES ATOMS HW N4 EPSILON [kcalmol] 0.052 SIGMA [angstrom] 2.42 RCUT [angstrom] 9.0 &END &LENNARD-JONES ATOMS HW O EPSILON [kcalmol] 0.058 SIGMA [angstrom] 2.2612 RCUT [angstrom] 9.0 &END &END &END &END QMMM No other modification are necessary to perform this task. Run the MD equilibration and inspect the temperature, potential energy and total conserved quantity. ==== Homeworks ==== Try to convert this input to use GPW. Hints: when setting-up a correct ''[[inp>FORCE_EVAL/DFT]]'' section, keep in mind that the QM/MM multigrid approach requires the usage of ''[[inp>FORCE_EVAL/DFT/MGRID#COMMENSURATE]]'' grids in the ''[[inp>FORCE_EVAL/DFT/MGRID]]'' section. Moreover, instead of using the COULOMB interaction scheme for QM/MM coupling, use the GAUSS coupling. Do not forget to provide reasonable basis sets and pseudo potentials in the subsys for the QM kinds, as defined in the QMMM section. ===== Fourth task: QM/MM Metadyanamics simulations ===== Startin from the equilibrated QM/MM system, we will perform two metadynamics run to inspect: * the reaction Zwitterionic-Neutral reaction in solution (in directory ''RUN02_QMMM_MTD1'') * the elimination reaction, producing cyanic acid and ammonia (in directory ''RUN02_QMMM_MTD2'') ==== Zwitterion-Neutral mechanism ==== In order to sample the reverse reaction, from Zwitterion to Neutral, we employ two collective variables, based on coordination: &COLVAR &COORDINATION ATOMS_FROM 1 ATOMS_TO 2 3 4 7 R0 [angstrom] 1.2 &END &END &COLVAR &COORDINATION ATOMS_FROM 6 ATOMS_TO 2 3 4 7 R0 [angstrom] 1.2 &END &END The two above defined CV, represent the coordination of each Nitrogen with all QM Hydrogens. We define the following setup to run MTD (see lecture notes on MTD): &FREE_ENERGY METHOD METADYN &METADYN DO_HILLS NT_HILLS 60 WW [kcalmol] 1.0 &METAVAR COLVAR 1 SCALE 0.2 &END METAVAR &METAVAR COLVAR 2 SCALE 0.2 &END METAVAR &PRINT &COLVAR &EACH MD 10 &END COMMON_ITERATION_LEVELS 1 &END COLVAR &END PRINT &END METADYN Inspec the ''[[http://cp2k.org/static/exercises/2015_cecam_tutorial/UREA/RUN02_QMMM_MTD1/UREA-ZW-COLVAR-1.metadynLog|RUN02_QMMM_MTD1/UREA-ZW-COLVAR-1.metadynLog]]'' until you see the exploration of a second basin. Inspecting the trajectory file is always strongly recommended. To determine the free energy profile employ the fes.sopt program. How deep is the basin? ==== Elimination ==== The elimination reaction is sampled along the CV representing the bond between the NH3 and the HNCO moiety: &COLVAR &DISTANCE ATOMS 1 5 &END &END Similarly to the previous mechanism, inspect the COLVAR files, the trajectory and determine the barrier for the elimination process. ===== Questions ===== Evaluate the free-energy for both processes: from Zwitterionic to Neutral form and for the elimination pathway. How do these numbers compare with the 5 kcal/mol predicted in several published works? Inspect carefully the metadynamics trajectory in order to find a solution (what is the first attempt of the hydrogen of NH3(+) before moving towards the NH(-) group? what would happen if the nearby water molecules would be treated QM?). ===== Homeworks ===== Take into account a primary solvation shell of water molecules as a part of the QM subsystem, using a FLEXIBLE_PARTITIONING scheme to prevent the diffusion of the QM water molecules. Re-run the equilibration steps and perform the Zwitterionic-Neutral metadynamics. Do you see any change in the barrier energy? Why?