====== C2H2 and C2H4 bond energy ====== **REMEMBER TO LOAD THE MODULE FOR THE CP2K TRUNK VERSION: module load cp2k/trunk.2.5.13191 and to submit the job chain with bsub < c2h2.a.chain ** Create a new directory for this exercise: you@brutusX ~$ mkdir mmm_exercise_2.1 you@brutusX ~$ cd mmm_exercise_2.1 In this exercise you will test the general AMBER force field, which is suitable for a large class of molecules. In this example you will compute the classical energy of a simple acetylene molecule and apply small variations to the C-C bond lengths. All files of this exercise (**input and scripts are all commented**) can be downloaded from the wiki: {{exercise_2.1.zip|}} Please copy them to your directory. The input file structure is the following: &FORCE_EVAL ! This section defines method for calculating energy and forces METHOD FIST ! Using Molecular Mechanics &MM &FORCEFIELD ! This section specifies forcefield parameters parm_file_name c2h2.top ! This file contains force field parameters parmtype AMBER &SPLINE ! This section specifies parameters to set up the splines used in the nonboned interactions (both pair body potential and many body potential) EMAX_SPLINE 15.0 ! Specify the maximum value of the potential up to which splines will be constructed &END &END FORCEFIELD &POISSON ! This section specifies parameters for the Poisson solver &EWALD ! This section specifies parameters for the EWALD summation method (for the electrostatics) EWALD_TYPE SPME ! Smooth particle mesh using beta-Euler splines ALPHA .36 GMAX 128 ! Number of grid points &END EWALD &END POISSON &PRINT ! This section controls printing options &FF_INFO &END &FF_PARAMETER_FILE &END &END &END MM &SUBSYS ! This section defines the system &CELL ! Unit cell set up ABC 50 50 50 ! Lengths of the cell vectors A, B, and C &END CELL &COORD ! This section specify all the atoms and their coordinates H 2.468 3.488 4.000 C 3.534 3.516 4.000 C 4.715 3.514 4.000 H 5.782 3.482 4.000 &END COORD &TOPOLOGY CHARGE_BETA ! Read MM charges from the BETA field of PDB file CONNECTIVITY AMBER ! Use AMBER topology file for reading connectivity CONN_FILE_NAME c2h2.top ! This file contains connectivity data &END TOPOLOGY &PRINT &TOPOLOGY_INFO AMBER_INFO &END &END &END SUBSYS &END FORCE_EVAL &GLOBAL ! Section with general information regarding which kind of simulation to perform an parameters for the whole PROGRAM PRINT_LEVEL LOW ! Global print level PROJECT c2h2 ! Name of the project. This word will appear as part of a name of all ouput files (except main ouput file, specified with -o option) RUN_TYPE ENERGY ! Calculating of energy for the fixed position of atoms &END GLOBAL The script to modify the C-C distance in the input and generate the curve is the following: #!/bin/bash d0=0.0002 #definition of a step multiplicator nmax=11 # number of steps rm curve.a.c2h2 # delete file curve.a.c2h2 if it exists for n in $(eval echo "{1..$nmax}") # loop, where n changes from 1 to 11 (nmax) do delta=`echo \( $n \- 6 \) \* $d0 | bc -l ` # This command performs delta=n-6*d0 # Where "bc" is a linux calculator. # symbol "\" says to bash that next symbol is a text symbol not a command awk -v delta=$delta '{if (($1 == "H" || $1 == "C" )) { if (ind<=1) {printf "%s %10.8f %10.8f %10.8f \n", $1,$2-delta,$3,$4; ind=ind+1} else {printf "%s %10.8f %10.8f %10.8f \n", $1,$2+delta,$3,$4; ind=ind+1} } else {print $0} }' < c2h2.amber.inp > c2h2.$n.inp # awk reads file c2h2.amber.inp line by line and prints output in the file cp2k.1.inp (if n equals 1) # If the firs word of a line is H or C, then x coordinate is decreased (for the first and the second atom) or increased # (for the third and the fourth atom) by the value of "delta" cp2k.popt -i c2h2.$n.inp > c2h2.$n.out # running cp2k dist=`grep -A 4 '&COORD' c2h2.$n.inp | grep ^C | awk '{x[NR]=$2;y[NR]=$3;z[NR]=$4} END {dx=x[1]-x[2]; dy=y[1]-y[2]; dz=z[1]-z[2];print sqrt(dx*dx+dy*dy+dz*dz)}' ` # ^ # reading the cp2k ouput file # and printing line which # contains the world # "&COORD" + 4 lines below # ^ # printing only # the lines # which begins # from C # ^ # NR is the number of current line (in our case it can be only 1 or 2, because there are only two carbon atoms in C2H2) # When awk finished reading the input file it performs directives, which are written in END { ... } # in our case it's simply calculation the distance between two carbon atoms # ^ finally the variable dist is equal to the distance between two carbon atoms grep 'ENERGY|' c2h2.$n.out | grep -v Run | awk -v dist=$dist '{print dist,$9}' >> curve.a.c2h2 # ^ # print lines from the file c2h2.1.inp (if n equals 1) , which contains the word "ENERGY|" # ^ # don't print lines, which contains the word "Run" # ^ # print distance between two atoms and 9th word from the line which correspond to the value of energy in the file curve.a.c2h2 done # finishing the loop mkdir Logs # creating directory Logs mv c2h2.?.inp c2h2.??.inp c2h2.*.out Logs rm RUN* *restart* # removing all the files, which has name starting from "RUN.." or contains word "restart" At this point submit the job grid, first loading the module for cp2k version 2.5 entering you@brutusX mmm_exercise_2.1$ module load cp2k/trunk.2.5.13191 you@brutusX mmm_exercise_2.1$ bsub < c2h2.a.chain Finally, the gnuplot code to fit the curve to a parabolic function - and to extract the parameters of the bond interaction is the following #!/usr/bin/gnuplot f(x)=k*(x-x0)*(x-x0)+e0 # definition of the function to be fitted delta = 0.002 # e0=0.007720330647107 # initial value of e0 k = 500 # initial value of k x0=1 # initial value of x0 #a0 = 1.3327436840 # set xlabel 'd (Angstrom)' # label of the x-axis set ylabel 'E (kcal/mol)' # label of the y-axis set title "c2h2 triple bond" # main title fit f(x) 'curve.a.c2h2' u ($1):(($2)*627) via k,x0,e0 # fitting procedure # ^ function to be fitted # ^ take data from the file curve.a.c2h2 # and use the first column as X and the second one # ax Y (all values of the second column are multiplied by 627. 1 Hartree = 627 kcal/mol) # ^ k, x0, e0 - are parameters to be adjusted plot 'curve.a.c2h2' u ($1):(($2)*627)-e0 t 'computed AMBER' w lp lw 2 ps 2 pt 7, f(x)-e0 t 'k*(x-x0)^2' w l lw 2 # ^ take data for the first plot from the file curve.a.c2h2 # ^ use the firts column as X and the second as Y # ^ title of the first plot: computed AMBER # ^ draw points and lines # ^ line width = 2 # ^ point size = 2 # ^ point type = 7 # ^ plot function f(x)-e0 # ^ title of the second plot: k*(x-x0)^2 # ^ draw only line # ^ line width = 2 pause mouse # exit by right-clicking on the plot window # EOF Compare the values that you obtain with the ones listed in the "human readable" potential file c2h2-force_field.pot that was generated by cp2k. Now, perform the same exercise in another directory for the molecule C2H4. * Copy the input file and change the coordinates to the following ones: H 2.7558925119 4.9201502245 3.9999999733 H 2.7617214834 3.0599837703 4.0000000267 C 3.3235178526 3.9916405319 4.0000000002 C 4.6493228423 3.9951429630 3.9999999995 H 5.2118318595 4.9261779106 4.0000000269 H 5.2167134504 3.0669045998 3.9999999735 ALL INSTANCES OF C2H2 (filename, potential, etc) have to be changed as well * The script to modify the bond lengths changes slightly as well d0=0.0002 nmax=11 rm curve.a.c2h4 for n in $(eval echo "{1..$nmax}") do delta=`echo \( $n \- 6 \) \* $d0 | bc -l ` awk -v delta=$delta '{if (($1 == "H" || $1 == "C" )) { if (ind<=2) {printf "%s %10.8f %10.8f %10.8f \n", $1,$2-delta,$3,$4; ind=ind+1} else {printf "%s %10.8f %10.8f %10.8f \n", $1,$2+delta,$3,$4; ind=ind+1} } else {print $0} }' < c2h4.inp > c2h4.$n.inp #YOUR Cp2K COMMAND dist=`grep -A 4 '&COORD' c2h4.$n.inp | grep ^C | awk '{x[NR]=$2;y[NR]=$3;z[NR]=$4} END {dx=x[1]-x[2]; dy=y[1]-y[2]; dz=z[1]-z[2];print sqrt(dx*dx+dy*dy+dz*dz)}' ` grep 'ENERGY|' c2h4.$n.out | grep -v Run | awk -v dist=$dist '{print dist,$9}' >> curve.a.c2h4 done mkdir Logs mv c2h4.*.inp c2h4.*.out Logs rm RUN* *restart* * Use a fitting gnuplot script to verify the value of the parameters. What has changed with respect to the C2H2 case? Why? Assignment: - Report the values of the bond length and force constants in all cases. - Discuss these values in view of the kind of molecules we are simulating - Try a calculation (DFT) series with a wider range of bond lengths (say, from +0.3 to -0.3 Angstrom). How does the curve look like?