Replica exchange of the disordering of a cluster

Replica exchange of the disordering of a cluster

For this job we will use the cluster HYPATIA available at Empa. There we have access to parallel facilities with reserved nodes for the lecture. How to connect to HYPATIA:

Dear Student,

In order to be able to run simulations at high priority, today we will work on the Empa Cluster. We have created a personal account for you. Since the cluster is behind a firewall, we must connect to a gate machine (jumphost) to be allowed to access to the cluster. For security reasons, there are two temporary passwords that you should change to a personal password (can be the same for the gate and for the cluster.

Here the instructions to connect. Your username/password (EMPA-USER, TEMP-PASSW1, TEMP-PASSW2) are listed at the end of this message.

1) Decide a password (we will call it EMPA-PASSW )

2) connect to the jumphost:

ssh -X EMPA-USER@jump1.empa.ch Password: TEMP-PASSW1

3) Accept the contract

4) Set a new password (input old password, TEMP-PASSW1, write new password, EMPA-PASSW)

5) Connect to hypatia: ssh -X hypatia password: TEMP-PASSW2

6) Accept the contract

7) Change your password as in point 4) using TMP-PASSW2 as old password and set EMPA-PASSW

User-specific information (note: TMP-PASSW1 ist the password for jump1, that is the FIRST one, but is listed as second):

EMPA-USER:TMP-PASSW2:TMP-PASSW1

[you@hypatia ~]$ mmm-init
[you@hypatia ~]$ cd /mnt/scratch/YOURUSER/
[you@hypatia ~]$ cp -r /home/psd/exercise_7 .
[you@hypatia ~]$ cd exercise_7

The commands that you need to do to perform the exercise are, in this order:

[you@hypatia ~]$ qsub 00_run
[you@hypatia ~]$ ./01_adapt_files
[you@hypatia ~]$ ./02_reorder
[you@hypatia ~]$ ./03_extract_allaverages

Running the job

The script contains the directives for the queuing system, including 16 cores on one nodes reserved for the job.

#=== job name:
#PBS -N parallel 
#=== wall time limit (h:m:s)
#PBS -l walltime=1:00:00
#choice of the number of nodes and proc. per node
#PBS -l nodes=1:ppn=16 
#PBS -q short
#which queue
#=== memory usage
##PBS -l mem=1024mb
#=== join stdout and stderr
#PBS -j oe
#======================================
 
#
# set environment variables
#
 
module unload mvapich2
module load openmpi
module load lammps/17Nov16/openmpi/2.0.1/gcc/4.9.4
cd $PBS_O_WORKDIR          
 
rm parallel.o* log.* screen*
 
mpiexec -np 16 lmp_mpi -partition 16x1   -in input

The last line is the command to run a parallel lammps job with the input file input

The input file for lammps

The file input contains information for the program lammps. Details on the documentation can be found here

There is an initialization section, showing the kind of units (see this page), the dimensionality, the boundary conditions.

# Initialization
units            metal
dimension        3
boundary         p p p
atom_style       atomic

In the second part of the input file a spherical region is defined (to confine the cluster). Then the atoms are read from input.dat. We also assign a mass to the kind number 1 (there is just one atomic type for Argon).

region rs sphere 0 0 0 12.66
read_data        input.dat
mass 1 39.948

Then, we define the parameters for the Lennard-Jones potential. The units are eV for epsilon, and angstrom for sigma. The last number is the cutoff, in Angstrom.

pair_style lj/cut 8.5
pair_coeff 1 1 0.01042 3.405  8.5

Then, we initialize the fix and the velocity as well as the temperature of each replica, which have been previously generated using the program t.x present in the same directory. We distribute the temperature exponentially between 2 K and 40 K. In LAMMPS, a fix is any operation that is applied to the system during timestepping or minimization. Here we have a fix for controlling temperature with NVT (a different temperature for each temperature), and a fix for applying a harmonic restraint to the spherical region confining the cluster. In this way, the atoms going beyond this region will be elastically pushed back into the sphere.

variable i equal part
variable t world  2.00    2.44    2.98    3.64    4.45    5.43    6.63    8.09    9.88   12.07   14.74   17.99   21.97   26.83   32.76   40.00
velocity        all        create        $t        293288
velocity all zero linear
velocity all zero angular
fix        1        all        nvt        temp        $t        $t        0.1
fix  2 all wall/region rs harmonic 2.0 0.0 0.4

The next section is about writing out each 1000 steps the relevant information about temperature and energy. We also dump a restart file at the end, and every 10000 steps a structure in xyz format.

thermo         1000
thermo_style        custom        step  temp     pe ke etotal
thermo_modify       line        one
restart 5000000 restart.*
dump         2 all xyz 10000 structure_$i.xyz
dump_modify  2 element "Ar" sort id

Finally, this is the command to run the tempering, with an exchange move attempted every 1000 step of molecular dynamics and an initial temperature $t that is different from replica to replica. The last numbers are random seeds that are used for choosing which replica to exchange and for the Metropolis criterion.

temper 5000000 1000 $t 1 3678 3490

Adapting the output files

We must now make some postprocessing on the output files. The goal is to performs averages at different temperatures. These averages are enhanced by the exchanges that were performed between different molecular dynamics replica. Note that temperature is set by a thermostat.

Example. Processor 0 starts with temperature T0=2 K, processor 1 with temperature T1=2.44 K. After 1000 steps, an exchange step is attempted and accepted with some probability (see theory slides, and also the paper 10.1063/1.481671. After the exchange move, the temperature of processor 0 is 2.44 and the one of processor 1 is 2 K. But you can see it also as the configurations of T0=2 K and the one of T1=2.44 K are changing, thus improving the sampling at both temperature.

The script 01_adapt_files performs the following operations:

prunes the log.lammps file which contains a log of all exchanges between the replicas. Take only the steps for which we also have a dump of the atomic coordinates.
For all the log.lammps.* files from each replica take only the lines for which we also have a dump of the atomic coordinates. These lines are put in a file *.nxyz, one for each replica. Each line contains temperatures, potential energies, etc.
Compute the q4 order parameter for all structure files and create *.q4 files, one for each replica.
now paste the *.nxyz and the *.q4 files into a file t_q4_epot_etot.*.out containing the dump of temperature, energy, q4 every 10000 steps.

Reordering the replica: one temperature, one file

At this point, we have a set of t_q4_epot_etot.*.out, one for each replica (processor). But along each of these files, the temperatures change a lot due to the exchanges. So, we use the file exchanges_nxyz.log that keeps track of the exchanges, and tells us at a given timestep which replica has which temperature: we scramble the t_q4_epot_etot.*.out files, and at the end we will have one file for each temperature. This is accomplished by the script 02_reorder.

Consider each file t_q4_epot_etot.*.out (processor by processor). Say you consider the number 5 (6th replica): t_epot_q4_etot.5.out.
At the step 50000, the file shows the following line:

50000 6.7133746 -1.7636174 0.189 -1.7315099

indicating a temperature of 6.7133746.

The file exchanges_nxyz.log, at the step 50000, gives us the following line:

50000 7 0 3 2 1 6 10 5 12 8 11 4 9 13 15 14

indicating that at the 6th replica (column 7), we have the temperature 6, which is (see input file) T=6.63 K. Meaning that at step 50000, the thermostat is keeping replica 5 around the temperature T=6.63 K.

This means that this line has to be stored in the temperature file number 6.

At the end of the above procedure performed by the small script section:

NP=16
NP1=$[NP-1]
rm torder*
for repl in `seq 0 $NP1`
do
   echo $repl
   awk -v rep=$repl '{r2=rep+2;print $r2}'  < exchanges_nxyz.log  > rep_$repl
   i=0
   for a in `cat rep_$repl`
   do
        i=$[i+1]
        head -$i t_q4_epot_etot.$repl.out | tail -1  >> torder.$a
   done
done

we will have a set of files, one for each temperature. The file torder.6 (showing the temperature log around T=6.63 K shows something like that:

110000 6.0832407 0.188 -1.7669426 -1.7378488
300000 5.3292135 0.189 -1.7741021 -1.7486144
460000 7.270977 0.188 -1.7594967 -1.7247223
850000 5.547995 0.189 -1.7583209 -1.7317869
900000 6.0463203 0.190 -1.7563726 -1.7274553
1100000 7.4527984 0.189 -1.7608437 -1.7251998
1160000 7.660013 0.189 -1.7653205 -1.7286855
1290000 7.634912 0.188 -1.7551173 -1.7186023
1520000 6.7791476 0.190 -1.7719473 -1.7395252
1530000 5.562028 0.189 -1.7551797 -1.7285786
1540000 5.9499865 0.189 -1.7682706 -1.739814
1560000 8.0181451 0.186 -1.7549744 -1.7166267
1670000 6.4413007 0.189 -1.7601051 -1.7292988
1740000 5.5362416 0.188 -1.7592589 -1.7327812
1750000 6.8539271 0.189 -1.7645124 -1.7317327
2030000 7.8928443 0.188 -1.7657447 -1.7279962
2040000 5.3275227 0.189 -1.763795 -1.7383155
2100000 5.7265507 0.189 -1.7645332 -1.7371452
2550000 8.1985344 0.189 -1.7581595 -1.7189489
2580000 7.3481203 0.190 -1.7668799 -1.7317366
2780000 6.7587102 0.189 -1.7581622 -1.7258378
2800000 7.1581346 0.188 -1.7609368 -1.7267022
...

As you see, the number of steps is not ordered. This is easily achieved by the last part of the script 02_reorder

for repl in `seq 0 $NP1` 
do
    sort -nk1 torder.$repl > temp 
    mv temp torder.$repl
done

and now the same file torder.6 shows the following lines:

0 6.5781351 0.191 -1.7950808 -1.7636201
10000 5.4632389 0.188 -1.7609687 -1.7348401
20000 5.498244 0.189 -1.7597787 -1.7334826
30000 5.5142334 0.190 -1.7559687 -1.7295962
40000 7.4876442 0.189 -1.7622814 -1.7264708
50000 6.7133746 0.189 -1.7636174 -1.7315099
60000 5.9256132 0.188 -1.7593177 -1.7309777
70000 5.8414791 0.182 -1.7619757 -1.7340381
80000 3.9373038 0.189 -1.7687489 -1.7499183
90000 9.949782 0.189 -1.7640962 -1.7165101
100000 7.5855163 0.189 -1.7616613 -1.7253826
110000 6.0832407 0.188 -1.7669426 -1.7378488
120000 7.047375 0.189 -1.7588753 -1.7251703
130000 6.3651424 0.188 -1.7596141 -1.729172
140000 8.268057 0.188 -1.7647263 -1.7251833
150000 5.9081219 0.189 -1.7641776 -1.7359213
160000 5.2026849 0.188 -1.7603192 -1.7354367
170000 7.1694387 0.190 -1.762217 -1.7279282
180000 5.3619579 0.188 -1.7596472 -1.7340029
190000 7.9061423 0.188 -1.7631399 -1.7253278
200000 8.0048742 0.188 -1.7612416 -1.7229573
210000 9.5218385 0.189 -1.758481 -1.7129416
220000 6.3793891 0.189 -1.7658995 -1.7353892
230000 7.5105967 0.189 -1.7545324 -1.7186121
240000 7.6066407 0.188 -1.7643938 -1.7280141
250000 5.969687 0.189 -1.7611185 -1.7325677
260000 6.6266784 0.189 -1.761914 -1.730221
270000 6.8500414 0.181 -1.7615648 -1.7288036
280000 4.0299504 0.187 -1.7663177 -1.747044
...

Extract averages

Now we are ready to extract averages at each temperature. This is achieved by the m_* function m_average (hint: look for the code of this function in the file /share/apps/m_functions.bash), which is used in the script 03_extract_allaverages.

. /share/apps/m_functions.bash
rm averages_t_q4_epot_etot
for a in torder.? torder.??
do
 t=`m_average $a 2`
 q4=`m_average $a 4`
 epot=`m_average $a 3`
 etot=`m_average $a 5`
 echo $t $q4 $epot $etot  >> averages_t_q4_epot_etot
done

At this point you have a file averages_t_q4_epot_etot with the corresponding averages for each temperature.

ASSIGNMENTS

Using gnuplot, plot the steps vs. q4 (columns 1 and 3) from the file t_q4_epot_etot.0.out, t_q4_epot_etot.13.out, t_q4_epot_etot.15.out. Comment what you observe.
Compare using gnuplot the plot of the nsteps vs. potential energy (columns 1 and 4) for t_q4_epot_etot.13.out and torder.13. Comment the differences
Plot the q4 in torder.0, torder.5, torder.10, torder.15. Comment the differences.
Using the averages file, try to reproduce figure 2, top panel of the paper 10.1063/1.481671.
ADVANCED . Describe what you would need to reproduce Fig. 1 of the same paper. What does this figure show? Find the reference to this figure in the text of the paper.
ADVANCED . Using the torder.* files, and using eq. (14), obtain Fig. 1 of the paper.

Table of Contents