Resurrection script
Jump to navigation
Jump to search
go back to Main Page, Group Pages, Núria_López_and_Group, Scripts_for_VASP
Setting up the calculation[edit]
First, put the three scripts listed here in your ~/bin/ folder in MareNostrum, and do chmod +x to make them executables.
Now, create the main directory for your dynamics, and the first folder, such as this:
you@login1:~> mkdir my_dynamics you@login1:~> mkdir my_dynamics/1 you@login1:~> cd my_dynamics/1
Now, put in the folder 1 your basic input files (INCAR, KPOINTS, POTCAR and POSCAR). Your first run.sh script should be like this (running in class_c queue):
#!/bin/bash #BSUB -J name_of_job_1 #BSUB -q class_c #BSUB -n 64 #BSUB -W 23:59 #BSUB -o o_name_of_job_1.%J #BSUB -e e_name_of_job_1.%J #BSUB -u youremail@iciq.es #BSUB -R"span[ptile=16]" ### Load environment variables ########### module load VASP/5.3.3 ### Run job ############################## resurrection_timecontrol 23 30 r_name_of_job_1 & mpirun vasp.complex ; touch stopflag ; resurrection name_of_job_ 1 16c 64 23 30 ; echo the dynamics has been resurrected >> r_name_of_job_1 ; exit
Explanation:
- In this example, you are running with 64 processors on class_c queue.
- You have three quickly accesible log files: o_* is the standard output, e_* contains the errors, and r_* contains the information related to the resurrection process.
- Your time limit will be 23:59 hours, the maximum allowed by class_c is 24:00
- Before starting VASP, you will lauch resurrection_timecontrol, which will stop the calculation after 23:30 hours, via STOPCAR (LSTOP = .TRUE.).
- Then the script will execute VASP on your local folder.
- If the VASP calculation ends abruptly before the time limit, it will deliver a signal (stopflag) that will kill "resurrection_timecontrol", avoiding a phantom job to stay on the line for hours.
- Now the calculation will be resurrected with the name name_of_job_2, on folder 2 (see script 1 for more details) on the same queue with the same number of processors and the same time control. This script will call internally to rungen_resurrection, but you can merge them if you prefer.
- This set of scripts is totally self-contained.
- Tested and debugged.
Now that you know how this work, begin to calculate by typing:
you@login1:~/my_dynamics/1> bsub < run.sh
Do not forget to baby-sit your calculations every day, and verify that all your electronic cycles have been converged.
Script 1: resurrection[edit]
#!/bin/bash # Rodrigo García-Muelas # 28/03/2013 # # Input: # $1 Name of work # $2 Suffix (number id) # $3 Queue # $4 Number of processors # $5 Number of hours of runtime # $6 Extra number of minuts of runtime # # Motivation: I create a directory for the next step. # Then, I create the new run.sh, which shall call this script # And send # run.sh has an internal time control i=$(($2+1)) mkdir ../$i cp ./INCAR ../$i/INCAR cp ./KPOINTS ../$i/KPOINTS cp ./CONTCAR ../$i/POSCAR cp ./POTCAR ../$i/POTCAR mv ./WAVECAR ../$i/WAVECAR mv ./CHGCAR ../$i/CHGCAR rm ./CHG cd ../$i/ rungen_resurrection $1 $i $3 $4 $5 $6 # generate run.sh bsub < run.sh # submit run.sh exit
Script 2: rungen_resurrection[edit]
#!/bin/bash # Rodrigo García-Muelas # 28/03/2013 # # Input: # $1 Name of work # $2 Suffix (number id) # $3 Queue # $4 Number of processors # $5 Runtime hours # $6 Runtime minutes (add) # # Motivation: I create a directory for the next step. # Then, I create the new run.sh, which shall call this script case $3 in 16a) queue=class_a ; mar=1 ; procqueue=16 ; maxhours=47 ;; 16b) queue=class_b ; mar=1 ; procqueue=16 ; maxhours=22 ;; # maybe they give priority to shorter works 16c) queue=class_c ; mar=1 ; procqueue=16 ; maxhours=22 ;; # idem *) echo "Error in queue name!!! " ; exit ;; esac # Comprobate if the number of processors is correct let AAA=`expr $4 % $procqueue` ; if [ 0 != $AAA ] ; then exit 1 ; fi # number of processars right? # Generating the run.sh file cat >run.sh<<! #!/bin/bash #BSUB -J $1$2 #BSUB -q $queue #BSUB -n $4 #BSUB -W $5:59 #BSUB -o o_$1$2.%J #BSUB -e e_$1$2.%J #BSUB -u rgarcia@iciq.es #BSUB -R"span[ptile=16]" ### Load environment variables ########### module load VASP/5.3.3 ### Run job ############################## resurrection_timecontrol $5 $6 r_$1$2 & mpirun vasp.complex ; touch stopflag ; resurrection $1 $2 $3 $4 $5 $6 ; echo the dynamics has been resurrected >> r_$1$2 ; exit !
Script 3: resurrection_timecontrol[edit]
#!/bin/bash # # Rodrigo García-Muelas # Improved on May 17th, 2013 # # INPUT # # $1 number of hours + # $2 number of minutes # (before generating file STOPCAR) # $3 name of file # # INTERNAL # # timeini : The calculus starts # timeend : The calculus ends # timenow : Current time timeini=`date +'%s'` timenow=$timeini timeend=$(($timeini+3600*$1+60*$2)) echo resurrection flags are timeini $timeini timeend $timeend >> $3 # If VASP finishes before timeend, kill this process while [ $timenow -lt $timeend ] ; do if [ -e stopflag ] ; then rm stopflag ; echo resurrection: VASP finished normally at $timenow >> $3 ; exit ; fi sleep 5s # Verify status each 5 seconds timenow=`date +'%s'` done # If timeend is reached, write STOPCAR echo resurrection: writing STOPCAR at $timenow >> $3 cat >STOPCAR<<! LSTOP = .TRUE. ! exit