Getting started on the NCI supercomputer (Australian National Supercomputing Facility) ======================================================================================= Apply for an account at http://nci.org.au If you are in Daniel Price’s research group, request to join project “wk74” Log in ------- Please read the :doc:`general instructions for how to log in/out and copy files to/from a remote cluster ` :: ssh -Y USER@gadi.nci.org.au Configure your environment --------------------------- First edit your .bashrc file in your favourite text editor:: vi ~/.bashrc Mine has:: export SYSTEM=nci export PROJECT=wk74 export OMP_STACKSIZE=512M export OMP_NUM_THREADS=32 export SPLASH_DIR=/g/data/$PROJECT/splash export PATH=$PATH:$SPLASH_DIR/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$SPLASH_DIR/giza/lib export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/apps/hdf5/1.10.5/lib export MAXP=2000000 ulimit -s unlimited source ~/.modules If you are using phantom+mcfost, you will need the following lines:: export MCFOST_DIR=/g/data/$PROJECT/mcfost-src/ export MCFOST_AUTO_UPDATE=0 export MCFOST_INSTALL=/g/data/$PROJECT/mcfost/ export MCFOST_GIT=1 export MCFOST_NO_XGBOOST=1 export MCFOST_LIBS=/g/data/$PROJECT/mcfost export MCFOST_UTILS=/g/data/$PROJECT/mcfost/utils export HDF5ROOT=/apps/hdf5/1.10.5/lib Then relevant modules in your .modules file:: vi ~/.modules Mine contains:: module load intel-compiler module load intel-mpi module load git module load hdf5 where the last line is needed for git's large file storage (LFS) to work. Finally, make a shortcut to the /g/data filesystem:: cd /g/data/$PROJECT mkdir $USER cd ln -s /g/data/$PROJECT/$USER runs cd runs pwd -P Get phantom ------------ Clone a copy of phantom into your home directory:: $ cd $HOME $ git clone https://github.com/danieljprice/phantom.git $ cd phantom and tell git who you are:: $ git config --global user.name "Joe Bloggs" $ git config --global user.email "joe.bloggs@monash.edu" Run a calculation ------------------ then make a subdirectory for the name of the calculation you want to run (e.g. tde):: $ cd; cd runs $ mkdir tde $ cd tde $ ~/phantom/scripts/writemake.sh grtde > Makefile $ make setup $ make then run phantomsetup to create your initial conditions:: $ ./phantomsetup tde (just press enter to all the questions to get the default) To run the code, you need to write a pbs script. You can get an example by typing “make qscript”:: $ make qscript INFILE=tde.in JOBNAME=myrun > run.q should produce something like:: $ cat run.q #!/bin/bash ## PBS Job Submission Script, created by "make qscript" Tue Mar 31 12:32:08 AEDT 2020 #PBS -l ncpus=48 #PBS -N myrun #PBS -q normal #PBS -P wk74 #PBS -o tde.in.pbsout #PBS -j oe #PBS -m e #PBS -M daniel.price@monash.edu #PBS -l walltime=48:00:00 #PBS -l mem=16G #PBS -l storage=gdata/wk74 #PBS -l other=hyperthread ## phantom jobs can be restarted: #PBS -r y cd $PBS_O_WORKDIR echo "PBS_O_WORKDIR is $PBS_O_WORKDIR" echo "PBS_JOBNAME is $PBS_JOBNAME" env | grep PBS cat $PBS_NODEFILE > nodefile echo "HOSTNAME = $HOSTNAME" echo "HOSTTYPE = $HOSTTYPE" echo Time is `date` echo Directory is `pwd` ulimit -s unlimited export OMP_SCHEDULE="dynamic" export OMP_NUM_THREADS=48 export OMP_STACKSIZE=1024m echo "starting phantom run..." export outfile=`grep logfile "tde.in" | sed "s/logfile =//g" | sed "s/\\!.*//g" | sed "s/\s//g"` echo "writing output to $outfile" ./phantom tde.in >& $outfile You can then proceed to submit the job to the queue using:: qsub run.q Check the status using:: qstat -u $USER How to keep your job running for more than 48 hours ---------------------------------------------------- Often you will want to keep your calculation going for longer than the 48-hour maximum queue limit. To achieve this you can just submit another job with the same script that depends on completion of the previous job First find out the job id of the job you have already submitted:: $ qstat Job id Name User Time Use S Queue --------------------- ---------------- ---------------- -------- - ----- 18780261.gadi-pbs croc abc123 402:48:0 R normal-exec Then submit another job that depends on this one:: qsub -W depend=afterany:18780261 run.q The job will remain in the queue until the previous job completes. Then when the new job starts phantom will just carry on where it left off. A more sophisticated version of the above can be achieved by generating your PBS script with PBSRESUBMIT=yes:: make qscript INFILE=tde.in PBSRESUBMIT=yes > run.q you can check the details of this using:: cat run.q and submit your script using:: qsub -v NJOBS=10 run.q which will automagically submit 10 jobs to the queue, each depending on completion of the previous job. how to not annoy everybody else --------------------------------- Do not fill the disk quota! Use a mix of small and full dumps where possible and set dtmax to a reasonable value to avoid generating large numbers of unnecessary large files. For how to move the results of your calculations off gadi see :doc:`here ` how to use splash to make movies without your job getting killed ----------------------------------------------------------------- If you try to make a sequence of images using splash on the login node (e.g. by typing /png or file.png at the device prompt), your job will get killed due to the runtime limits:: Graphics device/type (? to see list, default /xw):/png ... Killed A simple workaround for this is to launch N instances of splash using a bash loop:: $ for x in dump_0*; do echo $x; done Then replace "echo $x" with the relevant splash command:: $ for x in dump_0*; do splash -r 6 -dev $x.png $x; done If you still get prompts that need answers you can follow the procedure `here `, or simply list the answers to the prompts in a file (here called answers.txt) and use:: $ for x in dump_0*; do splash -r 6 -dev $x.png $x < answers.txt; done this way each process is short and your movie-making can proceed without getting killed. more info ---------- See :doc:`general instructions for how to log in/out and copy files to/from a remote cluster ` For more information on the actual machine `read the userguide `__