Reading and writing Phantom output files in HDF5 format
From Wikipedia:
Hierarchical Data Format (HDF) is a set of file formats (HDF4, HDF5) designed to store and organize large amounts of data. Originally developed at the National Center for Supercomputing Applications, it is supported by The HDF Group, a non-profit corporation whose mission is to ensure continued development of HDF5 technologies and the continued accessibility of data stored in HDF.
HDF5 has the following nice features.
It is widely available.
It has bindings in C and Fortran.
It has command line tools for reading data.
It has Python packages to read data into NumPy arrays.
It has compression built-in.
Compiling and installing HDF5 libraries
Note
On most supercomputing clusters HDF5 is available, e.g. Ozstar. The following instructions are for a machine you control where the HDF5 libraries are not yet available.
Using a package manager
On macOS you can install HDF5 with Homebrew.
brew install hdf5
The shared object library and include files are at
/usr/local/opt/hdf5
. Use this directory as HDF5_DIR
(see below).
On Ubuntu 18.04, for example, you can install HDF5 with apt.
sudo apt install libhdf5-serial-dev
The location of the library is then
/usr/lib/x86_64-linux-gnu/hdf5/serial
. Use this directory as
HDF5_DIR
(see below).
Compiling
We use the Fortran bindings in Phantom. This means that you must use the same compiler to compile Phantom as was used for the HDF5 library on your machine. Typically this is GCC (gfortran) not Intel Fortran (ifort). If you want to compile Phantom with ifort you must compile HDF5 yourself.
First download the source from https://www.hdfgroup.org/downloads/hdf5/source-code/. Then run the following code. It may take a while.
# Set compilers to Intel.
export CC=icc
export F9X=ifort
export CXX=icpc
# Extract tar ball. I assume you downloaded the tar.gz file.
tar -zxvf hdf5-1.10.5.tar.gz
cd hdf5-1.10.5
# Configure and Make
./configure --prefix=/usr/local/hdf5 --enable-fortran --enable-cxx
make
# Run tests
make check
# Install to /usr/local/hdf5
sudo mkdir /usr/local/hdf5
# You need ownership over the directory to which you will install
# Replace <user> appropriately
sudo chown <user> /usr/local/hdf5
make install
Compiling Phantom
Writing HDF5 output is a compile time option and requires access to the
Fortran HDF5 library. To compile for HDF5 output set HDF5_DIR
, for
example if HDF5 was installed with Homebrew on macOS:
HDF5_DIR=/usr/local/opt/hdf5
or if it was installed with APT on Ubuntu:
HDF5_DIR=/usr/lib/x86_64-linux-gnu/hdf5/serial
Then compile with:
make HDF5=yes
The variable HDF5_DIR
specifies the location of the HDF5 library.
Note
You may need to add to LD_LIBRARY_PATH
(in Linux) or
DYLD_LIBRARY_PATH
(in macOS) to point to the HDF5 library location.
For example, on Linux with HDF5 compiled with ifort and installed to
/usr/local/hdf5
export LD_LIBRARY_PATH=/usr/local/hdf5/lib:$LD_LIBRARY_PATH
Ozstar
On Ozstar you need to make sure that the OpenMPI and HDF5 modules are
loaded. The variable HDF5_DIR
gives the location of the HDF5 library
once the HDF5 module is loaded:
module load iccifort/2018.1.163-gcc-6.4.0
module load openmpi/3.0.0
module load hdf5/1.10.1
Then when you compile Phantom ensure HDF5_DIR
is set correctly:
make SYSTEM=ozstar HDF5=yes phantom setup
Note that you must have the HDF5 module loaded when running phantom,
phantomsetup, etc. So make sure to put module load hdf5/1.10.1
in
your Slurm job file.
Converting standard output files to HDF5 format with phantom2hdf5
phantom2hdf5
is a utility that can convert standard Phantom dump
files to HDF5 format.
You need to compile it with the same options as you compiled Phantom to produce the original Phantom dump. For example, if you have a dusty disc dump to convert that you originally compiled with
make SETUP=dustydisc MAXP=10000000
then you would compile phantom2hdf5
as follows
make SETUP=dustydisc MAXP=10000000 HDF5=yes phantom2hdf5
Recall that you will need to set HDF5_DIR
appropriately for your system.
Now pass a file (or a list of files) to the converter
./phantom2hdf5 dump_00*
Which returns an HDF5 version of each dumpfile
$ ls
dump_00000 dump_00001 dump_00002 dump_00003
dump_00000.h5 dump_00001.h5 dump_00002.h5 dump_00003.h5
...
Reading Phantom HDF5 dump files in Python
You can now read the data from the dump file with the command line tools available with HDF5 or with the Python package h5py.
Command line
To see all the available datasets:
h5ls -r dump_00000.h5
This produces output like
/ Group
/header Group
/header/Bextx Dataset {SCALAR}
/header/Bexty Dataset {SCALAR}
/header/Bextz Dataset {SCALAR}
/header/C_cour Dataset {SCALAR}
/header/C_force Dataset {SCALAR}
/header/RK2 Dataset {SCALAR}
/header/alpha Dataset {SCALAR}
/header/alphaB Dataset {SCALAR}
/header/alphau Dataset {SCALAR}
/header/angtot_in Dataset {SCALAR}
/header/dtmax Dataset {SCALAR}
/header/dum Dataset {SCALAR}
/header/etot_in Dataset {SCALAR}
/header/fileident Dataset {SCALAR}
/header/gamma Dataset {SCALAR}
/header/get_conserv Dataset {SCALAR}
/header/graindens Dataset {2}
/header/grainsize Dataset {2}
/header/hfact Dataset {SCALAR}
/header/idust Dataset {SCALAR}
/header/ieos Dataset {SCALAR}
/header/iexternalforce Dataset {SCALAR}
/header/isink Dataset {SCALAR}
/header/majorv Dataset {SCALAR}
/header/massoftype Dataset {7}
/header/mdust_in Dataset {2}
/header/microv Dataset {SCALAR}
/header/minorv Dataset {SCALAR}
/header/nblocks Dataset {SCALAR}
/header/ndustlarge Dataset {SCALAR}
/header/ndustsmall Dataset {SCALAR}
/header/npartoftype Dataset {7}
/header/nparttot Dataset {SCALAR}
/header/nptmass Dataset {SCALAR}
/header/ntypes Dataset {SCALAR}
/header/polyk2 Dataset {SCALAR}
/header/qfacdisc Dataset {SCALAR}
/header/rhozero Dataset {SCALAR}
/header/time Dataset {SCALAR}
/header/tolh Dataset {SCALAR}
/header/totmom_in Dataset {SCALAR}
/header/udist Dataset {SCALAR}
/header/umagfd Dataset {SCALAR}
/header/umass Dataset {SCALAR}
/header/utime Dataset {SCALAR}
/header/xmax Dataset {SCALAR}
/header/xmin Dataset {SCALAR}
/header/ymax Dataset {SCALAR}
/header/ymin Dataset {SCALAR}
/header/zmax Dataset {SCALAR}
/header/zmin Dataset {SCALAR}
/particles Group
/particles/divv Dataset {10250000}
/particles/dt Dataset {10250000}
/particles/h Dataset {10250000}
/particles/itype Dataset {10250000}
/particles/pressure Dataset {10250000}
/particles/vxyz Dataset {10250000, 3}
/particles/xyz Dataset {10250000, 3}
/sinks Group
/sinks/h Dataset {4}
/sinks/hsoft Dataset {4}
/sinks/m Dataset {4}
/sinks/maccreted Dataset {4}
/sinks/spinxyz Dataset {4, 3}
/sinks/tlast Dataset {4}
/sinks/vxyz Dataset {4, 3}
/sinks/xyz Dataset {4, 3}
You can access a particular value like
h5dump -d "/header/npartoftype" dump_00000.h5
This produces output like
HDF5 "dump_00000.h5" {
DATASET "/header/npartoftype" {
DATATYPE H5T_STD_I32LE
DATASPACE SIMPLE { ( 7 ) / ( 7 ) }
DATA {
(0): 10000000, 250000, 0, 0, 0, 0, 0
}
}
}
Python with h5py
The Python package h5py comes with Anaconda. Alternatively you can install it with pip or Conda.
conda install h5py
To read a dump file
>>> import h5py
>>> f = h5py.File('dump_00000.h5')
Then you can access datasets like
>>> f['particles/xyz'][:]
Plonk
Plonk is a Python package for analysis and visualisation of SPH data in hdf5 format. Plonk is open source and available at https://github.com/dmentipl/plonk.