Converting Phantom output files to HDF5 format =============================================== From `Wikipedia `__: Hierarchical Data Format (HDF) is a set of file formats (HDF4, HDF5) designed to store and organize large amounts of data. Originally developed at the National Center for Supercomputing Applications, it is supported by The HDF Group, a non-profit corporation whose mission is to ensure continued development of HDF5 technologies and the continued accessibility of data stored in HDF. HDF5 has the following nice features. - It is widely available. - It has bindings in C and Fortran. - It has command line tools for reading data. - It has Python packages to read data into NumPy arrays. - It has compression built-in. Compiling and installing HDF5 libraries --------------------------------------- .. note:: On most supercomputing clusters HDF5 is available, e.g. Ozstar. The following instructions are for a machine you control where the HDF5 libraries are not yet available. Using a package manager ~~~~~~~~~~~~~~~~~~~~~~~ On macOS you can install HDF5 with Homebrew. :: brew install hdf5 The shared object library and include files are at ``/usr/local/opt/hdf5``. Use this directory as ``HDF5_DIR`` (see below). On Ubuntu 18.04, for example, you can install HDF5 with apt. :: sudo apt install libhdf5-serial-dev The location of the library is then ``/usr/lib/x86_64-linux-gnu/hdf5/serial``. Use this directory as ``HDF5_DIR`` (see below). Compiling ~~~~~~~~~ We use the Fortran bindings in phantom2hdf5. This means that you must use the same compiler to compile phantom2hdf5 as was used for the HDF5 library on your machine. Typically this is GCC (gfortran) *not* Intel Fortran (ifort). If you want to compile Phantom with ifort you must compile HDF5 yourself. First download the source from https://www.hdfgroup.org/downloads/hdf5/source-code/. Then run the following code. It may take a while. :: # Set compilers to Intel. export CC=icc export F9X=ifort export CXX=icpc # Extract tar ball. I assume you downloaded the tar.gz file. tar -zxvf hdf5-1.10.5.tar.gz cd hdf5-1.10.5 # Configure and Make ./configure --prefix=/usr/local/hdf5 --enable-fortran --enable-cxx make # Run tests make check # Install to /usr/local/hdf5 sudo mkdir /usr/local/hdf5 # You need ownership over the directory to which you will install # Replace appropriately sudo chown /usr/local/hdf5 make install Converting standard output files to HDF5 format with phantom2hdf5 ----------------------------------------------------------------- ``phantom2hdf5`` is a utility that can convert standard Phantom dump files to HDF5 format. Writing HDF5 output requires access to the Fortran HDF5 library. To compile for HDF5 output set ``HDF5_DIR``, for example if HDF5 was installed with Homebrew on macOS:: HDF5_DIR=/usr/local/opt/hdf5 or if it was installed with APT on Ubuntu:: HDF5_DIR=/usr/lib/x86_64-linux-gnu/hdf5/serial Then compile with:: make phantom2hdf5HDF5=yes The variable ``HDF5_DIR`` specifies the location of the HDF5 library. .. note:: You **may** need to add to ``LD_LIBRARY_PATH`` (in Linux) or ``DYLD_LIBRARY_PATH`` (in macOS) to point to the HDF5 library location. For example, on Linux with HDF5 compiled with ifort and installed to ``/usr/local/hdf5`` ``export LD_LIBRARY_PATH=/usr/local/hdf5/lib:$LD_LIBRARY_PATH`` Ozstar ~~~~~~ On Ozstar you need to make sure that the OpenMPI and HDF5 modules are loaded. The variable ``HDF5_DIR`` gives the location of the HDF5 library once the HDF5 module is loaded:: module load iccifort/2018.1.163-gcc-6.4.0 module load openmpi/3.0.0 module load hdf5/1.10.1 Then when you compile Phantom2hdf5 ensure ``HDF5_DIR`` is set correctly:: make SYSTEM=ozstar HDF5=yes phantom2hdf5 Note that you must have the HDF5 module loaded when running phantom2hdf5. So make sure to put ``module load hdf5/1.10.1`` in your Slurm job file if performing the conversion as part of your job script. Using phantom2hdf5 ~~~~~~~~~~~~~~~~~~ Now pass a file (or a list of files) to the converter :: ./phantom2hdf5 dump_00* Which returns an HDF5 version of each dumpfile :: $ ls dump_00000 dump_00001 dump_00002 dump_00003 dump_00000.h5 dump_00001.h5 dump_00002.h5 dump_00003.h5 ... Reading Phantom HDF5 dump files in Python ----------------------------------------- You can now read the data from the dump file with the command line tools available with HDF5 or with the Python package h5py. Command line ~~~~~~~~~~~~ To see all the available datasets: :: h5ls -r dump_00000.h5 This produces output like :: / Group /header Group /header/Bextx Dataset {SCALAR} /header/Bexty Dataset {SCALAR} /header/Bextz Dataset {SCALAR} /header/C_cour Dataset {SCALAR} /header/C_force Dataset {SCALAR} /header/RK2 Dataset {SCALAR} /header/alpha Dataset {SCALAR} /header/alphaB Dataset {SCALAR} /header/alphau Dataset {SCALAR} /header/angtot_in Dataset {SCALAR} /header/dtmax Dataset {SCALAR} /header/dum Dataset {SCALAR} /header/etot_in Dataset {SCALAR} /header/fileident Dataset {SCALAR} /header/gamma Dataset {SCALAR} /header/get_conserv Dataset {SCALAR} /header/graindens Dataset {2} /header/grainsize Dataset {2} /header/hfact Dataset {SCALAR} /header/idust Dataset {SCALAR} /header/ieos Dataset {SCALAR} /header/iexternalforce Dataset {SCALAR} /header/isink Dataset {SCALAR} /header/majorv Dataset {SCALAR} /header/massoftype Dataset {7} /header/mdust_in Dataset {2} /header/microv Dataset {SCALAR} /header/minorv Dataset {SCALAR} /header/nblocks Dataset {SCALAR} /header/ndustlarge Dataset {SCALAR} /header/ndustsmall Dataset {SCALAR} /header/npartoftype Dataset {7} /header/nparttot Dataset {SCALAR} /header/nptmass Dataset {SCALAR} /header/ntypes Dataset {SCALAR} /header/polyk2 Dataset {SCALAR} /header/qfacdisc Dataset {SCALAR} /header/rhozero Dataset {SCALAR} /header/time Dataset {SCALAR} /header/tolh Dataset {SCALAR} /header/totmom_in Dataset {SCALAR} /header/udist Dataset {SCALAR} /header/umagfd Dataset {SCALAR} /header/umass Dataset {SCALAR} /header/utime Dataset {SCALAR} /header/xmax Dataset {SCALAR} /header/xmin Dataset {SCALAR} /header/ymax Dataset {SCALAR} /header/ymin Dataset {SCALAR} /header/zmax Dataset {SCALAR} /header/zmin Dataset {SCALAR} /particles Group /particles/divv Dataset {10250000} /particles/dt Dataset {10250000} /particles/h Dataset {10250000} /particles/itype Dataset {10250000} /particles/pressure Dataset {10250000} /particles/vxyz Dataset {10250000, 3} /particles/xyz Dataset {10250000, 3} /sinks Group /sinks/h Dataset {4} /sinks/hsoft Dataset {4} /sinks/m Dataset {4} /sinks/maccreted Dataset {4} /sinks/spinxyz Dataset {4, 3} /sinks/tlast Dataset {4} /sinks/vxyz Dataset {4, 3} /sinks/xyz Dataset {4, 3} You can access a particular value like :: h5dump -d "/header/npartoftype" dump_00000.h5 This produces output like :: HDF5 "dump_00000.h5" { DATASET "/header/npartoftype" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 7 ) / ( 7 ) } DATA { (0): 10000000, 250000, 0, 0, 0, 0, 0 } } } Python with h5py ~~~~~~~~~~~~~~~~ The Python package h5py comes with Anaconda. Alternatively you can install it with pip or Conda. :: conda install h5py To read a dump file :: >>> import h5py >>> f = h5py.File('dump_00000.h5') Then you can access datasets like :: >>> f['particles/xyz'][:] Plonk ~~~~~ Plonk is a Python package for analysis and visualisation of SPH data in hdf5 format. Plonk is open source and available at https://github.com/dmentipl/plonk.