Publishing the data from your phantom calculations
Recommended best practice for open science is that parameter files, initial conditions and snapshots from calculations with phantom should be made publicly available on publication.
FAIR Principles
According to the FAIR principles for scientific data management, your data should be:
Findable, e.g. with links to and from the paper publishing the simulations
Accessible, available for free in a publicly accessible repository
Interoperable, data is labelled and able to be reused or converted
Reusable, include enough information to be able to reproduce your simulations
Data curation
For calculations with phantom that have been published in a paper, ideal practice is to upload the entire calculation including .in and .setup files, .ev files and all dump files in a public repository.
See for example a dataset from Mentiplay et al. (2020) using figshare: https://doi.org/10.6084/m9.figshare.11595369.v1
Or this example from Wurster, Bate & Price (2018) in the University of Exeter repository: https://doi.org/10.24378/exe.607
However, size limitations may restrict preservation of all data, in which case we recommend saving:
.in files
.setup files
.ev files
dump files used to create figures in your paper, with a link to splash or sarracen in the metadata for how to read/convert these files
dump files containing initial conditions, if these are non-trivial
metadata including link to your publication or arXiv preprint, link to the phantom code, code version information and labelling of data corresponding to simulations listed in your paper
Zenodo community
To facilitate better data sharing between phantom users, we have set up a Zenodo community:
Please join this community and let’s learn from each other to create best-practice data curation. Zenodo currently has a 50Gb limit on data size, which is sufficient for the recommended list of files to save above.
Archiving your data to Google Drive using rclone
You can use rclone to copy data from a remote cluster or supercomputing facility to Google Drive. This is not recommended as a long term storage solution but can facilitate short-term data sharing between users.
Set this up by logging into your supercomputer and typing:
$ rclone config
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
name> dan-google-drive
Storage> drive
Google Application Client Id
See https://rclone.org/drive/#making-your-own-client-id for how to create your own.
client_id> (enter client id you got from the instructions)
client_secret> (enter client secret you got from the instructions)
scope> drive
root_folder_id> (leave this blank)
Edit advanced config? (y/n) n
Use auto config?
y/n> n
Please go to the following link: https://accounts.google.com/o/oauth2/auth?access_type=offline&client_id=...
(click on the link to approve)
Configure this as a team drive?
y/n> n
Check the above was successful by listing files on your remote drive using:
$ rclone ls dan-google-drive:
To COPY files to your google drive, LEAVING a copy on the local machine, you can then use:
$ rclone copy local_path remote_path
For example:
$ rclone copy $HOME/runs/phantom/disc-test1 dan-google-drive:phantom/disc-test1
To MOVE files to your google drive and DELETE them from the cluster (e.g. to clear disc space):
$ rclone move $HOME/runs/phantom/disc-test1 dan-google-drive:phantom/disc-test1
To SYNC an entire directory tree onto your google drive, DELETING files ALSO ON THE DRIVE you can then use:
$ rclone sync -i $HOME/runs dan-google-drive:runs