Bulk downloadsΒΆ
A complete set of MaveDB data and metadata is available as a bulk download hosted on Zenodo.
Note
The associated DOI for the most recent version of the archive is 10.5281/zenodo.11201736.
The archive will be updated twice yearly in May and November.
The archive contains a single JSON document called main.json that provides
the structured metadata for every experiment set,
experiment, and score set.
Score set data is provided in .csv format,
with separate score and count files for each record as appropriate.
Each file is named using the score set urn.
Users who are interested in downloading a large number of MaveDB datasets are strongly encouraged to use and cite these archival releases, particularly for machine learning or AI-based studies where the associated data needs to be clearly identified for reproducibility.
Datasets released using the CC0 public domain license are included in the archive, and this is the license applied to the archive itself. Datasets provided by MaveDB under other licenses are not currently included.