This repository contains the output of bioCADDIE WG3 Descriptive Metadata for Datasets., defining the DatA Tag Suite (DATs) model. The presentations and notes from the WG3 activities can be found at this website. This repository contains the different versions of the DATS specification.
The (work in progress) documentation about DATS can be found at readthedocs.
The material in this repository is distributed under CC BY-SA 3.0 license.
Update
Currently, DATS is being used and further refined under the new phase of the NIH Data Commons programme and has its own DATS GitHub organization. For the latest version refer to this new Github organization.
- Xiaoling Chen, Anupama E Gururaj, Burak Ozyurt, Ruiling Liu, Ergin Soysal, Trevor Cohen, Firat Tiryaki, Yueling Li, Nansu Zong, Min Jiang, Deevakar Rogith, Mandana Salimi, Hyeon-eui Kim, Philippe Rocca-Serra, Alejandra Gonzalez-Beltran, Claudiu Farcas, Todd Johnson, Ron Margolis, George Alter, Susanna-Assunta Sansone, Ian M Fore, Lucila Ohno-Machado, Jeffrey S Grethe, Hua Xu; DataMed – an open source discovery index for finding biomedical datasets, Journal of the American Medical Informatics Association, Volume 25, Issue 3, 1 March 2018, Pages 300–308, https://doi.org/10.1093/jamia/ocx121
- Alejandra N Gonzalez-Beltran, John Campbell, Patrick Dunn, Diana Guijarro, Sanda Ionescu, Hyeoneui Kim, Jared Lyle, Jeffrey Wiser, Susanna-Assunta Sansone, Philippe Rocca-Serra; Data discovery with DATS: exemplar adoptions and lessons learned, Journal of the American Medical Informatics Association, Volume 25, Issue 1, 1 January 2018, Pages 13–16, https://doi.org/10.1093/jamia/ocx119
- Lucila Ohno-Machado, Susanna-Assunta Sansone, George Alter, Ian Fore, Jeffrey Grethe, Hua Xu, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra, Anupama E Gururaj, Elizabeth Bell, Ergin Soysal, Nansu Zong & Hyeon-eui Kim; Finding useful data across multiple biomedical data repositories using DataMed; Nature Genetics 49, 816–819 (2017) https://doi.org/10.1038/ng.3864
- Susanna-Assunta Sansone#, Alejandra Gonzalez-Beltran#, Philippe Rocca-Serra#, George Alter, Jeffrey S. Grethe, Hua Xu, Ian M. Fore, Jared Lyle, Anupama E. Gururaj, Xiaoling Chen, Hyeon-eui Kim, Nansu Zong, Yueling Li, Ruiling Liu, I. Burak Ozyurt & Lucila Ohno-Machado; DATS, the data tag suite to enable discoverability of datasets; Scientific Data volume 4, Article number: 170059 (2017) DOI: 10.1038/sdata.2017.59 #=equal contribution; Pre-print: bioRxiv DOI: 10.1101/103143 .
All DATS releases can be accessed in Zenodo through the DATS Community. In addition, links to individual releases can be found below.
The document provides links to the different appendices files.
The document provides links to the different appendices files.
- [Metadata Specification - version 1.1 - Draft open for comments (doc)] (https://github.com/biocaddie/WG3-MetadataSpecifications/blob/master/doc/v1.1/NIH-BS2K-bioCADDIE-WG3-MetadataElements-Specification-v1.1.docx)
- Metadata Specification - version 1.0 (PDF file)
- Appendix 1 - Metadata Mapping File v1 (Spreadsheet)
- Appendix 2 - Metadata Elements File v1 (Spreadsheet)
The python code included in the repository validates the DATS JSON schemas and the DATS JSON instances against the schemas. To execute the code, it is recommended to use a virtual environment, following these steps:
- If not already installed in your system, first install the virtual environment via
pip
:pip install virtualenv
- Create a virtual environment:
virtualenv venv
- Then, activate the virtual environment:
source venv/bin/activate
- Install the requirements:
pip install -r requirements.txt
- Finally, you can inspect and run the tests to validate the DATS schemas and JSON instances against the schemas.
python setup.py test