Metadata standards
When writing metadata, one should use some predefined standards to ensure that specific information can be read correctly, depending on the information to record.
- Dates
For dates, one should use the ISO-8601 string format YYYY-MM-DD or YYYYMMDD. Additionally, this date format has the interest, when use at the start of a file’s name, that when sorted by name, the sorted file list will be also sorted by date (older first).
- File type
For file type and format, the MIME scheme is a great choice. It will not usually include science specific types (such as
.root
or.fast
), but text encoded, images, PDFs:, and generic binary will be covered.- Language
When indicating in which language is a document, following the ISO 639-1 scheme is a good option. It is simple and one will most generally use
en
(English).- Places
The website GeoName offers references to many places, with permalink to them (for example, Strasbourg) or can reference coordinates (48.584/7.746). This is a nice way to reference a specific place on a map.
- Laboratories and institutions
The Research Organization Registry provides
ROR ID
, DOI-like references for laboratories, universities, … (for example: IPHC).- Researchers
For researcher, the Open Researcher and Contributor ID (Orcid) is the standard way to identify a researcher across platforms (example: Greg Henning).
- Research Topics
For research topics, one can refer to the old style Physics and Astronomy Classification Scheme (Pacs) numbers , but these should be replaced by the new Physics Subject Headings (PhySH). For more general topics, the Library of Congress Subject Headings (LCSH) provides more subjects. Wikidata or the Bibliothèque nationale de France provide subject ID schemes, but the list of headings is not as easy to search.
- Researcher contributions
To indicate the roles and contributions of collaborators in a project, several schemes exist. The DataCite scheme includes, for the
Contributor
fields, severalcontributorType
values, with an emphasis on project management. An alternative scheme, that is aimed at being used in publications (either of paper or data), is the Contributor Roles Taxonomy (Credit), with 14 roles for collaborators.