Metadata standards

When writing metadata, one should use some predefined standards to ensure that specific information can be read correctly, depending on the information to record.

Dates

For dates, one should use the ISO-8601 string format YYYY-MM-DD or YYYYMMDD. Additionally, this date format has the interest, when use at the start of a file’s name, that when sorted by name, the sorted file list will be also sorted by date (older first).

File type

For file type and format, the MIME scheme is a great choice. It will not usually include science specific types (such as .root or .fast), but text encoded, images, PDFs:, and generic binary will be covered.

Language

When indicating in which language is a document, following the ISO 639-1 scheme is a good option. It is simple and one will most generally use en (English).

Places

The website GeoName offers references to many places, with permalink to them (for example, Strasbourg) or can reference coordinates (48.584/7.746). This is a nice way to reference a specific place on a map.

Laboratories and institutions

The Research Organization Registry provides ROR ID, DOI-like references for laboratories, universities, … (for example: IPHC).

Researchers

For researcher, the Open Researcher and Contributor ID (Orcid) is the standard way to identify a researcher across platforms (example: Greg Henning).

Research Topics

For research topics, one can refer to the old style Physics and Astronomy Classification Scheme (Pacs) numbers , but these should be replaced by the new Physics Subject Headings (PhySH). For more general topics, the Library of Congress Subject Headings (LCSH) provides more subjects. Wikidata or the Bibliothèque nationale de France provide subject ID schemes, but the list of headings is not as easy to search.

Researcher contributions

To indicate the roles and contributions of collaborators in a project, several schemes exist. The DataCite scheme includes, for the Contributor fields, several contributorType values, with an emphasis on project management. An alternative scheme, that is aimed at being used in publications (either of paper or data), is the Contributor Roles Taxonomy (Credit), with 14 roles for collaborators.