Metadata literally means ‘data about data’. In the context of research data management, Metadata may refer to a highly structured, machine-readable subset of data documentation that may be indexed and stored within a database. The terms metadata and data documentation are often used interchangeably.
Metadata may be created manually, by filling in fields in a spreadsheet or database for example, or created automatically, by being recorded by an instrument. The cost of manual metadata creation is significantly greater than that of automatic metadata creation. There are various tools available for automating metadata capture – some workflow and Lab notebook management tools are listed by the DCC.
|Metadata needs to be standardised to be useful – common aspects such as language, spelling and date formats need agreement. Metadata schemas are arrangements in which metadata are structured, specifying the content, format and organisation of metadata elements. The metadata schema will specify fields (or parameters) with standardised formats for their content (or value). Metadata schemas range from the generic such as Dublin Core, generic schemas for research data, such as the Datacite Metadata Schema and the Data Documentation Initiative (DDI), to those developed for specific disciplines and the sorts of data they produce - For more information on these, see the DCC page on Disciplinary Metadata.|
|Types of Metadata||
|Catalogue Metadata||This is the information required to identify a dataset, to search and discover a dataset in a repository catalogue, and to cite a dataset. This is the information required by a data repository when depositing a dataset.
The Datacite schema has five mandatory elements necessary for a citation:
Several optional elements:
All repositories require the mandatory metadata elements; many make description, subject, resource type and rights metadata mandatory too. Most will require additional information, Funder and Grant code in particular. Some repositories employ specialised metadata schemas appropriate for their subject area, which require input of discipline specific information.
|Discipline specific / Reuse metadata||
Specialised schemas employed by some research data repositories, will include additional fields, controlled vocabularies and specialised ontologies allowing a dataset record to provide enough information to make the data understandable and reusable. For more information on these, see the DCC page on Disciplinary Metadata.
Basic ‘catalogue’ metadata schemas, such as the Datacite schema, may be adequate in providing this richer metadata, as information about the methods involved in creating or collecting and processing the data may be included in the ‘description’ field of a ‘catalogue’ metadata schema. Alternatively, it may be more convenient to give detailed information in a separate ‘data document’ (which will also require its own catalogue metadata description). Such data documents may be associated with the dataset file, by being included in the same file set or by being identified in the Related identifier / Reference field of the dataset metadata. For more information, see the page on Describing your data.
For more on metadata see ANDS Metadata Guide (working level)
For further information, please contact email@example.com