What is Research Data Management?

To best illustrate what Research Data Management encompasses it's easiest to think about the headings and questions that the University of Virginia in the United States ask researchers to consider when planning for data management.

Their five major headings cover:

  1. Types of data
  2. Data and Metadata Standards
  3. Policies for access and sharing and provisions for appropriate protection/privacy
  4. Policies and provisions for re-use, re-distribution
  5. Plans for archiving and Preservation of access

Each heading can be broken down further to spell out what Research Data Management should cover:

1. Types of data
Samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project.

  1. What data will be generated in the research? (Give a short description, including amount – if known and the content of the data).
  2. What data types will you be creating or capturing? (e.g. experimental measures, observational or qualitative, model simulation, processed etc.)
  3. How will you capture or create the data?
  4. If you will be using existing data, state that fact and include where you got it. What is the relationship between the data you are collecting and the existing data?

2. Data and Metadata Standards
Standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies).

  1. Which file formats will you use for your data, and why?
  2. What contextual details (metadata) are needed to make the data you capture or collect meaningful?
  3. How will you create or capture these details?
  4. What form will the metadata take?
  5. Which metadata standards will you use?
  6. Why have you chosen particular standards and approaches for metadata and contextual documentation? (e.g. recourse to staff expertise, Open Source, accepted domain-local standards, widespread usage)

3. Policies for access and sharing and provisions for appropriate protection/privacy

  1. How will you make the data available? (Resources needed: equipment, systems, expertise, etc.)
  2. When will you make the data available? (Give details of any embargo periods for political/commercial/patent reasons.
  3. What is the process for gaining access to the data?
  4. Will access be chargeable?
  5. Does the original data collector/ creator/ principal investigator retain the right to use the data before opening it up to wider use?
  6. Provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements;
  • Are there ethical and privacy issues?
  • If so, how will these be resolved? (e.g. anonymisation of data, institutional ethical committees, formal consent agreements.)
  • What have you done to comply with your obligations in your IRB Protocol?
  • Is the dataset covered by copyright? If so, who owns the copyright and other intellectual property?
  • How will the dataset be licensed if rights exist? (e.g. any restrictions or delays on data sharing needed to protect intellectual property, copyright or patentable data.)

4. Policies and provisions for re-use, re-distribution

  1. Will any permission restrictions need to be placed on the data?
  2. Which bodies/groups are likely to be interested in the data?
  3. What and who are the intended or foreseeable uses / users of the data?
  4. Are there any reasons not to share or re-use data? (Suggestions: ethical, non-disclosure, etc.)

5. Plans for archiving and Preservation of access
Plans for archiving data, samples, and other research products, and Preservation of access to them.

  1. What is the long-term strategy for maintaining, curating and archiving the data?
  2. Which archive/repository/central database/ data centre have you identified as a place to deposit data?
  3. What transformations will be necessary to prepare data for preservation / data sharing? (e.g. data cleaning/anonymisation where appropriate.
  4. What metadata/ documentation will be submitted alongside the data or created on deposit/ transformation in order to make the data reusable?
  5. What related information will be deposited (e.g. references, reports, research papers, fonts, the original bid proposal, etc.)
  6. How long will/should data be kept beyond the life of the project?
  7. What procedures does your intended long-term data storage facility have in place for preservation and backup?

*Adapted from: http://www2.lib.virginia.edu/brown/data/

"Research data management (RDM) is an umbrella term to describe ‘what researchers do with their data’. This includes activities such as storing, organising, documenting and sharing data, and also covers issues such as confidentiality and data protection.", Rachel Kane, University of Sheffield