Describe your data

Idnetify what the data are that you plan to produce

A first step in developing a data management plan is to determine what data will be collected or generated during your research. Research data are the digital information, structured by a formal methodology, for the purpose of creating new research or scholarship and which are commonly accepted in the scientific community as necessary to validate research findings (also see the Office of Management and Budget definition). 

Four general categories of data are:

  • Observational data (e.g. sensor readings, survey instruments)
  • Experimental data (e.g. lab equipment readings)
  • Simulation data (e.g. climate models)
  • Derived or compiled data (e.g. compiled databases, text or data mining)

Specific examples of research data include:

  • Digital texts or digital copies of text
  • Spreadsheets
  • Audio, video
  • Computer Aided Design (CAD)
  • Waveforms
  • Statistics (SPSS, SAS)
  • Databases
  • Geographic Information Systems (GIS) and spatial data
  • Digital copies of images
  • Matlab files
  • Computer code
  • Protein or genetic sequences
  • Artistic products
  • Web files

Questions to consider when writing your data management plan:

  • What types of data will you be creating or capturing, (e.g. experimental measures, qualitative, raw, processed) and how do these data fit the needs of your research plan?
  • How will you create or capture the data? This should cover content selection, instrumentation, technologies and approaches chosen, methods for naming, versioning, etc, and should be sensitive to the location in which data capture is taking place.
  • Do you plan to use existing data, and what is the relationship between these existing data and the data you are generating yourself?
  • How much data do you expect to collect?
  • Will you collect any data of a sensitive nature?

The U.S. Federal Government's Office of Management and Budget Circular A-110 (36.d.2.i Property Standards; Intangible property; definition) states:

Research data is defined as the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues. This "recorded" material excludes physical objects (e.g., laboratory samples). Research data also do not include:

Trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law; and Personnel and medical information and similar information the disclosure of which would constitute a clearly unwarranted invasion of personal privacy, such as information that could be used to identify a particular person in a research study.

Published is defined as either when:

  • Research findings are published in a peer-reviewed scientific or technical journal; or
  • A Federal agency publicly and officially cites the research findings in support of an agency action that has the force and effect of law.