Documentation and metadata

Data documentation

Documenting data is important for data use and preservation. Data documentation explains how data were created or digitised, what data content and structure are and any data manipulations that may have taken place.

Good data documentation includes concise information on:

  • The context of data collection: project aim and objectives
  • Data collection methods: sampling, data collection process, instruments, hardware and software used
  • Structure of data files
  • Quality assurance procedures carried out
  • Version control
  • Information on access and use conditions or data confidentiality
  • Names, labels and descriptions for variables, records and their values
  • Explanation or definition of codes and classification schemes used
  • Definitions of specialist terminology or acronyms used
  • Codes of, and reasons for, missing values

High quality data are well organised, structured, named and versioned. Well-organised file names and folder structures make it easier to find and keep track of data files. Develop a system that works for your project and use it consistently.

File names can contain project acronyms, researchers’ initials, file type information, a version number, file status information and date. Think carefully how best to structure files in folders, in order to make it easy to locate and organise files and versions. Whenever data are used, sufficient contextual information is required to make sense of that data.

Tips for file naming:

  • Create meaningful but brief names
  • Use file names to classify broad types of files
  • Avoid using spaces and special characters

Data description and metadata

In the context of data management, metadata are a subset of standardised and structured data documentation that explains the origin, purpose, time reference, geographic location, creator, access conditions and terms of use of a data collection.

Metadata for research data can be structured according to international standards or schemes such as Data Documentation Initiative (DDI), Dublin Core, Metadata Encoding and Transmission Standard (METS) or ISO 19115 for geographic information. Especially DDI has been recommend as a metadata standard for research data. Find information about disciplinary metadata standards. Alternatively, metadata can be produced informally without using metadata standards. Overall, as a part of documentation and metadata creation consider what information is needed to understand and use your data now and in future.

If you plan to deposit your data to a data repository, the repositories usually determine the metadata information needed and provide guidelines.

Study more about metadata for research data (see Documentation and metadata).

Read also Making a research project understandable – Guide for data documentation.

The descriptive metadata can be stored e.g. to the Finnish research data finder service Etsin. Etsin contains metadata about the research datasets, while the actual data can be stored into a discipline-specific or national/international data repository (see Research data services). Opening metadata is, in principle, always recommended to increase the visibility of the research also in situations when the actual research data can’t be opened. 


Think: How you will document and describe your data through out the research project?

Further information:

Data description and metadata. Finnish Social Science Data Archive.

(9/2020 KH)