Metadata, the key to FAIR
When making data FAIR (Findable, Accessible, Interoperable, Reusable), metadata plays an important role. Metadata means the descriptive information about the research data. To make research data easier to find and understand, it is essential to document the data. This considerably facilitates the further use of the data and makes reproducibility possible. Well-documented data are used and cited more frequently, which increases the reputation of the creator. Documentation is also helpful for you: Over time, details can fall into oblivion, so it is advisable to document the data directly while you work.
Good metadata enables you to understand, use, and share your own data now and in the future, and helps other researchers discover, access, use and cite your data in the long-term. It also facilitates long-term preservation of the data. It ensures that the context for how your data was created, analysed and stored, is clear, detailed and therefore, reproducible.
With metadata you describe who is the responsible researcher, when, where and why the data was collected, how the research data should be cited, etc. The content and format of metadata is often guided by a specific discipline and/or repository.
Here are few rules you can follow for organising and structuring your data as a part of data management and documentation:
- Use existing best practices.
- Store data in folders.
- Use systematic, content-related folder naming.
- Use no more than three subfolder levels.
- Use clear naming of files. Use date for chronological sorting (YYYYMMDD) and avoid spaces and special characters. Use documented naming conventions or abbreviations, e.g. [project]_[interview]_[place]_[personID]_[YYYYMMDD].mp4
- After the end of the project, check what data is still needed.
Within the context of open science and for optimal long-term archiving, data files should not be compressed or saved as proprietary formats, while open formats should be favoured. This ensures the access and re-usability of the content. Notice that some file formats cannot be converted to open formats. Check if the repository where you want to deposit a dataset has a list of preferred formats.
Read more about recommended file formats.
How to make your data FAIR
Particularly, when you are planning to share your data, it is recommended to follow the FAIR principles in research data management.
In short, the FAIR principles mean that data is:
- Findable: The first step in (re)using data is to find them. Make sure that your data can be found by both humans and machines. Discovery of datasets depends on persistent identifiers (PIDs) and metadata. Make your data findable by ensuring that data are described with rich metadata. Choose a data repository that assigns a persistent identifier when archiving a dataset. The advantage of a persistent identifier over a normal web address is that the PID always points to the data, even if the data itself has changed location. Several types of PID exist, such as DOI, Handle and URN. DOI is currently the most used PID but also other forms can be used. Use ORCID for your personal persistent researcher identifier and use this identifier with all your research outputs.
- Accessible: Once someone has found your data, they need to know how they can get access to them. This can include e.g. authorisation and/or authentication processes. FAIR does not necessarily mean that data need to be open! In case the access to the data is restricted, make sure to provide sufficient contact information for other researchers if they are interested about the data. If the data cannot be made openly accessible, it is usually still possible to make the metadata publicly available.
- Interoperable: Ensure that your data can be integrated with other data and that they can be utilised by applications or workflows for analysis, storage, and processing. Open, non-proprietary or common formats will increase accessibility. Think about software tools that are needed to access your data. If necessary, include documentation about the software (version, etc.). The concept of interoperability applies both at the data and metadata level.
- Reusable: To support proper data interpretation and to maximise the potential reuse of your data, make sure that your data and related metadata are well-described. Have a clear and accessible data usage license (e.g. Creative Commons) so others know what kinds of reuse are permitted.
The FAIR principles are guiding principles, not standards. FAIR describes qualities or behaviours that are required to make data maximally reusable. Those qualities can be achieved by different standards. Also, you may not always be able to adhere to all. But applying some of the principles to your data will add to the findability, accessibility, interoperability and reusability of your research data.
- FAIR = Findable, Accessible, Interoperable, Reusable
- Good metadata enables you to understand, use, and share your data now and in the future, and helps other researchers discover, access, use and cite your data.
- Adhere to the FAIR principles to the extent you can with the data in question.