Preparing data collection

The research questions mainly determine what kind of data are collected. Furthermore, the data types influence how the data can be created, processed and opened.

The quality of data collection methods used strongly influences data quality and documenting in detail how data are collected provides evidence of such quality. Quality control measures during data collection may include e.g. calibration of instruments, taking multiple measurements and using standardised methods and protocols.

The format and software in which research data are created usually depend on how researchers plan to analyse data, the hardware used, the availability of software, or can be determined by discipline-specific standards and customs. So, consider the file formats and software suitable for your research data in advance.

When research involves obtaining data from people, researchers are expected to maintain high ethical standards, such as those recommended by professional bodies, institutions and funding organisations, both during research and when sharing data.

Researchers are usually expected to obtain informed consent for people to participate in research and for use of the information collected. Where possible, consent should also take into account any future uses of data, such as the sharing, preservation and long-term use of research data.

Read more about informing research participants about the processing of their personal data in FSD guidelines.


Personal data processing must be planned thoroughly and executed carefully. Before data obtained from research with people can be published or shared, they need to be anonymised so that individuals can not be identified from the data.

Personal data is any information relating to an identified or identifiable natural person:

Direct identifiers such as full name, social security number, email address containing the personal name, biometric identifiers (fingerprints, facial image, voice patterns).

Strong indirect identifiers such as postal address, phone number, vehicle registration number, unusual job title, rare disease.

Indirect identifiers: information which on their own are not enough to identify someone but when linked with other available information, could be used for deducing the identity of the person.

Read more about personal data and anonymisation from FSD guidelines.

Further information:

(8/2019 AK)