2. Ethical and legal issues in research data management

Ethical and legal factors affect research data management, i.e. how research data is collected and processed, what rights are attached to the data, where the data can be stored, how and to whom the data can be shared. Research data can include personal data, personal data that belongs to special category of personal data, sensitive species information e.g. related to endangered animals or plants or other confidential data such as patents, information related to national defense, or trade secrets. It is important that you, together with your supervisor, identify the juridical and ethical aspects and limitations related to the research data, and that you take them into account throughout the master’s thesis and the data management process.

The ethical and legal aspects to be taken into account when collecting and processing research data are guided by, for example:

Data Protection Legislation (eg. General Data Protection Regulation; GDPR)
Copyrights
Responsible conduct of research

Personal data as research data of the master’s thesis

Personal data are any information that can be used to identify a person directly or indirectly. Personal data are all the information that are related to a natural person:

Direct identifiers
Information that alone are enough to identify a person such as full name, social security number, email address containing the personal name, biometric identifiers (facial image, voice patterns, fingerprints, iris, palm shape, handwritten signature).
Strong indirect identifiers
Information with which a person can be identified with reasonable effort such as postal address, phone number, IP-address, student number, insurance number, bank account number, exact annual income, vehicle registration number, unusual job title and rare disease, position held by only one person at a time (chairperson in an organisation).
Indirect identifiers
Information which on their own are not enough to identify someone but when linked with other available information, could be used for deducing the identity of the person such as gender, age, principal abode, profession, workplace, education, school, dates (date of birth, death, accident).
Special categories of personal data, crime and punishment (sensitive data)
Racial or ethnic origin, political opinions, religion or philosophical beliefs, trade union membership, data concerning health, sexual orientation or activity and genetic and biometric data for identifying the person as well as crimes and convictions. As a rule, the processing of personal data belonging to special categories is prohibited without virtue of the GDPR or other separate legislation or agreements in addition to the GDPR.

All activities involving personal data such as collecting, storing, organizing, searching, viewing, preserving, deleting and destroying, constitute processing of personal data. A controller is an individual or organisation that determines the purposes and means of the processing of personal data.

The thesis author is usually considered as a controller. See more instructions from the Data protection guide for students at Kamu.

There must be a legal basis for the processing of personal data. The processing of personal data in master’s theses can be permitted when required by the public interest, the study participant gives his or her consent, or when it is necessary for the legitimate interests pursued by the controller.

Personal data should be minimized when collecting research data. This means that only such personal data necessary for research purposes should be collected. For research purposes personal data can be protected by processing data anonymously, or pseudonymously.

Anonymisation refers to the processing of personal data in a manner that makes it impossible to identify individuals from them. The prevention of identification must be permanent and make it impossible for the controller or a third party to convert the data back into identifiable form with the information held by them. Simply deleting names is not enough and the data must be carefully reviewed and all information that can be used to identify a person must be removed. That means that no identification codes can exist after the anomisation process. Anonymised data are no longer considered as personal data. However, notice that anonymisation as a process means that you are still processing personal data.

Pseudonymisation means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific person without the use of additional information, which is kept separately. It means removing or replacing indentifiers with codes or false names. Pseudonymized data is still considered to be personal data, and their processing is subject to data protection regulations.

Read more about personal data and anonymisation from Finnish Social Sciences Data Archive (FSD) guidelines.