Research data

With the rise of open science, and particularly following the publication of the Second National Plan for Open Science, research data is now a key concern for the academic community. For this reason, researchers and laboratories are questioning how it should be managed. Like an article, research data is intended to be shared, cited, and reused.

  • But what exactly is research data?

In 2007,INIST (Institut de l’Information Scientifique et Technique) defined research data as “factual records (figures, texts, images, and sounds) that are used as primary sources for scientific research and are generally recognized by the scientific community as necessary for validating research results.”

  • What is not included in the definition of research data?

Preliminary analyses, future work programs, peer reviews, personal communications (e.g., emails), physical objects, training materials, administrative data.

The aim of the FAIR principles is to promote the discovery, access, interoperability, and reuse of shared data. Each FAIR principle is broken down into a set of characteristics that data and metadata must have in order to facilitate their discovery and use by humans as well as machines. The four main principles are:

  • Easy to find

The principle of "easy to find" aims to facilitate the discovery of data by humans and computer systems, in particular through the use of metadata standards and persistent identifiers (e.g., DOIs).

  • Accessible

The Accessible principle allows data to be easily accessed and downloaded. It encourages the long-term storage of data and metadata and facilitates their access and/or download, specifying the conditions of access (open access or restricted) and use (license).

  • Interoperable

The principle of interoperability aims to enable data to be used regardless of the IT environment. It can be broken down into: downloadable, usable, understandable, and combinable with other data, by humans and machines.

  • Reusable

The Reusable principle aims to reuse data for future research and highlights the characteristics that make data reusable for future research or other purposes (education, innovation, reproduction/transparency of science).

Interactive visualization of the four FAIR principles provided by the DoRANum service platform:

https://view.genial.ly/5d64fbbd8352350fa3d22603/interactive-content-les-principes-fair

To effectively manage data throughout its life cycle, universities have developed the research data life cycle, which describes the process of using data from its creation to publication and reuse.

Several models describing this life cycle have been proposed, but they always include the following main phases:

A Data Management Plan (DMP) is an essential document that describes how research data will be collected, organized, stored, shared, and preserved throughout and after the completion of a research project. It ensures rigorous data management, thereby promoting data reuse and sustainability.

Upstream of the project, it is useful because it allows you to ask the right questions at each stage of your data lifecycle. It is a living document that can be updated throughout your project.

  • Why is a PGD important?

Compliance with requirements: More and more funding bodies, such as the ANR (French National Research Agency) and the European Union through Horizon 2020, require the creation of a DMP to ensure that data from the research they fund is managed correctly.

Facilitating sharing and reuse: A DMP promotes transparency and encourages the reuse of data by other researchers, thereby contributing to the advancement of scientific research.

Data preservation: By defining clear protocols for storage and backup, the DMP helps preserve data over the long term, reducing the risk of information loss or corruption.

  • What does a Data Management Plan contain?

A PGD generally includes the following elements:

Data description: Type of data collected, format, estimated volume, etc.

Data collection and processing: Collection methods, verification procedures, and quality control.

Storage and backup: Data location, backup frequency, secure access, etc.

Sharing and access: Sharing methods (e.g., deposits in data warehouses), anonymization, possible access restrictions.

Long-term preservation: Data retention period, plan for format migration if necessary, etc.

  • How to write a PGD?

At the University Library, we can help you create your Data Management Plan using tools and advice tailored to your research project. You can use DMP templates available online (such as DMPOpidor) and consult us if you have any questions about data management.