Research Data Management from A to Z
Showing 1 - 13 of 13 Results
Archiving means the unchangeable, long-term storage of your data. It is therefore very important you choose a suitable medium for storing your research data – saving your files on an external data carrier and keeping it in your drawer is not an option here!
For smaller amounts of data that are not personal, we recommend simpleArchive. For regular archiving of larger or personal data, the RWTH IT Center's archive service is a suitable option.
Documenting your data correctly is very important for archiving and you should ensure contain all the essential information, such as metadata, are included.
The RWTH IT Center's archive service offers the assignment of ePIC persistent identifiers as an archiving option.
Please refer to our Instructions on Archiving Data for a Publication, which are in German.
You can find information on copyright in the Fact sheet on the Copyright Protection of Research Data.
Data Management Plan
A data management plan, DMP for short, means the systematic and targeted documentation of your research data. A DMP takes into account the handling, storage and archiving, access and use of your data and metadata. Creating a DMP means much thought has got into the quality of your data, your resources, and your intellectual property right from the start of your project.
The following online tools can help you create a DMP:
- RWTH Aachen University's own DMP template
DMPonline is a tool developed by the British Digital Curation Centre, DCC for short, and hosted at the University of Edinburgh. It provides different templates from funding organizations as well as a generic template that is suitable for every research project. DMPonline helps you create a DMP according to EU guidelines.
The DMPTool is offered by the California Digital Library. It contains instructions for certain funding organizations that have already made DMP a requirement today. Integrated resources and services from certain partner institutions make it easier to complete a DMP in some cases. The tool also offers a generic DMP template and is freely accessible to everyone. In addition, the website offers some examples of DMPs.
Different domains, as well as working environments, can be identified within a research project. The domains differ in the type of data exchange, the circle of exchange partners, and the type of use.
- The private domain indicates each researcher’s working environment.
- The group domain denotes the research group’s common working environment.
- The permanent domain means the working environment for long-term archiving.
- Access and reuse is the cross-project interdisciplinary working environment of all researchers around the world.
Every research project involves at least the first three domains over its duration.
The critical points are the transitions between the domains. Extensive planning, for example via a data management plan, is therefore required for this to be a smooth process.
- It is important to lay the foundations in the researcher’s private domain using an overall concept that will also be applied for the later transitions.
- In order to transition to the group domain, basic specifications for the common use and creation of research data are necessary.
- If permanent storage is required and publication is planned, information for cross-disciplinary understanding and reuse should be elaborated.
- You should bear in mind that data are often not only relevant to one sole research context. There are frequently overlaps to other fields and the data from one discipline today form the basis for research in another tomorrow. In order to create these new opportunities, it is important to create access to research data.
The spectrum of data types and formats of research data is very diverse.
Examples of data types are:
- Models: statistical, 3D modeling
- Multimedia data: JPEG, TIFF, MPEG
- Numerical data: Excel, SPSS, CSV
- Software: Java, C++
- Text documents: Word, PDF, XML
Good Academic Practice
German: Gute wissenschaftliche Praxis
Good academic practice implicates storing research data for at least ten years.
Institutional policies can help you and your employees create security and orientation. In the RWTH Institutional Policy template, you will find proposals that are not binding and can be individually adapted to your working group, institute, et cetera. Institutional policies include the handling of data management plans, usage rights, copyright, and the storage and archiving of research data.
Long-term archiving generally means ensuring data availability for a period of over ten years. Besides preserving the data content at the bit level, you should also bear in mind the following requirements for the future interpretability of data:
- Is the data format suitable for long-term archiving?
- Is special software required for interpretation?
- Are the metadata complete?
Technical and descriptive metadata are particularly important to ensure it will be possible to use the data in technical infrastructures of the future.
You can find more detailed information on the long-term archiving of research data in the NESTOR manuals: Long-Term Archiving of Research Data or Digitial Curation of Research Data, or via the nestor wiki.
One option for long-term archiving, which the RDM team has tested, is Ex Libris’ Rosetta software.
The term "metadata" denotes further information about your research data. They describe your data in more detail and make them interpretable at any time. Metadata are particularly important for the documentation, management, and classification of digital research data, since they are essential in answering the following:
- Where does the data come from?
- Who created the data, when, and how?
To ensure the exchange and reusability of metadata via digital information systems, you should use standardized metadata schemata as consistently as possible.
Jisc infoKit provides an introduction to metadata. This guide informs you about the most important metadata goals and concepts and is suitable for those who do not have any prior knowledge in the area.
A very short introduction to documentation and metadata can be found in the presentation Explain It.
The interactive Mantra course offers training on documentation and metadata. You will quickly understand why it is important to document your own research – both for you and for others. In addition, you are taught when and why to use metadata.
A metadata schema means compiling permitted data elements to uniquely describe a resource. A suitable metadata schema for you depends on a number of factors, such as the data type or the context in which it was created and used.
There are a variety of metadata schemas for data from different disciplines. The first step to take when designing your research data descriptions is to check whether a suitable schema already exists for your discipline. You can find an ever-growing list on FAIRsharing.org, for example, while Dublin Core and RADAR are two of the best-known standardized metadata schemes.
The metadata tool lets you fill in metadata according to a schema created for your institution. The schema not only specifies which metadata fields, for example author and subject, need to, and even can be, registered, but also allows you to use controlled vocabularies. Selecting or creating a suitable metadata schema is by no means trivial and the RDM project team will be happy to assist you here.
Once you have decided on a metadata schema, you must define the content of the data fields. To ensure the greatest possible chances of reusability and to optimally support research, we recommend you use controlled vocabularies, thesauri, and classifications. You can also find a large number of both interdisciplinary and discipline-specific solutions for this.
German: Persistenter Identifikator
An identifier signifies the unique denotation of a resource, usually digital. A classic example of an identifier in printed resources is the International Standard Book Number, ISBN for short. The Uniform Resource Locator, URL for short, is often used for digital resources. URLs have a half-life of around 100 days. Due to this short lifespan, URLs are not a suitable identifier for the permanent and distinct citation of research data.
This is where persistent identifiers, PID for short, come into play. PIDs represent a middle layer between the reference and the object, whereby the object is uncoupled from the "electronic" location. This results in a reduction of broken links, or the “Error 404: Page not found” message, or, in other words, this increases the stability of references, even if the data’s storage location changes.
PIDs give research data a permanent and unchangeable identifier, called a Uniform Resource Identifier, or URI for short, which is assigned throughout their lifecycle and beyond.
The best-known example of a PID is the Digital Object Identifier, DOI for short.
RWTH offers its members ePIC PID assignments.
Personal Data Management
German: Persönliches Datenmanagement
To implement data management according to your plan, you must carefully organize your general research activities on a daily basis. You should consider matters such as documentation, labelling samples, and organizing the data structure. You should therefore specify the following as early as possible:
- Data organization, storage structures, versioning
- Documentation, metadata
- Data, backup during the project period
- Responsibilities, access rights, collaboration rules
- Archiving or publishing after the end of project
The Data Management Plan or Institute Policy tools are suitable for supporting your personal data management. The Research Data team also advises you in individual or group consultations, where you can develop solution strategies tailored to your subject and the conditions at your institute using RWTH’s technical services.
There are both discipline-specific and institutional repositories. You can find a good overview of research data repositories on the Registry of Research Data Repository, re3data for short, which is funded by the German Research Foundation, DFG for short, and offered as a service by DataCite. Furthermore, you can also use the institutional repository RWTH Publications.
Many research data repositories, including RWTH Publications, can assign a Digital Object Identifier, DOI for short, to your data. The University Library RWTH Aachen University is already registered with the Leibniz Information Centre for Science and Technology and University Library, TIB for short, as a data center that assigns digital objects.