Fundametals of Repositories
Repositories are online archives for the digital storage of research outputs in the broadest sense of the term: scientific publications, research data, registered reports, software, etc. In general, a repository is a system of servers that store and maintain digital objects in an organized manner and also provide an online user interface, through which interested individuals can upload and access files. Repositories are managed by various organizations, from research institutions (e.g., DiRROS, Slovenian Social Science Data Archives, Harvard Dataverse, 4TU.ResearchData, Zenodo) to non-profit organizations (e.g., Dryad, Global Biodiversity Information Facility, Avibase), private technology companies (Figshare) and scientific publishers (Mendeley Data). A list of all global research repositories can be found on the re3data.org website.
Research repositories are managed and maintained with varying degrees of quality. The highest quality ones meet certain professional guidelines or standards (e.g., CoreTrustSeal, DIN 31644, ISO 16363) or are long established in the scientific community. Such repositories are deemed trustworthy. CTK UL recommends that you use trustworthy repositories to store your data whenever possible. Other selection criteria will depend on the requirements of your research institution or funder and the possibilities offered by your research area. Various directories of repositories and their policies (e.g., ROAR, openDOAR, ROARMAP, etc.) can help you make your selection.
Domain-specific repositories, if they exist in your research field, are the first choice for your data. They were mostly created on the initiative of the scientific community, e.g., consortia of different research institutions and non-profit organizations (professional associations) that recognized the need for orderly storage of research data even before the open science movement took off. The main advantage of these repositories is that they mostly prescribe domain-specific criteria for describing the provenance of research data, sometimes also in the form of standard metadata schemas. By choosing a domain-specific repository, you ensure that your data are stored among related data of other researchers and thus easier to find. At the same time, due to the well-established criteria for reporting the research process, they will also be as reusable as possible for other interested parties.
Some examples of international trustworthy domain-specific repositories are:
- ArkeoGIS – a joint project of 27 French and German research organisations for data in the field of the sciences of the past, especially archaeology and palaeontology;
- repositories of the CLARIN association for linguistics;
- Worldwide Protein Data Bank – an archive of experimentally determined three-dimensional structures of biological macromolecules;
- World Glacier Monitoring Service – an archive of standardised observations of glacier changes;
- Qualitative Data Repository – an archive of digital data collected through qualitative and multi-method research in the social sciences;
- Sound and Vision – one of the largest audiovisual archives in Europe.
- Databrary for video and audio files, which can also accept and distribute sensitive data and is therefore particularly suitable for research data.
Two domain-specific repositories currently operate in Slovenia:
- CLARIN.SI, the Slovenian node of the international network of linguistic repositories CLARIN,
- and Social Science Data Archives, which will be described below.
There used to be two other domain-specific repositories in Slovenia, namely MODES for data from weather models, simulations, analyses and forecasts and InGeoCloudS for geological, geophysical and other geospatial data. Unfortunately, both have been discontinued.
Social Science Data Archives is a trustworthy domain-specific repository dedicated to storing research data of interest to social science analyses. It focuses on data related to Slovenian society or otherwise important for Slovenian society and social sciences, regardless of geographic boundaries. At the same time, it is the national regional partner and data service provider within the CESSDA Consortium. Complete scientific databases (2.20) stored in ADP are included in the BIBLIO-D list of the Slovenian Research Agency (ARRS) and awarded 30 points according to the Bibliographic Criteria of Scientific and Professional Performance.
Institutional repositories primarily serve the research organisations that founded them, but some have outgrown their scope and opened up to the international scientific community. Most Slovenian institutional repositories were established with the aim of digitising university theses, the role that was later broadened by the storage of open-access versions of scientific publications. Currently, the infrastructure for storing research data is also being established. Slovenian institutional repositories are:
- Repository of the University of Ljubljana (RUL),
- Digital Library of the University of Maribor (DKUM),
- Repository of the University of Primorska (RUP),
- Repository of the University of Nova Gorica (RUNG),
- Repository of Independent Higher Education and Post-secondary Education Organizations (ReVIS),
- Digital Repository of Slovenian Research Organizations (DiRROS).
CTK UL recommends that STEM researchers use the 4TU.ResearchData repository in the event that there are no domain-specific repositories in your research field and the Slovenian institutional repositories do not meet your needs. 4TU.ResearchData was established as an institutional repository of the 4TU.Federation, the association of the Delft University of Technology, Eindhoven University of Technology, University of Twente and Wageningen University, paving the way for research data management in Europe. Today, 4TU.ResearchData is a trustworthy, CoreTrustSeal-certified repository open to science, engineering and design research data from around the world.
You can use generalist repositories if there is no suitable domain-specific repository in your research field and institutional repositories do not meet your needs. Generalist repositories accept research data regardless of data type, file format, research content or research area. On the other hand, generalist repositories do not prescribe standard metadata schemas for describing the provenance of the data for this very reason. Therefore, it is important that you voluntarily follow the domain-specific criteria for describing the experimental process, which you attach to the data in the form of a ReadMe file.
The most established generalist repositories are:
- Zenodo – operated by CERN,
- Figshare – operated by a British technology company Digital Science,
- Dryad – operated by a non-profit organization of the same name,
- Harvard Dataverse – operated by the Harvard University,
- Mendeley Data – operated by the scientific publishing house Elsevier,
- OSF – operated by the American Center for Open Science.
For a comparison of the features of the listed generalist repositories, see the Generalist Repository Comparison Chart document, which you can consult when choosing the repository best suited to your needs.
Last update: 2 September 2022