Licensing Research Data

To enable reuse of research data, it is imperative to assign an appropriate license to the data and metadata. By convention, research data are licensed under open licenses, such as Creative Commons. As a rule, data are licensed under CC BY and metadata under CC0. Creative Commons licenses are suitable for licensing all types of research data except software code, for which dedicated open licenses are used, such as GNU and Apache. The use of bespoke licenses to license open research data and metadata is discouraged.
Creative Commons Licenses
Creative Commons (Slovenian Ustvarjalna gmajna) is an American non-profit organization that develops the system of free, standardized copyright licenses by which authors grant the public the right to share and use their creative work on terms they set. In science, Creative Commons licenses were first implemented to license articles published in green or gold open access. Their use has expanded to include research data after the open sharing of research data and metadata became mandatory in the European Union.
In case you might be wondering: Creative Commons licenses are not a decoration, but an internationally valid copyright tool for licensing various scientific results, which helps authors reserve certain rights that can be recovered in court in case of violations. Creative Commons maintains a record of such court decisions.
The Spectrum of Creative Commons Licenses and Their Meaning
Creative Commons licenses cover the full spectrum of openness, from the fully public domain (CC0) to the strictest CC BY-NC-ND (Attribution-NonCommercial-NoDerivatives) license, which is the last level before all copyrights are withheld. Licenses are created by combining standard parts with a precisely defined meaning to obtain licenses with increasingly complex conditions of reuse.
1. CC0: “No Rights Reserved”
With this license, the authors waive all rights and place their work entirely in the public domain, so that others may use, improve, and modify it without restrictions of copyright or database rules. Some examples of institutional use of this license are described here .
CC0 is most appropriate for licensing metadata, as metadata are mostly factual information for which no copyright can be claimed (e.g., the fact that you took the measurement at a temperature of 25 °C is not a copyrighted creative work). Plan S also defines CC0 as an acceptable license for licensing research data that is of broader societal importance. The exception are very specific cases when metadata could be used to uncover information that would, for example, jeopardize the intellectual property protection process or the safety of persons, threatened areas, groups, species and sensitive data. Such cases are subject to the principles of eligible exemptions from openness.
2. CC BY: Attribution
All other Creative Commons licenses require users of licensed works to credit their authors under the terms they require, but not in a way that suggests the authors are promoting the users or their work. The CC BY license enables sharing (copying and distribution) of content in any medium and form, as well as processing (editing, remixing and incorporating content into one's own works) for all purposes, including commercial ones. When reusing licensed content, users must indicate the authors, a link to the license and mark changes, if any. If users wish to use the work without attribution or for self-promotional purposes, they must obtain separate permission from the authors.
CC BY License - Attribution 4.0 International (CC BY 4.0) is the default license for open access research publication and research data licensing, except for eligible exceptions. It is based on the paradigm that research results, which are the result of public funding, are a public good. The wider society therefore has the right to use them, regardless of whether it is for commercial or non-commercial use. However, they are still an author's work, so the authors have the right to be cited.
3. CC BY-SA: Attribution-ShareAlike
Authors permit users to copy, distribute, display and modify their work, as long as the distribution and modification take place under the same conditions and with reference to the original authorship. If users wish to distribute modified works under other conditions, they must obtain separate permission from the authors. Commercial use is permitted.
The funders affiliated with cOAlition S recognize the CC BY-SA license as an acceptable alternative to the default CC BY license for licensing research data, articles and other research outputs.
4. CC BY-NC: Attribution-NonCommercial
Authors permit users to copy, distribute, display, and modify their work for any purpose other than commercial, and users must credit the authors. To use the work for commercial purposes, separate permission must be obtained from the authors. This license is not acceptable under Plan S.
5. CC BY-ND: Attribution-NoDerivatives
Authors permit users to copy, distribute, and display only original copies of their work with attribution. If users wish to modify the work, they must obtain separate permission from the authors. Commercial use is permitted.
Authors who received funding from funders affiliated with cOAlition S can request to use the CC BY-ND license, but their request must be substantiated and approved by the funder. CC BY-ND is most often used in the social sciences and humanities, where the meaning of original scientific works may be distorted due to translation, inconsistent citation or superficial interpretation.
6. CC BY-NC-SA: Attribution-NonCommercial-ShareAlike
Authors permit users to copy, distribute, display and modify their work for non-commercial purposes, as long as the distribution and modification take place under the same conditions and with reference to the original authorship. If users want to use the work for commercial purposes, they must obtain separate permission from the authors. This license is not acceptable under Plan S.
7. CC BY-NC-ND: Attribution-NonCommercial-NoDerivatives
Authors permit users to copy, distribute and display only original copies of their work for non-commercial purposes and with attribution. If users wish to modify the work or use it for commercial purposes, they must obtain separate permission from the authors. This license is not acceptable under Plan S.
Open Licenses for Software
Here, we will briefly summarize the guidelines for publishing software code prepared by the University of Reading. For more information, we recommend reading the source document.
As collaborative development, reuse, and dissemination of knowledge are encouraged in the scientific community, the use of open licenses for software is also encouraged. These are not necessarily incompatible with commercial use. Dedicated Open Source licenses are recommended for substantial original software, whereas it is more convenient to combine short code snippets (e.g., a statistical analysis script written in R) with the related data set and archive the bundle in a repository under a general open license such as Creative Commons.
Open licenses for software are divided into:
- permissive licenses, which allow mostly unrestricted reuse and re-licensing of the code with reference to the original authorship,
- and copyleft licenses, which require that any adaptations of the code be shared on the same terms, effectively limiting its commercial use.
If you decide to use open licenses to license your software, it is recommended that you use one recognized by the Open Source Initiative. The online tools Choose a License and tl;drLegal may aid you in choosing the best license for your needs.
Permissive Licenses
The most commonly used permissive software licenses are Apache 2.0 and MIT. Apache 2.0 defines certain aspects of reuse more precisely than MIT. For example, it licenses user contributions to the code under the same conditions by default, therefore the written consent of the original authors for the publication of the derivative or upgraded code under the same license is not required. Apache 2.0 also contains a clause whereby anyone who contributes to the code grants the licensor the right to a patent. These two licenses are probably the simplest choice for most research software, such as scripts and code snippets, where the main purpose of the licensing is to make the code as accessible and reusable as possible.
Copyleft Licenses
Copyleft licenses are sometimes further divided into "strong" copyleft and "weak" copyleft licenses. The distinction makes sense particularly in the case of composite software, where the new code has been integrated into the existing, already licensed one. “Strong” copyleft licenses require that the combined code be licensed in its entirety under the copyleft terms under which the source code was licensed. On the other hand, "weak" copyleft licenses refer only to the already licensed part and its derivatives, and do not apply to the new part that was included in the already licensed code. A commonly used “strong” copyleft license is the GNU General Public License 3.0, and the commonly used "weak" copyleft licenses are Mozilla Public License 2.0 and GNU Lesser General Public License 3.0.
Dual Licensing
Software can be released simultaneously under both an open license (usually the GNU General Public License 3.0) and a proprietary license. Dual licensing is particularly appropriate for software with commercial potential, with an open license encouraging code development and reuse, wheres a proprietary license allows the source version of the software and plug-ins to be sold and incorporated into other commercial applications. Examples of dual-licensed software are MySQL and FFTW. Dual licensing is rarely appropriate for academic works unless they have significant market potential.
Nazadnje spremenjeno: 25. 8. 2022