The demand for transparency about the origin and use of third-party funds has found its way into debates about university autonomy and governance in many European countries since the 1990s and is closely interwoven with initiatives to intensify the documentation and evaluation of publicly funded research activities of university and non-university research institutions. The desire for more meaningful performance measurement and resource allocation of research institutions has increased the demands on reporting in the function of a management and control instrument for the quality management of universities and non-university research institutions continually. However, the use of highly condensed quantitative indicators for the evaluation of complex social relationships has been viewed critically for a number of years.
What is needed is a consistent, quality-assured and complete database that can also do justice to the different reporting purposes and recipients and reporting cycles. The process initiated by the Science Council to develop a core research data set (KDSF) represents an important step in the standardization of research reporting; However, the implementation of the recommendations on the KDSF presents the institutions concerned with the challenge of making their previous reporting more transparent and more efficient. The required data must be collected in high quality, consolidated and prepared for the specific reporting requirements.
The automation of data preparation brings with it some challenges, since the available internal and publicly accessible data sources often do not provide the data required for research reporting in sufficient quality and precision. A major problem here is the disambiguation of the entity types person and organization.
One approach to solving this challenge is to use persistent identifiers to represent the various entities, person, organization and publication. Ideally, each entity is assigned only one persistent identifier, which is linked to other entities via relationships described in ontologies.
In order to make the desired solution reusable for as wide a range of users as possible, proprietary data sources should explicitly be dispensed with at this point and instead the interfaces to databases of initiatives from the context of the open science movement should be used, such as:
- ORCID (ORCID ID to identify authors),
- CrossRef (DOI to identify publications),
- ROR (ROR ID to identify organizations) or
- aggregators like Freya, OpenAIRE and Datacite Commons
(Partly) automated reporting
Another challenge is the (partially) automatic creation of standard reports, for the technical transmission of which no technical standard has been established so far. Based on established open source software, a reporting component for the free and linked-data-based research information system VIVO is to be developed within the scope of this project, with which the output of predefined standard reports in various target formats is made possible.
There should be a strong focus on the use of the definitions of the research core data set and reporting requirements such as the Lower Saxony guidelines on transparency in research. The output should be made possible as KDSF-XML or as reports in configurable Word templates.