Documentation

Structured description of research contributions

With ORKG you can describe the research contributions traditionally described in scholarly articles in a structured and semantic manner. This is done by adding a paper to ORKG. First, include basic metadata about the paper (e.g., title and authors) which can be looked up by DOI. The main step consists of a specialized editor for the structured description of research contributions. Typically, a research contribution should describe the addressed problem, the utilized materials and methods, and the obtained result. For instance, a research contribution could describe the p-value resulting in a statistical hypothesis test adressing a specific research problem. Naturally, a paper can describe multiple research contributions. The following figure examplifies entering a structured description of a research contribution for the effect of ozone on plant growth as the research problem and information about a conducted statistical hypothesis test comparing mean height of plants exposed to elevated ozone and plants exposed to ambient levels. Existing papers and their research contributions can be edited to modify existing information or include new information. Hence, when you add a paper you can describe the essential information and return at any time to improve the description. Indeed, others can chime in and improve the description of the paper you added.

Templates

The structured description of research contributions is no easy task. Describing scholarly information is complicated. You need to decide at what level of granularity you want to describe a research contribution, the addressed problem, its results and employed material and methods. Also, research contributions addressing the same problem should be described in a comparable manner. Finally, while for humans it is best to only capture essential information, for machines shallow structures typically carry little, no, or ambiguous semantics. To address this issue, ORKG supports the possibility of creating templates that specify the structure of content types, and using templates when describing research contributions. The following figure shows the specification of the attributes of a process that estimates the basic reproduction number of a population using a specific method.

Comparisons

Given described papers and their research contributions, it is possible to compare the contributions addressing a specific problem, across the scholarly literature. Given structured and comparable descriptions, we can leverage algorithms to automatically create such ORKG comparisons. A classic example in computer science is a comparison of best/average/worst case performance of sorting algorithms. Another is precision and recall of algorithms for vehicle recognition in images. In virology, we may want to compare the basic reproduction number of a particular virus. In materials science, we may want to compare the solubility parameters of compounds. The following figure shows a comparison of COVID-19 basic reproduction number as published in numerous papers, specifically the value and its 95% confidence interval as well as the location and period of the infectious agent population. Such comparisons provide us with an overview of key information about a resarch problem across dozens or hundreds of papers and are thus a valuable instrument that can help discovering, for instance, the leading sorting algorithm or how dangerous a virus is relative to other viruses.

Graph visualization

Above we have seen how research contribution data is shown in what we call the statement brower. Since ORKG is a knowledge graph, paper and research contribution descriptions can also be visualized as a graph. The graph view thus provides an alternative and complementary way to interact with ORKG content. It is a highly advanced user interface for visual exploration of graph data and includes a range of great features to make exploration of highly structured graph data intuitive. The graph is automatically optimally arranged on the screen. Nodes can easily be expanded, collapsed or removed. Users can search for information in the graph.

Observatories

While we try to automate as much as can be automated, the ORKG strongly relies on expert content curation and knowledge organization. In order to pool disciplinary expertise, we developed the ORKG Observatories. Observatories are groups of experts (e.g., seniour researchers) affiliated with different institutions that curate and organize ORKG content for a specific discipline, and within that typically a specific research area or even a research problem. Since such knowledge curation and organization is time consuming, ORKG acknowledges the contribution of experts as well as the observatories and institutions they are members of. The following figure shows how we implement this acknowledgement as provenance information for the description of research contributions. Observatories and their experts can contribute in numerous ways to ORKG. In addition to adding and describing papers or curating existing papers, observatories play a crucial role in knowledge organization for a particular research area. In ORKG, observatories can for instance specify templates for the information types that are most relevant to their area of research. In doing so, observatories help ensuring the creation of high quality and comparable structured scholarly knowledge for their area.

Data Science

The structured and semantic description of scholarly knowledge enables easier reuse of knowledge. The comparisons described above are only one way of reusing knowledge. In science, knowledge found the literature is reused in innumerable ways. To support this diversity, the ORKG implements a web-based interface (REST API), which can be used in, for instance, Python programming to access and process ORKG content. For Python, we also provide a specialized ORKG library, which further simplifies accessing ORKG content. With this you can load ORKG content (individual contribution descriptions and comparisons) into a data analysis environment such as Jupyter Notebooks and language-native data structures such as pandas DataFrame to further process data and create domain-specific applications for data visualization, interpolation, modeling, simulation, etc.