The Computer Science Ontology: A Comprehensive Automatically-Generated Taxonomy of Research Areas

Angelo A Salatino,Aliaksandr Birukou,Francesco Osborne,Enrico Motta,Thiviyan Thanapalasingam,Andrea Mannocci

doi:10.1162/dint_a_00055

Angelo A Salatino, Aliaksandr Birukou + Show 4 more

Open Access

https://doi.org/10.1162/dint_a_00055

Copy DOI

Abstract

Ontologies of research areas are important tools for characterizing, exploring, and analyzing the research landscape. Some fields of research are comprehensively described by large-scale taxonomies, e.g., MeSH in Biology and PhySH in Physics. Conversely, current Computer Science taxonomies are coarse-grained and tend to evolve slowly. For instance, the ACM classification scheme contains only about 2K research topics and the last version dates back to 2012. In this paper, we introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 14K topics and 162K semantic relationships. It was created by applying the Klink-2 algorithm on a very large data set of 16M scientific articles. CSO presents two main advantages over the alternatives: i) it includes a very large number of topics that do not appear in other classifications, and ii) it can be updated automatically by running Klink-2 on recent corpora of publications. CSO powers several tools adopted by the editorial team at Springer Nature and has been used to enable a variety of solutions, such as classifying research publications, detecting research communities, and predicting research trends. To facilitate the uptake of CSO, we have also released the CSO Classifier, a tool for automatically classifying research papers, and the CSO Portal, a Web application that enables users to download, explore, and provide granular feedback on CSO. Users can use the portal to navigate and visualize sections of the ontology, rate topics and relationships, and suggest missing ones. The portal will support the publication of and access to regular new releases of CSO, with the aim of providing a comprehensive resource to the various research communities engaged with scholarly data.

Highlights

Ontologies have proved to be powerful solutions to represent domain knowledge, integrate data from different sources, and support a variety of semantic applications [1,2,3,4,5]
We have recently developed a new version of the Computer Science Ontology (CSO) Classifier [24], which uses a combination of linguistics and semantics to generate a more comprehensive set of topics, including topics that may not be explicitly mentioned in the metadata
Augur [11] is an approach that aims to detect the emergence of new research areas by analysing topic networks and identifying clusters associated with a significant increase in the pace of collaboration

Summary

Introduction

Ontologies have proved to be powerful solutions to represent domain knowledge, integrate data from different sources, and support a variety of semantic applications [1,2,3,4,5]. Ontologies are often used to facilitate the integration of large datasets of research data [6], the exploration of the academic landscape [7], information extraction from scientific articles [8], and so on. Some fields of research are well described by large-scale and up-to-date taxonomies, e.g., MeSH in Biology and PhySH in Physics. The current version of the ACM classification scheme, containing only about 2K research topics, dates back to 2012, when it superseded its 1998 release

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data Intelligence	Publication Date: Jul 1, 2020
Citations: 32	License type: cc-by

R Discovery Prime

R Discovery Prime

The Computer Science Ontology: A Comprehensive Automatically-Generated Taxonomy of Research Areas

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Intelligence

Lead the way for us

Similar Papers

The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
Angelo A Salatino ... Thiviyan Thanapalasingam
-
Angelo A Salatino, et. al.Angelo A Salatino ... Thiviyan Thanapalasingam
01 Jan 2018
01 Jan 2018

The Computer Science Ontology (CSO)
Angelo A Salatino Angelo A Salatino ... Thiviyan Thanapalasingam Thiviyan Thanapalasingam
-
Angelo A Salatino Angelo A Salatino, et. al.Angelo A Salatino Angelo A Salatino ... Thiviyan Thanapalasingam Thiviyan Thanapalasingam
15 Mar 2019
15 Mar 2019

CSO Classifier 3.0: a scalable unsupervised method for classifying documents in terms of research topics
Angelo Salatino ... Enrico Motta
International Journal on Digital Libraries | VOL. 23
Angelo Salatino, et. al.Angelo Salatino ... Enrico Motta
22 Jul 2021
International Journal on Digital Libraries | VOL. 23

New trends in scientific knowledge graphs and research impact assessment
Paolo Manghi ... Angelo Salatino
Quantitative Science Studies | VOL. 2
Paolo Manghi, et. al.Paolo Manghi ... Angelo Salatino
01 Dec 2021
Quantitative Science Studies | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Computer Science Ontology: A Comprehensive Automatically-Generated Taxonomy of Research Areas

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Intelligence