Abstract

Data science is a new academic field that has received much attention in recent years. One reason for this is that our increasingly digitalized society generates more and more data in all areas of our lives and science and we are desperately seeking for solutions to deal with this problem. In this paper, we investigate the academic roots of data science. We are using data of scientists and their citations from Google Scholar, who have an interest in data science, to perform a quantitative analysis of the data science community. Furthermore, for decomposing the data science community into its major defining factors corresponding to the most important research fields, we introduce a statistical regression model that is fully automatic and robust with respect to a subsampling of the data. This statistical model allows us to define the ‘importance’ of a field as its predictive abilities. Overall, our method provides an objective answer to the question ‘What is data science?’.

Highlights

  • From time to time new scientific fields emerge as a consequence to adapt to a changing world

  • Examples for the establishment of new academic disciplines are economy, computer science, bioinformatics and most recently data science [4,5,6]

  • By using data from Google Scholar from scientists who declare a research interest in ‘data science’, we study various publication statistics providing information about the scientists and the research fields they are interested in

Read more

Summary

Introduction

From time to time new scientific fields emerge as a consequence to adapt to a changing world. Examples for the establishment of new academic disciplines are economy (the first professorship in economics was established at the University of Cambridge in 1890 held by Alfred Marshall [1]), computer science (the first department of computer science in the United States was established at Purdue University in 1962 whereas the term ‘computer science’ has appeared first in [2]), bioinformatics (the term was first used by [3]) and most recently data science [4,5,6]. The first appearance of the term ‘data science’ is ascribed to Peter Naur in 1974 [7] but it took nearly 30 years until there were callings for an independent discipline with this name [8]. A commonality all of these papers is that they present a qualitative, descriptive list of attributes in an argumentative way

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.