Abstract

Maps of science representing the structure of science can help us understand science and technology (S&T) development. Studies have thus developed techniques for analyzing research activities’ relationships; however, ongoing research projects and recently published papers have difficulty in applying inter-citation and co-citation analysis. Therefore, in order to characterize what is currently being attempted in the scientific landscape, this paper proposes a new content-based method of locating research projects in a multi-dimensional space using the recent word/paragraph embedding techniques. Specifically, for addressing an unclustered problem associated with the original paragraph vectors, we introduce paragraph vectors based on the information entropies of concepts in an S&T thesaurus. The experimental results show that the proposed method successfully formed a clustered map from 25,607 project descriptions of the 7th Framework Programme of EU from 2006 to 2016 and 34,192 project descriptions of the National Science Foundation from 2012 to 2016.

Highlights

  • Price (1965) proposed studying science using scientific methods

  • For addressing an unclustered problem associated with the original paragraph vectors, we introduce paragraph vectors based on the information entropies of concepts in an science and technology (S&T) thesaurus

  • We constructed a funding map using paragraph vectors as we described a preliminary version in Kawamura et al (2017), In the map, nodes represent research projects that are linked by certain distances of the content similarity

Read more

Summary

Introduction

Price (1965) proposed studying science using scientific methods. Since studies have developed techniques for analyzing and measuring research activities’ relationships and constructed maps of science (Boyack et al 2005), which is a major topic in scientometrics, to provide a bird’s eye view of the scientific landscape. Research laboratories and universities that are organized according to the established disciplines can understand an organization’s environment. Such maps are important to policy analysts and funding agencies. We analyze them using a content-based method using the recent natural language processing (NLP) techniques, word/paragraph embedding. We constructed a funding map using paragraph vectors as we described a preliminary version in Kawamura et al (2017), In the map, nodes represent research projects that are linked by certain distances of the content similarity. The main contribution of this paper is the construction of a content-based map characterizing what is being attempted in research projects based on the latest NLP techniques. Experiments and evaluations are described in ‘‘Experiments and evaluation’’ section, and conclusions and suggestions for future work are provided in ‘‘Conclusion and future work’’ section

Related work
Experiments and evaluation
Method Strength
Findings
Conclusion and future work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call