Identification of data mining research frontier based on conference papers

Yue Huang,Hu Liu,Jing Pan

doi:10.1108/ijcs-01-2021-0001

Abstract

Purpose Identifying the frontiers of a specific research field is one of the most basic tasks in bibliometrics and research published in leading conferences is crucial to the data mining research community, whereas few research studies have focused on it. The purpose of this study is to detect the intellectual structure of data mining based on conference papers. Design/methodology/approach This study takes the authoritative conference papers of the ranking 9 in the data mining field provided by Google Scholar Metrics as a sample. According to paper amount, this paper first detects the annual situation of the published documents and the distribution of the published conferences. Furthermore, from the research perspective of keywords, CiteSpace was used to dig into the conference papers to identify the frontiers of data mining, which focus on keywords term frequency, keywords betweenness centrality, keywords clustering and burst keywords. Findings Research showed that the research heat of data mining had experienced a linear upward trend during 2007 and 2016. The frontier identification based on the conference papers showed that there were five research hotspots in data mining, including clustering, classification, recommendation, social network analysis and community detection. The research contents embodied in the conference papers were also very rich. Originality/value This study detected the research frontier from leading data mining conference papers. Based on the keyword co-occurrence network, from four dimensions of keyword term frequency, betweeness centrality, clustering analysis and burst analysis, this paper identified and analyzed the research frontiers of data mining discipline from 2007 to 2016.

Highlights

In the era of “Internetþ,” big data have become the focus of attention
Compared with previous research work starting from the journal papers for the research frontier identification, in this paper, the conference papers were used as the analysis object
Based on the keyword co-occurrence network, from four dimensions of keyword term frequency, betweeness centrality, clustering analysis and burst analysis, this paper identified and analyzed the research frontiers of data mining discipline from 2007 to 2016

Summary

Introduction

In the era of “Internetþ,” big data have become the focus of attention. How to mine and use these massive information is the significance of scientists studying big data. Data mining refers to an engineered and systematic process of mining implicit and previously unknown but potentially useful information and patterns from large amounts of data. The authors can say that data mining provides new ideas and methods for the research of scientists. In recent couple of years, the research results of data mining discipline have been increasing in number where the published volume in international famous journals and conferences continues to grow. To investigate the research trends of data mining in a more comprehensive and meticulous manner, this paper discusses data mining discipline based on the papers of international authoritative conferences

Objectives

Methods

Conclusion