Clustering of Distributions: A Case of Patent Citations

Nataša Kejžar,Simona Korenjak-Černe,Vladimir Batagelj

doi:10.1007/s00357-011-9084-x

Abstract

Often the data units are described with discrete distributions (work described with citation distribution over time, population pyramid described as age-sex distribution etc.).When the set of such units is very large, appropriate clustering methods can reveal the typical patterns hidden in the data. In this paper we present an adapted leaders method combined with a compatible adapted agglomerative hierarchical method that are based on relative error measure between a unit and the corresponding cluster representative–leader. The proposed approach is illustrated on citation distributions derived from the data set of US patents from 1980 to 1999. These new methods were developed because clustering of units, described with distributions, with classical k-means method reveals patterns with single high peaks which correspond to a single year. These patterns prevail over other distribution shapes also present in the data. Compared with centers in k-means method, clusters’ representatives obtained with the proposed new methods better detect typical distribution shapes of units. The obtained main cluster types for different sets of units show three main patterns: patents with early or late peak of importance to the community, and patents where the importance is slowly increasing throughout the time period.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Clustering of Distributions: A Case of Patent Citations

Abstract

Talk to us

Similar Papers

More From: Journal of Classification

Lead the way for us

Journal: Journal of Classification	Publication Date: Jun 18, 2011
Citations: 7

Similar Papers

Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal
Michael J Stringer ... Luís A Nunes Amaral
Journal of the American Society for Information Science and Technology | VOL. 61
Michael J Stringer, et. al.Michael J Stringer ... Luís A Nunes Amaral
14 Jun 2010
Journal of the American Society for Information Science and Technology | VOL. 61

Microstructure and micromechanics of polydisperse granular materials: Effect of the shape of particle size distribution
Joanna Wiącek ... Marek Molenda
Powder Technology | VOL. 268
Joanna Wiącek, et. al.Joanna Wiącek ... Marek Molenda
23 Aug 2014
Powder Technology | VOL. 268

Clustering large data sets described with discrete distributions and its application on TIMSS data set
Simona Korenjak‐Černe ... Vladimir Batagelj
Statistical Analysis and Data Mining: The ASA Data Science Journal | VOL. 4
Simona Korenjak‐Černe, et. al.Simona Korenjak‐Černe ... Vladimir Batagelj
13 Jan 2011
Statistical Analysis and Data Mining: The ASA Data Science Journal | VOL. 4

Markov chain correlation based clustering of gene expression data
Youping Deng ... Chaoyang Zhang
-
Youping Deng, et. al. Youping Deng ... Chaoyang Zhang
01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Clustering of Distributions: A Case of Patent Citations

Abstract

Talk to us

Similar Papers

More From: Journal of Classification