An Automatic K-Means Clustering Algorithm of GPS Data Combining a Novel Niche Genetic Algorithm with Noise and Density

Xiangbing Zhou,Huaming Gong,Hongjiang Ma,Shaopeng Shen,Jianggang Gu,Fang Miao,Hua Zhang

doi:10.3390/ijgi6120392

Abstract

Rapidly growing Global Positioning System (GPS) data plays an important role in trajectory and their applications (e.g., GPS-enabled smart devices). In order to employ K-means to mine the better origins and destinations (OD) behind the GPS data and overcome its shortcomings including slowness of convergence, sensitivity to initial seeds selection, and getting stuck in a local optimum, this paper proposes and focuses on a novel niche genetic algorithm (NGA) with density and noise for K-means clustering (NoiseClust). In NoiseClust, an improved noise method and K-means++ are proposed to produce the initial population and capture higher quality seeds that can automatically determine the proper number of clusters, and also handle the different sizes and shapes of genes. A density-based method is presented to divide the number of niches, with its aim to maintain population diversity. Adaptive probabilities of crossover and mutation are also employed to prevent the convergence to a local optimum. Finally, the centers (the best chromosome) are obtained and then fed into the K-means as initial seeds to generate even higher quality clustering results by allowing the initial seeds to readjust as needed. Experimental results based on taxi GPS data sets demonstrate that NoiseClust has high performance and effectiveness, and easily mine the city’s situations in four taxi GPS data sets.

Highlights

Nowadays, with the prevalence of smart Global Positioning System (GPS) devices with positioning ability, a large amount of GPS-based data and trajectories are available
For the purpose of testing the performance of the NoiseClust algorithm, experiments are conducted on real-world taxi GPS data sets [35], and the results show that NoiseClust has a higher performance and effectiveness than GenClust [16] and Genetic algorithm K-means (GAK) [42]
NoiseClust uses the proposed new niche genetic algorithm (NGA) with noise and density to avoid getting stuck in a local optimum, while achieving high-quality cluster results for taxi GPS data

Summary

Introduction

With the prevalence of smart Global Positioning System (GPS) devices with positioning ability, a large amount of GPS-based data and trajectories are available. The key element to these applications is location (based on GPS), which is required to mine the hidden information and understand the meaning of the trajectories, instead of only considering trajectory as a combination of recorded GPS data points In these application domains, techniques for mining trajectory patterns and frequent trajectory routes are very important [1], and have usually been described by several trajectory patterns, such as origins and destinations (OD) [2,3,4], stops and moves [5,6], moving object [7,8]; a great quantity of clustering algorithms have been used to mine these patterns and produce clustering results. GGA (Group genetic algorithm) [21] presented GA-based clustering algorithms with a new grouping method in the initial population These GAs with K-means can lose population diversity due to global optimal problems and weak exploitation capabilities, and the gene size of the chromosomes must be equal in the AGCUK (Automatic genetic clustering for unknown K) and GAGR. In GGA algorithm, the number of clusters require a user input, but gene sizes are not equal

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ISPRS international journal of geo-information	Publication Date: Dec 1, 2017
Citations: 22	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Automatic K-Means Clustering Algorithm of GPS Data Combining a Novel Niche Genetic Algorithm with Noise and Density

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS international journal of geo-information

Lead the way for us

Similar Papers

Genetic Algorithm with an Improved Initial Population Technique for Automatic Clustering of Low-Dimensional Data
Xiangbing Zhou ... Hongjiang Ma
Information | VOL. 9
Xiangbing Zhou, et. al.Xiangbing Zhou ... Hongjiang Ma
21 Apr 2018
Information | VOL. 9

A novel clustering algorithm combining niche genetic algorithm with canopy and K-means
Hua Zhang ... Xiangbing Zhou
-
Hua Zhang, et. al.Hua Zhang ... Xiangbing Zhou
01 May 2018
01 May 2018

Wind power fitness function calculation based on niche genetic algorithm
Pan Yanhong
-
Pan Yanhong Pan Yanhong
01 Jan 2012
01 Jan 2012

A genetic algorithm with gene rearrangement for K-means clustering
Dong-Xia Chang ... Chang-Wen Zheng
Pattern Recognition | VOL. 42
Dong-Xia Chang, et. al.Dong-Xia Chang ... Chang-Wen Zheng
20 Nov 2008
Pattern Recognition | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Automatic K-Means Clustering Algorithm of GPS Data Combining a Novel Niche Genetic Algorithm with Noise and Density

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: ISPRS international journal of geo-information