Abstract

With the development of cities, urban congestion is nearly an unavoidable problem for almost every large-scale city. Road planning is an effective means to alleviate urban congestion, which is a classical non-deterministic polynomial time (NP) hard problem, and has become an important research hotspot in recent years. A K-means clustering algorithm is an iterative clustering analysis algorithm that has been regarded as an effective means to solve urban road planning problems by scholars for the past several decades; however, it is very difficult to determine the number of clusters and sensitively initialize the center cluster. In order to solve these problems, a novel K-means clustering algorithm based on a noise algorithm is developed to capture urban hotspots in this paper. The noise algorithm is employed to randomly enhance the attribution of data points and output results of clustering by adding noise judgment in order to automatically obtain the number of clusters for the given data and initialize the center cluster. Four unsupervised evaluation indexes, namely, DB, PBM, SC, and SSE, are directly used to evaluate and analyze the clustering results, and a nonparametric Wilcoxon statistical analysis method is employed to verify the distribution states and differences between clustering results. Finally, five taxi GPS datasets from Aracaju (Brazil), San Francisco (USA), Rome (Italy), Chongqing (China), and Beijing (China) are selected to test and verify the effectiveness of the proposed noise K-means clustering algorithm by comparing the algorithm with fuzzy C-means, K-means, and K-means plus approaches. The compared experiment results show that the noise algorithm can reasonably obtain the number of clusters and initialize the center cluster, and the proposed noise K-means clustering algorithm demonstrates better clustering performance and accurately obtains clustering results, as well as effectively capturing urban hotspots.

Highlights

  • Modern cities have become important engines and hubs to drive social development.A city represents the most concentrated residence of people and the gathering place of social resources

  • Many taxi GPS datasets exist for many cities across the world

  • As can be seen from the city hotspot marking results of the taxis GPS data in five cities obtained by the noise-based K-means clustering in Figure 12, the obtained clustering effect is better than that obtained with Fuzzy C-means (FCM), K-means, and K-means plus clustering

Read more

Summary

A Novel K-Means Clustering Algorithm with a Noise

Hotspots. Appl. Sci. 2021, 11, 11202. School of Information and Engineering, Sichuan Tourism University, Chendu 610100, China School of Computer Science and Technology, Aba Teachers University, Wenchuan 623002, China School of Electronic Information and Automation, Civil Aviation University of China, Tianjin 300300, China

Introduction
The Idea of the Noise K-Means Clustering Algorithm
The Realization of the Noise-Based K-Means Clustering Algorithm
Obtain the Clustering Number K Value and the Initial Center
Optimize Clustering Center
Obtain Clustering Result and Capture Excellent Cluster Center
Urban Taxi GPS Data
Experimental Environment and Parameter Setting
Experimental Results and Comparison Analysis
Visual
11. The city byby thethe method:
Statistical Analysis of Wilcoxon
Evaluation Method
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call