NS-DBSCAN: A Density-Based Clustering Algorithm in Network Space

Tianfu Wang,Chang Ren,Yun Luo,Jing Tian

doi:10.3390/ijgi8050218

Abstract

Spatial clustering analysis is an important spatial data mining technique. It divides objects into clusters according to their similarities in both location and attribute aspects. It plays an essential role in density distribution identification, hot-spot detection, and trend discovery. Spatial clustering algorithms in the Euclidean space are relatively mature, while those in the network space are less well researched. This study aimed to present a well-known clustering algorithm, named density-based spatial clustering of applications with noise (DBSCAN), to network space and proposed a new clustering algorithm named network space DBSCAN (NS-DBSCAN). Basically, the NS-DBSCAN algorithm used a strategy similar to the DBSCAN algorithm. Furthermore, it provided a new technique for visualizing the density distribution and indicating the intrinsic clustering structure. Tested by the points of interest (POI) in Hanyang district, Wuhan, China, the NS-DBSCAN algorithm was able to accurately detect the high-density regions. The NS-DBSCAN algorithm was compared with the classical hierarchical clustering algorithm and the recently proposed density-based clustering algorithm with network-constraint Delaunay triangulation (NC_DT) in terms of their effectiveness. The hierarchical clustering algorithm was effective only when the cluster number was well specified, otherwise it might separate a natural cluster into several parts. The NC_DT method excessively gathered most objects into a huge cluster. Quantitative evaluation using four indicators, including the silhouette, the R-squared index, the Davis–Bouldin index, and the clustering scheme quality index, indicated that the NS-DBSCAN algorithm was superior to the hierarchical clustering and NC_DT algorithms.

Highlights

The first law of geography states that closer spatial entities are more strongly related to each other than the distant ones [1]
The NS-density-based spatial clustering of applications with noise (DBSCAN) algorithm was capable of distinguishing the separated highly populated regions, and the shape of clusters approximately portrayed the shape of these regions
Dead-end roads do not have any impact on the study

Summary

Introduction

The first law of geography states that closer spatial entities are more strongly related to each other than the distant ones [1]. Clustering analysis is generally divided into two categories, one using the spatial point pattern analyses to discover aggregated points with statistical indicators and the other obtaining clusters from the perspective of data mining [16]. The spatial point pattern methods, such as the local K-function [17,18], local Moran’s I [19,20], Getis-Ord Gi [10], scan statistics [21], and local indicators of mobility association (LIMA) [22], are commonly adopted for indicating aggregated regions and discovering the density trend of spatial dataset. In contrast to spatial point pattern methods, generic clustering algorithms for multidimensional features delineate aggregated configuration of dataset and precisely depict specific shapes of separated clusters

Objectives

Results

Conclusion