Abstract
As location information of numerous Internet of Thing (IoT) devices can be recognized through IoT sensor technology, the need for technology to efficiently analyze spatial data is increasing. One of the famous algorithms for classifying dense data into one cluster is Density-Based Spatial Clustering of Applications with Noise (DBSCAN). Existing DBSCAN research focuses on efficiently finding clusters in numeric data or categorical data. In this paper, we propose the novel problem of discovering a set of adjacent clusters among the cluster results derived for each keyword in the keyword-based DBSCAN algorithm. The existing DBSCAN algorithm has a problem in that it is necessary to calculate the number of all cases in order to find adjacent clusters among clusters derived as a result of the algorithm. To solve this problem, we developed the Genetic algorithm-based Keyword Matching DBSCAN (GKM-DBSCAN) algorithm to which the genetic algorithm was applied to discover the set of adjacent clusters among the cluster results derived for each keyword. In order to improve the performance of GKM-DBSCAN, we improved the general genetic algorithm by performing a genetic operation in groups. We conducted extensive experiments on both real and synthetic datasets to show the effectiveness of GKM-DBSCAN than the brute-force method. The experimental results show that GKM-DBSCAN outperforms the brute-force method by up to 21 times. GKM-DBSCAN with the index number binarization (INB) is 1.8 times faster than GKM-DBSCAN with the cluster number binarization (CNB).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.