Abstract

DBSCAN algorithm is a location-based clustering approach; it is used to find relationships and patterns in geographical data. Because of its widespread application, several data science-based programming languages include the DBSCAN method as a built-in function. Researchers and data scientists have been clustering and analyzing their study data using the built-in DBSCAN functions. All implementations of the DBSCAN functions require user input for radius distance (i.e., eps) and a minimum number of samples for a cluster (i.e., min_sample). As a result, the result of all built-in DBSCAN functions is believed to be the same. However, the DBSCAN Python built-in function yields different results than the other programming languages those are analyzed in this study. We propose a scientific way to assess the results of DBSCAN built-in function, as well as output inconsistencies. This study reveals various differences and advises caution when working with built-in functionality.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.