Abstract

Clustering may be named as the first clustering technique addressed by the research community since 1960s. However, as databases continue to grow in size, numerous research studies have been undertaken to develop more efficient clustering algorithms and to improve the performance of existing ones. This paper demonstrates a general optimization technique applicable to clustering algorithms with a need to calculate distances and check them against some minimum distance condition. The optimization technique is a simple calculation that finds the minimum possible distance between two points, and checks this distance against the minimum distance condition; thus reusing already computed values and reducing the need to compute a more complicated distance function periodically. The proposed optimization technique has been applied to the agglomerative hierarchical clustering, k-means clustering, and DBSCAN algorithms with successful results. Runtimes for all three algorithms with this optimization scenario were reduced, and the clusters they returned were verified to remain the same as the original algorithms. The optimization technique also shows potential for reducing runtimes by a substantial amount for large databases. As well, the optimization technique shows potential for reducing runtimes more and more as databases grow larger and larger.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.