In the big data era, the requirement for data clustering methods that can handle massive and heterogeneous datasets with varying distributions increases. This study proposes a clustering algorithm for data sets with heterogeneous density using a dual-mode memristor crossbar array for data clustering. The array consists of a Ta/HfO2/RuO2 memristor operating in analog or digital modes, controlled by the reset voltage. The digital mode shows low dispersion and a high resistance ratio, and the analog mode enables precise conductance tuning. The local outlier factor is introduced to handle a heterogeneous density, and the required Euclidean and K-distances within the given dataset are calculated in the analog mode in parallel. In the digital mode, clustering is performed based on the connectivity among data points after excluding the detected outliers. The proposed algorithm boasts linear time complexity for the entire process. Extensive evaluations of synthetic datasets demonstrate significant improvement over representative density-based algorithms, and the datasets with heterogeneous density are clustered feasibly. Finally, the proposed algorithm is used to cluster the single-molecule localization microscopy data, demonstrating the feasibility of the suggested method for real-world problems.
Read full abstract