Abstract
Clustering algorithm can reveal the inherent properties and laws of data through the learning of unlabeled data. However, interference data exists in some fields of different data forms, and the clustering will reduce the credibility of clustering results without processing data. This paper puts forward a semi-supervised clustering algorithm based on outlier pruning (LOF-SAP). For the outlier in the data, the local outlier factor algorithm (LOF) is used to look for them and reduce the influence of the outliers in data structure. Then, semi-supervised clustering algorithms can help find better partitions of data in the presence of side information. And then side information that is pair-wise constraint obtained with active learning is embedded in the data similarity matrix. At the last through affinity propagation clustering algorithm, clustering results obtains. This method is compared with the traditional affinity propagation (AP) clustering algorithm and the AP clustering algorithm with pair-wise constraint, and the experiment is done by using UCI database. And the proposed method can achieve better clustering performance.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.