Abstract

Context. The task of clustering – classification without a teacher of data arrays occupies a rather important place in Data Mining. To solve this problem, many approaches have been proposed at the moment, differing from each other in a priori assumptions in the studied and analyzed arrays, in the mathematical apparatus that is the basis of certain methods. The solution of clustering problems is complicated by the large dimension of the vectors of the analyzed observations, their distortion of various types.
 Objective. The purpose of the work is to introduce a fuzzy clustering procedure that combines the advantages of methods based on the analysis of data distribution densities and their peaks, which are characterized by high speed and can work effectively in conditions of classes that overlapping.
 Method. The method of fuzzy clustering of data arrays, based on the ideas of analyzing the distribution densities of these data, their peaks, and a confidence fuzzy approach has been introduced. The advantage of the proposed approach is to reduce the time for solving optimization problems related to finding attractors of density functions, since the number of calls to the optimization block is determined not by the volume of the analyzed array, but by the number of density peaks of the same array.
 Results. The method is quite simple in numerical implementation and is not critical to the choice of the optimization procedure. The experimental results confirm the effectiveness of the proposed approach in clustering problems under the condition of cluster intersection and allow us to recommend the proposed method for practical use in solving problems of automatic clustering of large data volumes.
 Conclusions. The method is quite simple in numerical implementation and is not critical to the choice of the optimization procedure. The advantage of the proposed approach is to reduce the time for solving optimization problems related to finding attractors of density functions, since the number of calls to the optimization block is determined not by the volume of the analyzed array, but by the number of density peaks of the same array. The method is quite simple in numerical implementation and is not critical to the choice of the optimization procedure. The experimental results confirm the effectiveness of the proposed approach in clustering problems under conditions of overlapping clusters.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.