Abstract

Parameter selection is a well-known problem in the fuzzy clustering community. In this paper, we propose to tackle this problem using a computationally intensive approach. We apply this approach to a new method for clustering recently introduced in the literature. It is the fuzzy c-means with tolerance. This method permits data to include some error, and this is modeled by moving data in a particular direction within a particular range when clusters are defined. The proper application of this approach needs the correct definition of the parameter κ. A value that might be different for each record and corresponds to the maximum shift allowed to the data. In this paper, we review this method and we study the definition of this parameter κ when the same value of κ is used for all data elements. Our approach is based on the analysis of sets of data with increasing noise and an exhaustive analysis of the behavior of the algorithm with different values of κ. The analysis is motivated in privacy preserving data mining. The same approach can be used for parameter selection in other clustering algorithms. © 2010 Wiley Periodicals, Inc.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.