Abstract

This paper reports the application of a possibility and rough set based clustering to semantically segmented real-world databases. The approach is an improved version of the well-known k-modes algorithm. It is a soft clustering method that clusters instances with uncertain categorical values to different clusters using their membership degrees. The possibility theory is used for dealing with uncertainty in the values of attributes and in the memberships of clusters. Rough sets are used to detect clusters with rough boundaries. We demonstrate the effectiveness of the proposed approach with the help of two real-world databases: a retail store or transactions data set and a mobile phone data set. The numeric values of attributes are segmented into semantically meaningful linguistic values using a novel discretization method. These linguistic values can lead to more natural interpretation of knowledge using possibilistic degrees. The possibilistic degrees describe our knowledge relative to the values of attributes (fully plausible to occur, may occur, or rejected) and identify the level of uncertainty in memberships to different clusters. In addition, our method deduces peripheral objects by calculating the approximate sets as defined in the rough set theory. The k-modes enhanced with rough set and possibility theories can provide semantically meaningful information for decision making to the store owners (retails data set) and telecommunication companies (mobile phone data set).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call