Abstract

In recent years, granular computing has been developed as a unified data description paradigm. As a popular soft computing supervised learning model, rough sets theory-based data description approach has been intensively investigated in data mining research. Feasible information granulation and approximation approaches have been recognized as two key features of data descriptors in rough sets. In this study, we propose a Dempster–Shafer theory-based rough granular description model based on a principle of justifiable granularity. First, we apply evidence information to show the performance of information granules generated from various data density regions, and definitions of lower and upper approximation sets are discussed considering characteristics of data credibility and plausibility, respectively. Furthermore, we propose a robust rough description model to identify some extreme instances, such as outlier and noise instances. Moreover, a set of pseudo labels is provided to enhance the robustness of the proposed model. Finally, to search for an optimum granularity, justifiable granularity is quantified from the perspectives of legitimacy and interpretability, and then optimized by a particle swarm optimization algorithm. Extensive comparative experiments with several representative rough granular description models illustrate that the proposed model achieves almost all the best approximation quality, number difference, and neighborhood credibility values. These experimental results demonstrate that the proposed approach is reasonable, effective, and robust, and is a promising rough granular description model for complex data in real-world applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call