Abstract
Attribute reduction, as an important preprocessing step for knowledge acquiring in data mining, is one of the key issues in rough set theory. It can only deal with attributes of a specific type in the information system by using a specific binary relation. However, there may be attributes of multiple different types in information systems in real-life applications. A composite relation is proposed to process attributes of multiple different types simultaneously in composite information systems. In order to solve the time-consuming problem of traditional heuristic attribute reduction algorithms, a novel attribute reduction algorithm based on structure discernibility matrix was proposed in this paper. The proposed algorithms can choose the same attribute reduction as its previous version, but it can be used to accelerate a heuristic process of attribute reduction by avoiding the process of intersection and adopting the forward greedy attribute reduction approach. The theoretical analysis and experimental results with UCI data sets show that the proposed algorithm can accelerate the heuristic process of attribute reduction.
Highlights
Pawlak proposed the Rough set theory in 1980s[1], this theory has become a powerful mathematical tool for analyzing one of various types of data[2,3]
Many scholars introduced the composite rough set model and proposed the basic idea to deal with attributes of multiple different types[12,13,14]. we introduced a structure discernibility matrix[15] to solve the time-consuming problem of traditional heuristic attribute reduction algorithms in this paper
MRPR algorithm is a structure discernibility matrix reduction algorithm based on positive region
Summary
Pawlak proposed the Rough set theory in 1980s[1], this theory has become a powerful mathematical tool for analyzing one of various types of data[2,3]. It can be used in an attribute value representation model to describe the dependencies among attributes, evaluate the significance of attributes and derive reduction[4,5]. Many scholars introduced the composite rough set model and proposed the basic idea to deal with attributes of multiple different types[12,13,14]. Extensive experiments on different data sets from UCI show that the proposed structure discernibility matrix-based method can process large data sets efficiently
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have