Abstract
The paper introduces a rough set model to analyze an information system in which some conditions and decision data are missing. Many studies have focused on missing condition data, but very few have accounted for missing decision data. Common approaches tend to remove objects with missing decision data because such objects are apparently considered worthless from the perspective of decision-making. However, we indicate that this removal may lead to information loss. Our method retains such objects with missing decision data. We observe that a scenario involving missing decision data is somewhat similar to the situation of semi-supervised learning, because some objects are characterized by complete decision data whereas others are not. This leads us to the idea of estimating potential candidates for the missing data using the available data. These potential candidates are determined by two quantitative indicators: local decision probability and universal decision probability. These potential candidates allow us to define set approximations and the definition of reduct. We also compare the reducts and rules induced from two information systems: one removes objects with missing decision data and the other retains such objects. We highlight that the knowledge induced from the former can be induced from the latter using our approach. Thus, our method offers a more generalized approach to handle missing decision data and prevents information loss.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have