Abstract

In practical applications of machine learning, only part of data is labeled because the cost of assessing class label is relatively high. Measure of uncertainty is abbreviated as MU. This paper explores MU for partially labeled real-valued data via a discernibility relation. First, a decision information system with partially labeled real-valued data (p-RVDIS) is separated into two decision information systems: one is the decision information system with labeled real-valued data (l-RVDIS) and the other is the decision information system with unlabeled real-valued data (u-RVDIS). Then, based on a discernibility relation, dependence function, conditional information entropy and conditional information amount, four degrees of importance on an attribute subset in a p-RVDIS are defined. They are calculated by taking the weighted sum of l-RVDIS and u-RVDIS based on the missing rate, which can be considered as four MUs for a p-RVDIS. Combining l-RVDIS and u-RVDIS provides a more accurate assessment of the importance and classification ability of attribute subsets in a p-RVDIS. This is precisely the novelty of this paper. Finally, experimental analysis on several datasets verify the effectiveness of these MUs. These findings will contribute to the comprehension of the essence of the uncertainty in a p-RVDIS.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call