Abstract

In the age of big data, lots of data obtained is low-quality data characterized by heterogeneousness and incompleteness, referred to as heterogeneous incomplete decision systems (HIDSs) in this paper. Data classification is an important task in machine learning, with the ability to discover valuable knowledge hidden in HIDSs. However, systematic studies on data classification in HIDSs are rarely reported. Especially, there is a lack of adaptive classification methods for HIDSs, which can deal directly with heterogeneous incomplete data and do not require prior discretization of numerical attributes or filling in missing values. In this paper, a unified representation model, called parameterized tolerance granulation model (PTGM), is proposed to deal with heterogeneous incomplete data. And the principle of an adaptive granulation method of constructing appropriate PTGMs is also described using difference-based collaborative optimization. Based on PTGMs, decision logic language is used to describe classifiers consisting of decision rules satisfying given conditions. Then, a discernibility function-based and a heuristic function-based classification methods are proposed to obtain all optimized rule sets (classifiers) and to generate a particular optimized rule set, respectively. The heuristic function-based method is actually an adaptive classification method, which can deal directly with heterogeneous incomplete data. Furthermore, detailed theoretical analyses are given to illustrate the correctness and effectiveness of the proposed methods. The experimental results show that the proposed methods are effective and have obvious advantages in directly handling heterogeneous incomplete data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.