Abstract

Possible world has become one of the most effective tools to deal with various types of data uncertainty in uncertain data management. However, few uncertain data classification algorithms are proposed based on possible world. Most existing uncertain data classification algorithms are simply extended from traditional classification algorithms for certain data. They deal with data uncertainty based on relatively ideal probability distribution and data type assumptions, thus are difficult to be applied for various application scenarios. In this paper, we propose a novel possible world based AdaBoost algorithm for classifying uncertain data, called PwAdaBoost. In the training procedure, PwAdaBoost uses the possible world set generated from the uncertain training set sampled in each iteration to train the sub-basic classifiers, and employs the possible world set generated from the whole uncertain training set to adjust the weights of the sub-basic classifiers and detect the quality of the basic classifiers. In the prediction procedure, PwAdaBoost utilizes the possible world set generated from the predicted object to get the results of the basic classifiers via majority voting and weighted voting. Furthermore, we analyze the stability and give the parallelization strategies for its training procedure and prediction procedure respectively. The proposed PwAdaBoost can deal with various types of data uncertainty, and use any existing classification algorithms for certain data to serve for uncertain data. As far as we know, it is the first ensemble classification algorithm for uncertain data. Extensive experiment results demonstrate the superiority of our proposed algorithm in terms of effectiveness and efficiency.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.