Abstract
AbstractWith the development of computer networks, it has become easy to have huge databases. Accordingly, it is becoming difficult for users to extract knowledge from such databases. In this paper we focus on data mining, especially classification. In real‐world data mining, the missing value problem occurs in cases such as speech containing noise, facial occlusion, and the like. When a test sample has missing values, a classification system cannot handle them. In previous studies, various imputation methods have been developed, with the objective of solving the missing value problem with numerous explanatory variables, even if some explanatory variables were ineffective for imputation. It has been said that the use of many variables degrades learning efficiency, and thus we believe that imputation methods should be developed considering the relations among explanatory variables. It is also effective to consider the relations between the test sample and each of the training samples. Therefore, we have proposed an imputation method using a Bayesian network with weighted learning. Experiments have confirmed that the proposed method imputes missed values with approximate values, and the classification system successfully classified test samples in which missed values were imputed by the proposed method, with better success than some conventional imputation methods. © 2012 Wiley Periodicals, Inc. Electron Comm Jpn, 95(12): 1–9, 2012; Published online in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/ecj.11449
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.