Abstract

To classify data with missing values, we propose a method exploiting autoencoders and evidence theory. We augment the complete data by deleting each feature once and imputing it using the nearest neighbor to a set of predefined points generated using a new scheme. We train an autoencoder with the complete data set to get a latent space representation of the input. The network is retrained with the augmented data to get a better latent space representation. Now for each class, we train a support vector machine (SVM) with a one-vs-all strategy using the latent space representation of the complete data set. For an r-class problem, the output of each of the r SVMs is used to define a Basic Probability Assignment (BPA). The BPAs are combined using Dempster’s rule of combination to make the final decision. Now to classify any test instance with missing values, we make an initial guess of the missing values using the nearest neighbor rule. We take the latent space representation of that imputed instance and pass it through each trained SVM. As done earlier, using each SVM output, we generate a BPA and the r BPAs are aggregated to get a composite BPA. The class label of the test point is then determined using the Pignistic probabilities. We have compared the proposed method with four state-of-the-art techniques using three experiments with artificial and real datasets. The proposed method is found to perform better.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.