Classification of Incomplete Data Using Autoencoder and Evidential Reasoning

Suvra Jyoti Choudhury,Nikhil R Pal

doi:10.1007/978-3-030-19823-7_13

Abstract

To classify data with missing values, we propose a method exploiting autoencoders and evidence theory. We augment the complete data by deleting each feature once and imputing it using the nearest neighbor to a set of predefined points generated using a new scheme. We train an autoencoder with the complete data set to get a latent space representation of the input. The network is retrained with the augmented data to get a better latent space representation. Now for each class, we train a support vector machine (SVM) with a one-vs-all strategy using the latent space representation of the complete data set. For an r-class problem, the output of each of the r SVMs is used to define a Basic Probability Assignment (BPA). The BPAs are combined using Dempster’s rule of combination to make the final decision. Now to classify any test instance with missing values, we make an initial guess of the missing values using the nearest neighbor rule. We take the latent space representation of that imputed instance and pass it through each trained SVM. As done earlier, using each SVM output, we generate a BPA and the r BPAs are aggregated to get a composite BPA. The class label of the test point is then determined using the Pignistic probabilities. We have compared the proposed method with four state-of-the-art techniques using three experiments with artificial and real datasets. The proposed method is found to perform better.

Full Text