Abstract

Recent advances in remote sensing technologies have led to the fast proliferation of massive and often imbalanced datasets. Direct classification in these datasets becomes difficult, because of the high dimensionality, and the fact that minority classes are overlapped and dwarfed by majority classes. Deep learning is the state-of-the-art in image classification, with applications in face- and text detection, text recognition, as well as voice classification. However, deep learning requires a favorable ratio between dimensionality and sample size. To address high dimensional yet imbalanced datasets, in this paper, we propose the integration of data augmentation, to a deep learning classifier of a high dimensional and highly imbalanced photo-thermal infrared hyperspectral dataset of chemical substances. First, we apply a basic deep machine learning approach using a convolutional neural network (CNN) on the original dataset. Second, we apply principal component analysis (PCA) to reduce dimensionality before applying CNN. Third, we prepend an offline data augmentation step to increase dataset size before applying CNN. After that, we evaluate the performance by calculating the probability of detection (POD), and recall based on true positive (TP), false negative (FN), false positive (FP), and true negative (TN).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call