Abstract

Time series spectral imaging facilitates a comprehensive understanding of the underlying dynamics of multi-component systems and processes. Most existing classification strategies focus exclusively on the spectral features and they tend to fail when spectra between classes closely resemble each other. This work proposes a hybrid approach of principal component analysis (PCA) and deep learning (i.e., long short-term memory (LSTM) model) for incorporating and utilizing the combined multi-temporal and spectral information from time series spectral imaging datasets. An example data, consisting of times series spectral images of casein-based biopolymers, was used to illustrate and evaluate the proposed hybrid approach. Compared to using partial least squares discriminant analysis (PLSDA), the proposed PCA-LSTM method applying the same spectral pretreatment achieved substantial improvement in the pixel-wise classification (i.e., accuracy increased from 59.97% of PLSDA to 85.73% of PCA-LSTM). When projecting the pixel-wise model to object-based classification, the PCA-LSTM approach produced an accuracy of 100%, correctly classifying the whole 21 film samples in the independent test set, while PLSDA only led to an accuracy of 80.95%. The proposed method is powerful and versatile in utilizing distinctive characteristics of time dependencies from multivariate time series dataset, which could be adapted to suit non-congruent images over time sequences as well as spectroscopic data.

Highlights

  • We propose a novel strategy to combine principal component analysis (PCA) with Long short-term memory (LSTM) networks to improve the multiclass classification of time series spectral imaging datasets, exemplified by a time series dataset of casein-based biopolymers imaged over the process of drying

  • The superiority and versatility of the proposed method is demonstrated using time series spectral imaging data of film samples acquired during drying process

  • The major challenge resides in the high degree of spectral similarity between different film classes, leading to the unsuccessful prediction via partial least squares discriminant analysis (PLSDA) modelling which utilizes only the spectral information from each pixel

Read more

Summary

Introduction

The second strategy calculates the mean spectrum of each spectral image after eliminating background and compiles them into a matrix on which PCA is applied These two strategies are expected to deliver similar results in terms of concentrating the spectral variance and trends within the time series dataset, yet using mean spectra is faster, as fewer spectra are included. In this case, the non-background pixel spectra were first extracted, averaged and concatenated to form a matrix with 252 rows (7 categories  4 time points  3 replicates  3 repeats) and 104 columns (i.e., spectral variables). Instead of using all PCs, we can opt to just use the first few PCs to represent the original dataset, in this case, the first 30 PCs (explaining more than 99% variance) are selected

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call