Abstract
Training Deep Learning (DL) models with missing labels is a challenge in diverse engineering applications. Missing value imputation methods have been proposed to try to address this problem, but their performance is affected with Massive Proportion of Missing Labels (MPML). This paper presents a approach for handling MPML in Multivariate Long-Term Time Series Forecasting. It is an two-step process where interpolation (using Gaussian Processes Regression (GPR) and domain knowledge from experts) and prediction model are separated to enable the integration of prior domain knowledge. First, a set of samples of the possible interpolation of the missing outputs are generated by the GPR based on the domain knowledge. Second, the observed input sensor data and interpolated labels from GPR are used to train the prediction model. We evaluated our approach with the development of a soft-sensor with one real datasets to forecast the biomass during recombinant adeno-associated virus (rAAV) production in bioreactors. Our experimental results demonstrate the potential of the approach through quantitative evaluation of the generated forecasts in a case that would be extremely difficult to train a DL model due to MPML.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.