Gaussian processes (GPs) are a powerful and popular framework for addressing machine learning problems, particularly for time-dependent data such as that generated by the Internet of Things (IoT). GPs offer a compelling choice for constructing real-valued nonlinear models due to their inherent flexibility and ability to quantify uncertainty. However, traditional GP methods are often hindered by cubic computational complexity, making them impractical for the massive and potentially unbounded datasets commonly encountered in IoT applications. To address this issue, researchers have developed various sparse approximation methods that significantly reduce the computational burden of GPs. Among these, pseudo-point approximations have proven to be highly influential, leveraging a subset of the training data to represent the entire observation space. The variational sparse GP is a state-of-the-art approach that approximates the posterior distribution of GP models, enabling faster and more efficient predictions in real-time data series. However, integrating variational inference into the GP framework for sequentially arriving data remains a significant challenge, particularly when dealing with time series data where the underlying data distribution evolves over time. In this paper, we propose the OnLine Variational Gaussian Process (OLVGP) algorithm, which introduces a novel approach for dynamically managing the number of inducing points based on the concept of eigenfunction inducing features. Unlike traditional methods that rely on a fixed number of inducing points, OLVGP adaptively adjusts the number of inducing points as new data arrives and optimizes them from the model, ensuring that the model remains computationally efficient while maintaining high predictive accuracy. Our method capitalizes on the sophisticated design of the online variational inference framework, ensuring a technically robust derivation and implementation. We validate the effectiveness and efficiency of our approach through both synthetic examples and real-world experiments. The results demonstrate that OLVGP not only substantially reduces computational costs compared to traditional sparse GP methods but also dynamically adapts to the evolving data, delivering improved performance in time series prediction.
Read full abstract