The modeling capabilities of a Gaussian Process (GP), such as generalization, nonlinearity, and smoothness, are largely determined by the choice of its kernel. A popular family of kernels for GPs, the spectral mixture (SM) kernels, have the desirable property that with a large number of spectral components they can approximate any stationary kernel. However, using a large number of SM components increases the risk of overfitting and hinders interpretability. To overcome these challenges, we propose a compression algorithm incorporating component pruning and component merging for GPs. Here SM components with small signal variance are removed, and a moment-matching merge method is proposed to further reduce the number of SM components. The main novelty of the proposed method is a similarity measure between SM components based on their normalized cross-correlation, which is related to the Bhattacharyya coefficient. We derive a greedy GP compression algorithm and perform a comparative evaluation over various learning tasks in terms of forecasting performance and compression capability. Results substantiate the beneficial effect of the method, both in terms of generalization and interpretability.11Source code:https://github.com/ck2019ML/Gaussian-Process-Model-Compression.