Abstract

The global positioning system (GPS) can provide the daily coordinate time series to help geodesy and geophysical studies. However, due to logistics and malfunctioning, missing values are often “seen” in GPS time series, especially in polar regions. Acquiring a consistent and complete time series is the prerequisite for accurate and reliable statical analysis. Previous imputation studies focused on the temporal relationship of time series, and only a few studies used spatial relationships and/or were based on machine learning methods. In this study, we impute 20 Greenland GPS time series using missForest, which is a new machine learning method for data imputation. The imputation performance of missForest and that of four traditional methods are assessed, and the methods’ impacts on principal component analysis (PCA) are investigated. Results show that missForest can impute more than a 30-day gap, and its imputed time series has the least influence on PCA. When the gap size is 30 days, the mean absolute value of the imputed and true values for missForest is 2.71 mm. The normalized root mean squared error is 0.065, and the distance of the first principal component is 0.013. missForest outperforms the other compared methods. missForest can effectively restore the information of GPS time series and improve the results of related statistical processes, such as PCA analysis.

Highlights

  • Modern data measurement and acquisition consistently encounter the problem of missing data

  • We verify the effectiveness of OOB in global positioning system (GPS) time series imputation

  • When the gap size increases to 7 days, the results of orthogonal polynomial worsen, and its MAE and NRMSE values increase to 6.27 and

Read more

Summary

Introduction

Modern data measurement and acquisition consistently encounter the problem of missing data. Due to logistics and malfunctioning, missing values are often “seen” in GPS daily time series [1]. Most conventional time series analysis methods, such as wavelet transform [2], principal/independent component analysis [3,4,5], and spectrum analysis [6], require non-missing data. This requirement forces geodetic researchers who wish to perform further analysis of GPS time series to select between imputing or discarding missing data. We use the term “imputation” instead of the commonly used term “interpolation” in geodetic studies and adopt the definition that the former is meant to fill in missing values in the dataset, whereas the latter predicts values at unsampled locations [8]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.