Abstract

Missing observations may present several problems for statistical analyses on datasets if they are not accounted for. This paper concerns a model-based missing data analysis procedure to estimate the parameters of regression models fit to datasets with missing observations. Both autoregressive-exogenous (ARX) and autoregressive (AR) models are considered. These models are both used to simulate datasets, and are fit to existing structural vibration data, after which observations are removed. A missing data analysis is performed using maximum-likelihood estimation, the expectation maximization (EM) algorithm, and the Kalman filter to fill in missing observations and regression parameters, and compare them to estimates for the complete datasets. Regression parameters from these fits to structural vibration data can thereby be used as damage-sensitive features. Favorable conditions for accurate parameter estimation are found to include lower percentages of missing data, parameters of similar magnitude with one another, and selected model orders similar to those true to the dataset. Favorable conditions for dataset reconstruction are found to include random and periodic missing data patterns, lower percentages of missing data, and proper model order selection. The algorithm is particularly robust to varied noise levels.

Highlights

  • A fundamental task in a variety of fields is extracting useful statistical information from time series data

  • We do not explore loss of the entire dataset, as the algorithm requires at least a portion to be run. This algorithm joins a relatively minor list of those dedicated to regression models, and ARX models, subject to missing observations

  • An algorithm was proposed to identify parameters of both ARX and AR models fit to datasets with missing observations

Read more

Summary

Introduction

A fundamental task in a variety of fields is extracting useful statistical information from time series data. In certain applications, the datasets are faced with the possibility of missing measurements. These may result from network communication disruptions, malfunctioning sensing equipment, improper sampling protocol, or observation patterns inherent to the data collection schemes (Little and Rubin, 2002; Matarazzo and Pakzad, 2015). Missing data may present several problems for statistical analyses conducted and decisions made as a result of those analyses. If missing value indicators are not present in a data analysis package, inferences about the system being sampled can be biased. Similar biased inferences may result if missing observations are ignored, if an observation’s missingness is a function of ARX Parameter Estimation—Missing Data its value, for example when observations are uncharacteristically orders-of-magnitude atypical. From a cost perspective, it is not desirable to spend time and resources collecting data that eventually goes unused

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call