Abstract

This paper considers the statistical problems of editing and imputing data of multiple time series generated by repetitive surveys. The case under study is that of the Survey of Cattle Slaughter in Mexico’s Municipal Abattoirs. The proposed procedure consists of two phases; firstly the data of each abattoir are edited to correct them for gross inconsistencies. Secondly, the missing data are imputed by means of restricted forecasting. This method uses all the historical and current information available for the abattoir, as well as multiple time series models from which efficient estimates of the missing data are obtained. Some empirical examples are shown to illustrate the usefulness of the method in practice.

Highlights

  • The National Institute of Statistics and Geography (INEGI) carries out the Survey of Cattle Slaughter in Mexico’s Municipal Abattoirs (ESGRM for its name in Spanish)

  • Even though INEGI puts a lot of effort to collect and publish trustworthy data, it is a fact that the quality of some statistical figures published by this official statistical agency can still be greatly improved

  • Such is the case of the data generated by the ESGRM, since this survey presents the typical problems of: (1) inconsistency of the collected data, and (2) missing data

Read more

Summary

Introduction

Even though INEGI puts a lot of effort to collect and publish trustworthy data, it is a fact that the quality of some statistical figures published by this official statistical agency can still be greatly improved Such is the case of the data generated by the ESGRM, since this survey presents the typical problems of: (1) inconsistency of the collected data (the informant at the abattoir responded to the questionnaire, but the answers are not considered valid by some criteria used to verify the information), and (2) missing data (at least one of the variables lacks its value requested in the questionnaire). All these works suggest building an Auto-Regressive Integrated Moving Average (ARIMA) model for the available data, use all the data (observed before and after the missing ones) to get efficient estimates of the unobserved values.

Preliminary Data Analysis
Definition of concepts
Transformation of variables
Statistical Methodology
VAR model
Restricted forecasts
Forecasts in the original scale
Compatibility test
Restricted forecasts for the ESGRM
Building the VAR Model
Degree of differencing
Likelihood ratio tests
Model validation
Application of the Method
Edition of data
Edition and imputation of missing or inconsistent data
Simulation
19 NA NA O
Findings
Final Considerations
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call