Abstract

The objective of this study was to evaluate the impact of three different methods of handling missing data on the performance of Phase I Hotelling’s T 2 multivariate control chart. Using a Monte Carlo simulation, we studied the average, median, and standard deviation of the run length performance of multivariate data imputed using mean substitution, regression imputation, and predictive mean matching at three different levels of missingness ( 1 % , 10 % , and 25 % ) and three levels of variable correlation coefficients (0.2, 0.4, and 0.8). We found that predictive mean matching has average run length performance results comparable to that of the complete in-control data set at all levels of missingness and variable correlation, while the performance of mean substitution was adversely affected by high levels of missingness and by strong variable correlation. Based on the simulation (multivariate normal data), we concluded that predictive mean matching is superior to both regression imputation and mean substitution as a method for imputing missing values for the analysis of Phase I Hotelling’s T 2 control chart. Two applications were presented using the Altenrhein wastewater treatment plant and Olive oil datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.