Abstract

In the field of chemometrics and other areas of data analysis the development of new methods for statistical inference and prediction is the focus of many studies. The requirement to document the properties of new methods is inevitable, and often simulated data are used for this purpose. However, when it comes to simulating data there are few standard approaches. In this paper we propose a very transparent and versatile method for simulating response and predictor data from a multiple linear regression model which hopefully may serve as a standard tool simulating linear model data. The approach uses the principle of a relevant subspace for prediction, which is known both from Partial Least Squares and envelope models, and is essentially based on a re-parametrization of the random x regression model. The approach also allows for defining a subset of relevant observable predictor variables spanning the relevant latent subspace, which is handy for exploring methods for variable selection. The data properties are defined by a small set of input-parameters defined by the analyst. The versatile approach can be used to simulate a great variety of data with varying properties in order to compare statistical methods. The method has been implemented in an R-package and its use is illustrated by examples.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.