Abstract

In recent years, the research into cluster-weighted models has been intense. However, estimating the covariance matrix of the maximum likelihood estimator under a cluster-weighted model is still an open issue. Here, an approach is developed in which information-based estimators of such a covariance matrix are obtained from the incomplete data log-likelihood of the multivariate Gaussian linear cluster-weighted model. To this end, analytical expressions for the score vector and Hessian matrix are provided. Three estimators of the asymptotic covariance matrix of the maximum likelihood estimator, based on the score vector and Hessian matrix, are introduced. The performances of these estimators are numerically evaluated using simulated datasets in comparison with a bootstrap-based estimator; their usefulness is illustrated through a study aiming at evaluating the link between tourism flows and attendance at museums and monuments in two Italian regions.

Highlights

  • Cluster-weighted models constitute an approach to regression analysis with random covariates in which supervised and unsupervised learning methods are jointly exploited (Hastie et al, 2009)

  • In order to evaluate the properties of Cov1(θ ), Cov2(θ ) and Cov3(θ ) in comparison with the estimators based on the parametric bootstrap and the approach implemented in flexCWM, five Monte Carlo studies have been performed

  • As already mentioned in the “Introduction”, three information-based estimators of the covariance matrix of the maximum likelihood (ML) estimator for finite normal mixture models were developed by Boldea and Magnus (2009): two of them are based on the gradient vector and the Hessian matrix of the incomplete log-likelihood under a normal mixture model; the third estimator exploits the sandwich approach

Read more

Summary

Introduction

Cluster-weighted models constitute an approach to regression analysis with random covariates in which supervised (regression) and unsupervised (model-based cluster analysis) learning methods are jointly exploited (Hastie et al, 2009). The overall computational process associated with the use of bootstrap techniques can become time-consuming and complex because of difficulties typically associated with the fitting of finite mixture models (e.g. label-switching problems, possible convergence failures of the EM algorithm on the bootstrap samples) These inconveniences could be avoided through an approach in which the observed information matrix is obtained from the incomplete data log-likelihood and employed to compute information-based estimators of the covariance matrix of the ML estimator In order to make it possible to properly assess both the variability of and the covariance between ML estimates of all the parameters under multivariate linear normal clusterweighted models with a multivariate response, the gradient vector and second-order derivative matrix of the incomplete data log-likelihood for these models are explicitly derived here These results are used to obtain three estimators of the observed information matrix and the covariance matrix of the ML estimator. Technical details and additional results from the analysis of simulated datasets can be found in a separate document as supplementary materials

Score Vector and Hessian Matrix of Gaussian Linear Cluster-Weighted Models
Covariance Matrix Estimation of the ML Estimator
Numerical Study of the Properties of the Proposed Estimators
A Comparison with Some Estimators Under Normal Mixtures
Analysing Regional Tourism Data in Italy
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.