Abstract

Multisample covariance estimation—that is, estimation of the covariance matrices associated with k distinct populations—is a classical problem in multivariate statistics. A common solution is to base estimation on the outcome of a test that these covariance matrices show some given pattern. Such a preliminary test may, for example, investigate whether or not the various covariance matrices are equal to each other (test of homogeneity), or whether or not they have common eigenvectors (test of common principal components), etc. Since it is usually unclear what the possible pattern might be, it is natural to consider a collection of such patterns, leading to a collection of preliminary tests, and to base estimation on the outcome of such a multiple testing rule. In the present work, we therefore study preliminary test estimation based on multiple tests. Since this is of interest also outside k-sample covariance estimation, we do so in a very general framework where it is only assumed that the sequence of models at hand is locally asymptotically normal. In this general setup, we define the proposed estimators and derive their asymptotic properties. We come back to k-sample covariance estimation to illustrate the asymptotic and finite-sample behaviors of our estimators. Finally, we treat a real data example that allows us to show their practical relevance in a supervised classification framework.

Highlights

  • The present paper is motivated by the problem of estimating the covariance matrices Σ1, . . . , Σk associated with k distinct p-dimensional populations

  • preliminary multiple-test estimator (PMTE) dominate their competitors in the vicinity of the considered constraints

  • To demonstrate the practical relevance of PMTEs, we mainly focus on the multisample covariance estimation problem that motivated this work (Section 4)

Read more

Summary

Introduction

The present paper is motivated by the problem of estimating the covariance matrices Σ1, . . . , Σk associated with k distinct p-dimensional populations. To provide an example in the above k-sample covariance estimation framework, let us factorize the k covariance matrices as Σ = σ2V := (det Σ )1/p{Σ /(det Σ )1/p} to emphasize their “scale” σ and their “shape” V With this notation, one may consider the constraints associated with the null hypotheses of scale homogeneity H0scale : σ12 = . The estimator in (1.2) is a PTE that involves two constraints, whose intersection is associated with the null hypothesis H0cov of homogeneity of the k covariance matrices. Combining the outcomes of tests for the three null hypotheses allows one to define a three-constraint PMTE of the same nature as in (1.2) Such an estimator formalizes the estimator practitioners would use in practice in the present k-sample covariance estimation setup.

Assumptions
Asymptotic results
PMTE based on scale and shape constraints
Comparing PMTE vs single-constraint PTEs
CPC and homogeneity of eigenvalues: a real data example
Final comments
A Comparison with BIC-based model selection
B Technical details
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call