Abstract

ABSTRACTThe maximum association between two multivariate variables and is defined as the maximal value that a bivariate association measure between one-dimensional projections and can attain. Taking the Pearson correlation as projection index results in the first canonical correlation coefficient. We propose to use more robust association measures, such as Spearman’s or Kendall’s rank correlation, or association measures derived from bivariate scatter matrices. We study the robustness of the proposed maximum association measures and the corresponding estimators of the coefficients yielding the maximum association. In the important special case of being univariate, maximum rank correlation estimators yield regression estimators that are invariant against monotonic transformations of the response. We obtain asymptotic variances for this special case. It turns out that maximum rank correlation estimators combine good efficiency and robustness properties. Simulations and a real data example illustrate the robustness and the power for handling nonlinear relationships of these estimators. Supplementary materials for this article are available online.

Highlights

  • Association between two univariate variables U and V can be measured in several ways

  • The maximum association measure we define in (1) is different from maximal correlation, since (i) we search for optimal linear combinations of the measured variables without transforming them, and (ii) we gain robustness by taking other choices for R than the Pearson correlation coefficient

  • For studying the robustness of the maximum association measures, we carry out influence calculations, showing that the projection index R should have a bounded and smooth influence function

Read more

Summary

Introduction

Association between two univariate variables U and V can be measured in several ways. There the aim is to find optimal measurable transformations of the variables in X and Y such that the first canonical correlation coefficient is maximized. The maximum association measure we define in (1) is different from maximal correlation, since (i) we search for optimal linear combinations of the measured variables without transforming them, and (ii) we gain robustness by taking other choices for R than the Pearson correlation coefficient. This article studies a multivariate association measure and is not aiming to provide a fully robustified version of canonical correlation analysis (CCA). A robust alternating regression technique has been used in Branco et al (2005) They used a projection pursuit-based algorithm to estimate the canonical variates, and they compared the different approaches by means of simulation studies. A fast implementation of this algorithm is made available for the statistical computing environment R in package ccaPP

Definitions and Basic Properties
Pearson Correlation
Spearman and Kendall Correlation
M-Association Derived From a Bivariate M-Scatter Matrix
Fisher Consistency and Influence Functions
Asymptotic Variances
Alternate Grid Algorithm
Example
Simulation Experiments
Effect of a Nonlinear Monotonic Transformation The true regression model is
Effect of Contamination
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call