The Deflation-Based FastICA Estimator: Statistical Analysis Revisited
This paper provides a rigorous statistical analysis of the deflation-based FastICA estimator, where the independent components (ICs) are extracted sequentially. The focus is on two aspects of the estimator: robustness against outliers as measured by the influence function (IF) and on its asymptotic relative efficiency (ARE) as measured by the ratio of the asymptotic variance of the FastICA w.r.t. the optimal maximum likelihood estimator (MLE). The derived compact closed-form expression of the IF reveals the vulnerability of the FastICA estimator to outliers regardless of the used nonlinearity. A cautionary finding is that even a moderate observation towards certain directions can render the estimator deficient in the sense that its separation performance degrades worse than a plain guess. The IF allows the derivation of a compact closed-form expression for the asymptotic covariance matrix of the FastICA estimator and subsequently its asymptotic relative efficiencies (AREs). The ARE figures calculated for some selected source distributions illustrate the fact that the order which the ICs are found is crucial as the accuracy of the previously extracted components can dominantly affect the accuracy of the successive deflation stages.
- Research Article
13
- 10.1080/10485250802447981
- Jan 1, 2009
- Journal of Nonparametric Statistics
Depth functions are increasingly being used in building nonparametric outlier detectors and in constructing useful nonparametric statistics such as depth-weighted L-statistics (DL-statistics). Robustness of a depth function is an essential property for such applications. Here, robustness of three key depth functions, spatial, simplicial, and generalised Tukey, is explored via the influence function (IF) approach. For all three depths, the IFs are derived and found to be bounded, an important robustness property, and are applied to evaluate two other robustness features, gross error sensitivity and local shift sensitivity. These IFs are also used as components of the IFs of associated DL-statistics, for which through a standard approach consistency and asymptotic normality are then derived. In turn, the asymptotic normality is applied to obtain asymptotic relative efficiencies (ARE). For spatial depth, two forms of weight function suggested in the recent literature are considered and AREs in comparison with the mean are obtained. For all three depths and one of these weight functions, finite sample REs are obtained by simulation under normal, contaminated normal, and heavy-tailed t distributions. As a technical tool of general interest, needed here with the simplicial depth, the IF of a general U-statistic is derived.
- Conference Article
10
- 10.1109/ssp.2009.5278485
- Aug 1, 2009
- 2009 IEEE/SP 15th Workshop on Statistical Signal Processing
This paper provides a rigorous statistical robustness analysis of the deflation-based FastICA estimator by deriving a compact closed form expression for its influence function (IF). The IF reveals the vulnerability of the FastICA estimator to outliers regardless of the used non-linearity. A cautionary finding is that even a moderate outlier towards certain directions can render the estimator deficient, i.e. having a separation performance worse than a plain guess. Based on the IF, a novel compact closed-form expression of the asymptotic covariance matrix of the FastICA estimator is also derived.
- Research Article
65
- 10.1111/j.0006-341x.1999.00338.x
- Jun 1, 1999
- Biometrics
Misclassification of exposure variables is a common problem in epidemiologic studies. This paper compares the matrix method (Barron, 1977, Biometrics 33, 414-418; Greenland, 1988a, Statistics in Medicine 7, 745-757) and the inverse matrix method (Marshall, 1990, Journal of Clinical Epidemiology 43, 941-947) to the maximum likelihood estimator (MLE) that corrects the odds ratio for bias due to a misclassified binary covariate. Under the assumption of differential misclassification, the inverse matrix method is always more efficient than the matrix method; however, the efficiency depends strongly on the values of the sensitivity, specificity, baseline probability of exposure, the odds ratio, case-control ratio, and validation sampling fraction. In a study on sudden infant death syndrome (SIDS), an estimate of the asymptotic relative efficiency (ARE) of the inverse matrix estimate was 0.99, while the matrix method's ARE was 0.19. Under nondifferential misclassification, neither the matrix nor the inverse matrix estimator is uniformly more efficient than the other; the efficiencies again depend on the underlying parameters. In the SIDS data, the MLE was more efficient than the matrix method (ARE = 0.39). In a study investigating the effect of vitamin A intake on the incidence of breast cancer, the MLE was more efficient than the matrix method (ARE = 0.75).
- Front Matter
142
- 10.1016/j.ajo.2008.06.031
- Mar 25, 2009
- American Journal of Ophthalmology
Nonparametric vs Parametric Tests of Location in Biomedical Research
- Supplementary Content
- 10.1080/00949650410001660793
- Jan 1, 2005
- Journal of Statistical Computation and Simulation
A nonlinear discriminant rule may be estimated by maximum likelihood estimation using unclassified observations. The performance of a nonlinear discriminant function based on a sample from a mixture of two Weibull distributions, with parameters λ, θ1, θ2 and p, is examined. Asymptotic expansion and asymptotic expected values of probabilities of misclassification are presented. The asymptotic relative efficiencies (AREs) of mixture and classified discrimination procedures are evaluated and discussed for selected parameters. Computations show that for fixed λ and p, as Δ = | θ1 − θ2| increases the ARE increases. Furthermore, for fixed λ and Δ, as p varies from 0.2 to 0.8 the values of ARE decrease. On the other hand, for fixed p and Δ, the ARE in case of λ = 0.5 are close to the ARE in the case of λ = 2.
- Research Article
12
- 10.1109/tit.2013.2249182
- Jul 1, 2013
- IEEE Transactions on Information Theory
It is demonstrated that the sampling distributions of the maximum likelihood (ML) estimator and its Studentized statistic for the generalized Gaussian distribution do not pass the most powerful normality tests even for fairly large sample sizes. This disagreement with what the standard large sample ML theory predicts and the computational burden of having to deal with its associated polygamma functions motivate the consideration of a competing convexity-based estimator. The asymptotic normality of this estimator is derived. It is shown that the competing estimator is almost as efficient as the ML estimator and its asymptotic relative efficiency to the ML estimator is equal to 1 in the limit as the shape parameter approaches zero. More important, its asymptotic distribution admits an exact variance stabilizing transformation, whereas the asymptotic variance function of the ML estimator does not have a closed-form variance stabilizing transformation. The exact transformation is a composition of the inverse hyperbolic cotangent and square root functions. Besides stabilizing the variance, the inverse hyperbolic cotangent and square root transformation is remarkably effective for symmetrizing and normalizing the sampling distribution of the estimator and hence improving the standard normal approximation. Furthermore, this simple transformation provides a quite accurate approximation to the non-closed-form variance stabilizing transformation of the ML estimator.
- Book Chapter
3
- 10.1007/978-3-0348-8326-9_4
- Jan 1, 2001
Chemical concentration data are almost always left censored and often-contain a few large outliers. This complicates the estimation of location and scale. To analyze such data sets, we propose a family of M-estimators for censored data, which include the maximum likelihood estimates of location and scale for censored t-distributions. Unlike the uncensored case, we note that the location M-estimators are not consistent under the censored normal model, and so a modification to them is introduced in order to obtain consistency at the censored normal model. Since a large class of M-estimators for censored data can be computed via an EM-algorithm, their computations are not considerably more complicated than the computations of the maximum likelihood estimates under the censored normal distribution. The asymptotic relative efficiency, influence function and simulations using contaminated censored normal distributions demonstrate the robustness and efficiency properties of the estimators. From these results we conclude that almost nothing is sacrificed but much is gained by using M-estimators, especially when a fair proportion of the data lies below the detection. Finally, our methods are applied to an example involving nitrate concentrations in well water. This example demonstrates the advantages of using M-estimators with redescending influence functions.KeywordsDetection LimitMaximum Likelihood EstimateRobust EstimationInfluence FunctionTobit ModelThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
- Research Article
7
- 10.1016/0898-1221(95)00072-7
- Jul 1, 1995
- Computers & Mathematics with Applications
The efficiency of a nonlinear discriminant function based on unclassified initial samples from a mixture of two Burr type XII populations
- Research Article
1
- 10.2174/1876527001809010026
- Dec 28, 2018
- The Open Statistics & Probability Journal
Introduction:The score statisticZ(θ)and the maximin efficient robust test statisticZMERTare commonly used in genetic association study, but according to our knowledge there is no formal comparison of them.Methods:In this report, we compare the asymptotic behavior ofZ(θ)andZMERT, by computing their Asymptotic Relative Efficiencies (AREs) relative to each other. Four commonly used ARE measures, the Pitman ARE, Chernoff ARE, Hodges-Lehmann ARE and the Bahadur ARE are considered. Some modifications of these methods are made to simplify the computations. We found that the Chernoff, Hodges-Lehmann and Bahadur AREs are suitable for our setting.Results and Conclusion:Based on our study, the efficiencies of the two test statistic varies for different criterion used, and for different parameter values under the same criterion, so each test has its advantages and dis-advantages according to the criterion used and the parameters involved, which are described in the context. Numerical examples are given to illustrate the use of the two statistics in genetic association study.
- Research Article
19
- 10.1016/j.jmva.2008.08.004
- Aug 24, 2008
- Journal of Multivariate Analysis
Signed-rank tests for location in the symmetric independent component model
- Research Article
6
- 10.1080/03610928508829045
- Jan 1, 1985
- Communications in Statistics - Theory and Methods
Pseudo maximum likelihood estimation (PML) for the Dirich-let-multinomial distribution is proposed and examined in this pa-per. The procedure is compared to that based on moments (MM) for its asymptotic relative efficiency (ARE) relative to the maximum likelihood estimate (ML). It is found that PML, requiring much less computational effort than ML and possessing considerably higher ARE than MM, constitutes a good compromise between ML and MM. PML is also found to have very high ARE when an estimate for the scale parameter in the Dirichlet-multinomial distribution is all that is needed.
- Research Article
11
- 10.1002/gepi.3
- Mar 12, 2001
- Genetic epidemiology
We compare the asymptotic relative efficiency (ARE) of different study designs for estimating gene and gene-environment interaction effects using matched case-control data. In the sampling schemes considered, cases are selected differentially based on their family history of disease. Controls are selected either from unrelated subjects or from among the case's unaffected siblings and cousins. Parameters are estimated using weighted conditional logistic regression, where the likelihood contributions for each subject are weighted by the fraction of cases sampled sharing the same family history. Results showed that compared to random sampling, over-sampling cases with a positive family history increased the efficiency for estimating the main effect of a gene for sib-control designs (103-254% ARE) and decreased efficiency for cousin-control and population-control designs (68-94% ARE and 67-84% ARE, respectively). Population controls and random sampling of cases were most efficient for a recessive gene or a dominant gene with an relative risk less than 9. For estimating gene-environment interactions, over-sampling positive-family-history cases again led to increased efficiency using sib controls (111-180% ARE) and decreased efficiency using population controls (68-87% ARE). Using case-cousin pairs, the results differed based on the genetic model and the size of the interaction effect; biased sampling was only slightly more efficient than random sampling for large interaction effects under a dominant gene model (relative risk ratio = 8, 106% ARE). Overall, the most efficient study design for studying gene-environment interaction was the case-sib-control design with over-sampling of positive-family-history-cases.
- Research Article
- 10.2307/2289320
- Sep 1, 1988
- Journal of the American Statistical Association
Hammersley (1950) considered, among other matters, the asymptotic relative efficiency (ARE) of the rounded sample median M ε with respect to the rounded sample mean as estimate of a Normal population mean restricted to a uniform grid of mesh size 2ε. This article extends Hammersley's work to a certain class of two-sided extended increasing failure rate (TEIFR) distributions for which the (grid-valued) population mean and median coincide, their common value designated as μ. The ARE of M ε with respect to , as estimators of μ, is examined for our class via the theory of large deviations. The role of the TEIFR assumption is simply to ensure that the tails of the distribution of X i − μ fall off quickly enough to make comparison of asymptotic probabilities of large (beyond ε) deviations of the location-normalized sample median M − μ and mean relevant to the comparison of their asymptotic variances. Even within our somewhat narrow class, we find the ARE of M ε with respect to surprisingly sensitive to distribution shape, as well as to grid mesh size and the actual definition of ARE. Among our findings is that, in the symmetric TEIFR class, the ARE of M ε with respect to is continuous in ε at ε = 0 under a definition of ARE closely related to the commonly used limiting ratio of equivalent sample sizes, but it is not continuous at ε = 0 under Hammersley's definition of ARE. A related finding is that, within the TEIFR class, the asymptotic effective variance [in the sense of Bahadur (1960)] of the sample median M equals its asymptotic variance as usually defined. Another finding is that, in the case of the Laplace distribution, M ε is asymptotically more efficient than , as estimator of the grid-valued population center μ, when the grid is fine (ε small), but it is asymptotically less efficient when the grid is coarse (ε large). All of these findings stem from the comparison of large-deviation rates that are equally relevant to the comparisons of asymptotic error rates of certain tests using and M as test statistics, a matter mentioned briefly in the last section.
- Research Article
6
- 10.1016/s0047-259x(02)00062-3
- Mar 6, 2003
- Journal of Multivariate Analysis
Asymptotic relative Pitman efficiency in group models
- Research Article
37
- 10.1111/1467-9868.00344
- Aug 1, 2002
- Journal of the Royal Statistical Society Series B: Statistical Methodology
SummaryA new estimator of the regression parameters is introduced in a multivariate multiple-regression model in which both the vector of explanatory variables and the vector of response variables are assumed to be random. The affine equivariant estimate matrix is constructed using the sign covariance matrix (SCM) where the sign concept is based on Oja's criterion function. The influence function and asymptotic theory are developed to consider robustness and limiting efficiencies of the SCM regression estimate. The estimate is shown to be consistent with a limiting multinormal distribution. The influence function, as a function of the length of the contamination vector, is shown to be linear in elliptic cases; for the least squares (LS) estimate it is quadratic. The asymptotic relative efficiencies with respect to the LS estimate are given in the multivariate normal as well as the t-distribution cases. The SCM regression estimate is highly efficient in the multivariate normal case and, for heavy-tailed distributions, it performs better than the LS estimate. Simulations are used to consider finite sample efficiencies with similar results. The theory is illustrated with an example.