Theory Of U-statistics Research Articles

The change in area under the curve (∆AUC), the integrated discrimination improvement (IDI), and net reclassification index (NRI) are commonly used measures of risk prediction model performance. Some authors have reported good validity of associated methods of estimating their standard errors (SE) and construction of confidence intervals, whereas others have questioned their performance. To address these issues, we unite the ∆AUC, IDI, and three versions of the NRI under the umbrella of the U-statistics family. We rigorously show that the asymptotic behavior of ∆AUC, NRIs, and IDI fits the asymptotic distribution theory developed for U-statistics. We prove that the ∆AUC, NRIs, and IDI are asymptotically normal, unless they compare nested models under the null hypothesis. In the latter case, asymptotic normality and existing SE estimates cannot be applied to ∆AUC, NRIs, or IDI. In the former case, SE formulas proposed in the literature are equivalent to SE formulas obtained from U-statistics theory if we ignore adjustment for estimated parameters. We use Sukhatme-Randles-deWet condition to determine when adjustment for estimated parameters is necessary. We show that adjustment is not necessary for SEs of the ∆AUC and two versions of the NRI when added predictor variables are significant and normally distributed. The SEs of the IDI and three-category NRI should always be adjusted for estimated parameters. These results allow us to define when existing formulas for SE estimates can be used and when resampling methods such as the bootstrap should be used instead when comparing nested models. We also use the U-statistic theory to develop a new SE estimate of ∆AUC. Copyright © 2017 John Wiley & Sons, Ltd.

Read full abstract

We compare the performance of relative densities of two parameterized random geometric digraph families called proximity catch digraphs (PCDs) in testing bivariate spatial patterns. These PCD families are proportional edge (PE) and central similarity (CS) PCDs and are defined with proximity regions based on relative positions of data points from two classes. The relative densities of these PCDs were previously used as statistics for testing segregation and association patterns against complete spatial randomness. The relative density of a digraph, D, with n vertices (i.e., with order n) represents the ratio of the number of arcs in D to the number of arcs in the complete symmetric digraph of the same order. When scaled properly, the relative density of a PCD is a U-statistic; hence, it has asymptotic normality by the standard central limit theory of U-statistics. The PE- and CS-PCDs are defined with an expansion parameter that determines the size or measure of the associated proximity regions. In this article, we extend the distribution of the relative density of CS-PCDs for expansion parameter being larger than one, and compare finite sample performance of the tests by Monte Carlo simulations and asymptotic performance by Pitman asymptotic efficiency. We find the optimal expansion parameters of the PCDs for testing each alternative in finite samples and in the limit as the sample size tending to infinity. As a result of our comparisons, we demonstrate that in terms of empirical power (i.e., for finite samples) relative density of CS-PCD has better performance (which occurs for expansion parameter values larger than one) for the segregation alternative, while relative density of PE-PCD has better performance for the association alternative. The methods are also illustrated in a real-life data set from plant ecology.

Read full abstract

Theory Of U-statistics Research Articles

Articles published on Theory Of U-statistics

Asymptotic distribution of ∆AUC, NRIs, and IDI based on theory of U-statistics.

Inverse probability weighting estimation of the volume under the ROC surface in the presence of verification bias.

Testing homogeneity of several covariance matrices and multi-sample sphericity for high-dimensional data under non-normality

Rank-based kernel estimation of the area under the ROC curve

Adjusting for unmeasured confounding due to either of two crossed factors with a logistic regression model.

An Accurate Kernelized Energy Detection in Gaussian and non-Gaussian/Impulsive Noises

Tests for high-dimensional covariance matrices using the theory of U-statistics

Comparison of relative density of two random geometric digraph families in testing spatial clustering

Tests of Covariance Matrices for High Dimensional Multivariate Data Under Non Normality

On asymptotic distribution of sample central moments in normal-uniform distribution

Averaged shifted chi-square test

Resolving statistical uncertainty in correlation dimension estimation

The distribution of the relative arc density of a family of interval catch digraph based on uniform data

Asymptotic Distribution of Coefficients of Skewness and Kurtosis

Nonparametric statistical inference method for partial areas under receiver operating characteristic curves, with application to genomic studies

Association testing by haplotype-sharing methods applicable to whole-genome analysis

ROC Graphs for Assessing the Ability of a Diagnostic Marker to Detect Three Disease Classes with an Umbrella Ordering

Rank methods for the analysis of clustered data in diagnostic trials

Concordance probability and discriminatory power in proportional hazards regression

Feasibility of real-time calculation of correlation integral derived statistics applied to EEG time series

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Theory Of U-statistics Research Articles

Articles published on Theory Of U-statistics

Asymptotic distribution of ∆AUC, NRIs, and IDI based on theory of U-statistics.

Inverse probability weighting estimation of the volume under the ROC surface in the presence of verification bias.

Testing homogeneity of several covariance matrices and multi-sample sphericity for high-dimensional data under non-normality

Rank-based kernel estimation of the area under the ROC curve

Adjusting for unmeasured confounding due to either of two crossed factors with a logistic regression model.

An Accurate Kernelized Energy Detection in Gaussian and non-Gaussian/Impulsive Noises

Tests for high-dimensional covariance matrices using the theory of U-statistics

Comparison of relative density of two random geometric digraph families in testing spatial clustering

Tests of Covariance Matrices for High Dimensional Multivariate Data Under Non Normality

On asymptotic distribution of sample central moments in normal-uniform distribution

Averaged shifted chi-square test

Resolving statistical uncertainty in correlation dimension estimation

The distribution of the relative arc density of a family of interval catch digraph based on uniform data

Asymptotic Distribution of Coefficients of Skewness and Kurtosis

Nonparametric statistical inference method for partial areas under receiver operating characteristic curves, with application to genomic studies

Association testing by haplotype-sharing methods applicable to whole-genome analysis

ROC Graphs for Assessing the Ability of a Diagnostic Marker to Detect Three Disease Classes with an Umbrella Ordering

Rank methods for the analysis of clustered data in diagnostic trials

Concordance probability and discriminatory power in proportional hazards regression

Feasibility of real-time calculation of correlation integral derived statistics applied to EEG time series