- Research Article
- 10.1214/26-ejs2499
- Jan 1, 2026
- Electronic Journal of Statistics
- Nina Dörnemann
We propose a two-sample test for large-dimensional covariance matrices in generalized elliptical models. The test statistic is based on a U-statistic estimator of the squared Frobenius norm of the difference between the two population covariance matrices. This statistic was originally introduced by Li and Chen (2012) for the independent component model. As a key theoretical contribution, we establish a new central limit theorem for the U-statistics under elliptical data, valid under both the null and alternative hypotheses. This result enables asymptotic control of the test level and facilitates a power analysis. To the best of our knowledge, the proposed test is the first such method to be supported by theoretical guarantees for elliptical data. Our approach imposes only mild assumptions on the covariance matrices and does not require sparsity nor explicit growth conditions on the dimension-to-sample-size ratio. We illustrate our theoretical findings through applications to both synthetic and real-world data.
- Research Article
- 10.1214/26-ejs2501
- Jan 1, 2026
- Electronic Journal of Statistics
- Yuki Takazawa + 1 more
The inference of evolutionary histories is a central problem in evolutionary biology. The analysis of a sample of phylogenetic trees can be conducted in Billera–Holmes–Vogtmann tree space, which is a CAT(0) metric space of phylogenetic trees. The globally non-positively curved (CAT(0)) property of this space enables the extension of various statistical techniques. In the problem of nonparametric density estimation, two primary methods, kernel density estimation and log-concave maximum likelihood estimation, have been proposed, yet their theoretical properties remain largely unexplored. In this paper, we address this gap by proving the consistency of these estimators in a more general setting—CAT(0) orthant spaces, which include BHV tree space. We extend log-concave approximation techniques to this setting and establish consistency via the continuity of the log-concave projection map. We also modify the kernel density estimator to correct boundary bias and establish uniform consistency using empirical process theory.
- Research Article
- 10.1214/26-ejs2493
- Jan 1, 2026
- Electronic Journal of Statistics
- Asad Lodhia + 3 more
We study a notion of positivity of Gaussian directed acyclic graphical models corresponding to a non-negativity constraint on the coefficients of the associated structural equation model. We prove that this constraint is equivalent to the distribution being conditionally increasing in sequence (CIS), a well-known subclass of positively associated random variables. These distributions require knowledge of a permutation, a CIS ordering, of the nodes for which the constraint of non-negativity holds. We provide an algorithm and prove in the noise-less setting that a CIS ordering can be recovered efficiently when it exists. We extend this result to the noisy setting and provide assumptions for recovering the CIS orderings. In addition, we provide a characterization of Markov equivalence for CIS DAG models. Further, we show that when a CIS ordering is known, the corresponding class of Gaussians lies in a family of distributions in which maximum likelihood estimation is a convex problem.
- Research Article
- 10.1214/26-ejs2516
- Jan 1, 2026
- Electronic Journal of Statistics
- The Tien Mai
- Research Article
- 10.1214/26-ejs2486
- Jan 1, 2026
- Electronic Journal of Statistics
- Shixiang Liu + 3 more
- Research Article
- 10.1214/25-ejs2480
- Jan 1, 2026
- Electronic Journal of Statistics
- Yifan Li
- Research Article
- 10.1214/26-ejs2484
- Jan 1, 2026
- Electronic Journal of Statistics
- Christof Schötz + 1 more
We noisily observe solutions of an ordinary differential equation u˙=f(u) at given times, where u lives in a d-dimensional state space. The model function f is unknown and belongs to a Hölder-type smoothness class with parameter β. For the nonparametric problem of estimating f, we provide lower bounds on the error in two complementary model specifications: the snake model with few, long observed solutions and the stubble model with many short ones. The lower bounds are minimax optimal in some settings. They depend on various parameters, which in the optimal asymptotic regime leads to the same rate for the squared error in both models: it is characterized by the exponent −2β∕(2(β+1)+d) for the total number of observations n. To derive these results, we establish a master theorem for lower bounds in general nonparametric regression problems, which makes the proofs more comparable and seems to be a useful tool for future use.
- Research Article
- 10.1214/26-ejs2482
- Jan 1, 2026
- Electronic Journal of Statistics
- Olympio Hacquard + 2 more
We consider a binary supervised learning classification problem where instead of having data in a finite-dimensional Euclidean space, we observe measures on a compact space X. Formally, we observe data DN=(μ1,Y1),…,(μN,YN) where μi is a measure on X and Yi is a label in {0,1}. Given a set F of base-classifiers on X, we build corresponding classifiers in the space of measures. We provide upper and lower bounds on the Rademacher complexity of this new class of classifiers that can be expressed simply in terms of corresponding quantities for the class F. If the measures μi are uniform over a finite set, this classification task boils down to a multi-instance learning problem. However, our approach allows more flexibility and diversity in the input data we can deal with. While this general framework has many possible applications, we particularly focus on classifying data via topological descriptors called persistence diagrams. These objects are discrete measures on R2, where the coordinates of each point correspond to the range of scales at which a topological feature exists. We will present several classifiers on measures and show how they can heuristically and theoretically enable a good classification performance in various settings in the case of persistence diagrams.
- Research Article
- 10.1214/26-ejs2485
- Jan 1, 2026
- Electronic Journal of Statistics
- Patrick Bastian + 1 more
In many change point problems it is reasonable to assume that compared to a benchmark at a given time point t0 the properties of the observed stochastic process change gradually over time for t>t0. Often, these gradual changes are not of interest as long as they are small (nonrelevant), but one is interested in the question if the deviations are practically significant in the sense that the deviation of the process compared to the time t0 (measured by an appropriate metric) exceeds a given threshold, which is of practical significance (relevant change). In this paper we develop novel and powerful change point analysis for detecting such deviations in a sequence of gradually varying means, which is compared with the average mean from a previous time period. Current approaches to this problem suffer from low power, rely on the selection of smoothing parameters and require a rather regular (smooth) development for the means. We develop a multiscale procedure that alleviates all these issues, validate it theoretically and demonstrate its good finite sample performance on both synthetic and real data.
- Research Article
- 10.1214/26-ejs2513
- Jan 1, 2026
- Electronic Journal of Statistics
- Rina Foygel Barber + 1 more