Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning

Tengyao Wang,Edgar Dobriban,Milana Gataric,Richard J Samworth

doi:10.1080/01621459.2024.2340792

Abstract

We propose a new method for high-dimensional semi-supervised learning problems based on the careful aggregation of the results of a low-dimensional procedure applied to many axis-aligned random projections of the data. Our primary goal is to identify important variables for distinguishing between the classes; existing low-dimensional methods can then be applied for final class assignment. To this end, we score projections according to their class-distinguishing ability; for instance, motivated by a generalized Rayleigh quotient, we can compute the traces of estimated whitened between-class covariance matrices on the projected data. This enables us to assign an importance weight to each variable for a given projection, and to select our signal variables by aggregating these weights over high-scoring projections. Our theory shows that the resulting Sharp-SSL algorithm is able to recover the signal coordinates with high probability when we aggregate over sufficiently many random projections and when the base procedure estimates the diagonal entries of the whitened between-class covariance matrix sufficiently well. For the Gaussian EM base procedure, we provide a new analysis of its performance in semi-supervised settings that controls the parameter estimation error in terms of the proportion of labeled data in the sample. Numerical results on both simulated data and a real colon tumor dataset support the excellent empirical performance of the method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning

Abstract

Talk to us

Similar Papers

More From: Journal of the American Statistical Association

Lead the way for us

Journal: Journal of the American Statistical Association	Publication Date: Apr 8, 2024
License type: CC BY 4.0

Similar Papers

Large-scale machine learning for classification and search
...
-
, et. al. ...
01 Jan 2012
01 Jan 2012

Semi-supervised transfer learning with hierarchical self-regularization
Xingjian Li ... Chengzhong Xu
Pattern Recognition | VOL. 144
Xingjian Li, et. al.Xingjian Li ... Chengzhong Xu
26 Jul 2023
Pattern Recognition | VOL. 144

Online Manifold Regularization: A New Learning Setting and Empirical Study
Andrew B Goldberg ... Ming Li
-
Andrew B Goldberg, et. al.Andrew B Goldberg ... Ming Li
15 Sep 2008
15 Sep 2008

LViT: Language Meets Vision Transformer in Medical Image Segmentation.
Zihan Li ... Qingde Li
IEEE Transactions on Medical Imaging | VOL. 43
Zihan Li, et. al.Zihan Li ... Qingde Li
01 Jan 2024
IEEE Transactions on Medical Imaging | VOL. 43

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sharp-SSL: Selective high-dimensional axis-aligned random projections for semi-supervised learning

Abstract

Talk to us

Similar Papers

More From: Journal of the American Statistical Association