Semi-supervised inference: General theory and estimation of means

Anru Zhang,T Tony Cai,Lawrence D Brown

doi:10.1214/18-aos1756

Abstract

We propose a general semi-supervised inference framework focused on the estimation of the population mean. As usual in semi-supervised settings, there exists an unlabeled sample of covariate vectors and a labeled sample consisting of covariate vectors along with real-valued responses (“labels”). Otherwise, the formulation is “assumption-lean” in that no major conditions are imposed on the statistical or functional form of the data. We consider both the ideal semi-supervised setting where infinitely many unlabeled samples are available, as well as the ordinary semi-supervised setting in which only a finite number of unlabeled samples is available. Estimators are proposed along with corresponding confidence intervals for the population mean. Theoretical analysis on both the asymptotic distribution and $\ell_{2}$-risk for the proposed procedures are given. Surprisingly, the proposed estimators, based on a simple form of the least squares method, outperform the ordinary sample mean. The simple, transparent form of the estimator lends confidence to the perception that its asymptotic improvement over the ordinary sample mean also nearly holds even for moderate size samples. The method is further extended to a nonparametric setting, in which the oracle rate can be achieved asymptotically. The proposed estimators are further illustrated by simulation studies and a real data example involving estimation of the homeless population.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semi-supervised inference: General theory and estimation of means

Abstract

Talk to us

Similar Papers

More From: The Annals of Statistics

Lead the way for us

Journal: The Annals of Statistics	Publication Date: Oct 1, 2019
Citations: 34

Similar Papers

Hybrid estimators for mean aboveground carbon per unit area
Ronald E Mcroberts ... James A Westfall
Forest Ecology and Management | VOL. 378
Ronald E Mcroberts, et. al.Ronald E Mcroberts ... James A Westfall
19 Jul 2016
Forest Ecology and Management | VOL. 378

Estimation of the offspring mean in a controlled branching process with a random control function
T.N Sriram ... I Del Puerto
Stochastic Processes and their Applications | VOL. 117
T.N Sriram, et. al.T.N Sriram ... I Del Puerto
27 Nov 2006
Stochastic Processes and their Applications | VOL. 117

Prediction of Brain Connectivity Map in Resting-State fMRI Data Using Shrinkage Estimator.
Atiye Nazari ... Elham Faghihzadeh
Basic and Clinical Neuroscience Journal | VOL. 10
Atiye Nazari, et. al.Atiye Nazari ... Elham Faghihzadeh
30 Oct 2018
Basic and Clinical Neuroscience Journal | VOL. 10

Inference on order restricted means of inverse Gaussian populations under heteroscedasticity
Anjana Mondal ... Somesh Kumar
Computational Statistics & Data Analysis | VOL. 194
Anjana Mondal, et. al.Anjana Mondal ... Somesh Kumar
23 Feb 2024
Computational Statistics & Data Analysis | VOL. 194

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semi-supervised inference: General theory and estimation of means

Abstract

Talk to us

Similar Papers

More From: The Annals of Statistics