Confidence intervals for the means of the selected populations

Claudio Fuentes,Martin T Wells,George Casella

doi:10.1214/17-ejs1374

Abstract

Consider an experiment in which $p$ independent populations $\pi_{i}$ with corresponding unknown means $\theta_{i}$ are available, and suppose that for every $1\leq i\leq p$, we can obtain a sample $X_{i1},\ldots,X_{in}$ from $\pi_{i}$. In this context, researchers are sometimes interested in selecting the populations that yield the largest sample means as a result of the experiment, and then estimate the corresponding population means $\theta_{i}$. In this paper, we present a frequentist approach to the problem and discuss how to construct simultaneous confidence intervals for the means of the $k$ selected populations, assuming that the populations $\pi_{i}$ are independent and normally distributed with a common variance $\sigma^{2}$. The method, based on the minimization of the coverage probability, obtains confidence intervals that attain the nominal coverage probability for any $p$ and $k$, taking into account the selection procedure.

Highlights

Given a set of p available features, researchers must often determine which one is the best, or rank them according to a certain prespecified criteria
Gupta and coauthors have pioneered the subset selection approach, in which a subset of populations is selected with a minimum probability guarantee of containing the largest mean with certain probability P ∗ [see 15]. Note that both of these approaches are mainly concerned with the problem of correct selection of the population with the largest mean rather than estimation of the selected mean. This second problem has been widely discussed in the literature, and in the following two sections we present a brief summary of the main findings, giving separate consideration to the point estimation and interval estimation procedures
Dahiya [12] addresses this problem for the case of two normal populations and proposed estimators that perform better in terms of mean squared error (MSE)

Summary

Introduction

Given a set of p available features, researchers must often determine which one is the best, or rank them according to a certain prespecified criteria. Instance, researchers may be interested in determining what treatment is more efficient in fighting a certain disease, or ranking the level of gene expression in a genomics experiment This type of problems is commonly referred to as ranking and selection procedures and specific solutions and methods have been proposed in the literature since the second half of the 20th century, with a start that is usually traced back to the pathbreaking works of Bechhofer [2] and Gupta & Sobel [16]. Gupta and coauthors have pioneered the subset selection approach, in which a subset of populations is selected with a minimum probability guarantee of containing the largest mean with certain probability P ∗ [see 15]. This second problem has been widely discussed in the literature, and in the following two sections we present a brief summary of the main findings, giving separate consideration to the point estimation and interval estimation procedures

Point estimation

Interval estimation

Coverage probability results

Selecting the best population

Selecting the top k populations

Post-selection confidence intervals

The unknown variance case

Numerical studies

Discussion

Lemma in Theorem 1

Findings

Proof of Theorem 2

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2018
Citations: 17	License type: cc-by

R Discovery Prime

R Discovery Prime

Confidence intervals for the means of the selected populations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

Inference from a sample mean--Part 1.
Nikolaos Pandis
American journal of orthodontics and dentofacial orthopedics : official publication of the American Association of Orthodontists, its constituent societies, and the American Board of Orthodontics | VOL. 147
Nikolaos PandisNikolaos Pandis
01 Jun 2015
01 Jun 2015

Confidence intervals for ranked means
Edward J Dudewicz
Naval Research Logistics Quarterly | VOL. 17
Edward J DudewiczEdward J Dudewicz
01 Mar 1970
Naval Research Logistics Quarterly | VOL. 17

Two-Sample Procedures in Simultaneous Estimation
W C Healy
The Annals of Mathematical Statistics | VOL. 27
W C HealyW C Healy
01 Sep 1956
The Annals of Mathematical Statistics | VOL. 27

Some Selection Problems Involving Folded Normal Distribution
M Haseeb Rizvi
Technometrics | VOL. 13
M Haseeb RizviM Haseeb Rizvi
01 May 1971
Technometrics | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Confidence intervals for the means of the selected populations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics