Pvclass: An R Package for p Values for Classification

Niki Zumbrunnen,Lutz Dümbgen

doi:10.18637/jss.v078.i04

Abstract

Let (X, Y) be a random variable consisting of an observed feature vector X and an unobserved class label Y ∈ {1, 2, . . . , L} with unknown joint distribution. In addition, let D be a training data set consisting of n completely observed independent copies of (X, Y). Instead of providing point predictors (classifiers) for Y , we compute for each b ∈ {1, 2, . . . , L} a p value π_b (X, D) for the null hypothesis that Y = b, treating Y temporarily as a fixed parameter, i.e., we construct a prediction region for Y with a certain confidence. The advantages of this approach over more traditional ones are reviewed briefly. In principle, any reasonable classifier can be modified to yield nonparametric p values. We describe the R package pvclass which computes nonparametric p values for the potential class memberships of new observations as well as cross-validated p values for the training data. Additionally, it provides graphical displays and quantitative analyses of the p values.

Highlights

Let (X, Y ) be a pair of random variables, consisting of an observed feature vector X with values in a feature space X and an unobserved class label Y ∈ Y := {1, 2, . . . , L} with L ≥ 2 possible values
In the sequel we provide a brief introduction to the particular paradigm of p values as introduced by Dümbgen, Igl, and Munk (2008)
It is closely related to Neyman-Pearson classification, see Scott (2007), Zhao, Feng, Wang, and Tong (2015) and the references cited therein

Summary

Introduction

Let (X, Y ) be a pair of random variables, consisting of an observed feature vector X with values in a feature space X and an unobserved class label Y ∈ Y := {1, 2, . . . , L} with L ≥ 2 possible values. Let (X, Y ) be a pair of random variables, consisting of an observed feature vector X with values in a feature space X and an unobserved class label Y ∈ Y := {1, 2, . Our aim is inference about Y with a given confidence, based on X and certain training data. In the sequel we provide a brief introduction to the particular paradigm of p values as introduced by Dümbgen, Igl, and Munk (2008). It is closely related to Neyman-Pearson classification, see Scott (2007), Zhao, Feng, Wang, and Tong (2015) and the references cited therein

From classifiers to p values

Example

Optimal p values as benchmark

Training data and nonparametric p values

Cross-validated p values and ROC functions

Data example buerk

Choices of test statistics

Plug-in estimator for standard model

Nearest neighbors and weighted nearest neighbors

Penalized multicategory logistic regression

Implementation and main functions

Classify new observations

Cross-validated p values

Choice of tuning parameters

Numerical examples

Findings

Relation to other classifiers and packages

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of statistical software	Publication Date: Jan 1, 2017
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

Pvclass: An R Package for p Values for Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of statistical software

Lead the way for us

Similar Papers

A method for adequate selection of training data sets to reconstruct seismic data using a convolutional U-Net
Jiho Park ... Jihun Choi
GEOPHYSICS | VOL. 86
Jiho Park, et. al.Jiho Park ... Jihun Choi
18 Aug 2021
GEOPHYSICS | VOL. 86

P-values for classification - Computational aspects and asymptotics

-

01 Jan 2014
01 Jan 2014

Detection of Phishing Webpages Using Heterogeneous Transfer Learning
Karl R Weiss ... Taghi M Khoshgoftaar
-
Karl R Weiss, et. al.Karl R Weiss ... Taghi M Khoshgoftaar
01 Oct 2017
01 Oct 2017

Analysis of Individual Loan Defaults Using Logit under Supervised Machine Learning Approach
Dominic M Obare ... Moses M Muraya
Asian Journal of Probability and Statistics | VOL. -
Dominic M Obare, et. al.Dominic M Obare ... Moses M Muraya
01 May 2019
Asian Journal of Probability and Statistics | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Pvclass: An R Package for p Values for Classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of statistical software