Error control and Neyman–Pearson classification with buffered probability and support vectors
Error control and Neyman–Pearson classification with buffered probability and support vectors
- Book Chapter
2
- 10.1007/978-3-319-41279-5_4
- Jan 1, 2016
The Neyman–Pearson (NP) classification paradigm addresses an important binary classification problem where users want to minimize type II error while controlling type I error under some specified level α, usually a small number. This problem is often faced in many genomic applications involving binary classification tasks. The terminology Neyman–Pearson classification paradigm arises from its connection to the Neyman–Pearson paradigm in hypothesis testing. The NP paradigm is applicable when one type of error (e.g., type I error) is far more important than the other type (e.g., type II error), and users have a specific target bound for the former. In this chapter, we review the NP classification literature, with a focus on the genomic applications as well as our contribution to the NP classification theory and algorithms. We also provide simulation examples and a genomic case study to demonstrate how to use the NP classification algorithm in practice.
- Research Article
- 10.1093/imaiai/iaae010
- Apr 1, 2024
- Information and Inference: A Journal of the IMA
We propose a universal classifier for binary Neyman–Pearson classification where the null distribution is known, while only a training sequence is available for the alternative distribution. The proposed classifier interpolates between Hoeffding’s classifier and the likelihood ratio test and attains the same error probability prefactor as the likelihood ratio test, i.e. the same prefactor as if both distributions were known. In addition, such as Hoeffding’s universal hypothesis test, the proposed classifier is shown to attain the optimal error exponent tradeoff attained by the likelihood ratio test whenever the ratio of training to observation samples exceeds a certain value. We propose a lower bound and an upper bound to the optimal training to observation ratio. In addition, we propose a sequential classifier that attains the optimal error exponent tradeoff.
- Research Article
59
- 10.1109/tit.2007.901152
- Aug 1, 2007
- IEEE Transactions on Information Theory
In the Neyman-Pearson (NP) classification paradigm, the goal is to learn a classifier from labeled training data such that the probability of a false negative is minimized while the probability of a false positive is below a user-specified level alpha isin (0,1). This work addresses the question of how to evaluate and compare classifiers in the NP setting. Simply reporting false positives and false negatives leaves some ambiguity about which classifier is best. Unlike conventional classification, however, there is no natural performance measure for NP classification. We cannot reject classifiers whose false positive rate exceeds a since, among other reasons, the false positive rate must be estimated from data and hence is not known with certainty. We propose two families of performance measures for evaluating and comparing classifiers and suggest one criterion in particular for practical use. We then present general learning rules that satisfy performance guarantees with respect to these criteria. As in conventional classification, the notion of uniform convergence plays a central role, and leads to finite sample bounds, oracle inequalities, consistency, and rates of convergence. The proposed performance measures are also applicable to the problem of anomaly prediction.
- Research Article
76
- 10.1016/s0893-6080(03)00086-8
- May 13, 2003
- Neural Networks
Radial basis function neural networks for nonlinear Fisher discrimination and Neyman–Pearson classification
- Research Article
80
- 10.1126/sciadv.aao1659
- Feb 2, 2018
- Science Advances
In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II error (that is, the conditional probability of misclassifying a class 1 observation as class 0) while enforcing an upper bound, α, on the type I error. Despite its century-long history in hypothesis testing, the NP paradigm has not been well recognized and implemented in classification schemes. Common practices that directly limit the empirical type I error to no more than α do not satisfy the type I error control objective because the resulting classifiers are likely to have type I errors much larger than α, and the NP paradigm has not been properly implemented in practice. We develop the first umbrella algorithm that implements the NP paradigm for all scoring-type classification methods, such as logistic regression, support vector machines, and random forests. Powered by this algorithm, we propose a novel graphical tool for NP classification methods: NP receiver operating characteristic (NP-ROC) bands motivated by the popular ROC curves. NP-ROC bands will help choose α in a data-adaptive way and compare different NP classifiers. We demonstrate the use and properties of the NP umbrella algorithm and NP-ROC bands, available in the R package nproc, through simulation and real data studies.
- Research Article
3
- 10.1080/02664763.2019.1701636
- Dec 18, 2019
- Journal of applied statistics
The Akaike Information Criterion (AIC) and related information criteria are powerful and increasingly popular tools for comparing multiple, non-nested models without the specification of a null model. However, existing procedures for information-theoretic model selection do not provide explicit and uniform control over error rates for the choice between models, a key feature of classical hypothesis testing. We show how to extend notions of Type-I and Type-II error to more than two models without requiring a null. We then present the Error Control for Information Criteria (ECIC) method, a bootstrap approach to controlling Type-I error using Difference of Goodness of Fit (DGOF) distributions. We apply ECIC to empirical and simulated data in time series and regression contexts to illustrate its value for parametric Neyman–Pearson classification. An R package implementing the bootstrap method is publicly available.
- Conference Article
- 10.1109/aike52691.2021.00019
- Dec 1, 2021
There is a growing need for asymmetric error control in medical disease diagnosis because the cost of a false negative may greatly exceed the cost of a false positive. This paper aims to extend the asymmetric error control achieved using the Neyman-Pearson (NP) Lemma in binary classification to multiclass classification. The NP oracle inequalities for binary classes are not immediately applicable for the multiclass NP classification, leading to a multi-step procedure to extend the algorithm in the context of multiple classes. Firstly, a hierarchical order of severity for misclassification for each class is maintained and the most critical classes identified. Secondly, the NP methods are applied to classify the most critical class versus the rest, and then the NP methods are applied to the next most severe class in the list versus the rest until a label is assigned. This approach is used to construct a novel tree-based classifier that enables asymmetric error control for multiclass classification and is evaluated on a cardiac arrhythmia dataset, where missing certain types of arrhythmia have more severe consequences than missing others. The results show that we are able to control the number of false negatives for the most critical classes in the multiclass classification problem.
- Research Article
- 10.1002/for.3280
- Apr 14, 2025
- Journal of Forecasting
ABSTRACTA deep learning binary classifier is proposed to test if asset returns follow martingale difference sequences. The Neyman–Pearson classification paradigm is applied to control the type I error of the test. In Monte Carlo simulations, I find that this approach has better power properties than variance ratio and portmanteau tests against several alternative processes. I apply this procedure to a large set of exchange rate returns and find that it detects several potential deviations from the martingale difference hypothesis that the conventional statistical tests fail to capture.
- Research Article
7
- 10.1007/s12532-021-00214-w
- Feb 3, 2022
- Mathematical Programming Computation
Stochastic gradient methods (SGMs) have been widely used for solving stochastic optimization problems. A majority of existing works assume no constraints or easy-to-project constraints. In this paper, we consider convex stochastic optimization problems with expectation constraints. For these problems, it is often extremely expensive to perform projection onto the feasible set. Several SGMs in the literature can be applied to solve the expectation-constrained stochastic problems. We propose a novel primal-dual type SGM based on the Lagrangian function. Different from existing methods, our method incorporates an adaptiveness technique to speed up convergence. At each iteration, our method inquires an unbiased stochastic subgradient of the Lagrangian function, and then it renews the primal variables by an adaptive-SGM update and the dual variables by a vanilla-SGM update. We show that the proposed method has a convergence rate of $$O(1/\sqrt{k})$$ in terms of the objective error and the constraint violation. Although the convergence rate is the same as those of existing SGMs, we observe its significantly faster convergence than an existing non-adaptive primal-dual SGM and a primal SGM on solving the Neyman–Pearson classification and quadratically constrained quadratic programs. Furthermore, we modify the proposed method to solve convex–concave stochastic minimax problems, for which we perform adaptive-SGM updates to both primal and dual variables. A convergence rate of $$O(1/\sqrt{k})$$ is also established to the modified method for solving minimax problems in terms of primal-dual gap. Our code has been released at https://github.com/RPI-OPT/APriD .
- Research Article
129
- 10.1016/s0925-2312(98)00038-1
- Nov 1, 1998
- Neurocomputing
Predicting bankruptcies with the self-organizing map
- Research Article
- 10.12694/scpe.v7i2.365
- Jan 3, 2001
- Scalable Computing Practice and Experience
Analysis of Stochastic Automata Networks (SAN) is a well established approach for modeling the behaviour of computing networks and systems, particularly parallel systems. The transient study of performance measures leads us to time and space complexity problems as well as error control of the numerical results. The SAN theory presents some advantages such as avoiding to build the entire infinitesimal generator and facing the time complexity problem thanks to the tensor algebra properties. The aim of this study is the computation of the transient state probability vector of SAN models. We first select and modify the (stable) uniformization method in order to compute that vector in a sequential way. We also propose a new efficient algorithm to compute a product of a vector by a tensor sum of matrices. Then, we study the contribution of parallelism in front of the increasing execution time for stiff models by developing a parallel algorithm of the uniformization. The latter algorithm is efficient and allows to process, within a fair computing time, systems with more than one million states and large mission time values.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.