Neyman-Pearson classification algorithms and NP receiver operating characteristics

  title={Neyman-Pearson classification algorithms and NP receiver operating characteristics},
  author={Xin Tong and Yang Feng and Jingyi Jessica Li},
  journal={Science Advances},
An umbrella algorithm and a graphical tool for asymmetric error control in binary classification. In many binary classification applications, such as disease diagnosis and spam detection, practitioners commonly face the need to limit type I error (that is, the conditional probability of misclassifying a class 0 observation as class 1) so that it remains below a desired threshold. To address this need, the Neyman-Pearson (NP) classification paradigm is a natural choice; it minimizes type II… 

Non-splitting Neyman-Pearson Classifiers

Leveraging a canonical linear discriminant analysis model, a quantitative CLT is derived for a certain functional of quadratic forms of the inverse of sample and population covariance matrices and developed for the first time NP classifiers without splitting the training sample.

Neyman-Pearson Classification Via Context Trees

  • Basarbatu CanH. Özkan
  • Computer Science
    2020 28th Signal Processing and Communications Applications Conference (SIU)
  • 2020
An NP classification method that solves nonlinear problems via context trees in an online manner with an average of 66% increase in the area under the ROC curve along with a precise control over the desired type I error, compared to the algorithms that do not use context trees and can only solve linear problems.

Bridging Cost-sensitive and Neyman-Pearson Paradigms for Asymmetric Binary Classification

The methodological connections between the cost-sensitive and Neyman-Pearson paradigms are studied for the first time, and the TUBE-CS algorithm is developed to bridge the two paradigm from the perspective of controlling the population type I error.

Active Learning for Online Nonlinear Neyman-Pearson Classification

  • Basarbatu CanH. Özkan
  • Computer Science
    2022 30th Signal Processing and Communications Applications Conference (SIU)
  • 2022
An active learning method for online context tree based ensemble NP classifiers that prioritizes training samples that have high uncertainty (greater than a constant threshold) among different classifiers of the ensemble model is proposed.

Asymmetric error control under imperfect supervision: a label-noise-adjusted Neyman-Pearson umbrella algorithm

This work proposes the first theory-backed algorithm that adapts most state-of-theart classification methods to the training label noise under the Neyman-Pearson classification paradigm and results not only control the type I error with high probability under the desired level but also improve power.

Neyman-Pearson Multi-class Classification via Cost-sensitive Learning

This work studies the multiclass NP problem by connecting it to the CS problem and proposes two algorithms, believed to be the first work to solve the multi-class NP problem via cost-sensitive learning techniques with theoretical guarantees.

Imbalanced classification: an objective-oriented review

An objective-oriented review of the common resampling techniques for binary classification under imbalanced class sizes is provided and the take-away message is that with imbalanced data, one usually should consider all the combinations of resamplings techniques and the base classification methods.

Imbalanced classification: A paradigm‐based review

A paradigm‐based review of the common resampling techniques for binary classification under imbalanced class sizes, which considers the classical paradigm that minimizes the overall classification error, the cost‐sensitive learning paradigm, and the Neyman–Pearson paradigm.

Neyman-Pearson Criterion (NPC): A Model Selection Criterion for Asymmetric Binary Classification

A real data case study of breast cancer suggests that the Neyman-Pearson criterion is a practical criterion that leads to the discovery of novel gene markers with both high sensitivity and specificity for breast cancer diagnosis.

Hierarchical Neyman-Pearson Classification for Prioritizing Severe Disease Categories in COVID-19 Patient Data

This work proposes a hierarchical NP (H-NP) framework and an umbrella algorithm that generally adapts to popular classification methods and controls the under-diagnosis errors with high probability on an integrated collection of single-cell RNA-seq datasets for 740 patients.



A survey on Neyman‐Pearson classification and suggestions for future research

Though NP classification has the potential to be an important subfield in the classification literature, it has not received much attention in the statistics and machine learning communities.

Neyman-Pearson Classification under High-Dimensional Settings

This article is the first attempt to construct classifiers with guaranteed theoretical performance under the NP paradigm in high-dimensional settings using a plug-in approach to construct NP-type classifiers for Naive Bayes models.

A plug-in approach to neyman-pearson classification

  • Xin Tong
  • Computer Science
    J. Mach. Learn. Res.
  • 2013
This paper proposes two related plug-in classifiers which amount to thresholding respectively the class conditional density ratio and the regression function and derives oracle inequalities that can be viewed as finite sample versions of risk bounds.

Comparison and Design of Neyman-Pearson Classifiers

This work proposes two families of performance measures for evaluating and comparing classifiers and suggests one criterion in particular for practical use and presents general learning rules that satisfy performance guarantees with respect to these criteria.

Genomic Applications of the Neyman–Pearson Classification Paradigm

This chapter reviews the NP classification literature, with a focus on the genomic applications as well as the contribution to theNP classification theory and algorithms.

On Generalizable Low False-Positive Learning Using Asymmetric Support Vector Machines

This paper proposes the notion of Asymmetric Support Vector Machine (ASVM), an asymmetric extension of the SVM that employs a new objective that models the imbalance between the costs of false predictions from different classes in a novel way such that user tolerance on false-positive rate can be explicitly specified.

SMOTE: Synthetic Minority Over-sampling Technique

A combination of the method of oversampling the minority (abnormal) class and under-sampling the majority class can achieve better classifier performance (in ROC space) and a combination of these methods and the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy is evaluated.

Neyman-Pearson Classification, Convexity and Stochastic Constraints

The Neyman-Pearson paradigm is implemented to deal with asymmetric errors in binary classification with a convex loss and a new classifier is obtained by solving an optimization problem with an empirical objective and an empirical constraint.

A Neyman-Pearson approach to statistical learning

This paper investigates an extension of NP theory to situations in which one has no knowledge of the underlying distributions except for a collection of independent and identically distributed (i.i.d.) training examples from each hypothesis and demonstrates that several concepts from statistical learning theory have counterparts in the NP context.