• Corpus ID: 15685462

Threshold Choice Methods: the Missing Link

@article{HernndezOrallo2011ThresholdCM,
  title={Threshold Choice Methods: the Missing Link},
  author={Jos{\'e} Hern{\'a}ndez-Orallo and Peter A. Flach and C. Ferri},
  journal={ArXiv},
  year={2011},
  volume={abs/1112.2640}
}
Many performance metrics have been introduced in the literature for the evaluation of classification performance, each of them with different origins and areas of application. These metrics include accuracy, macro-accuracy, area under the ROC curve or the ROC convex hull, the mean absolute error and the Brier score or mean squared error (with its decomposition into refinement and calibration). One way of understanding the relation among these metrics is by means of variable operating conditions… 
3 Citations

Machine Learning - The Art and Science of Algorithms that Make Sense of Data

TLDR
Peter Flach's clear, example-based approach begins by discussing how a spam filter works, which gives an immediate introduction to machine learning in action, with a minimum of technical fuss.

Quantifier Guided Aggregation for the Veracity Assessment of Online Reviews

TLDR
A multicriteria decision making approach based both on the assessment of multiple criteria and the use of aggregation operators with the aim of obtaining a veracity score associated with each review, which is possible to detect fake reviews.

Model selection and assessment for multi-species occupancy models.

TLDR
Several different contemporary Bayesian hierarchical approaches for checking and validating multi-species occupancy models are examined and applied to a freshwater aquatic study system in Colorado, USA, to better understand the diversity and distributions of plains fishes.

References

SHOWING 1-10 OF 53 REFERENCES

An experimental comparison of performance measures for classification

Modifying ROC Curves to Incorporate Predicted Probabilities

TLDR
A new probabilistic version of AUC is introduced, called pAUC, which evaluates ranking performance, but also takes the magnitude of the probabilities into account.

Measuring classifier performance: a coherent alternative to the area under the ROC curve

  • D. Hand
  • Computer Science
    Machine Learning
  • 2009
TLDR
A simple valid alternative to the AUC is proposed, and the property of it being fundamentally incoherent in terms of misclassification costs is explored in detail.

A Coherent Interpretation of AUC as a Measure of Aggregated Classification Performance

TLDR
Should one wish to consider only optimal thresholds, it is demonstrated that a simple and more intuitive alternative to Hand's H measure is already available in the form of the area under the cost curve, both uniform and hence model-independent.

A Survey on Graphical Methods for Classification Predictive Performance Evaluation

TLDR
This paper surveys various graphical methods often used for predictive performance evaluation and presents these methods in the same framework to shed some light on deciding which methods are more suitable to use in different situations.

Improving Accuracy and Cost of Two-class and Multi-class Probabilistic Classifiers Using ROC Curves

TLDR
A hillclimbing approach which adjusts the weights for each class in a pre-defined order is proposed which leads to significant improvements over the naive Bayes classifier's accuracy.

Properties and Benefits of Calibrated Classifiers

TLDR
It is proved that a well calibrated classifier provides bounds on the Bayes error, and calibrating a classifier is guaranteed not to decrease classification accuracy, and the procedure of calibration provides the threshold or thresholds on the decision rule that minimize the classification error.

Comparing classifiers when the misallocation costs are uncertain

Quantifying counts and costs via classification

  • George Forman
  • Computer Science
    Data Mining and Knowledge Discovery
  • 2008
TLDR
The paper describes a variety of methods and evaluates them with a suitable methodology, revealing which methods give reliable estimates when training data is scarce, the testing class distribution differs widely from training, and the positive class is rare, e.g., 1% positives.

The Foundations of Cost-Sensitive Learning

TLDR
It is argued that changing the balance of negative and positive training examples has little effect on the classifiers produced by standard Bayesian and decision tree learning methods, and the recommended way of applying one of these methods is to learn a classifier from the training set and then to compute optimal decisions explicitly using the probability estimates given by the classifier.
...