#### Filter Results:

- Full text PDF available (19)

#### Publication Year

1997

2012

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

Most classification algorithms expect the frequency of examples form each class to be roughly the same. However , this is rarely the case for real-world data where very often the class probability distribution is non-uniform (or, imbalanced). For these applications, the main problem is usually the fact that the costs of mis-classifying examples belonging to… (More)

- Dragos D. Margineantu, Stephen Bay, Philip Chan, Terran Lane
- SIGKDD Explorations
- 2005

For many applications, data mining systems are required to detect anomalous (abnormal, unmodeled, or unexpected) observations. This has so far proven to be a difficult challenge because anomalies are usually considered to be "non-normal" observations, where "normality" is typically defined by very complex concepts. Because of these and other reasons, there… (More)

Many machine learning applications require classiiers that minimize an asymmetric cost function rather than the misclassiication rate, and several recent papers have addressed this problem. However, these papers have either applied no statistical testing or have applied statistical methods that are not appropriate for the cost-sensitive setting. Without… (More)

Many machine learning applications require classiiers that minimize an asymmetric cost function rather than the misclassiication rate, and several recent papers have addressed this problem. However, these papers have either applied no statistical testing or have applied statistical methods that are not appropriate for the cost-sensitive setting. Without… (More)

- Dragos D. Margineantu
- IJCAI
- 2005

For many classification tasks a large number of instances available for training are unlabeled and the cost associated with the labeling process varies over the input space. Meanwhile, virtually all these problems require classifiers that minimize a non-uniform loss function associated with the classification decisions (rather than the accuracy or number of… (More)

- Dragos D. Margineantu
- ECML
- 2002

Decision tree models typically give good classification decisions but poor probability estimates. In many applications, it is important to have good probability estimates as well. This paper introduces a new algorithm, Bagged Lazy Option Trees (BLOTs), for constructing decision trees and compares it to an alternative, Bagged Probability Estimation Trees… (More)

This paper addresses two cost-sensitive learning methodology issues. First, we ask the question of whether Bagging is always an appropriate procedure to compute accurate class-probability estimates for cost-sensitive classiication. Second, we will point the reader to a potential source of erroneous results in the most common procedure of evaluating… (More)

Many machine learning applications require classiiers that minimize an asymmetric loss function rather than the raw misclassiication rate. We study methods for modifying C4.5 to incorporate arbitrary loss matrices. One way to incorporate loss information into C4.5 is to manipulate the weights assigned to the examples from diierent classes. For 2-class… (More)