#### Filter Results:

- Full text PDF available (26)

#### Publication Year

1998

2013

- This year (0)
- Last 5 years (1)
- Last 10 years (5)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Data Set Used

#### Key Phrases

Learn More

- Stephen D. Bay, Mark Schwabacher
- KDD
- 2003

Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in… (More)

- Stephen D. Bay, Michael J. Pazzani
- Data Mining and Knowledge Discovery
- 2001

A fundamental task in data analysis is understanding the differences between several contrasting groups. These groups can represent different classes of objects, such as male or female students, or the same group over time, e.g. freshman students in 1993 through 1998. We present the problem of mining contrast sets: conjunctions of attributes and values that… (More)

- Stephen D. Bay, Michael J. Pazzani
- KDD
- 1999

A fundamental task in data analysis is understanding the di erences between several contrasting groups. These groups can represent di erent classes of objects, such as male or female students, or the same group over time, e.g. freshman students in 1993 versus 1998. We present the problem of mining contrast-sets: conjunctions of attributes and values that di… (More)

- Stephen D. Bay
- ICML
- 1998

Combining multiple classiiers is an eeective technique for improving accuracy. There are many general combining algorithms, such as Bagging or Error Correcting Output Coding, that signiicantly improve classiiers like decision trees, rule learners, or neural networks. Unfortunately, many combining methods do not improve the nearest neighbor classiier. In… (More)

- Stephen D. Bay
- Intell. Data Anal.
- 1999

Combining multiple classiiers is an eeective technique for improving accuracy. There are many general combining algorithms, such as Bagging, Boosting, or Error Correcting Output Coding, that signiicantly improve classiiers like decision trees, rule learners, or neural networks. Unfortunately, these combining methods do not improve the nearest neighbor… (More)

- Stephen D. Bay, Dennis F. Kibler, Michael J. Pazzani, Padhraic Smyth
- SIGKDD Explorations
- 2000

% $ & ! ' ($ )! * ! * + $, * . & . * /0 $1* 2 * . *.! 3 4 & ! * 56 $) * * # $ 7/0 ! 8 & $8 $ * 4 * $ 9* :0$ "<;0 8 8 $( *, 6$ & * 4 6* :' /= 2 ! > * * ' 4* : ?! * > 8 ! @ # ':A * B 8& * :4C $ ) # ! + . ! '! * # 6 ! ! 4 $( 4 2 4 ! ) *? 5 ! 8 4 :A# ' * /0 $ ?:A * < 6$ " ;D*? # * . ! ( ( 8 E>FHGA & ) 8! $ 9E>F G7I9 * /0 $ 9J ! * & 4 .J9 >K I>J9J>L

- Stephen D. Bay
- Knowledge and Information Systems
- 2001

Many algorithms in data mining can be formulated as a set-mining problem where the goal is to find conjunctions (or disjunctions) of terms that meet user-specified constraints. Set-mining techniques have been largely designed for categorical or discrete data where variables can only take on a fixed number of values. However, many datasets also contain… (More)

- Stephen D. Bay, Michael J. Pazzani
- ICML
- 2000

- Stephen D. Bay
- KDD
- 2000

Many algorithms in data mining can be formulated as a set mining problem where the goal is to nd conjunctions (or disjunctions) of terms that meet user speci ed constraints. Set mining techniques have been largely designed for categorical or discrete data where variables can only take on a xed number of values. However, many data sets also contain… (More)

- Stephen D. Bay, Jeff Shrager, Andrew Pohorille, Pat Langley
- Journal of Biomedical Informatics
- 2002

Discovering the complex regulatory networks that govern mRNA expression is an important but difficult problem. Many current approaches use only expression data from microarrays to infer the likely network structure. However, this ignores much existing knowledge because for a given organism and system under study, a biologist may already have a partial model… (More)