#### Filter Results:

#### Publication Year

1994

2015

#### Publication Type

#### Co-author

#### Key Phrase

#### Publication Venue

#### Data Set Used

Learn More

In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To achieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the… (More)

When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous vari ables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality as sumption and instead use statistical methods for… (More)

We address the problem of nding a subset of features that allows a supervised induction algorithm to induce small high-accuracy concepts. We examine notions of relevance and irrelevance, and show that the deenitions used in the machine learning literature do not adequately partition the features into useful categories of relevance. We present deeni-tions… (More)

We present MLC ++ , a library of C ++ classes and tools for supervised Machine Learning. While MLC ++ provides general learning algorithms that can be used by end users, the main objective is to provide researchers and experts with a wide variety of tools that can accelerate algorithm development, increase software reliability, provide comparison tools, and… (More)

The most popular delayed reinforcement learning technique, Q-learning (Watkins 1989), estimates the future reward expected from executing each action in every state. If these estimates are correct, then an agent can use them to select the action with maximal expected future reward in each state, and thus perform optimally. Watkins has proved that Q-learning… (More)

As data warehouses grow to the point where one hundred gigabytes is considered small, the computational efficiency of data-mining algorithms on large databases becomes increasingly important. Using a sample from the database can speed up the data-mining process, but this is only acceptable if it does not reduce the quality of the mined knowledge. To this… (More)

- Ron Kohavi, George H John
- 1998

In the feature subset selection problem, a learning algorithm is faced with the problem of selecting a relevant subset of features upon which to focus its attention, while ignoring the rest. To a c hieve the best possible performance with a particular learning algorithm on a particular training set, a feature subset selection method should consider how the… (More)

- George H John, Pat Langley
- 1995

When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods for… (More)

Finding and removing outliers is an important problem in data mining. Errors in large databases can be extremely common, so an important property of a data mining algorithm is robustness with respect to errors in the database. Most sophisticated methods in machine learning address this problem to some extent , but not fully, and can be improved by… (More)