Robert C. Holte

Learn More
This article reports an empirical investigation of the accuracy of rules that classify examples on the basis of a single attribute. On most datasets studied, the best of these very simple rules is as accurate as the rules induced by the majority of machine learning systems. The article explores the implications of this finding for machine learning research(More)
During a project examining the use of machine learning techniques for oil spill detection, we encountered several essential questions that we believe deserve the attention of the research community. We use our particular case study to illustrate such issues as problem formulation, selection of evaluation measures, and data preparation. We relate these(More)
The computation of the first complete approximations of game-theoretic optimal strategies for fullscale poker is addressed. Several abstraction techniques are combined to represent the game of 2player Texas Hold’em, having size , using closely related models each having size . Despite the reduction in size by a factor of 100 billion, the resulting models(More)
This paper takes a new look at two sampling schemes commonly used to adapt machine algorithms to imbalanced classes and misclassification costs. It uses a performance analysis technique called cost curves to explore the interaction of over and under-sampling with the decision tree learner C4.5. C4.5 was chosen as, when combined with one of the sampling(More)
This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing classifier performance for most purposes. This is because they(More)
Ideally, definitions induced from examples should consist of al l , and only, disjuncts that are meaningful (e.g., as measured by a statistical significance test) and have a low error rate. Exist ing inductive systems create definitions that are ideal wi th regard to large disjuncts, but far from ideal wi th regard to small disjuncts, where a small (large)(More)
We exhibit a theoretically founded algorithm T2 for agnostic PAC-learning of decision trees of at most 2 levels, whose computation time is almost linear in the size of the training set. We evaluate the performance of this learning algorithm T2 on 15 common “real-world” datasets, and show that for most of these datasets T2 provides simple decision trees with(More)
Building a high-performance poker-playing program is a challenging project. The best program to date, PsOpti, uses game theory to solve a simplified version of the game. Although the program plays reasonably well, it is oblivious to the opponent’s weaknesses and biases. Modeling the opponent to exploit predictability is critical to success at poker. This(More)
! #" $!%& ' ( ) * + -, ./ 0 1 0 2 430 14 5 1' 6 7 +89 :14 + 6 ; 7 0 ( 5< '3>= ?1) @(A7 0 '3 14 5<1' 6 ! ' ( ) * + CBD * > ) * /BD + (@ 8 0 E +50 + + ( ) +F+ G +8 " $!%H ' ( ) * + -,9I J0 2 G 6 K J0 5 6 ) + 5LA7MN ! * . O '3 B# ( / # B#B# 5 ' @P 0 + F+ +821' 6 ) : + 5Q14 R8? S(J 14 #./ C T ?14J0= /14 6 ; U V 0 I 6 / 5DS(J > ) * ' @D .QBGJ 1) #I 4 O W + T O14(More)
This paper presents a new perspective on the traditional AI task of problem solving and the techniques of abstraction and refinement. The new perspective is based on the well known, but little exploited, relation between problem solving and the task of finding a path in a graph between two given nodes. The graph oriented view of abstraction suggests two new(More)