• Publications
  • Influence
Supervised and Unsupervised Discretization of Continuous Features
Binning, an unsupervised discretization method, is compared to entropy-based and purity-based methods, which are supervised algorithms, and it is found that the performance of the Naive-Bayes algorithm significantly improved when features were discretized using an entropy- based method. Expand
A Bayesian Approach to Filtering Junk E-Mail
This work examines methods for the automated construction of filters to eliminate such unwanted messages from a user’s mail stream, and shows the efficacy of such filters in a real world usage scenario, arguing that this technology is mature enough for deployment. Expand
Deep Knowledge Tracing
The utility of using Recurrent Neural Networks to model student learning and the learned model can be used for intelligent curriculum design and allows straightforward interpretation and discovery of structure in student tasks are explored. Expand
Toward Optimal Feature Selection
An efficient algorithm for feature selection which computes an approximation to the optimal feature selection criterion is given, showing that the algorithm effectively handles datasets with a very large number of features. Expand
Hierarchically Classifying Documents Using Very Few Words
This work proposes an approach that utilizes the hierarchical topic structure to decompose the classification task into a set of simpler problems, one at each node in the classification tree, which can be solved accurately by focusing only on a very small set of features, those relevant to the task at hand. Expand
Inductive learning algorithms and representations for text categorization
A comparison of the effectiveness of five different automatic learning algorithms for text categorization in terms of learning speed, realtime classification speed, and classification accuracy is compared. Expand
Learning Limited Dependence Bayesian Classifiers
  • M. Sahami
  • Mathematics, Computer Science
  • KDD
  • 2 August 1996
A framework for characterizing Bayesian classification methods is presented and a general induction algorithm is presented that allows for traversal of this spectrum depending on the available computational power for carrying out induction and its application in a number of domains with different properties. Expand
A web-based kernel function for measuring the similarity of short text snippets
This paper defines a similarity kernel function, mathematically analyze some of its properties, and provides examples of its efficacy, and shows the use of this kernel function in a large-scale system for suggesting related queries to search engine users. Expand
Text Mining: Classification, Clustering, and Applications
This book examines methods to automatically cluster and classify text documents and applies these methods in a variety of areas, including adaptive information filtering, information distillation, and text search. Expand
Autonomously Generating Hints by Inferring Problem Solving Policies
This paper autonomously generate hints for the Code.org `Hour of Code,' (which is to the best of the authors' knowledge the largest online course to date) using historical student data, and discovers that this statistic is highly predictive of a student's future success. Expand