Bhavani Raskutti

Learn More
There are many practical applications where learning from single class examples is either, the only possible solution, or has a distinct performance advantage. The first case occurs when obtaining examples of a second class is difficult, e.g., classifying sites of "interest" based on web accesses. The second situation is exemplified by the gene knock-out(More)
In this paper, we outline the main steps leading to the development of the winning solution for Task 2 of KDD Cup 2002 (Yeast Gene Regulation Prediction). Our unusual solution was a pair of linear classifiers in high dimensional space (∼14,000), developed with just 38 and 84 training examples, respectively, all belonging to the target class only. The(More)
Analysis of naturally occurring information-seeking dialogues indicates that they usually consist of a number of distinct discourse segments, such as a greeting segment, a request issued by a user, an optional clariication segment, a transfer of information segment, and a nal closing segment. The clariication interaction is often initiated by the(More)
This paper presents a mechanism which infers a user's plans from his/her utterances by directing the inference process towards the more likely interpretations of a speaker's statements among many possible interpretations. Our mechanism uses Bayesian theory of probability to assess the likelihood of an interpretation, and it complements this assessment by(More)
The increasing availability of a large number of interactive multi-media information services means that users have a large and diverse collection of choices open to them. This diversity and choice may present navigation difficulties to users which can dissuade them from using such services. One method of assisting users to navigate through large(More)
An important problem in clustering is how to decide what is the best set of clusters for a given data set, in terms of both the number of clusters and the membership of those clusters. In this paper we develop four criteria for measuring the quality of different sets of clusters. These criteria are designed so that different criteria prefer cluster sets(More)
We investigate two seemingly incompatible approaches for improving document retrieval performance in the context of question answering: query expansion and query reduction. Queries are expanded by generating lexical paraphrases. Syntactic, semantic and corpus-based frequency information is used in this process. Queries are reduced by removing words that may(More)