Learn More
This paper presents a mechanism which infers a user's plans from his/her utterances by directing the inference process towards the more likely interpretations of a speaker's statements among many possible interpretations. Our mechanism uses Bayesian theory of probability to assess the likelihood of an interpretation, and it complements this assessment by(More)
Lexicon entry1: Data: The Lexical Access Problem consists of determining the intended sequence of words corresponding to an input sequence of phonemes (basic speech sounds) that come from a low-level phoneme recognizer. In this paper we present an information-theoretic approach based on the Minimum Message Length Criterion for solving the Lexical Access(More)
We investigate the effect of paraphrase generation on document retrieval performance. Specifically, we describe experiments where three information sources are used to generate lexical paraphrases of queries posed to the In-ternet. These information sources are: WordNet, a Webster-based thesaurus, and a combination of Webster and WordNet. Corpus-based(More)
In this paper, we present a new co-training strategy that makes use of unlabelled data. It trains two predictors in parallel, with each predictor labelling the unlabelled data for training the other predictor in the next round. Both predictors are support vector machines, one trained using data from the original feature space, the other trained with new(More)
In this paper, we outline the main steps leading to the development of the winning solution for Task 2 of KDD Cup 2002 (Yeast Gene Regulation Prediction). Our unusual solution was a pair of linear classifiers in high dimensional space (∼14,000), developed with just 38 and 84 training examples, respectively, all belonging to the target class only. The(More)