Learn More
We introduce a new kind of language model, which models whole sentences or utterances directly using the Maximum Entropy paradigm. The new model is conceptually simpler, and more naturally suited to modeling whole-sentence phenomena, than the conditional ME models proposed to date. By avoiding the chain rule, the model treats each sentence or utterance as a(More)
BACKGROUND Mathematical and computational models provide valuable tools that help public health planners to evaluate competing health interventions, especially for novel circumstances that cannot be examined through observational or controlled studies, such as pandemic influenza. The spread of diseases like influenza depends on the mixing patterns within(More)
Imagine trying to build a system to identify people, locations and organizations , or other arbitrary types, in a human language you are not familiar with. If we knew what kinds of words represent the classes people , locations and organizations, by examining enough text data they occur in, we could learn to recognize the contexts they occur in. And if we(More)
Program. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Abstract In the developing world, critical information, such as in the field of healthcare, can often(More)
Acknowledgements I would like to thank my advisor Alan Black for all his support and dedication, without him this thesis would not have been possible; Kenji Sagae for the insightful discussions about this thesis and, most importantly, for his patience and support; Guy Lebanon and Christian Monson, LTI colleagues, for the discussion about unsupervised(More)
This paper introduces lattice based language models, a new language model-ing paradigm. These models construct multi-dimensional hierarchies of partitions and select the most promising partitions to generate the estimated distributions. We discussed a speciic two dimensional lattice and propose two primary features to measure the usefulness of each node:(More)
We describe our ongoing efforts at adaptive statistical language mod-eling. Central to our approach is the Maximum Entropy (ME) Principle , allowing us to combine evidence from multiple sources, such as long-distance triggers and conventional short.distance trigrams. Given consistent statistical evidence, a unique ME solution is guaranteed to exist, and an(More)
Information retrieval (IR) research has reached a point where it is appropriate to assess progress and to define a research agenda for the next five to ten years. This report summarizes a discussion of IR research challenges that took place at a recent workshop. The attendees of the workshop considered information retrieval research in a range of areas(More)
This paper reports recent efforts to apply the speaker-independent SPHINX-H system to the DARPA Wall Street Journal continuous speech recognition task. In SPHINX-H, we incorporated additional dynamic and speaker-normalized features, replaced discrete models with sex-dependent semi-continuous hidden Markov models, augmented within-word triphones with(More)