Robert L. Mercer

Learn More
We describe a series o,f five statistical models o,f the translation process and give algorithms,for estimating the parameters o,f these models given a set o,f pairs o,f sentences that are translations o,f one another. We define a concept o,f word-by-word alignment between such pairs o,f sentences. For any given pair of such sentences each o,f our models(More)
We address the problem of predicting a word from previous words in a sample of text In particular we discuss n gram models based on classes of words We also discuss several statistical algorithms for assigning words to classes based on the frequency of their co occurrence with other words We nd that we are able to extract classes that have the avor of(More)
The field of machine translation is almost as old as the modern digital computer. In 1949 Warren Weaver suggested that the problem be attacked with statistical methods and ideas from information theory, an area which he, Claude Shannon, and others were developing at the time (Weaver 1949). Although researchers quickly abandoned this approach, advancing(More)
Speech recognition is formulated as a problem of maximum likelihood decoding. This formulation requires statistical models of the speech production process. In this paper, we describe a number of statistical models for use in speech recognition. We give special attention to determining the parameters for such models from sparse data. We also describe two(More)
We present an estimate of an upper bound of 1.75 bits for the entropy of characters in printed English, obtained by constructing a word trigram model and then computing the cross-entropy between this model and a balanced sample of English text. We suggest the well-known and widely available Brown Corpus of printed English as a standard against which to(More)
We describe a statistical technique for assigning senses to words. An instance of a word is assigned a sense by asking a question about the context in which the word appears. The question is constructed to have high mutual information with the translation of that instance in another language. When we incorporated this method of assigning senses into our(More)
An approach to automatic translation is outlined that utilizes technklues of statistical inl'ormatiml extraction from large data bases. The method is based on the availability of pairs of large corresponding texts that are translations of each other. In our case, the iexts are in English and French. Fundamental to the technique is a complex glossary of(More)
We describe a generative probabilistic model of natural language, which we call HBG, that takes advantage of detailed linguistic information to resolve ambiguity. HBG incorporates lexical, syntactic, semantic, and structural information from the parse tree into the disambiguation process in a novel way. We use a corpus of bracketed sentences, called a(More)