Peter F. Brown

Learn More
We describe a series o,f five statistical models o,f the translation process and give algorithms,for estimating the parameters o,f these models given a set o,f pairs o,f sentences that are translations o,f one another. We define a concept o,f word-by-word alignment between such pairs o,f sentences. For any given pair of such sentences each o,f our models(More)
We address the problem of predicting a word from previous words in a sample of text In particular we discuss n gram models based on classes of words We also discuss several statistical algorithms for assigning words to classes based on the frequency of their co occurrence with other words We nd that we are able to extract classes that have the avor of(More)
The field of machine translation is almost as old as the modern digital computer. In 1949 Warren Weaver suggested that the problem be attacked with statistical methods and ideas from information theory, an area which he, Claude Shannon, and others were developing at the time (Weaver 1949). Although researchers quickly abandoned this approach, advancing(More)
An approach to automatic translation is outlined that utilizes technklues of statistical inl 'ormatiml extraction from large data bases. The method is based on the availability of pairs of large corresponding texts that are translations of each other. In our case, the iexts are in English and French. Fundamenta l to the technique is a complex glossary of(More)
We present an estimate of an upper bound of 1.75 bits for the entropy of characters in printed English, obtained by constructing a word trigram model and then computing the cross-entropy between this model and a balanced sample of English text. We suggest the well-known and widely available Brown Corpus of printed English as a standard against which to(More)
We describe a statistical technique for assigning senses to words. An instance of a word is assigned a sense by asking a question about the context in which the word appears. The question is constructed to have high mutual information with the translation of that instance in another language. When we incorporated this method of assigning senses into our(More)