A Statistical Approach to Machine Translation

Abstract

In this paper, we present a statistical approach to machine translation. We describe the application of our approach to translation from French to English and give preliminary results. The field of machine translation is almost as old as the modern digital computer. In 1949 Warren Weaver suggested that the problem be attacked with statistical methods and ideas from information theory, an area which he, Claude Shannon, and others were developing at the time (Weaver 1949). Although researchers quickly abandoned this approach, advancing numerous theoretical objections, we believe that the true obstacles lay in the relative impotence of the available computers and the dearth of machine-readable text from which to gather the statistics vital to such an attack. Today, computers are five orders of magnitude faster than they were in 1950 and have hundreds of millions of bytes of storage. Large, machine-readable corpora are readily available. Statistical methods have proven their value in automatic speech recognition (Bahl et al. 1983) and have recently been applied to lexicography Sharman et al. 1988). We feel that it is time to give them a chance in machine translation. The job of a translator is to render in one language the meaning expressed by a passage of text in another language. This task is not always straightforward. For example , the translation of a word may depend on words quite far from it. Some English translators of Proust's seven volume work A la Recherche du Temps Perdu have striven to make the first word of the first volume the same as the last word of the last volume because the French original begins and ends with the same word (Bernstein 1988). Thus, in its most highly developed form, translation involves a careful study of the original text and may even encompass a detailed analysis of the author's life and circumstances. We, of course, do not hope to reach these pinnacles of the translator's art. In this paper, we consider only the translation of individual sentences. Usually, there are many acceptable translations of a particular sentence, the choice among them being largely a matter of taste. We take the view that every sentence in one language is a possible translation of any sentence in the other. We assign to every pair of sentences (S, T) a probability, Pr(TIS), to be interpreted as the probability that a translator will produce T in the target language when presented …

Extracted Key Phrases

Showing 1-10 of 12 references

Howard's Way. The New York Times Magazine

  • R Bernstein
  • 1988

The Computational Analysis of English: A Corpus-Based Approach

  • R G Garside, G N Leech, G R Sampson
  • 1987

Lcxicographic Evidence Dictionaries , Lexicography and Language Learning

  • J M Sinclair
  • 1985

Hidden Markov Analysis: An Introduction

  • J D Ferguson
  • 1980

Interpolated Estimation of Markov Source Parameters from Sparse Data

  • F Jelinck, R L Mercer
  • 1980

Stochastic Modeling for Automatic Speech Understanding Speech Recognition

  • J K Baker
  • 1979
Showing 1-10 of 946 extracted citations
050100'92'95'98'01'04'07'10'13'16
Citations per Year

1,451 Citations

Semantic Scholar estimates that this publication has received between 1,313 and 1,607 citations based on the available data.

See our FAQ for additional information.