A Log-Linear Model for Unsupervised Text Normalization

  title={A Log-Linear Model for Unsupervised Text Normalization},
  author={Yi Yang and Jacob Eisenstein},
We present a unified unsupervised statistical model for text normalization. The relationship between standard and non-standard tokens is characterized by a log-linear model, permitting arbitrary features. The weights of these features are trained in a maximumlikelihood framework, employing a novel sequential Monte Carlo training algorithm to overcome the large label space, which would be impractical for traditional dynamic programming solutions. This model is implemented in a normalization… CONTINUE READING
Highly Cited
This paper has 74 citations. REVIEW CITATIONS


Publications citing this paper.

74 Citations

Citations per Year
Semantic Scholar estimates that this publication has 74 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.

Similar Papers

Loading similar papers…