• Publications
  • Influence
One billion word benchmark for measuring progress in statistical language modeling
TLDR
We propose a new benchmark corpus to be used for measuring progress in statistical language modeling, and to compare their contribution when combined with other advanced techniques. Expand
  • 757
  • 107
  • PDF
Structured language modeling
TLDR
This paper presents an attempt at using the syntactic structure in natural language for improved language models for speech recognition. Expand
  • 306
  • 27
  • PDF
Adaptation of maximum entropy capitalizer: Little data can help a lot
TLDR
A novel technique for maximum “a posteriori” (MAP) adaptation of maximum entropy (MaxEnt) and maximum entropy Markov models (MEMM) is presented. Expand
  • 248
  • 23
  • PDF
“Your Word is my Command”: Google Search by Voice: A Case Study
TLDR
An important goal at Google is to make spoken access ubiquitously available. Expand
  • 202
  • 16
Exploiting Syntactic Structure for Natural Language Modeling
TLDR
The thesis presents an attempt at using the syntactic structure in natural language for improved language models for speech recognition using an original probabilistic parameterization of a shift-reduce parser. Expand
  • 59
  • 13
  • PDF
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
TLDR
Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Expand
  • 92
  • 11
  • PDF
Tagged Back-Translation
TLDR
We propose a simpler alternative to noised-beam decoding during back-translation, consisting of tagging back-translated source sentences with an extra token. Expand
  • 69
  • 11
  • PDF
Exploiting Syntactic Structure for Language Modeling
TLDR
The paper presents a language model that develops syntactic structure and uses it to extract meaningful information from the word history, thus enabling the use of long distance dependencies. Expand
  • 202
  • 10
  • PDF
Position Specific Posterior Lattices for Indexing Speech
TLDR
The paper presents the Position Specific Posterior Lattice, a novel representation of automatic speech recognition lattices that naturally lends itself to efficient indexing of position information and subsequent relevance ranking of spoken documents using proximity. Expand
  • 91
  • 10
  • PDF
Recognition performance of a structured language model
TLDR
A new language model for speech recognition inspired by linguistic analysis is presented. Expand
  • 51
  • 8
  • PDF
...
1
2
3
4
5
...