Lexicon Adaptation for Broadcast News Transcription

  title={Lexicon Adaptation for Broadcast News Transcription},
  author={Nicola Bertoldi and Marcello Federico},
This paper presents a technique for dynamically extending the language model lexicon of an Italian broadcast news transcription system. New words are selected dayby-day, from contemporary news available on the Internet, according to a strategy that tries to minimize the out-of-vocabulary rate of the language model. Phonetic transcriptions of new words are generated automatically with an in-house developed software tool. Experiments, performed with the ITC-irst 62K-word baseline system, show… CONTINUE READING