• Corpus ID: 238857036

Continual learning using lattice-free MMI for speech recognition

  title={Continual learning using lattice-free MMI for speech recognition},
  author={Hossein Hadian and Arseniy Gorin},
Continual learning (CL), or domain expansion, recently became a popular topic for automatic speech recognition (ASR) acoustic modeling because practical systems have to be updated frequently in order to work robustly on types of speech not observed during initial training. While sequential adaptation allows tuning a system to a new domain, it may result in performance degradation on the old domains due to catastrophic forgetting. In this work we explore regularization-based CL for neural… 

Figures and Tables from this paper


Domain Expansion in DNN-Based Acoustic Models for Robust Speech Recognition
This study studies several domain expansion techniques which exploit only the data of the new domain to build a stronger model for all domains and evaluates these techniques in an accent adaptation task in which a DNN acoustic model is adapted to three different English accents.
Continual Learning for Multi-Dialect Acoustic Models
This work demonstrates that by using loss functions that mitigate catastrophic forgetting, sequential transfer learning can be used to train multi-dialect acoustic models that narrow the WER gap between the best (combined training) and worst (fine-tuning) case by up to 65%.
Purely Sequence-Trained Neural Networks for ASR Based on Lattice-Free MMI
A method to perform sequencediscriminative training of neural network acoustic models without the need for frame-level cross-entropy pre-training is described, using the lattice-free version of the maximum mutual information (MMI) criterion: LF-MMI.
Toward Domain-Invariant Speech Recognition via Large Scale Training
This work explores the idea of building a single domain-invariant model for varied use-cases by combining large scale training data from multiple application domains, and shows that by using as little as 10 hours of data from a new domain, an adapted domain- Invariants model can match performance of a domain-specific model trained from scratch using 70 times as much data.
Continual Learning in Automatic Speech Recognition
This work emulates continual learning observed in real life, where new training data are used for gradual improvement of an Automatic Speech Recognizer trained on old domains and appears to yield slight advantage over offline multi-condition training.
Learning without Forgetting
  • Zhizhong Li, Derek Hoiem
  • Computer Science, Mathematics
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2018
This work proposes the Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, and performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques.
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network
SpeechStew is a speech recognition model that is trained on a combination of various publicly available speech recognition datasets: AMI, Broadcast News, Common Voice, LibriSpeech, Switchboard/Fisher, Tedlium, and Wall Street Journal, and it is demonstrated that SpeechStew learns powerful transfer learning representations.
Memory Efficient Experience Replay for Streaming Learning
It is found that full rehearsal can eliminate catastrophic forgetting in a variety of streaming learning settings, with ExStream performing well using far less memory and computation.
Overcoming catastrophic forgetting in neural networks
It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks.
Continual Lifelong Learning with Neural Networks: A Review
This review critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting.