Dual Language Models for Code Mixed Speech Recognition

@inproceedings{Garg2018DualLM,
  title={Dual Language Models for Code Mixed Speech Recognition},
  author={S. Garg and Tanmay Parekh and Preethi Jyothi},
  booktitle={INTERSPEECH},
  year={2018}
}
In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text. Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models. We propose a novel technique called dual language models, which involves building two complementary monolingual language models and combining them using a probabilistic model for switching between the two. We… 

Figures and Tables from this paper

Language Informed Modeling of Code-Switched Text
TLDR
It is hypothesize that encoding language information strengthens a language model by helping to learn code-switching points and is demonstrated that the highest performing model achieves a test perplexity of 19.52 on the CS corpus that was collected and processed.
Translating Code-Switched Texts From Bilingual Speakers
TLDR
The goal of this project is to create a model best suited for code-switching translation tasks, and shows that the LID-Translation Model Pipeline performs better than a direct translation pipeline and the Direct-Translation Bilingual Model performed better than the regular translation models fine-tuned on non-bilingual datasets.
Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training
TLDR
An ASR-motivated evaluation setup is proposed which is decoupled from an ASR system and the choice of vocabulary, and this setup lends itself to a discriminative training approach, which is demonstrated to work better than generative language modeling.
Machine Translation on a Parallel Code-Switched Corpus
TLDR
Several methods to translate code-switched corpus are examined: conventional statistical machine translation, the end-to-end neural machine translation and multitask-learning, which the author proposes is unique.
Code-Switching Language Modeling with Bilingual Word Embeddings: A Case Study for Egyptian Arabic-English
TLDR
While all representations improve CS LM, this work proposes an innovative but simple approach that jointly learns bilingual word representations without the use of any parallel data, relying only on monolingual and a small amount of CS data.
Code-switched Language Models Using Dual RNNs and Same-Source Pretraining
TLDR
A novel recurrent neural network unit with dual components that focus on each language in the code-switched text separately and Pretraining the LM using synthetic text from a generative model estimated using the training data is proposed.
Code-switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation
TLDR
By utilizing a generative adversarial network, an unsupervised method is proposed that can generate intra-sentential code-switching sentences from monolingual sentences and shows that the generated code-Switching sentences improve the performance of code- Switched language models.
Training Code-Switching Language Model with Monolingual Data
TLDR
This paper constraining and normalizing output projection matrix in RNN based language model makes the embeddings of different languages close to each other and uses unsupervised bilingual word translation to analyze if semantically equivalent words in different languages are mapped together.

References

SHOWING 1-10 OF 28 REFERENCES
Asymmetric acoustic modeling of mixed language speech
TLDR
This work proposes to improve speech recognition performance on speaker-independent, mixed language speech by asymmetric acoustic modeling using selective decision tree merging between a bilingual model and an accented embedded speech model that outperforms previous approaches.
Improved mixed language speech recognition using asymmetric acoustic model and language model with code-switch inversion constraints
  • Ying Li, Pascale Fung
  • Computer Science
    2013 IEEE International Conference on Acoustics, Speech and Signal Processing
  • 2013
We propose an integrated framework for large vocabulary continuous mixed language speech recognition that handles the accent effect in the bilingual acoustic model and the inversion constraint well
Combining recurrent neural networks and factored language models during decoding of code-Switching speech
TLDR
This paper develops factored language models and converts recurrent neural network language models into backoff language models for an efficient usage during decoding and combines the language models during decoding to obtain a mixed error rate reduction.
An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling
TLDR
An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling, and proposes a state mapping approach to merge English states with similar Mandarin states to solve the problem of very limited data for English.
Language dependent universal phoneme posterior estimation for mixed language speech recognition
TLDR
A new theoretical framework to combine phoneme class posterior probabilities in a principled way by using (statistical) evidence about the language identity to estimate “universal” phoneme posterior probabilities for mixed language speech recognition is proposed.
Language identification on code-switching utterances using multiple cues
TLDR
A two-stage framework is proposed, containing a language identifier and then a speech recognizer, to evaluate on a Mandarin-Taiwanese codeswitching utterance, using a maximum a posteriori decision rule to connect an acoustic model, a duration model and a language model.
A first speech recognition system for Mandarin-English code-switch conversational speech
  • Ngoc Thang Vu, D. Lyu, Haizhou Li
  • Computer Science
    2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2012
TLDR
This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech and investigated statistical machine translation (SMT) - based text generation approaches for building code- Switched language models.
Syntactic and Semantic Features For Code-Switching Factored Language Models
TLDR
The experimental results reveal that Brown word clusters, part-of-speech tags and open-class words are the most effective at reducing the perplexity of factored language models on the Mandarin-English Code-Switching corpus SEAME.
Detection of language boundary in code-switching utterances by bi-phone probabilities
TLDR
This paper presents an effective method to detect the language boundary (LB) in code-switching utterances, mainly produced in Cantonese, a commonly used Chinese dialect, whilst occasionally English words are inserted between Cantonse words.
Chinese-English bilingual speech recognition
TLDR
Experimental results show that the first method is more compact and flexible in acoustic modeling than the second method, and achieves higher word accuracy than the first, which can provide the required consistency with the western languages.
...
1
2
3
...