• Corpus ID: 239016668

Intent Classification Using Pre-Trained Embeddings For Low Resource Languages

@article{Yadav2021IntentCU,
  title={Intent Classification Using Pre-Trained Embeddings For Low Resource Languages},
  author={Hemant Yadav and Akshat Gupta and Sai Krishna Rallabandi and Alan W. Black and Rajiv Ratn Shah},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.09264}
}
Building Spoken Language Understanding (SLU) systems that do not rely on language specific Automatic Speech Recognition (ASR) is an important yet less explored problem in language processing. In this paper, we present a comparative study aimed at employing a pre-trained acoustic model to perform SLU in low resource scenarios. Specifically, we use three different embeddings extracted using Allosaurus, a pre-trained universal phone decoder: (1) Phone (2) Panphone, and (3) Allo embeddings. These… 

Figures and Tables from this paper

Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
TLDR
The objective of this paper is to fill that gap of a comprehensive literature survey of the publicly available Sinhala natural language tools and research so that the researchers working in this field can better utilize contributions of their peers.

References

SHOWING 1-10 OF 13 REFERENCES
Universal Phone Recognition with a Multilingual Allophone System
TLDR
This work proposes a joint model of both language-independent phone and language-dependent phoneme distributions that can build a (nearly-)universal phone recognizer that, when combined with the PHOIBLE [1] large, manually curated database of phone inventories, can be customized into 2,000 language dependent recognizers.
Sinhala and Tamil Speech Intent Identification From English Phoneme Based ASR
TLDR
This paper uses a pre-trained English ASR model to generate phoneme probability features and uses them to identify intents of utterances expressed in Sinhala and Tamil, for which a rather small speech dataset is available.
Speech-Language Pre-Training for End-to-End Spoken Language Understanding
  • Yao Qian, Ximo Bian, Michael Zeng
  • Computer Science
    ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
TLDR
The proposed unified speech-language pre-trained model (SLP) is continually enhanced on limited labeled data from a target domain by using a conditional masked language model (MLM) objective, and thus can effectively generate a sequence of intent, slot type, and slot value for given input speech in the inference.
Speech Model Pre-training for End-to-End Spoken Language Understanding
TLDR
A method to reduce the data requirements of end-to-end SLU in which the model is first pre-trained to predict words and phonemes, thus learning good features for SLU is proposed and improves performance both when the full dataset is used for training and when only a small subset is used.
Deep Speech: Scaling up end-to-end speech recognition
TLDR
Deep Speech, a state-of-the-art speech recognition system developed using end-to-end deep learning, outperforms previously published results on the widely studied Switchboard Hub5'00, achieving 16.0% error on the full test set.
Domain Specific Intent Classification of Sinhala Speech Data
TLDR
The proposed solution is the first-of-its-kind for domain-specific intent classification for Sinhala language utilizing a feed-forward neural network with backpropagation and the performance of the system is evaluated using the recognition accuracy of the speech queries.
PanPhon: A Resource for Mapping IPA Segments to Articulatory Feature Vectors
TLDR
It is shown that phonological features outperform character-based models in PanPhon, a database relating over 5,000 IPA segments to 21 subsegmental articulatory features that boosts performance in various NER-related tasks.
Mere account mein kitna balance hai? - On building voice enabled Banking Services for Multilingual Communities
TLDR
This work investigates various training strategies for building speech based intent recognition systems and presents the results using a Naive Bayes classifier on approximate acoustic phone units using the Allosaurus library.
Intent Recognition and Unsupervised Slot Identification for Low-Resourced Spoken Dialog Systems
TLDR
An acoustic based SLU system that converts speech to its phonetic transcription using a universal phone recognition system and a word-free natural language understanding module that does intent recognition and slot identification from these phonetic transcriptions are presented.
Acoustics Based Intent Recognition Using Discovered Phonetic Units for Low Resource Languages
TLDR
A novel acoustics based intent recognition system that uses discovered phonetic units for intent classification and performs multilingual training of the intent classifier and shows improved cross-lingual transfer and zero-shot performance on an unknown language within the same language family.
...
1
2
...