Constructing sub-word units for spoken term detection
@article{Heerden2017ConstructingSU, title={Constructing sub-word units for spoken term detection}, author={Charl Johannes van Heerden and Damianos G. Karakos and Karthik Narasimhan and Marelie Hattingh Davel and Richard M. Schwartz}, journal={2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2017}, pages={5780-5784} }
Spoken term detection, especially of out-of-vocabulary (OOV) keywords, benefits from the use of sub-word systems. We experiment with different language-independent approaches to sub-word unit generation, generating both syllable-like and morpheme-like units, and demonstrate how the performance of syllable-like units can be improved by artificially increasing the number of unique units. The effect of unit choice is empirically evaluated using the eight languages from the 2016 IARPA BABEL…
14 Citations
On the Use of Grapheme Models for Searching in Large Spoken Archives
- Linguistics2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
This paper explores the possibility to use grapheme-based word and sub-word models in the task of spoken term detection (STD) and achieves STD performance comparable with phoneme-based models but without the additional burden of G2P conversion.
Induced Inflection-Set Keyword Search in Speech
- EconomicsSIGMORPHON
- 2020
This work provides a recipe and evaluation set for the community to use as an extrinsic measure of the performance of inflection generation approaches and indicates how lexeme-set search performance changes with the number of hypothesized inflections.
Spoken Term Detection and Relevance Score Estimation Using Dot-Product of Pronunciation Embeddings
- Computer ScienceInterspeech
- 2021
A novel approach to Spoken Term Detection in large spoken archives using deep LSTM networks based on the previous approach of using Siamese neural networks for STD and naturally extends it to directly localize a spoken term and estimate its relevance score.
Deep LSTM Spoken Term Detection using Wav2Vec 2.0 Recognizer
- Computer ScienceINTERSPEECH
- 2022
A bootstrapping approach that allows the transfer of the knowledge contained in traditional pronunciation vocabulary of DNN-HMM hybrid ASR into the context of grapheme-based Wav2Vec in the task of spoken term detection over a large set of spoken docu-ments is described.
Deep LSTM Spoken Term Detection using Wav2Vec 2.0 Recognizer
- Computer Science
- 2022
A bootstrapping approach that allows the transfer of the knowledge contained in traditional pronunciation vocabulary of DNN-HMM hybrid ASR into the context of grapheme-based Wav2Vec in the task of spoken term detection over a large set of spoken docu-ments is described.
ALBAYZIN 2018 spoken term detection evaluation: a multi-domain international evaluation in Spanish
- Computer ScienceEURASIP J. Audio Speech Music. Process.
- 2019
The obtained results suggest that the STD task is still in progress and performance is highly sensitive to changes in the data domain.
SPEECH KEYWORD SPOTTING SYSTEM
- Education, Economics
- 2017
In this paper we describe the 2016 BBN conversational telephone speech keyword spotting system; the culmination of four years of research and development under the IARPA Babel program. The system was…
ODSQA: Open-Domain Spoken Question Answering Dataset
- Computer Science2018 IEEE Spoken Language Technology Workshop (SLT)
- 2018
This paper releases Open-Domain Spoken Question Answering Dataset (ODSQA), the largest real SQA dataset, and finds that ASR errors have catastrophic impact on SQA, and that data augmentation on text-based QA training examples can improve SQA.
The 2016 BBN Georgian telephone speech keyword spotting system
- Education, Economics2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2017
The 2016 BBN conversational telephone speech keyword spotting system is described; the culmination of four years of research and development under the IARPA Babel program and presents the technological breakthroughs in building top-performing keyword spotting processing systems for new languages.
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation
- Computer ScienceICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
This work proposes to mitigate the ASR errors by aligning the mismatch between ASR hypotheses and their corresponding reference transcriptions by applying an adversarial model to this domain adaptation task.
References
SHOWING 1-10 OF 25 REFERENCES
Cross-word sub-word units for low-resource keyword spotting
- Economics, EducationSLTU
- 2014
This work investigates the use of sub-word lexical units for the detection of out-of-vocabulary (OOV) keywords in the keyword spotting task and demonstrates that cross-word subword units achieve similar performance on OOV keywords as other types of sub, but can be combined to produce further gains.
Using Pronunciation-Based Morphological Subword Units to Improve OOV Handling in Keyword Search
- LinguisticsIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2016
This paper systematically investigates morphology-based subword modeling approaches on seven low-resource languages and shows that using morphological subword units (morphs) in speech recognition decoding is substantially better than expanding word-decoded lattices into sub word units including phones, syllables and morphs.
Towards using hybrid word and fragment units for vocabulary independent LVCSR systems
- Computer ScienceINTERSPEECH
- 2009
It is shown that a hybrid system which combines words and data-driven, variable length sub word units has a better phone accuracy than word only systems and is better in detecting Out-Of-Vocabulary (OOV) terms and representing them phonetically.
Comparing decoding strategies for subword-based keyword spotting in low-resourced languages
- Education, Computer ScienceINTERSPEECH
- 2014
This paper investigates the use of subword lexical units for keyword spotting and finds that ignoring word boundaries improves the detection of OOV keywords without significantly impacting in-vocabulary keyword detection.
Subword and phonetic search for detecting out-of-vocabulary keywords
- Computer ScienceINTERSPEECH
- 2014
The syllable units are the best of the subword units for OOV keyword detection using fuzzy phonetic search, and these methods combine very well, sometimes resulting in ATWV scores for Oov terms which are not too far below those of IV terms.
A new method for OOV detection using hybrid word/fragment system
- Computer Science2009 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2009
A new method for detecting regions with out-of-vocabulary words in the output of a large vocabulary continuous speech recognition (LVCSR) system that outperforms existing methods published in the literature.
Subword speech recognition for detection of unseen words
- Computer ScienceINTERSPEECH
- 2012
Experiments show that the proposed subword recognizer outperforms other subword systems in terms of phonetic keyword search accuracy measured on queries that consist of words not present in the training data.
Improvements on transducing syllable lattice to word lattice for keyword search
- Linguistics2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2015
A weighted finite state transducer (WFST) based syllable decoding and transduction method for keyword search (KWS), and compares it with sub-word search and phone confusion methods in detail is compared.
Analysis of keyword spotting performance across IARPA babel languages
- Computer Science, Education2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2017
This work demonstrates that ATWV is keyword dependent, and that this must be accounted for in any cross-language analysis, and shows that while performance across languages does not track with any particular feature of the language, it is correlated with inter-annotator agreement.