ODSQA: Open-Domain Spoken Question Answering Dataset
@article{Lee2018ODSQAOS, title={ODSQA: Open-Domain Spoken Question Answering Dataset}, author={Chia-Hsuan Lee and Shang-Ming Wang and Huan-Cheng Chang and Hung-yi Lee}, journal={2018 IEEE Spoken Language Technology Workshop (SLT)}, year={2018}, pages={949-956} }
Reading comprehension by machine has been widely studied, but machine comprehension of spoken content is still a less investigated problem. In this paper, we release Open-Domain Spoken Question Answering Dataset (ODSQA) with more than three thousand questions. To the best of our knowledge, this is the largest real SQA dataset. On this dataset, we found that ASR errors have catastrophic impact on SQA. To mitigate the effect of ASR errors, subword units are involved, which brings consistent…
40 Citations
Knowledge Distillation for Improved Accuracy in Spoken Question Answering
- Computer ScienceICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2021
This work devise a training strategy to perform knowledge distillation (KD) from spoken documents and written counterparts to improve the performance of the student model by reducing the misalignment between automatic and manual transcripts.
Improving Spoken Question Answering Using Contextualized Word Representation
- Computer ScienceICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2020
This paper proposes using contextualized word representations to mitigate the effects of ASR errors and pretraining on existing textual QA datasets to mitigateThe data scarcity issue.
DUAL: Discrete Spoken Unit Adaptive Learning for Textless Spoken Question Answering
- Computer ScienceINTERSPEECH
- 2022
Discrete Spoken Unit Adaptive Learning (DUAL) is proposed, leveraging unlabeled data for pre-training and beingtuned by the SQA downstream task, which empirically showed yields results comparable to those obtained by cascading ASR and text QA model and robust to real-world data.
MRD-Net: Multi-Modal Residual Knowledge Distillation for Spoken Question Answering
- Computer ScienceIJCAI
- 2021
A novel multi-modal residual knowledge distillation method (MRD-Net), which further distills knowledge at the acoustic level from the audio-assistant (Audio-A) and proposes a simple yet effective attention mechanism to adaptively leverage audio-text features as the new deep attention knowledge to boost the network performance.
Mitigating Noisy Inputs for Question Answering
- Computer ScienceINTERSPEECH
- 2019
This work investigates and mitigate the effects of noise from Automatic Speech Recognition systems on two factoid Question Answering (QA) tasks, and empirically shown to improve the accuracy of downstream neural QA systems.
End-to-end Spoken Conversational Question Answering: Task, Dataset and Model
- Computer ScienceNAACL-HLT
- 2022
A novel data distillation approach, DDN ET, is proposed, which effectively in-gests cross-modal information to achievene-grained representations of the speech and language modalities to ease the process of knowledge transfer.
Mitigating the Impact of Speech Recognition Errors on Spoken Question Answering by Adversarial Domain Adaptation
- Computer ScienceICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
This work proposes to mitigate the ASR errors by aligning the mismatch between ASR hypotheses and their corresponding reference transcriptions by applying an adversarial model to this domain adaptation task.
Contextualized Attention-based Knowledge Transfer for Spoken Conversational Question Answering
- Computer ScienceInterspeech
- 2021
CADNet is proposed, a novel contextualized attention-based distillation approach, which applies both cross-att attention and self-attention to obtain ASR-robust contextualized embedding representations of the passage and dialogue history for performance improvements on SCQA.
A REVIEW ON TECHNIQUES FOR IMPROVING THE PERFORMANCES OF SPOKEN QUESTION ANSWERING SYSTEMS
- Computer Science
- 2021
Various techniques to mitigate the effects of ASR errors, and to increase the accuracy of the predicted answers are discussed.
An Initial Investigation of Non-Native Spoken Question-Answering
- Computer ScienceArXiv
- 2021
It is found that there is an approximately linear relationship between ASR errors and the SQA assessment scores but grammar mismatches have minimal impact.
References
SHOWING 1-10 OF 43 REFERENCES
Reading Wikipedia to Answer Open-Domain Questions
- Computer ScienceACL
- 2017
This approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs, indicating that both modules are highly competitive with respect to existing counterparts.
Learning to Paraphrase for Question Answering
- Computer ScienceEMNLP
- 2017
This paper presents a general framework which learns felicitous paraphrases for various QA tasks and shows that this framework consistently improves performance, achieving competitive results despite the use of simple QA models.
SQuAD: 100,000+ Questions for Machine Comprehension of Text
- Computer ScienceEMNLP
- 2016
A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%).
NewsQA: A Machine Comprehension Dataset
- Computer ScienceRep4NLP@ACL
- 2017
NewsQA, a challenging machine comprehension dataset of over 100,000 human-generated question-answer pairs, is presented and analysis confirms that NewsQA demands abilities beyond simple word matching and recognizing textual entailment.
DRCD: a Chinese Machine Reading Comprehension Dataset
- Computer ScienceArXiv
- 2018
DRCD (Delta Reading Comprehension Dataset), an open domain traditional Chinese machine reading comprehension (MRC) dataset, is introduced, which can be a source dataset in transfer learning.
Supervised and Unsupervised Transfer Learning for Question Answering
- Computer ScienceNAACL
- 2018
The performance of both models on a TOEFL listening comprehension test and MCTest is significantly improved via a simple transfer learning technique from MovieQA, which achieves the state-of-the-art on all target datasets.
Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension
- Computer ScienceINTERSPEECH
- 2018
On the new listening comprehension task, it is found that speech recognition errors have catastrophic impact on machine comprehension, and several approaches are proposed to mitigate the impact.
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text
- Computer ScienceEMNLP
- 2013
MCTest is presented, a freely available set of stories and associated questions intended for research on the machine comprehension of text that requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension.
Factoid Question Answering for Spoken Documents
- Computer Science
- 2012
This work explores, for the first time, which techniques can be robustly adapted from the usual QA on written documents to the more difficult spoken documents scenario, and study new information retrieval (IR) techniques designed for speech, and utilize several levels of linguistic information for the speech-based QA task.
(Almost) Zero-Shot Cross-Lingual Spoken Language Understanding
- Computer Science, Linguistics2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2018
Different approaches to train a SLU component with little supervision for two new languages - Hindi and Turkish are examined, and it is shown that with only a few hundred labeled examples the authors can surpass the approaches proposed in the literature.