The TechQA Dataset

@article{Castelli2020TheTD,
  title={The TechQA Dataset},
  author={V. Castelli and Rishav Chakravarti and Saswati Dana and Anthony Ferritto and Radu Florian and M. Franz and Dinesh Garg and Dinesh Khandelwal and J. Scott McCarley and Mike McCawley and Mohamed Nasr and Lin Pan and Cezar Pendus and J. Pitrelli and Saurabh Pujar and S. Roukos and Andrzej Sakrajda and Avirup Sil and Rosario A. Uceda-Sosa and T. Ward and Rong Zhang},
  journal={ArXiv},
  year={2020},
  volume={abs/1911.02984}
}
We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 dev, and 490 evaluation question/answer pairs -- thus reflecting the cost of creating large labeled… Expand

Paper Mentions

Technical Question Answering across Tasks and Domains
A Neural Question Answering System for Basic Questions about Subroutines
Towards building a Robust Industry-scale Question Answering System
A Technical Question Answering System with Transfer Learning
Multi-Stage Pretraining for Low-Resource Domain Adaptation
...
1
2
...

References

SHOWING 1-10 OF 20 REFERENCES
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
Frustratingly Easy Natural Question Answering
SQuAD: 100, 000+ Questions for Machine Comprehension of Text
Neural Domain Adaptation for Biomedical Question Answering
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
The NarrativeQA Reading Comprehension Challenge
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
...
1
2
...