Multi-class Hierarchical Question Classification for Multiple Choice Science Exams

@article{Xu2019MulticlassHQ,
  title={Multi-class Hierarchical Question Classification for Multiple Choice Science Exams},
  author={Dongfang Xu and Peter Jansen and Jaycie Martin and Zhengnan Xie and Vikas Yadav and Harish Tayyar Madabushi and Oyvind Tafjord and Peter Clark},
  journal={ArXiv},
  year={2019},
  volume={abs/1908.05441}
}
Prior work has demonstrated that question classification (QC), recognizing the problem domain of a question, can help answer it more accurately. However, developing strong QC algorithms has been hindered by the limited size and complexity of annotated data available. To address this, we present the largest challenge dataset for QC, containing 7,787 science exam questions paired with detailed classification labels from a fine-grained hierarchical taxonomy of 406 problem domains. We then show… CONTINUE READING

Figures, Tables, Results, and Topics from this paper.

Key Quantitative Results

  • We empirically demonstrate large performance gains of +0.12 MAP (+13.5% P@1) on science exam question classification using a BERT-based model over five previous state-of-the art methods, while improving performance on two biomedical question datasets by 4-5%.
  • BERT-QC reaches 84.9% accuracy on this dataset, an increase of +4.5% over the best previous model.
  • In this work we generate the most fine-grained challenge dataset for question classification, using complex and syntactically diverse questions, and show gains of up to 12% are possible with our question classification model across datasets in open, science, and medical domains.

Explore Further: Topics Discussed in This Paper

References

Publications referenced by this paper.
SHOWING 1-10 OF 54 REFERENCES

Learning Question Classifiers

  • COLING
  • 2002
VIEW 8 EXCERPTS
HIGHLY INFLUENTIAL

Convolutional neural networks for sentence classification

Y. Kim
  • Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1746–1751.
  • 2014
VIEW 4 EXCERPTS
HIGHLY INFLUENTIAL