Neural Domain Adaptation for Biomedical Question Answering

  title={Neural Domain Adaptation for Biomedical Question Answering},
  author={Georg Wiese and Dirk Weissenborn and Mariana Neves},
Factoid question answering (QA) has recently benefited from the development of deep learning (DL) systems. Neural network models outperform traditional approaches in domains where large datasets exist, such as SQuAD (ca. 100,000 questions) for Wikipedia articles. However, these systems have not yet been applied to QA in more specific domains, such as biomedicine, because datasets are generally too small to train a DL system from scratch. For example, the BioASQ dataset for biomedical QA… 

Figures and Tables from this paper

How Context or Knowledge Can Benefit Healthcare Question Answering?

A new joint model is developed to incorporate both context and knowledge embeddings into neural ranking architectures and achieves the state-of-the-art performance on both HealthQA and NFCorpus datasets.

External features enriched model for biomedical question answering

It is proven by the experiments conducted that external lexical and syntactic features can improve Pre-trained Language Model’s performance in biomedical domain question answering task.

Hierarchical Question-Aware Context Learning with Augmented Data for Biomedical Question Answering

This paper proposes a Hierarchical Question-Aware Context Learning (HQACL) model for the biomedical QA task constituted by multi-level attention, which is superior to the best recent solution and achieves a new state of the art.

Semantically Corroborating Neural Attention for Biomedical Question Answering

This paper addresses the problem in the context of factoid and summarization question types, using a variety of deep learning and semantic methods, including various architectures, transfer learning, biomedical named entity recognition and corroboration of semantic evidence.

Transferability of Natural Language Inference to Biomedical Question Answering

This paper observes that BioBERT trained on the NLI dataset obtains better performance on Yes/No, Factoid, and list type questions compared to performance obtained in a previous challenge, and presents a sequential transfer learning method that significantly performed well in the 8th BioASQ Challenge (Phase B).

CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering

This work proposes a simple yet effective framework, CliniQG4QA, which leverages question generation (QG) to synthesize QA pairs on new clinical contexts and boosts QA models without requiring manual annotations and introduces a seq2seq-based question phrase prediction (QPP) module that can be used together with most existing QG models to diversify the generation.

Pre-trained Language Model for Biomedical Question Answering

This paper investigates the performance of BioBERT, a pre-trained biomedical language model, in answering biomedical questions including factoid, list, and yes/no type questions.

Factoid Question Answering with Distant Supervision

Experimental results show that the model solely trained on generated data via the distant supervision and mined paraphrases could answer real-world questions with the accuracy of 49.34%.

Biomedical Question Answering: A Comprehensive Review

This work comprehensively investigate prior BQA approaches, which are classified into 6 major methodologies (open-domain, knowledge base, information retrieval, machine reading comprehension, question entailment and visual QA), 4 topics of contents (scientific, clinical, consumer health and examination) and 5 types of formats (yes/no, extraction, generation, multi-choice and retrieval).

Improving End-to-End Biomedical Question Answering System

This paper employs the BM25-based documents retriever, BERT-based neural ranker, and an answer extraction stage using the BioBERT pre-trained language model to address the question answering task, and gets competitive results on BioASQ8b.



Learning to Answer Biomedical Questions: OAQA at BioASQ 4B

The system extends the Yang et al. (2015) system and integrates additional biomedical and generalpurpose NLP annotators, machine learning modules for search result scoring, collective answer reranking, and yes/no answer prediction.

Making Neural QA as Simple as Possible but not Simpler

This work proposes a simple heuristic that guides the development of neural baseline systems for the extractive QA task and finds that there are two ingredients necessary for building a high-performing neural QA system: the awareness of question words while processing the context and a composition function that goes beyond simple bag-of-words modeling, such as recurrent neural networks.

An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition

Overall, BioASQ helped obtain a unified view of how techniques from text classification, semantic indexing, document and passage retrieval, question answering, and text summarization can be combined to allow biomedical experts to obtain concise, user-understandable answers to questions reflecting their real information needs.

Machine Comprehension Using Match-LSTM and Answer Pointer

This work proposes an end-to-end neural architecture for the Stanford Question Answering Dataset (SQuAD), based on match-LSTM, a model previously proposed previously for textual entailment, and Pointer Net, a sequence- to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences.

Dynamic Coattention Networks For Question Answering

The Dynamic Coattention Network (DCN) for question answering first fuses co-dependent representations of the question and the document in order to focus on relevant parts of both, then a dynamic pointing decoder iterates over potential answer spans to recover from initial local maxima corresponding to incorrect answers.

SQuAD: 100,000+ Questions for Machine Comprehension of Text

A strong logistic regression model is built, which achieves an F1 score of 51.0%, a significant improvement over a simple baseline (20%).

Representation Stability as a Regularizer for Improved Text Analytics Transfer Learning

Surprisingly, it is found that first distilling a human made rule based sentiment engine into a recurrent neural network and then integrating the knowledge with the target task data leads to a substantial gain in generalization performance.

Domain-Adversarial Training of Neural Networks

A new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions, which can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new gradient reversal layer.

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification

This work extends to sentiment classification the recently-proposed structural correspondence learning (SCL) algorithm, reducing the relative error due to adaptation between domains by an average of 30% over the original SCL algorithm and 46% over a supervised baseline.

Bidirectional Attention Flow for Machine Comprehension

The BIDAF network is introduced, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization.