WTMED at MEDIQA 2019: A Hybrid Approach to Biomedical Natural Language Inference

@inproceedings{Wu2019WTMEDAM,
  title={WTMED at MEDIQA 2019: A Hybrid Approach to Biomedical Natural Language Inference},
  author={Zhaofeng Wu and Yan Song and Sicong Huang and Yuanhe Tian and F. Xia},
  booktitle={BioNLP@ACL},
  year={2019}
}
Natural language inference (NLI) is challenging, especially when it is applied to technical domains such as biomedical settings. In this paper, we propose a hybrid approach to biomedical NLI where different types of information are exploited for this task. Our base model includes a pre-trained text encoder as the core component, and a syntax encoder and a feature encoder to capture syntactic and domain-specific information. Then we combine the output of different base models to form more… 

Figures and Tables from this paper

Surf at MEDIQA 2019: Improving Performance of Natural Language Inference in the Clinical Domain by Adopting Pre-trained Language Model
TLDR
This work employs word/subword-level based models that adopt large-scale data-driven methods such as pre-trained language models and transfer learning in analyzing text for the clinical domain that demonstrate the superiority of the proposed methods by achieving 90.6% accuracy in medical domain natural language inference task.
Improving biomedical named entity recognition with syntactic information
TLDR
The experimental results on six English benchmark datasets demonstrate that auto-processed syntactic information can be a useful resource for BioNER and the method with KVMN can appropriately leverage such information to improve model performance.
Probing Pre-Trained Language Models for Disease Knowledge
TLDR
This paper introduces DisKnE, a new benchmark for Disease Knowledge Evaluation, and annotated each positive MedNLI example with the types of medical reasoning that are needed, then created negative examples by corrupting these positive examples in an adversarial way.
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering
TLDR
The shared task is motivated by a need to develop relevant methods, techniques and gold standards for inference and entailment in the medical domain, and their application to improve domain specific information retrieval and question answering systems.
Applying Natural Language Inference or Question Entailment for Crowdsourcing More Data
TLDR
A novel way to crowdsource data for patient question answering by using recognise question entailment (RQE) to locate similar questions (answered questions) to the unanswered questions and using natural language inference (NLI) to validate if the similar questions’ answers can be inferred from their corresponding unanswered questions.
Learning with Latent Structures in Natural Language Processing: A Survey
TLDR
This work surveys three main families of methods to learn surrogate gradients, continuous relaxation, and marginal likelihood maximization via sampling to incorporate better inductive biases for improved end-task performance and better interpretability.
Improving Factual Completeness and Consistency of Image-to-Text Radiology Report Generation
TLDR
This work introduces two new simple rewards to encourage the generation of factually complete and consistent radiology reports: one that encourages the system to generate radiology domain entities consistent with the reference, and one that uses natural language inference to encourage these entities to be described in inferentially consistent ways.
Dependency-driven Relation Extraction with Attentive Graph Convolutional Networks
TLDR
In this approach, an attention mechanism upon graph convolutional networks is applied to different contextual words in the dependency tree obtained from an off-the-shelf dependency parser, to distinguish the importance of different word dependencies.
A survey on textual entailment based question answering

References

SHOWING 1-10 OF 55 REFERENCES
Lessons from Natural Language Inference in the Clinical Domain
TLDR
This work introduces MedNLI - a dataset annotated by doctors, performing a natural language inference task (NLI), grounded in the medical history of patients, and presents strategies to leverage transfer learning using datasets from the open domain and incorporate domain knowledge from external data and lexical sources.
Incorporating Domain Knowledge into Natural Language Inference on Clinical Texts
TLDR
A new incorporating medical concept definitions module on the classic enhanced sequential inference model (ESIM), which first extracts the most relevant medical concept for each word, if it exists, then encodes the definition of this medical concept with a bidirectional long short-term network (BiLSTM) to obtain domain-specific definition representations, and attends these definition representations over vanilla word embeddings.
Publicly Available Clinical BERT Embeddings
TLDR
This work explores and releases two BERT models for clinical text: one for generic clinical text and another for discharge summaries specifically, and demonstrates that using a domain-specific model yields performance improvements on 3/5 clinical NLP tasks, establishing a new state-of-the-art on the MedNLI dataset.
Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering
TLDR
The shared task is motivated by a need to develop relevant methods, techniques and gold standards for inference and entailment in the medical domain, and their application to improve domain specific information retrieval and question answering systems.
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
TLDR
A novel dependent reading bidirectional LSTM network (DR-BiLSTM) is proposed to efficiently model the relationship between a premise and a hypothesis during encoding and inference in the natural language inference (NLI) task.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
TLDR
This article introduces BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining), which is a domain-specific language representation model pre-trained on large-scale biomedical corpora that largely outperforms BERT and previous state-of-the-art models in a variety of biomedical text mining tasks when pre- trained on biomedical Corpora.
Neural Natural Language Inference Models Enhanced with External Knowledge
TLDR
This paper enrichs the state-of-the-art neural natural language inference models with external knowledge and demonstrates that the proposed models improve neural NLI models to achieve the state of theart performance on the SNLI and MultiNLI datasets.
ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing
TLDR
ScispaCy, a new Python library and models for practical biomedical/scientific text processing, which heavily leverages the spaCy library is described, which detail the performance of two packages of models released in scispa Cy and demonstrate their robustness on several tasks and datasets.
Multi-Task Deep Neural Networks for Natural Language Understanding
TLDR
A Multi-Task Deep Neural Network (MT-DNN) for learning representations across multiple natural language understanding (NLU) tasks that allows domain adaptation with substantially fewer in-domain labels than the pre-trained BERT representations.
Enhanced LSTM for Natural Language Inference
TLDR
This paper presents a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset, and demonstrates that carefully designing sequential inference models based on chain LSTMs can outperform all previous models.
...
...