Inferring Which Medical Treatments Work from Reports of Clinical Trials

@article{Lehman2019InferringWM,
  title={Inferring Which Medical Treatments Work from Reports of Clinical Trials},
  author={Eric P. Lehman and Jay DeYoung and Regina Barzilay and Byron C. Wallace},
  journal={ArXiv},
  year={2019},
  volume={abs/1904.01606}
}
How do we know if a particular medical treatment actually works? Ideally one would consult all available evidence from relevant clinical trials. Unfortunately, such results are primarily disseminated in natural language scientific articles, imposing substantial burden on those trying to make sense of them. In this paper, we present a new task and corpus for making this unstructured published scientific evidence actionable. The task entails inferring reported findings from a full-text article… 

Figures and Tables from this paper

Evidence Inference 2.0: More Data, Better Models
TLDR
Additional annotations are collected to expand the Evidence Inference dataset by 25%, provide stronger baseline models, systematically inspect the errors that these make, and probe dataset quality.
What Does the Evidence Say? Models to Help Make Sense of the Biomedical Literature
TLDR
Work is highlighted on developing tasks, corpora, and models to support semi-automated evidence retrieval and extraction that can consume articles describing clinical trials and automatically extract from these key clinical variables and findings, and estimate their reliability.
Understanding Clinical Trial Reports: Extracting Medical Entities and Their Relations
TLDR
This work considers the end-to-end task of extracting treatments and outcomes from full-text articles describing clinical trials and inferring the reported results for the former with respect to the latter, and proposes a new method motivated by how trial results are typically presented that outperforms these purely data-driven baselines.
Predicting Clinical Trial Results by Implicit Evidence Integration
TLDR
This work introduces a novel Clinical Trial Result Prediction (CTRP) task, and pre-train a model to predict the disentangled results from such implicit evidence and fine-tune the model with limited data on the downstream datasets.
Semi-Automated evidence synthesis in health psychology: current methods and future prospects
TLDR
How semi-automation via machine learning and natural language processing methods may help researchers and practitioners to review evidence more efficiently is discussed.
Generating (Factual?) Narrative Summaries of RCTs: Experiments with Neural Multi-Document Summarization
TLDR
This work evaluates modern neural models for abstractive summarization of relevant article abstracts from systematic reviews previously conducted by members of the Cochrane collaboration, and proposes a new method for automatically evaluating the factuality of generated narrative evidence syntheses using models that infer the directionality of reported findings.
Predicting Intervention Approval in Clinical Trials through Multi-Document Summarization
TLDR
This study proposes a new method to predict the effectiveness of an intervention in a clinical trial which relies on generating an informative summary from multiple documents available in the literature about the intervention under study.
Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time
TLDR
Trialstreamer, a living database of clinical trial reports, is introduced, with the evidence extraction component described, which extracts from biomedical abstracts key pieces of information that clinicians need when appraising the literature, and also the relations between these.
A clinical trials corpus annotated with UMLS entities to enhance the access to evidence-based medicine
TLDR
This resource is adequate for experiments with state-of-the-art approaches to biomedical named entity recognition and is generalizable to other languages with similar available sources.
...
...

References

SHOWING 1-10 OF 29 REFERENCES
Automatic Summarization of Results from Clinical Trials
TLDR
A novel method for automatically creating EBM-oriented summaries from research abstracts of randomly-controlled trials (RCTs) is presented, which extracts descriptions of the treatment groups and outcomes, as well as various associated quantities, and then calculates summary statistics.
Show Me Your Evidence - an Automatic Method for Context Dependent Evidence Detection
TLDR
This work proposes the task of automatically detecting evidences from unstructured text that support a given claim and suggests a system architecture based on supervised learning to address the evidence detection task.
ExaCT: automatic extraction of clinical trial characteristics from journal publications
TLDR
An automatic information extraction system that assists users with locating and extracting key trial characteristics from full-text journal articles reporting on randomized controlled trials (RCTs) and can be extended to handle other characteristics and document types.
BioCause: Annotating and analysing causality in the biomedical domain
TLDR
Augmenting named entity and event annotations with information about causal discourse relations could benefit the development of more sophisticated IE systems and further influence theDevelopment of multiple tasks, such as enabling textual inference to detect entailments, discovering new facts and providing new hypotheses for experimental work.
Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide
TLDR
This work trained and optimized support vector machine and convolutional neural network models on the titles and abstracts of the Cochrane Crowd RCT set and evaluated the models on an external dataset (Clinical Hedges), allowing direct comparison with traditional database search filters.
Identifying Comparative Claim Sentences in Full-Text Scientific Articles
TLDR
A set of semantic and syntactic features that characterize a sentence are introduced and then it is demonstrated how those features can be used in three different classifiers: Naive Bayes (NB), a Support Vector Machine (SVM) and a Bayesian network (BN).
Distributional Semantics Resources for Biomedical Text Processing
TLDR
This study introduces the first set of such language resources created from analysis of the entire available biomedical literature, including a dataset of all 1to 5-grams and their probabilities in these texts and new models of word semantics.
Question Answering in Webclopedia
TLDR
The QA Typology contains 94 nodes, of which 47 are leaf nodes; each Typology node has been annotated with examples and typical patterns of expression of both Question and Answer, as indicated in Figure 3.
A large annotated corpus for learning natural language inference
TLDR
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
Coarse-to-Fine Question Answering for Long Documents
TLDR
A framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models is presented and sentence selection is treated as a latent variable trained jointly from the answer only using reinforcement learning.
...
...