• Corpus ID: 219636467

Information Extraction of Clinical Trial Eligibility Criteria

  title={Information Extraction of Clinical Trial Eligibility Criteria},
  author={Yitong Tseo and Markku I. Salkola and Ahmed Ahmed Mohamed and Anuj Kumar and Freddy Abnousi},
Clinical trials predicate subject eligibility on a diversity of criteria ranging from patient demographics to food allergies. Trials post their requirements as semantically complex, unstructured free-text. Formalizing trial criteria to a computer-interpretable syntax would facilitate eligibility determination. In this paper, we investigate an information extraction (IE) approach for grounding criteria from trials in ClinicalTrials(dot)gov to a shared knowledge base. We frame the problem as a… 

Figures and Tables from this paper

: Marfan ’ s Syndrome neg : Ehlers-Danlos Syndrome

This paper describes the submissions of the Natural Language Processing (NLP) team from the Australian Research Council Industrial Transformation Training Centre (ITTC) for Cognitive Computing in

ITTC @ TREC 2021 Clinical Trials Track

This paper describes the submissions of the Natural Language Processing (NLP) team from the Australian Research Council Industrial Transformation Training Centre (ITTC) for Cognitive Computing in

Clinical Trial Information Extraction with BERT

This work trained named entity recognition models to extract eligibility criteria entities by fine-tuning a set of pre-trained BERT models and compared the performance of CT-BERT with recent baseline methods including attention-based BiLSTM and Criteria2Query.

Attention-Based LSTM Network for COVID-19 Clinical Trial Parsing

This study applies a deep learning approach to extract eligibility criteria variables from COVID-19 trials to enable quantitative analysis of trial design and optimization and demonstrates that Att-BiLSTM is an effective approach for eligibility criteria parsing.

How the clinical research community responded to the COVID-19 pandemic: An analysis of the COVID-19 clinical studies in ClinicalTrials.gov

A careful examination of the registered CO VID-19 clinical studies can identify the research gaps and inform future COVID-19 trial design towards balanced internal validity and generalizability.

Transformer-based named entity recognition for parsing clinical trial eligibility criteria

With promising NER results, further investigations on building a reliable natural language processing (NLP)-assisted pipeline for automated electronic screening are needed.

A review of research on eligibility criteria for clinical trials

The purpose of this paper is to systematically sort out and analyze the cutting-edge research on the eligibility criteria of clinical trials, and investigate the research status of eligibility criteria for clinical trials on academic platforms such as arXiv and NIH.

How can natural language processing help model informed drug development?: a review

Challenges such as reproducibility, explainability, fairness, limited data, limited language-support, and security need to be overcome to ensure wider adoption of NLP in MIDD landscape.

Machine Learning Prediction of Clinical Trial Operational Efficiency

A machine learning model is developed to predict clinical trial operational efficiency using a novel dataset from Roche containing over 2,000 clinical trials across 20 years and multiple disease areas, demonstrating that operational efficiency can be predicted robustly using trial features.

Combining human and machine intelligence for clinical trial eligibility querying

Criteria2Query was developed to enable real-time user intervention for criteria selection and simplification, parsing error correction, and concept mapping and features to engage domain experts and to overcome the limitations in automated machine output are shown to be useful and user-friendly.



Benchmarking Ontologies: Bigger or Better?

A family of metrics that describe the breadth and depth with which an ontology represents its knowledge domain are introduced and show that this approach captures the quality of ontological representation and guides efforts to narrow the breach between ontology and collective discourse within a domain.

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

This article introduces BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining), which is a domain-specific language representation model pre-trained on large-scale biomedical corpora that largely outperforms BERT and previous state-of-the-art models in a variety of biomedical text mining tasks when pre- trained on biomedical Corpora.

The ClinicalTrials.gov results database--update and key issues.

The structure and contents of the results database are summarized, an update of relevant policies are provided, and how the data can be used to gain insight into the state of clinical research are shown.

Information Extraction Applications for Clinical Trials: A Survey

This paper reviews applications and methods used for IE and performs an analysis with the objective of understanding which applications have already been used for clinical trials and applications that although they were not and would have an easy adaptation to be used as well.

Criteria2Query: a natural language interface to clinical databases for cohort definition

Criteria2Query is a natural language interface that facilitates human-computer collaboration for cohort definition and execution using clinical databases and supports fully automated and interactive modes for autonomous data-driven cohort definition by researchers with minimal human effort.

Measures of the Amount of Ecologic Association Between Species

Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

Empirical results over the BioCreative V CDR corpus show that the use of either type of character-level word embeddings in conjunction with the BiLSTM-CRF models leads to comparable state-of-the-art performance, however, the models using CNN-based character- level wordembeddings have a computational performance advantage.

The Data Gap in the EHR for Clinical Research Eligibility Screening

40% of common eligibility criteria concepts were not even defined in the concept space in the EHR dataset for a cohort of Alzheimer’s Disease patients, indicating a significant data gap may impede EHR-based eligibility screening.

When will clinical trials finally reflect diversity?

An analysis of drug studies shows that most participants are white, even though trials are being done in more countries, reveal Todd C. Knepper and Howard L. McLeod.An analysis of drug studies shows

EliXR-TIME: A Temporal Knowledge Representation for Clinical Research Eligibility Criteria

EliXR-TIME is developed, a frame-based representation designed to support semantic annotation for temporal expressions in eligibility criteria by reusing applicable classes from well-known clinical temporal knowledge representations, and it is concluded that this knowledge representation can facilitate semantic annotation of the temporal expressions of eligibility criteria.