Extracting a Knowledge Base of Mechanisms from COVID-19 Papers
@article{Amini2021ExtractingAK, title={Extracting a Knowledge Base of Mechanisms from COVID-19 Papers}, author={Aida Amini and Tom Hope and David Wadden and Madeleine van Zuylen and Eric Horvitz and Roy Schwartz and Hannaneh Hajishirzi}, journal={ArXiv}, year={2021}, volume={abs/2010.03824} }
The COVID-19 pandemic has spawned a diverse body of scientific literature that is challenging to navigate, stimulating interest in automated tools to help find useful knowledge. We pursue the construction of a knowledge base (KB) of mechanisms—a fundamental concept across the sciences, which encompasses activities, functions and causal relations, ranging from cellular processes to economic impacts. We extract this information from the natural language of scientific papers by developing a broad…
Figures and Tables from this paper
16 Citations
A Search Engine for Discovery of Scientific Challenges and Directions
- Computer ScienceAAAI
- 2022
A novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery on a large corpus of interdisciplinary work relating to the COVID-19 pandemic, ranging from biomedicine to areas such as AI and economics.
Constructing public health evidence knowledge graph for decision-making support from COVID-19 literature of modelling study
- Computer ScienceJournal of Safety Science and Resilience
- 2021
Extracting a Knowledge Base of COVID-19 Events from Social Media
- Computer Science
- 2020
A manually annotated corpus of 10,000 tweets containing public reports of five COVID-19 events, including positive and negative tests, deaths, denied access to testing, claimed cures and preventions, shows that it can support fine-tuning BERTbased classifiers to automatically extract publicly reported events and help track the spread of a new disease.
Data Models for Annotating Biomedical Scholarly Publications: the Case of CORD-19
- Computer Science
- 2022
This systematic review provides an analysis of the data models that have been applied to semantic annotation projects for the scholarly publications available in the CORD-19 dataset, an open database of the full texts of scholarly publications about COVID-19.
Queries related to COVID-19: a more effective retrieval through finetuned ALBERT with BM25L question answering system
- Computer Science
- 2021
A finetuned ALBERT-based QA system in association with Best Match25 (Okapi BM25) ranking function and its variant BM25L for context retrieval and provided high scores in benchmark data sets such as SQuAD for answers related to COVID-19 questions.
CovRelex: A COVID-19 Retrieval System with Relation Extraction
- Computer ScienceEACL
- 2021
CovRelex is a scientific paper retrieval system targeting entities and relations via relation extraction on COVID-19 scientific papers aimed at building a system supporting users efficiently in acquiring knowledge across a huge number of CO VID-19 science papers published rapidly.
A Computational Inflection for Scientific Discovery
- Computer ScienceArXiv
- 2022
The confluence of societal and computational trends suggests that computer science is poised to ignite a revolution in the scientific process itself.
SciCo: Hierarchical Cross-Document Coreference for Scientific Concepts
- Computer ScienceAKBC
- 2021
This work presents a new task of hierarchical CDCR for concepts in scientific papers, with the goal of jointly inferring coreference clusters and hierarchy between them and creates SCICO, an expert-annotated dataset for this task.
DiSCoMaT: Distantly Supervised Composition Extraction from Tables in Materials Science Articles
- Computer ScienceArXiv
- 2022
This work observes that materials science researchers organize similar compositions in a wide variety of table styles, necessitating an intelligent model for table understanding and composition extraction, and presents D I SC O M A T, a strong baseline geared towards this task, which outperforms recent table processing architectures by significant margins.
Predicting Informativeness of Semantic Triples
- Computer ScienceRANLP
- 2021
This work uses full texts of biomedical publications to create a training corpus of informative and important semantic triples based on the notion that the main contributions of an article are summarized in its abstract, and suggests that an importance ranking for semantic tripling could also be generated.
References
SHOWING 1-10 OF 90 REFERENCES
SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search
- Computer SciencebioRxiv
- 2020
Sight is presented, a system for exploratory search of COVID-19 research integrating two key capabilities: first, exploring associations between biomedical facets automatically extracted from papers; second, combining textual and network information to search and visualize groups of researchers and their ties.
CORD-19: The COVID-19 Open Research Dataset
- Computer ScienceNLPCOVID19
- 2020
The mechanics of dataset construction are described, highlighting challenges and key design decisions, an overview of how CORD-19 has been used, and several shared tasks built around the dataset are described.
COVID-19 Knowledge Graph: Accelerating Information Retrieval and Discovery for Scientific Literature
- Computer ScienceKNLP
- 2020
This work presents the COVID-19 Knowledge Graph (CKG), a heterogeneous graph for extracting and visualizing complex relationships between CO VID-19 scientific articles, and proposes a document similarity engine that leverages low-dimensional graph embeddings from the CKG with semanticembeddings for similar article retrieval.
Information Mining for COVID-19 Research From a Large Volume of Scientific Literature
- MedicineArXiv
- 2020
A graph-based model is developed using abstracts of 10,683 scientific articles to find key information on three topics: transmission, drug types, and genome research related to coronavirus to expedite and recommend new and alternative directions for COVID-19 research.
COVID-SEE: Scientific Evidence Explorer for COVID-19 Related Research
- Computer ScienceArXiv
- 2020
COVID-SEE augments search by providing a visual overview supporting exploration of a collection to identify key articles of interest, and builds on several distinct text analysis and natural language processing methods to structure and organise information in publications.
Separating Wheat from Chaff: Joining Biomedical Knowledge and Patient Data for Repurposing Medications
- MedicineAAAI
- 2019
We present a system that jointly harnesses large-scale electronic health records data and a concept graph mined from the medical literature to guide drug repurposing—the process of applying known…
Literome: PubMed-scale genomic knowledge base in the cloud
- BiologyBioinform.
- 2014
The Literome project has developed an automatic curation system to extract genomic knowledge from PubMed articles and made this knowledge available in the cloud with a Web site to facilitate browsing, searching and reasoning.
Constructing a semantic predication gold standard from the biomedical literature
- Computer ScienceBMC Bioinformatics
- 2011
A multi-phase gold standard annotation study, in which 500 sentences randomly selected from MEDLINE abstracts on a wide range of biomedical topics with 1371 semantic predications are annotated, showing increasing agreement in the main annotation phase points out that an acceptable level of agreement can be achieved in multiple iterations.
COVID-19 SignSym – A fast adaptation of general clinical NLP tools to identify and normalize COVID-19 signs and symptoms to OMOP common data model
- MedicineArXiv
- 2020
An automated tool is built, which can extract signs/symptoms and their eight attributes (body location, severity, temporal expression, subject, condition, uncertainty, negation, and course) from clinical text and will provide fundamental supports to the secondary use of EHRs, thus accelerating the global research of COVID-19.
BioCreative V CDR task corpus: a resource for chemical disease relation extraction
- BiologyDatabase J. Biol. Databases Curation
- 2016
The BC5CDR corpus was successfully used for the BioCreative V challenge tasks and should serve as a valuable resource for the text-mining research community.