Interactive Extractive Search over Biomedical Corpora
@article{TaubTabib2020InteractiveES, title={Interactive Extractive Search over Biomedical Corpora}, author={Hillel Taub-Tabib and Micah Shlain and Shoval Sadde and Dan Lahav and Matan Eyal and Yaara Cohen and Yoav Goldberg}, journal={ArXiv}, year={2020}, volume={abs/2006.04148} }
We present a system that allows life-science researchers to search a linguistically annotated corpus of scientific texts using patterns over dependency graphs, as well as using patterns over token sequences and a powerful variant of boolean keyword queries. In contrast to previous attempts to dependency-based search, we introduce a light-weight query language that does not require the user to know the details of the underlying linguistic representations, and instead to query the corpus by…
18 Citations
Extractive Search for Analysis of Biomedical Texts
- Computer ScienceSIGIR
- 2022
This work presents a two-stage system that creates custom datasets using a powerful mix of keyword and syntactic matching, and then returns lists of related words, which are used in downstream biomedical work.
A Search Engine for Discovery of Scientific Challenges and Directions
- Computer ScienceAAAI
- 2022
A novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery on a large corpus of interdisciplinary work relating to the COVID-19 pandemic, ranging from biomedicine to areas such as AI and economics.
CSFCube - A Test Collection of Computer Science Research Articles for Faceted Query by Example
- Computer ScienceNeurIPS Datasets and Benchmarks
- 2021
This work introduces the task of faceted Query by Example in which users can also specify a finer grained aspect in addition to the input query document, and describes an expert annotated test collection to evaluate models trained to perform this task.
Text mining approaches for dealing with the rapidly expanding literature on COVID-19
- Computer ScienceBriefings Bioinform.
- 2021
This review discusses the corpora, modeling resources, systems and shared tasks that have been introduced for COVID-19, and lists 39 systems that provide functionality such as search, discovery, visualization and summarization over the CO VID-19 literature.
Neural Extractive Search
- Computer ScienceACL
- 2021
The goals of this paper are to concisely introduce the extractive-search paradigm; and to demonstrate a prototype neural retrieval system for extractive search and its benefits and potential.
Hybrid Search based Enhanced Named Entity Annotation Tool
- Computer ScienceISEC
- 2022
A novel hybrid search-based enhanced annotation tool that provides an easy-to-use GUI and several search modes to accelerate the annotation exercise and provides faster annotation than typical annotators and comparable performance with state-of-the-art tools.
Rapid Knowledgebase Construction and Hypotheses Generation Using Extractive Literature Search
- Computer SciencebioRxiv
- 2022
This work presents a methodology and a supporting tool to allow individual researchers or small teams, without background in bio-curation or computer science, to mine the scientific literature and construct ad-hoc, personalized, and literature-anchored knowledgebases, that are tailored around their specific research interests and support their scientific goals.
Past and future uses of text mining in ecology and evolution
- Computer ScienceProceedings of the Royal Society B
- 2022
Applying computational tools from text mining and NLP will increase the efficiency of data synthesis, improve the reproducibility of literature reviews, formalize analyses of research biases and knowledge gaps, and promote data-driven discovery of patterns across ecology and evolutionary biology.
A Computational Inflection for Scientific Discovery
- Computer ScienceArXiv
- 2022
The confluence of societal and computational trends suggests that computer science is poised to ignite a revolution in the scientific process itself.
Evolutionary Algorithm Based Summarization for Analyzing COVID-19 Medical Reports
- Computer ScienceUnderstanding COVID-19: The Role of Computational Intelligence
- 2021
This chapter tries to extract important information about COVID-19 from the available text documents, such as research papers, articles, journals, reports, and other publications, and its performance is evaluated by comparing it with a few of the related state-of-the-art methods.
References
SHOWING 1-10 OF 22 REFERENCES
Syntactic Search by Example
- Computer ScienceACL
- 2020
A light-weight query language is introduced that does not require the user to know the details of the underlying syntactic representations, and instead to query the corpus by providing an example sentence coupled with simple markup.
Exploratory Relation Extraction in Large Text Corpora
- Computer ScienceCOLING
- 2014
This paper proposes and demonstrates Exploratory Relation Extraction, a novel approach to identifying and extracting relations from large text corpora based on user-driven and data-guided incremental exploration and presents an interactive workflow that allows users to build extractors based on entity types and human-readable extraction patterns derived from subtrees in dependency trees.
Attending to All Mention Pairs for Full Abstract Biological Relation Extraction
- Computer ScienceAKBC@NIPS
- 2017
This work proposes a model to consider all mention and entity pairs simultaneously in order to make a prediction, which achieves the state of the art on the Biocreative V Chemical Disease Relation dataset for models without KB resources, outperforming ensembles of models which use hand-crafted features and additional linguistic resources.
pyBART: Evidence-based Syntactic Transformations for IE
- Computer ScienceACL
- 2020
This work introduces a broad-coverage, data-driven and linguistically sound set of transformations, that makes event-structure and many lexical relations explicit, and presents pyBART, an easy-to-use open-source Python library for converting English UD trees either to Enhanced UD graphs or to the authors' representation.
Odinson: A Fast Rule-based Information Extraction Framework
- Computer ScienceLREC
- 2020
Odinson, a rule-based information extraction framework, which couples a simple yet powerful pattern language that can operate over multiple representations of text, with a runtime system that operates in near real time, to guarantee the rapid matching of patterns.
The NLM Medical Text Indexer System for Indexing Biomedical Literature
- Computer ScienceBioASQ@CLEF
- 2013
An overview of MTI’s functionality, performance, and its evolution over the years is provided.
ExaCT: automatic extraction of clinical trial characteristics from journal publications
- Computer ScienceBMC Medical Informatics Decis. Mak.
- 2010
An automatic information extraction system that assists users with locating and extracting key trial characteristics from full-text journal articles reporting on randomized controlled trials (RCTs) and can be extended to handle other characteristics and document types.
Cross-Sentence N-ary Relation Extraction with Graph LSTMs
- Computer ScienceTACL
- 2017
A general relation extraction framework based on graph long short-term memory networks (graph LSTMs) that can be easily extended to cross-sentence n-ary relation extraction is explored, demonstrating its effectiveness with both conventional supervised learning and distant supervision.
ScispaCy: Fast and Robust Models for Biomedical Natural Language Processing
- Computer ScienceBioNLP@ACL
- 2019
ScispaCy, a new Python library and models for practical biomedical/scientific text processing, which heavily leverages the spaCy library is described, which detail the performance of two packages of models released in scispa Cy and demonstrate their robustness on several tasks and datasets.
LIVIVO – the Vertical Search Engine for Life Sciences
- Computer ScienceDatenbank-Spektrum
- 2016
Future work will focus on the exploitation of life science ontologies and on the employment of NLP technologies in order to improve query expansion, filters in faceted search, and concept based relevancy rankings in LIVIVO.