• Publications
  • Influence
RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond
TLDR
The semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for ‘neighborhood’ genes to known operons and regulons, and computational developments are described.
Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
TLDR
A comprehensive overview of the development and uptake of NLP methods applied to free-text clinical notes related to chronic diseases is provided, including the investigation of challenges faced by NLP methodologies in understanding clinical narratives.
Entity recognition in the biomedical domain using a hybrid approach
TLDR
A high-recall, high-precision approach for the extraction of biomedical entities from scientific articles by combining a dictionary-based entity recognizer with a machine-learning classifier, which achieves the best results so far in this particular task.
Attempto Controlled English: A Knowledge Representation Language Readable by Humans and Machines
TLDR
This paper presents the language ACE in a nutshell, and gives an overview of the discourse representation structures derived from ACE texts, a variant of first-order logic.
TX Task: Automatic Detection of Focus Organisms in Biomedical Publications
TLDR
This paper presents an approach to the detection and disambiguation of the focus organism, i.e. the organism which are the subject of the research described in scientific papers, which can then be used for the disambigsuation of other entities.
The gene normalization task in BioCreative III
TLDR
Evaluating teams using the gold standard and inferred ground truth shows that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible, and shows measures of comparative performance between teams.
Mining relations in the GENIA corpus
TLDR
This paper proposes a method for discovery of interactions between genes and proteins from the scientific literature, based on a complete syntactic analysis of the corpus, and reports on preliminary results.
OGER++: hybrid multi-type entity recognition
TLDR
Combining knowledge-based and data-driven components allows creating a system with competitive performance in biomedical text mining, which combines a dictionary-based annotator with a corpus-based disambiguation component.
FACILE: Description of the NE System Used for MUC-7
TLDR
An evaluation of the internal structure of the system used in the MUC-7 NE task and some speculation on the extent to which the approach adopted is susceptible to further improvement are concluded.
Exploiting Paraphrases in a Question Answering System
We present a Question Answering system for technical domains which makes an intelligent use of paraphrases to increase the likelihood of finding the answer to the user's question. The system
...
...