Evaluation of BioCreAtIvE assessment of task 2

@article{Blaschke2005EvaluationOB,
  title={Evaluation of BioCreAtIvE assessment of task 2},
  author={Christian Blaschke and Eduardo Andr{\'e}s Le{\'o}n and Martin Krallinger and Alfonso Valencia},
  journal={BMC Bioinformatics},
  year={2005},
  volume={6},
  pages={S16 - S16}
}
BackgroundMolecular Biology accumulated substantial amounts of data concerning functions of genes and proteins. Information relating to functional descriptions is generally extracted manually from textual data and stored in biological databases to build up annotations for large collections of gene products. Those annotation databases are crucial for the interpretation of large scale analysis approaches using bioinformatics or experimental techniques. Due to the growing accumulation of… Expand
Overview of BioCreAtIvE: critical assessment of information extraction for biology
TLDR
The first BioCreAtIvE assessment provided state-of-the-art performance results for a basic task (gene name finding and normalization), where the best systems achieved a balanced 80% precision / recall or better, which potentially makes them suitable for real applications in biology. Expand
Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge
TLDR
A common characteristic observed in all three tasks was that the combination of system outputs could yield better results than any single system, including the development of the first text-mining meta-server. Expand
A sentence sliding window approach to extract protein annotations from biomedical articles
TLDR
The "sentence sliding window" approach proposed here was found to efficiently extract text fragments from full text articles containing annotations on proteins, providing the highest number of correctly predicted annotations. Expand
GOTA: GO term annotation of biomedical literature
TLDR
GOTA implements a flexible and expandable model for GO annotation of biomedical literature that makes use only of information that is readily available from public repositories and it is easily expandable to handle novel sources of information. Expand
Online assessment of protein interaction information extraction systems
Background. Manual database (DB) curation efforts extracting protein-protein interactions (PPIs) from publications are not able to cover the entire scientific literature on those interactions.Expand
Overview of the gene ontology task at BioCreative IV
TLDR
The state of the art in automatically mining GO terms from literature has improved over the past decade while much progress is still needed for computer-assisted GO curation. Expand
A robust data-driven approach for gene ontology annotation
  • Yanpeng Li, Hong Yu
  • Computer Science, Medicine
  • Database J. Biol. Databases Curation
  • 2014
TLDR
A binary classifier to identify evidence sentences using reference distance estimator (RDE), a recently proposed semi-supervised learning method that learns new features from around 10 million unlabeled sentences, and a filtering method based on high-level GO classes that substantially improved the performance are presented. Expand
BioCreAtIvE Task 1A: gene mention finding evaluation
TLDR
The 80% plus F-measure results are good, but still somewhat lag the best scores achieved in some other domains such as newswire, due in part to the complexity and length of gene names, compared to person or organization names in newswire. Expand
Analysis of biological processes and diseases using text mining approaches.
TLDR
An overview of disease-centric and gene-centric literature mining methods for linking genes to phenotypic and genotypic aspects and recent efforts for finding biomarkers through text mining and for gene list analysis and prioritization are discussed. Expand
Annotation of protein residues based on a literature analysis: cross-validation against UniProtKb
TLDR
This work introduces a system that identifies protein residues in MEDLINE abstracts and annotates them with features extracted from the context written in the surrounding text, an extension to other existing systems in that a wider range of residue entities are considered and that features of residues are extracted as annotations. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 31 REFERENCES
Learning Statistical Models for Annotating Proteins with Function Information using Biomedical Text
TLDR
This work built a system to automatically annotate a given protein with codes from the Gene Ontology using the text of an article from the biomedical literature as evidence, and observes that the Naïve Bayes models were effective in filtering and ranking the initially hypothesized annotations. Expand
An evaluation of GO annotation retrieval for BioCreAtIvE and GOA
TLDR
A biological perspective on the evaluation, how the GOA team annotate GO using literature is explained, and some suggestions to improve the precision of future text-retrieval and extraction techniques are offered. Expand
A sentence sliding window approach to extract protein annotations from biomedical articles
TLDR
The "sentence sliding window" approach proposed here was found to efficiently extract text fragments from full text articles containing annotations on proteins, providing the highest number of correctly predicted annotations. Expand
BioCreAtIvE Task 1A: gene mention finding evaluation
TLDR
The 80% plus F-measure results are good, but still somewhat lag the best scores achieved in some other domains such as newswire, due in part to the complexity and length of gene names, compared to person or organization names in newswire. Expand
Overview of BioCreAtIvE task 1B: normalized gene lists
TLDR
This assessment demonstrates that multiple groups were able to perform a real biological task across a range of organisms, and holds out promise that the technology can provide partial automation of the curation process in the near future. Expand
Finding genomic ontology terms in text using evidence content
TLDR
It is concluded that an automatic annotation system can effectively use the method introduced to identify biological properties in unstructured text to identify Gene Ontology annotations and their evidence in a set of articles. Expand
Protein annotation as term categorization in the gene ontology using word proximity networks
TLDR
The evaluation results indicate that the method for expanding words associated with GO nodes is quite powerful; it was able to successfully select appropriate evidence text for a given annotation in 38% of Task 2.1 queries. Expand
Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup
TLDR
A Challenge Evaluation task that was created for the Knowledge Discovery and Data Mining (KDD) Challenge Cup, where 18 participating groups provided systems that flagged articles for curation, based on whether the article contained experimental evidence for gene expression products. Expand
Mining protein function from text using term-based support vector machines
TLDR
The initial results suggest that the supervised machine-learning approach to mining protein function predictions from text can also mine annotations from text even when an explicit statement relating a protein to a GO term is absent. Expand
Data-poor categorization and passage retrieval for Gene Ontology Annotation in Swiss-Prot
TLDR
The combination of retrieval and natural language processing methods designed, achieved very competitive performances and suggests that the overall strategy could benefit a large class of information extraction tasks, especially when training data are missing. Expand
...
1
2
3
4
...