Text-mining and information-retrieval services for molecular biology

@article{Krallinger2005TextminingAI,
  title={Text-mining and information-retrieval services for molecular biology},
  author={Martin Krallinger and Alfonso Valencia},
  journal={Genome Biology},
  year={2005},
  volume={6},
  pages={224 - 224}
}
Text-mining in molecular biology - defined as the automatic extraction of information about genes, proteins and their functional relationships from text documents - has emerged as a hybrid discipline on the edges of the fields of information science, bioinformatics and computational linguistics. A range of text-mining applications have been developed recently that will improve access to knowledge for biologists and database annotators. 
Text Mining for Interpreting Gene
TLDR
A new phase of text mining process to uncover interesting term correlations, genomic term identification in curation process, identification of biological relations and help the biologists in their analysis of complex problems is described. Expand
Data and text mining Gene symbol disambiguation using knowledge-based profiles
Motivation: The ambiguity of biomedical entities, particularly of gene symbols, is a big challenge for text-mining systems in the biomedical domain. Existing knowledge sources, such as Entrez GeneExpand
Linking genes to literature: text mining, information extraction, and retrieval applications for biology
TLDR
This review presents a general introduction to the main characteristics and applications of currently available text-mining systems for life sciences in terms of the type of biological information demands being addressed; the level of information granularity of both user queries and results; and the features and methods commonly exploited by these applications. Expand
Literature mining for the biologist: from information retrieval to biological discovery
TLDR
This work states that literature mining is also becoming useful for both hypothesis generation and biological discovery, however, the latter will require the integration of literature and high-throughput data, which should encourage close collaborations between biologists and computational linguists. Expand
Seeking a New Biology through Text Mining
TLDR
Text mining, the use of computational tools to enhance the human ability to parse and understand complex text, is much interest in biomedical journals. Expand
Full-Text Mining: Linking Practice, Protocols and Articles in Biological Research
TLDR
This article outlines the approach to full-text mining of biological protocols and the subsequent linking of these with metrics of scientific quality and highlights the elements of full- text mining that could benefit from the attention of the wider text mining community. Expand
Development of Text Mining Tools for Information Retrieval from Patents
TLDR
In this work, a patent pipeline was developed and integrated into @Note2, an open-source computational framework for BioTM, which allows to run further BioTM tools over the patent documents, including Information Extraction processes as Named Entity Recognition or Relation Extraction. Expand
Integrating Text Mining into the MGI Biocuration Workflow
TLDR
Mouse Genome Informatics proves the potential for the further incorporation of semi-automated processes into the curation of the biomedical literature by providing an overview of its pilot projects with NCBO's Open Biomedical Annotator and Fraunhofer SCAI's ProMiner. Expand
Integrating text mining into the MGI biocuration workflow
TLDR
Mouse Genome Informatics proves the potential for the further incorporation of semi-automated processes into the curation of the biomedical literature by discussing the search process, performance metrics and success criteria, and how a short list of potential text mining tools were identified. Expand
Dependency-Based Relation Mining for Biomedical Literature
TLDR
Techniques for the automatic detection of relationships among domain entities mentioned in the biomedical literature are described, based on the adaptive selection of candidate interactions sentences, which are then parsed using an own dependency parser. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 90 REFERENCES
MedMiner: an Internet text-mining tool for biomedical information, with application to gene expression profiling.
TLDR
An Internet-based hypertext program, MedMiner, which filters and organizes large amounts of textual and structured information returned from public search engines like GeneCards and PubMed, and can be used to organize the information returning from any arbitrary PubMed search. Expand
The Frame-Based Module of the SUISEKI Information Extraction System
TLDR
SUISEKI, an information extraction system that takes an intermediate view of the problem by requiring the two names to be in a frame that indicates a direct or indirect interaction between them, is developed. Expand
Tagging gene and protein names in biomedical text
TLDR
This work proposes to approach the detection of gene and protein names in scientific abstracts as part-of-speech tagging, the most basic form of linguistic corpus annotation, and demonstrates that this method can be applied to large sets of MEDLINE abstracts, without the need for special conditions or human experts to predetermine relevant subsets. Expand
PreBIND and Textomy – mining the biomedical literature for protein-protein interactions using a support vector machine
TLDR
This work presents an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND. Expand
Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets
As the wealth of biomedical knowledge in the form of literature increases, there is a rising need for effective natural language processing tools to assist in organizing, curating, and retrievingExpand
iProLINK: an integrated protein resource for literature mining
TLDR
The goal of iProLINK is to provide curated data sources that can be utilized for text mining research in the areas of bibliography mapping, annotation extraction, protein named entity recognition, and protein ontology development. Expand
MedBlast: searching articles related to a biological sequence
TLDR
A new literature-mining tool MedBlast is developed, which uses natural language processing techniques, to retrieve the related articles of a given sequence by retrieving its related articles. Expand
Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature
TLDR
Extraction of particular biological facts can be accelerated significantly by ontologies, with Textpresso automatically performing nearly as well as expert curators to identify sentences; in searches for two uniquely named genes and an interaction term, the ontology confers a 3-fold increase of search efficiency. Expand
Kinase pathway database: an integrated protein-kinase and NLP-based protein-interaction resource.
TLDR
The Kinase Pathway Database, an integrated database involving major completely sequenced eukaryotes, is developed, which contains the classification of protein kinases and their functional conservation, ortholog tables among species, protein-protein,protein-gene, and protein-compound interaction data, domain information, and structural information. Expand
GENIA corpus - a semantically annotated corpus for bio-textmining
MOTIVATION Natural language processing (NLP) methods are regarded as being useful to raise the potential of text mining from biological literature. The lack of an extensively annotated corpus of thisExpand
...
1
2
3
4
5
...