A literature network of human genes for high-throughput analysis of gene expression
@article{Jenssen2001ALN, title={A literature network of human genes for high-throughput analysis of gene expression}, author={Tor-Kristian Jenssen and Astrid L{\ae}greid and Jan Komorowski and Eivind Hovig}, journal={Nature Genetics}, year={2001}, volume={28}, pages={21-28} }
We have carried out automated extraction of explicit and implicit biomedical knowledge from publicly available gene and text databases to create a gene-to-gene co-citation network for 13,712 named human genes by automated analysis of titles and abstracts in over 10 million MEDLINE records. The associations between genes have been annotated by linking genes to terms from the medical subject heading (MeSH) index and terms from the gene ontology (GO) database. The extracted database and…
654 Citations
Global mapping of gene/protein interactions in PubMed abstracts: A framework and an experiment with P53 interactions
- Computer ScienceJ. Biomed. Informatics
- 2007
A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks
- BiologyBMC Systems Biology
- 2013
The GenoMesh algorithm and web program provide the first genome- wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.
Automatic construction of gene relation networks using text mining and gene expression data
- Computer Science, BiologyMedical informatics and the Internet in medicine
- 2004
The main outcome of this project is the implementation of a software system that provides clinicians and researchers with a tool that supports the analysis of microarray gene expression data by mapping known relationships from the biomedical literature to local gene expression experiments.
Finding Functionally Related Genes by Local and Global Analysis of MEDLINE Abstracts
- Biology
- 2004
A textual analysis of documents associated with pairs of genes is presented, and it is described how this approach can be used to discover and annotate functional relationships among genes.
Combining evidence, biomedical literature and statistical dependence: new insights for functional annotation of gene sets
- BiologyBMC Bioinformatics
- 2005
An original functional annotation method based on a combination of evidence and literature that overcomes the weaknesses and the limitations of each approach and is more informative than either separate approach.
Literature-aided interpretation of gene expression data with the weighted global test
- BiologyBriefings Bioinform.
- 2011
It is illustrated that their comprehensive scope aids the interpretation of data from domains poorly covered by GO or alternative databases, and allows for the linking of gene expression with diseases, drugs, tissues and other types of concepts, and Literature mining tools are powerful additions to the toolbox for the interpretations of high-throughput genomics data.
CoPub: a literature-based keyword enrichment tool for microarray data analysis
- Economics, BiologyNucleic Acids Res.
- 2008
A publicly available tool called CoPub that uses the information in the Medline database for the biological interpretation of microarray data, providing detailed insight in the relationships between genes and keywords, and revealing the most influential genes as highly connected hubs.
Large-scale protein annotation through gene ontology.
- BiologyGenome research
- 2002
The development of GO Engine, a computational platform for GO annotation, and analysis of the resultant GO annotations of human proteins are reported, which centered on sequence homology with GO-annotated proteins and protein domain analysis.
CoPub Mapper: mining MEDLINE based on search term co-publication
- EducationBMC Bioinformatics
- 2004
The CoPub Mapper program allows for quick and versatile querying of co-published genes and keywords and can be successfully used to cluster predefined groups of genes and microarray data.
Inferring pathways from gene lists using a literature-derived network of biological relationships
- Computer Science, BiologyBioinform.
- 2005
A heuristic algorithm and a scoring function that work well both on simulated data and on data from known pathways are presented and it is found that the method works on reasonably complex curated networks containing approximately 9000 biological entities (genes and metabolites), and approximately 30,000 biological relationships.
References
SHOWING 1-10 OF 37 REFERENCES
Genes, Themes, and Microarrays: Using Information Retrieval for Large-Scale Gene Analysis
- BiologyISMB
- 2000
A new approach for utilizing the literature in order to establish functional relationships among genes on a genome-wide scale is developed, based on revealing coherent themes within the literature using a similarity-based search in document space.
Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in Medline abstracts.
- Computer SciencePacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
- 2000
A prototype system for retrieving and visualizing information from literature and genomic databases using gene names, which is a tool for efficiently exploring the biomedical information landscape and may act as a inference network.
Cluster analysis and display of genome-wide expression patterns
- Biology
- 1999
A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding that the standard correlation coefficient conforms well to the intuitive biological notion of what it means for two genes to be ‘‘coexpressed’’.
EDGAR: extraction of drugs, genes and relations from the biomedical literature.
- Computer SciencePacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
- 2000
The mechanisms for automatically generating assertions about drugs and genes relevant to cancer and on a simple application, conceptual clustering of documents are reported on.
Identifying the Interaction between Genes and Gene Products Based on Frequently Seen Verbs in Medline Abstracts.
- LinguisticsGenome informatics. Workshop on Genome Informatics
- 1998
We have selected the most frequently seen verbs from raw texts made up of 1-million-words of Medline abstracts, and we were able to identify (or bracket) noun phrases contained in the corpus, with a…
Gene Ontology: tool for the unification of biology
- BiologyNature Genetics
- 2000
The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions
- Computer Science, BiologyISMB
- 1999
The basic design of a system for automatic detection of protein-protein interactions extracted from scientific abstracts is described and the feasibility of developing a fully automated system able to describe networks of protein interactions with sufficient accuracy is demonstrated.
Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures.
- BiologyPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
- 2000
This paper describes how an information extraction system designed to participate in the MUC exercises has been modified for two bioinformatics applications: EMPathIE, concerns with enzyme and metabolic pathways; and PASTA, concerned with protein structure.
Automatic Annotation for Biological Sequences by Etraction of Keywords from MEDLINE Abstracts: Development of a Prototype System
- Computer ScienceISMB
- 1997
A prototype for the automatic annotation of functional characteristics in protein families able to extract biological information directly from scientific literature in the form of MEDLINE abstracts is developed.