A literature network of human genes for high-throughput analysis of gene expression

@article{Jenssen2001ALN,
  title={A literature network of human genes for high-throughput analysis of gene expression},
  author={Tor-Kristian Jenssen and Astrid L{\ae}greid and Jan Komorowski and Eivind Hovig},
  journal={Nature Genetics},
  year={2001},
  volume={28},
  pages={21-28}
}
We have carried out automated extraction of explicit and implicit biomedical knowledge from publicly available gene and text databases to create a gene-to-gene co-citation network for 13,712 named human genes by automated analysis of titles and abstracts in over 10 million MEDLINE records. The associations between genes have been annotated by linking genes to terms from the medical subject heading (MeSH) index and terms from the gene ontology (GO) database. The extracted database and… 
A genome-wide MeSH-based literature mining system predicts implicit gene-to-gene relationships and networks
TLDR
The GenoMesh algorithm and web program provide the first genome- wide, MeSH-based literature mining system that effectively predicts implicit gene-gene interaction relationships and networks in a genome-wide scope.
Automatic construction of gene relation networks using text mining and gene expression data
TLDR
The main outcome of this project is the implementation of a software system that provides clinicians and researchers with a tool that supports the analysis of microarray gene expression data by mapping known relationships from the biomedical literature to local gene expression experiments.
Finding Functionally Related Genes by Local and Global Analysis of MEDLINE Abstracts
TLDR
A textual analysis of documents associated with pairs of genes is presented, and it is described how this approach can be used to discover and annotate functional relationships among genes.
Combining evidence, biomedical literature and statistical dependence: new insights for functional annotation of gene sets
TLDR
An original functional annotation method based on a combination of evidence and literature that overcomes the weaknesses and the limitations of each approach and is more informative than either separate approach.
Literature-aided interpretation of gene expression data with the weighted global test
TLDR
It is illustrated that their comprehensive scope aids the interpretation of data from domains poorly covered by GO or alternative databases, and allows for the linking of gene expression with diseases, drugs, tissues and other types of concepts, and Literature mining tools are powerful additions to the toolbox for the interpretations of high-throughput genomics data.
CoPub: a literature-based keyword enrichment tool for microarray data analysis
TLDR
A publicly available tool called CoPub that uses the information in the Medline database for the biological interpretation of microarray data, providing detailed insight in the relationships between genes and keywords, and revealing the most influential genes as highly connected hubs.
Large-scale protein annotation through gene ontology.
TLDR
The development of GO Engine, a computational platform for GO annotation, and analysis of the resultant GO annotations of human proteins are reported, which centered on sequence homology with GO-annotated proteins and protein domain analysis.
CoPub Mapper: mining MEDLINE based on search term co-publication
TLDR
The CoPub Mapper program allows for quick and versatile querying of co-published genes and keywords and can be successfully used to cluster predefined groups of genes and microarray data.
Inferring pathways from gene lists using a literature-derived network of biological relationships
TLDR
A heuristic algorithm and a scoring function that work well both on simulated data and on data from known pathways are presented and it is found that the method works on reasonably complex curated networks containing approximately 9000 biological entities (genes and metabolites), and approximately 30,000 biological relationships.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 37 REFERENCES
Genes, Themes, and Microarrays: Using Information Retrieval for Large-Scale Gene Analysis
TLDR
A new approach for utilizing the literature in order to establish functional relationships among genes on a genome-wide scale is developed, based on revealing coherent themes within the literature using a similarity-based search in document space.
Biobibliometrics: information retrieval and visualization from co-occurrences of gene names in Medline abstracts.
  • B. Stapley, G. Benoît
  • Computer Science
    Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
  • 2000
TLDR
A prototype system for retrieving and visualizing information from literature and genomic databases using gene names, which is a tool for efficiently exploring the biomedical information landscape and may act as a inference network.
Cluster analysis and display of genome-wide expression patterns
TLDR
A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding that the standard correlation coefficient conforms well to the intuitive biological notion of what it means for two genes to be ‘‘coexpressed’’.
EDGAR: extraction of drugs, genes and relations from the biomedical literature.
TLDR
The mechanisms for automatically generating assertions about drugs and genes relevant to cancer and on a simple application, conceptual clustering of documents are reported on.
Identifying the Interaction between Genes and Gene Products Based on Frequently Seen Verbs in Medline Abstracts.
We have selected the most frequently seen verbs from raw texts made up of 1-million-words of Medline abstracts, and we were able to identify (or bracket) noun phrases contained in the corpus, with a
Gene Ontology: tool for the unification of biology
TLDR
The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions
TLDR
The basic design of a system for automatic detection of protein-protein interactions extracted from scientific abstracts is described and the feasibility of developing a fully automated system able to describe networks of protein interactions with sufficient accuracy is demonstrated.
Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures.
TLDR
This paper describes how an information extraction system designed to participate in the MUC exercises has been modified for two bioinformatics applications: EMPathIE, concerns with enzyme and metabolic pathways; and PASTA, concerned with protein structure.
Automatic Annotation for Biological Sequences by Etraction of Keywords from MEDLINE Abstracts: Development of a Prototype System
TLDR
A prototype for the automatic annotation of functional characteristics in protein families able to extract biological information directly from scientific literature in the form of MEDLINE abstracts is developed.
...
1
2
3
4
...