Exploring the human genome with functional maps.

@article{Huttenhower2009ExploringTH,
  title={Exploring the human genome with functional maps.},
  author={Curtis Huttenhower and Erin M. Haley and Matthew A. Hibbs and Vanessa Dumeaux and Daniel R Barrett and Hilary A. Coller and Olga G. Troyanskaya},
  journal={Genome research},
  year={2009},
  volume={19 6},
  pages={
          1093-106
        }
}
Human genomic data of many types are readily available, but the complexity and scale of human molecular biology make it difficult to integrate this body of data, understand it from a systems level, and apply it to the study of specific pathways or genetic disorders. An investigator could best explore a particular protein, pathway, or disease if given a functional map summarizing the data and interactions most relevant to his or her area of interest. Using a regularized Bayesian integration… 

Figures and Tables from this paper

Probabilistic analysis of the human transcriptome with side information

Novel computational strategies have been developed to investigate a key functional layer of genetic information, the human transcriptome, which regulates the function of living cells through protein synthesis, and provide new insights to cell-biological networks, cancer mechanisms and other aspects of genome function.

GIANT 2.0: genome-scale integrated analysis of gene networks in tissues

The NetWAS approach available through the server uses tissue-specific/cell-type networks predicted by GIANT2 to re-prioritize statistical associations from GWAS studies and identify disease-associated genes.

Gene networks in Drosophila melanogaster: integrating experimental data to predict gene function

The first genome-wide functional gene network in D. melanogaster is constructed by integrating most of the available, comprehensive sets of genetic interaction, protein-protein interaction, and microarray expression data and it is shown that this approach is a means of inferring annotations on a class of genes that cannot be annotated based solely on sequence similarity.

Simultaneous Genome-Wide Inference of Physical, Genetic, Regulatory, and Functional Pathway Components

This work describes methodology for simultaneously predicting specific types of biomolecular interactions using high-throughput genomic data and results in a comprehensive compendium of whole-genome networks for yeast, derived from ∼3,500 experimental conditions and describing 30 interaction types.

Construction and Functional Analysis of Human Genetic Interaction Networks with Genome-wide Association Data

This study demonstrated that the constructed genetic interaction networks are supported by functional evidence from independent biological databases, and the network can be used to discover pairs of compensatory gene modules (between-pathway models) in their joint association with a disease phenotype.

Using Functional Linkage Gene Networks to Study Human Diseases

In this chapter, the existence of functional association for genes working in a common biological process or implicated in acommon disease is demonstrated, and approaches to construct the functional linkage gene network (FLN) based on genomic and proteomic data integration are reviewed.

Understanding multicellular function and disease with human tissue-specific networks

NetWAS is introduced, which combines genes with nominally significant genome-wide association study (GWAS) P values and tissue-specific networks to identify disease-gene associations more accurately than GWAS alone.

Targeted analyses of very large genome-wide data collections

This work uses the massive public repositories to quantify the tissue-specific signals in gene expression profiles, characterize distinctive molecular features of human diseases, deconvolve the latent cell-type-specific factors in mixed clinical samples, and automatically integrate heterogeneous data sources in the context of a specific genome-wide dataset.

Integrative networks illuminate biological factors underlying gene-disease associations

Improvements in data and algorithms are expected to continue to improve integrative networks, allowing them to provide more detailed and mechanistic predictions into the context-specific genetic etiology of common diseases.
...

References

SHOWING 1-10 OF 50 REFERENCES

Assessing the functional structure of genomic data

This work analyzes the functional structure of Saccharomyces cerevisiae datasets from over 950 publications in the context of over 140 biological processes and uncovers subtle gene expression similarities in three otherwise disparate microarray datasets due to a shared strain background.

Computational modeling of the Plasmodium falciparum interactome reveals protein function on a genome-wide scale.

In an attempt to infer protein function on a genome-wide scale, a computationally modeled the P. falciparum interactome, elucidating local and global functional relationships between gene products and superimposed this map on genomes of three apicomplexan pathogens,describing relationships between these organisms based on retained functional linkages.

A scalable method for integration and functional analysis of multiple microarray datasets

This work presents Microarray Experiment Functional Integration Technology (MEFIT), a scalable Bayesian framework for predicting functional relationships from integrated microarray datasets and predicts these functional relationships within the context of specific biological processes.

Probabilistic model of the human protein-protein interaction network

It is demonstrated that a probabilistic analysis integrating model organism interactome data, protein domain data, genome-wide gene expression data and functional annotation data predicts nearly 40,000 protein-protein interactions in humans—a result comparable to those obtained with experimental and computational approaches in model organisms.

Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles

It is demonstrated how the GSEA method yields insights into several cancer-related data sets, including leukemia and lung cancer, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer.

KEGG for linking genomes to life and the environment

KEGG PATHWAY is now supplemented with a new global map of metabolic pathways, which is essentially a combined map of about 120 existing pathway maps, and the KEGG resource is being expanded to suit the needs for practical applications.

Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits

A hybrid procedure to map loci involved in complex traits that leverages the strengths of forward and reverse genetic approaches is detailed and 5-lipoxygenase is identified as underlying previously identified quantitative trait loci in an F2 cross between strains C57BL/6J and DBA/2J and shows that it has pleiotropic effects on body fat, lipid levels and bone density.

Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits

A hybrid procedure to map loci involved in complex traits that leverages the strengths of forward and reverse genetic approaches is detailed and 5-lipoxygenase is identified as underlying previously identified quantitative trait loci in an F2 cross between strains C57BL/6J and DBA/2J and shows that it has pleiotropic effects on body fat, lipid levels and bone density.

STRING 7—recent developments in the integration and prediction of protein interactions

Although primarily developed for protein interaction analysis, the resource has also been successfully applied to comparative genomics, phylogenetics and network studies, which are all facilitated by programmatic access to the database backend and the availability of compact download files.

Exploring the functional landscape of gene expression: directed search of large microarray compendia

This work designs a context-sensitive search algorithm that provides the ability for biological researchers to explore the totality of existing microarray data in a manner useful for drawing conclusions and formulating hypotheses, which it believes is invaluable for the research community.