Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

@article{Huang2009BioinformaticsET,
  title={Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists},
  author={Da Wei Huang and Brad T. Sherman and Richard A. Lempicki},
  journal={Nucleic Acids Research},
  year={2009},
  volume={37},
  pages={1 - 13}
}
Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in… 

Figures and Tables from this paper

A Review on Bioinformatics Enrichment Analysis Tools Towards Functional Analysis of High Throughput Gene Set Data
TLDR
This paper reviews 35 bioinformatics enrichment tools and 5 gene set databases that are currently available in the field, which include the description of these tools and databases, to help tool developers and users to gain a broad view and better understanding on the bioin formatters.
BIOINFORMATICS TOOLS FOR GENE LIST ANALYSIS
TLDR
This review is meant to assist tool developers to better understand the needs of the end-users, and in it it looks at the currently available gene list analysis tools, their strengths and weaknesses, and offers suggestions for their improvement.
Softwareset enrichment analysis through protein structural information
TLDR
PhenoFam provides a user-friendly, easily accessible web interface to perform GSEA based on highthroughput data sets and structural-functional protein information, and therefore aids in functional annotation of genes.
PhenoFam-gene set enrichment analysis through protein structural information
TLDR
PhenoFam provides a user-friendly, easily accessible web interface to perform GSEA based on high-throughput data sets and structural-functional protein information, and therefore aids in functional annotation of genes.
GeneSCF: a real-time based functional enrichment tool with support for multiple organisms
TLDR
A command-line tool that can predict the functionally relevant biological information for a set of genes in a real-time updated manner, designed to handle information from more than 4000 organisms from freely available prominent functional databases like KEGG, Reactome and Gene Ontology is designed.
Bioinformatics Tools for the Analysis of Gene-Phenotype Relationships Coupled with a Next Generation ChIP-Sequencing Data Analysis Pipeline
TLDR
A gene prioritization algorithm linking genes to non-disease phenotypes described by meaningful keywords was developed and can be used to process candidate genetic targets of a transcription factor produced by a computational pipeline for ChIP-Seq data analysis.
Extracting Biological Meaning from Large Gene Lists with DAVID
TLDR
This unit will describe step‐by‐step procedures to use DAVID tools, as well as a brief rationale and key parameters in the DAVID analysis.
An integrated analysis tool reveals intrinsic biases in gene set enrichment
TLDR
YAAT extends standard enrichment analyses, using a combination of co-expression data and profiles of phylogenetic conservation, to identify groups of functionally-related genes and additionally allows class clustering, providing inference of functional links between groups of genes.
WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013
TLDR
By integrating functional categories derived from centrally and publicly curated databases as well as computational analyses, WebGestalt has significantly increased the coverage of functional categories in various biological contexts, leading to a total of 78 612 functional categories.
GENEASE: real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization
TLDR
GENEASE is a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants.
...
...

References

SHOWING 1-10 OF 102 REFERENCES
Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources
TLDR
By following this protocol, investigators are able to gain an in-depth understanding of the biological themes in lists of genes that are enriched in genome-scale studies.
GFINDer: Genome Function INtegrated Discoverer through dynamic annotation, statistical analysis, and mining
TLDR
Genome Functional INtegrated Discoverer (GFINDer), a web server able to automatically provide large-scale lists of user-classified genes with functional profiles biologically characterizing the different gene classes in the list, aiding better interpretation of microarray experiment results.
Ontological analysis of gene expression data: current tools, limitations, and open problems
TLDR
A detailed comparison of the capabilities of 14 ontological analysis tools is presented using the following criteria: scope of the analysis, visualization capabilities, statistical model used, correction for multiple comparisons, reference microarrays available, installation issues and sources of annotation data.
DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists
TLDR
The expanded DAVID Knowledgebase now integrates almost all major and well-known public bioinformatics resources centralized by the DAVID Gene Concept, a single-linkage method to agglomerate tens of millions of diverse gene/protein identifiers and annotation terms from a variety of public bio informatics databases.
Enrichment analysis in high-throughput genomics - accounting for dependency in the NULL
TLDR
The exact null distribution for testing enrichment of GO classes is derived by relaxing the independence assumption using well-known statistical theory, and it is argued that the independent assumption is not detrimental.
g:Profiler—a web-based toolset for functional profiling of gene lists from large-scale experiments
TLDR
G:Profiler has a simple, user-friendly web interface with powerful visualisation for capturing Gene Ontology, pathway, or transcription factor binding site enrichments down to individual gene levels.
GoSurfer: a graphical interactive tool for comparative analysis of large gene sets in Gene Ontology space.
TLDR
GoSurfer is an easy-to-use graphical exploration tool with built-in statistical features that allow a rapid assessment of the biological functions represented in large gene sets, and can be downloaded for noncommercial use.
GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis
TLDR
An easy-to-use web-based toolkit that identifies statistically overrepresented GO terms within given gene sets and allows cross comparison of the GO enrichment status of multiple experiments to identify functional correlations among them.
GO-2D: identifying 2-dimensional cellular-localized functional modules in Gene Ontology
TLDR
Applications of GO-2D to the analyses of two human cancer datasets show that very specific disease-relevant processes can be identified by using cellular location information, demonstrating that 2-dimensional approach complementary to current 1- dimensional approach is powerful for finding modules highly relevant to diseases.
GeneTrail—advanced gene set enrichment analysis
TLDR
GeneTrail's statistics module includes a novel dynamic-programming algorithm that improves the P-value computation of GSEA methods considerably and is freely accessible at http://genetrail.uni-sb.de.
...
...