The null hypothesis of GSEA, and a novel statistical model for competitive gene set analysis

@article{Debrabant2017TheNH,
  title={The null hypothesis of GSEA, and a novel statistical model for competitive gene set analysis},
  author={Birgit Debrabant},
  journal={Bioinformatics},
  year={2017},
  volume={33},
  pages={1271–1277}
}
Motivation: Competitive gene set analysis intends to assess whether a specific set of genes is more associated with a trait than the remaining genes. However, the statistical models assumed to date to underly these methods do not enable a clear cut formulation of the competitive null hypothesis. This is a major handicap to the interpretation of results obtained from a gene set analysis. Results: This work presents a hierarchical statistical model based on the notion of dependence measures… 

Figures from this paper

Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods
TLDR
This work introduces simultaneous enrichment analysis (SEA), a new approach for analysis of feature sets in genomics and other omics based on a new unified null hypothesis, which includes the self-contained and competitive null hypotheses as special cases.
Computational analyses of mechanism of action (MoA): data, methods and integration
TLDR
This review discusses compound-specific data such as -omics, cell morphology and bioactivity data, as well as commonly used supplementary prior knowledge such as network and pathway data, and provides information on databases where this data can be accessed.
Development and Validation of a Lncrnas-Based Nomogram for Survival Prediction in Neuroblastoma Patients
TLDR
The lncRNA-based LASSO risk score is a promising and potential prognostic tool in predicting the survival of patients with neuroblastoma and the nomogram combined the lncRNAs and clinical parameters allows for accurate risk assessment in guiding clinical management.
Clinical Significance of TRMT6 in Hepatocellular Carcinoma: A Bioinformatics-Based Study
TLDR
TRMT6 was upregulated in HCC tissues, and higher TRMT6 expression levels was correlated with reduced OS and RFS in patients with primary HCC, suggesting it might be a promising prognostic biomarker for poor clinical outcomes inPrimary HCC patients.
Expression of TUSC3 and its prognostic significance in colorectal cancer.
  • Y. Zhu, M. Dong
  • Biology, Medicine
    Pathology, research and practice
  • 2018
Expression of miR‐486‐5p and its significance in lung squamous cell carcinoma
Lung squamous cell carcinoma (LUSC) is one of the main histological types of lung cancer with high mortality. The role of microRNA‐486‐5p in LUSC remains unclear. In the current study, the aim was to
Bioinformatics Analysis Identifies Potential Ferroptosis Key Genes in the Pathogenesis of Intracerebral Hemorrhage
TLDR
The results of this study indicated that the MAPK1-related mRNA–miRNA–lncRNA interaction chain could be potentially employed as a biomarker of the inception and progression of ferroptosis after cerebral hemorrhage.
The tumor suppressor NOR1 suppresses cell growth, invasiveness, and tumorigenicity in glioma.
TLDR
It is demonstrated that the NOR1 protein level was decreased in glioma tissue samples as compared to its normal counterpart and suggested for the first time that NOR1 suppresses gliomas progression via modulating the FOXR2 expression.

References

SHOWING 1-10 OF 23 REFERENCES
Gene set analysis methods: statistical models and methodological differences
TLDR
This article discusses four models of statistical experiment explicitly or implicitly assumed by most if not all currently available methods of gene set analysis, and recommends a group of methods that provide biologically interpretable results in statistically sound way.
A general modular framework for gene set enrichment analysis
TLDR
This framework provides a meta-theory of gene set analysis that not only helps to gain a better understanding of the relative merits of each embedded approach but also facilitates a principled comparison and offers insights into the relative interplay of the methods.
Gene set analysis for GWAS: assessing the use of modified Kolmogorov-Smirnov statistics
TLDR
It is shown that, when enhancing the impact of highly significant genes in the calculation of the test statistic, the corresponding test can be considered to infer the classical self-contained null hypothesis.
Analyzing gene expression data in terms of gene sets: methodological issues
TLDR
It is argued that methods that competitively test each gene set against the rest of the genes create an unnecessary rift between single gene testing and gene set testing.
Gene set analysis of SNP data: benefits, challenges, and future directions
TLDR
An overview of Gene set analysis is provided, highlighting the key challenges, potential solutions, and directions for ongoing research.
i-GSEA4GWAS: a web server for identification of pathways/gene sets associated with traits by applying an improved gene set enrichment analysis to genome-wide association study
TLDR
To provide researchers an open platform to analyze GWAS data, the i-GSEA4GWAS (improved GSEA for GWAS) web server is developed and aims to provide new insights in complex disease studies.
GSEA-SNP: applying gene set enrichment analysis to SNP data from genome-wide association studies
TLDR
The resulting GSEA-SNP method rests on the assumption that SNPs underlying a disease phenotype are enriched in genes constituting a signaling pathway or those with a common regulation, and may facilitate the identification of disease-associated SNPs and pathways, as well as the understanding of the underlying biological mechanisms.
SNP-based pathway enrichment analysis for genome-wide association studies
TLDR
The SNP-based pathway enrichment method described here offers a new alternative approach for analysing GWAS data, and is able to identify statistically significant pathways, and importantly, pathways that can be replicated in large genetically distinct samples.
Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles
TLDR
It is demonstrated how the GSEA method yields insights into several cancer-related data sets, including leukemia and lung cancer, where single-gene analysis finds little similarity between two independent studies of patient survival in lung cancer.
...
...