Network-Guided Biomarker Discovery

  title={Network-Guided Biomarker Discovery},
  author={Chlo{\'e}-Agathe Azencott},
Identifying measurable genetic indicators (or biomarkers) of a specific condition of a biological system is a key element of precision medicine. Indeed it allows to tailor diagnostic, prognostic and treatment choice to individual characteristics of a patient. In machine learning terms, biomarker discovery can be framed as a feature selection problem on whole-genome data sets. However, classical feature selection methods are usually underpowered to process these data sets, which contain orders… 
Combining network-guided GWAS to discover susceptibility mechanisms for breast cancer
Six network methods which identify subnetworks with high association scores to a phenotype are studied, showing the pertinence of network-based analyses to tackle known issues with GWAS, namely lack of statistical power and of interpretable solutions.
Boosting GWAS using biological networks: A study on susceptibility to familial breast cancer
Six network methods were selected and applied to GENESIS, a nationwide French study on familial breast cancer, and it was verified that network methods recovered more interpretable results than a standard GWAS, and the heterogeneity of their solutions was addressed, computing the consensus.
Identification of a gene signature for discriminating metastatic from primary melanoma using a molecular interaction network approach
A new network-based computational pipeline is configured, combined with a machine learning method, to mine publicly available transcriptomic data from melanoma patient samples, and identifies the most influential, differentially expressed nodes in metastatic as compared to primary melanoma.
Rank-based Molecular Prognosis and Network-guided Biomarker Discovery for Breast Cancer. (Pronostic moléculaire basé sur l'ordre des gènes et découverte de biomarqueurs guidé par des réseaux pour le cancer du sein)
This thesis is conceived following two lines of approaches intended to address two major challenges arising in genomic data analysis for breast cancer prognosis from a methodological standpoint of machine learning: rank-based approaches for improved molecular prognosis and network-guided approaches for enhanced biomarker discovery.
Biological networks and GWAS: comparing and combining network methods to understand the genetics of familial breast cancer susceptibility in the GENESIS study
Six network methods using gene scores from GENESIS, a genome-wide association study on French women with non-BRCA familial breast cancer, are studied, showing how network methods help overcome the lack of statistical power of GWAS and improve their interpretation.
Network Regularization in Imaging Genetics Improves Prediction Performances and Model Interpretability on Alzheimer’s Disease
  • N. Guigui, C. Philippe, V. Frouin
  • Computer Science, Biology
    2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019)
  • 2019
This work combines structural MRI with genetic data structured by prior knowledge of interactions in a Canonical Correlation Analysis (CCA) model with graph regularization to results in improved prediction performance and yields a more interpretable model.


The Influence of Feature Selection Methods on Accuracy, Stability and Interpretability of Molecular Signatures
It is observed that the feature selection method has a significant influence on the accuracy, stability and interpretability of signatures, and a simple Student's t-test seems to provide the best results.
Stable Feature Selection for Biomarker Discovery
Network-guided regression for detecting associations between DNA methylation and gene expression
Network-sparse Reduced-Rank Regression (NsRRR), a multivariate regression framework capable of using prior biological knowledge expressed as gene interaction networks to guide the search for associations between gene expression and DNA methylation signatures, is proposed.
Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data.
Pathway and network-based analysis of genome-wide association studies in multiple sclerosis
A pathway-oriented analysis of two GWAS in MS that takes into account all SNPs with nominal evidence of association (P < 0.05) and reports here for the first time the potential involvement of neural pathways in MS susceptibility.
Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network
This study proposes a new statistical framework called graph-guided fused lasso (GFlasso) to directly and effectively incorporate the correlation structure of multiple quantitative traits such as clinical metrics and gene expressions in association analysis.
A Network-Based Approach to Prioritize Results from Genome-Wide Association Studies
Network Interface Miner for Multigenic Interactions (NIMMI), a network-based method that combines GWAS data with human protein-protein interaction data, efficiently combines genetic association data with biological networks, translating GWAS findings into biological hypotheses.
Fast Identification of Biological Pathways Associated with a Quantitative Trait Using Group Lasso with Overlaps
In a comparison study with an alternative pathways method based on univariate SNP statistics, this method demonstrates high sensitivity and specificity for the detection of important pathways, showing the greatest relative gains in performance where marginal SNP effect sizes are small.
Evaluation of Feature Ranking Ensembles for High-Dimensional Biomedical Data: A Case Study
A case study consisting of 429 samples of exhaled air from smokers, 83% of whom suffer from COPD, and the t-statistic was rated the best among the 16 feature rankers, outperforming the currently favourite SVM ranker.
Identification of genes associated with multiple cancers via integrative analysis
The Mc.TGD (Multi-cancer Threshold Gradient Descent), an integrative analysis approach capable of analyzing multiple microarray studies on different cancers, is proposed, which is the first regularized approach to conduct "two-dimensional" selection of genes with joint effects on cancer development.