Bayesian identification of protein differential expression in multi-group isobaric labelled mass spectrometry data

  title={Bayesian identification of protein differential expression in multi-group isobaric labelled mass spectrometry data},
  author={Howsun Jow and Richard J. Boys and Darren J. Wilkinson},
  journal={Statistical Applications in Genetics and Molecular Biology},
  pages={531 - 551}
  • H. Jow, R. Boys, D. Wilkinson
  • Published 24 July 2014
  • Computer Science, Biology
  • Statistical Applications in Genetics and Molecular Biology
Abstract In this paper we develop a Bayesian statistical inference approach to the unified analysis of isobaric labelled MS/MS proteomic data across multiple experiments. An explicit probabilistic model of the log-intensity of the isobaric labels’ reporter ions across multiple pre-defined groups and experiments is developed. This is then used to develop a full Bayesian statistical methodology for the identification of differentially expressed proteins, with respect to a control group, across… 
A comparative study of analysis methods in quantitative label free proteomics
Correlation between different mass spectrometry instruments was assessed and found to yield high r values, especially at the protein level, and was also found to improve following the application of abundance thresholds, however the result of applying score thresholds was unpredictable.
Bayesian Methods for Proteomic 1 Biomarker Discovery 2 3 4 5
An introduction to Bayesian inference is provided and some of the advantages of using a Bayesian framework are demonstrated, including how Bayesian methods have been used previously in proteomics and other areas of bioinformatics.
Challenges and Opportunities for Bayesian Statistics in Proteomics.
This review gives a walk-through of the development of a Bayesian model for dynamic organic orthogonal phase-separation data, demonstrating their potential power, alongside the challenges posed by adopting this new statistical framework.
Multi-Omic Analysis Reveals Disruption of Cholesterol Homeostasis by Cannabidiol in Human Cell Lines
CBD treatment induced apoptosis in a dose-dependent manner in multiple human cell lines, which was rescued by inhibition of cholesterol synthesis, and potentiated by compounds that disrupt cholesterol trafficking and storage.
Bayesian hierarchical modelling for inferring genetic interactions in yeast
This work introduces Bayesian hierarchical models of population growth rates and genetic interactions that better reflect QFA experimental design than current approaches, and models population dynamics and genetic interaction simultaneously simultaneously, thereby avoiding passing information between models via a univariate fitness summary.
Selecting Random Effect Components in a Sparse Hierarchical Bayesian Model for Identifying Antigenic Variability
WAIC is combined with the SABRE method and its ability to approximate Bayesian Cross Validation performance in terms of correctly selecting random effect components analysed, and the results analysed.
Computational Intelligence Methods for Bioinformatics and Biostatistics
This note evaluates the properties and performance of a censored median regression estimator, as presented in literature by different authors in the context of support vector regression, and compares its performance on simulated and real data in the one-sample case, with the Kaplan-Meier estimator and an inverse probability weighted estimator.
A sparse hierarchical Bayesian model for detecting relevant antigenic sites in virus evolution
A sparse hierarchical Bayesian model for detecting relevant antigenic sites in virus evolution (SABRE) which can account for the experimental variability in the data and predict antigenic variability and it is shown how the method outperforms alternative established methods.


A hybrid approach to protein differential expression in mass spectrometry-based proteomics
This work outlines a statistical method for protein differential expression, based on a simple Binomial likelihood, that enables the selection of proteins not typically amenable to quantitative analysis, and presents an analysis protocol that combines quantitative and presence/absence analysis of a given dataset in a principled way.
A statistical framework for protein quantitation in bottom-up MS-based proteomics
A statistical model is presented that carefully accounts for informative missingness in peak intensities and allows unbiased, model-based, protein-level estimation and inference and the model is applicable to both label-based and label-free quantitation experiments.
Statistical methods for quantitative mass spectrometry proteomic experiments with labeling
This work describes use of fundamental concepts of statistical experimental design in the labeling framework in order to minimize variability and avoid biases, and demonstrates how to export data in the format that is most efficient for statistical analysis.
Statistical analysis of relative labeled mass spectrometry data from complex samples using ANOVA.
Statistical tools enable unified analysis of data from multiple global proteomic experiments, producing unbiased estimates of normalization terms despite the missing data problem inherent in these
Precise protein quantification based on peptide quantification using iTRAQ™
Quant is shown to generate results that are consistent with those produced by ProQuant™, thus validating both systems and a lognormal distribution fits the data of mass spectrometry based relative peptide quantification.
Multiplexed Protein Quantitation in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging Reagents*S
It is found that inactivation of Upf1p and Xrn1p causes common as well as unique effects on protein expression, and the use of 4-fold multiplexing to enable relative protein measurements simultaneously with determination of absolute levels of a target protein using synthetic isobaric peptide standards.
A bayesian based functional mixed-effects model for analysis of LC-MS data
A Bayesian multilevel functional mixed-effects model with group specific random-effects for analysis of liquid chromatography-mass spectrometry data allows alignment of LC-MS spectra with respect to both retention time and mass-to-charge ratio.
Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics.
Central tendency, linear regression, locally weighted regression, and quantile techniques were investigated for normalization of peptide abundance measurements obtained from high-throughput liquid
Normalization and missing value imputation for label-free LC-MS analysis
Several approaches to normalization and dealing with missing values for shotgun proteomic data are discussed, some initially developed for microarray data and some developed specifically for mass spectrometry-based data.
BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data.
The models presented represent the first building block for integrated Bayesian Analysis of Affymetrix GeneChip data and take into account additive as well as multiplicative error, gene expression levels are estimated using perfect match and a fraction of mismatch probes and are modeled on the log scale.