Probabilistic partial least squares model: Identifiability, estimation and application

  title={Probabilistic partial least squares model: Identifiability, estimation and application},
  author={Said el Bouhaddani and Hae-Won Uh and Caroline Hayward and Geurt Jongbloed and Jeanine J. Houwing-Duistermaat},
  journal={J. Multivar. Anal.},

Figures and Tables from this paper

On some limitations of probabilistic models for dimension‐reduction: Illustration in the case of probabilistic formulations of partial least squares

Partial least squares (PLS) refer to a class of dimension‐reduction techniques aiming at the identification of two sets of components with maximal covariance, to model the relationship between two

On some limitations of probabilistic models for dimension-reduction: illustration in the case of one particular probabilistic formulation of PLS.

Partial Least Squares (PLS) refer to a class of dimension-reduction techniques aiming at the identification of two sets of components with maximal covariance, in order to model the relationship

Statistical integration of heterogeneous omics data: Probabilistic two‐way partial least squares (PO2PLS)

A global test for the relationship between two datasets is proposed, specifically addressing the high dimensionality, and its asymptotic distribution is derived.

Statistical Integration of Heterogeneous Data with PO2PLS

A general framework, probabilistic two-way partial least squares (PO2PLS), which models the relationship between two datasets using joint and data-specific latent variables and performs better than alternatives in feature selection and prediction performance.

Statistical integration of two omics datasets using GO2PLS

GO2PLS integrates two omics datasets to help understand the underlying system that involves both omics levels and incorporates external group information and performs group selection, resulting in a small subset of features that best explain the relationship between two omic datasets for better interpretability.

Statistical Integration of Multiple Omics Datasets Using GO2PLS

The simulation study showed that introducing sparsity improved the performance concerning feature selection, and incorporating group structures increased the precision and power of the feature selection procedure.

Joint Modeling of An Outcome Variable and Integrated Omic Datasets Using GLM-PO2PLS

This article extends dimension reduction methods which model the joint part of omics to a novel method that jointly models an outcome variable with omics and shows that the model provides more insight by jointly considering methylation and glycomics.

Monitoring of Industrial Processes via Non-stationary Probabilistic Slow Feature Analysis Machine Learning Algorithm

The proposed NS-PSFA algorithm has better performance in non-stationary applications such as a continuous stirred tank reactor (CSTR) and a three-phase flow industrial process in comparison with the CVA and PCA methods.

Statistical method for modeling sequencing data from different technologies in longitudinal studies with application to Huntington disease

For one out of 14 genes, the initial significant result could be replicated with both technologies using data from both time points, but statistical efficiency is lost due to disagreement between the two technologies, measurement error when predicting gene expressions, and the need to include additional parameters to account for possible differences.

Mapping Particle Size and Soil Organic Matter in Tropical Soil Based on Hyperspectral Imaging and Non-Imaging Sensors

This study successfully generated, from the imaging sensor, a large-scale and detailed predicted soil maps for particle size and SOM, which are important in the management of tropical soils.



A Unifying Tool for Linear Multivariate Statistical Methods: The RV‐Coefficient

Consider two data matrices on the same sample of n individuals, X(p x n), Y(q x n). From these matrices, geometrical representations of the sample are obtained as two configurations of n points, in

Sparse meta-analysis with high-dimensional data.

Sparse meta-analysis is proposed, in which variable selection for meta- analysis is based solely on summary statistics and the effect sizes of each covariate are allowed to vary among studies, and the SMA enjoys the oracle property if the estimated covariance matrix of the parameter estimators from each study is available.

A two-step PLS inspired method for linear prediction with group effect

In this article, we consider prediction of a univariate response from background data. The data may have a near-collinear structure and additionally group effects are assumed to exist. A two-step

Simultaneous Envelopes for Multivariate Linear Regression

A likelihood-based objective function is used for estimating envelopes and then algorithms for estimation of a simultaneous envelope as well as for basic Grassmann manifold optimization are proposed.

Partial least squares regression and projection on latent structure regression (PLS Regression)

Partial least squares (PLS) regression (a.k.a. projection on latent structures) is a recent technique that combines features from and generalizes principal component analysis (PCA) and multiple

Bootstrapping principal component regression models

Bootstrap methods can be used as an alternative for cross‐validation in regression procedures such as principal component regression (PCR). Several bootstrap methods for the estimation of prediction

Overview and Recent Advances in Partial Least Squares

Partial Least Squares is a wide class of methods for modeling relations between sets of observed variables by means of latent variables as well as dimension reduction techniques and modeling tools.

Probabilistic partial least squares regression for quantitative analysis of Raman spectra

A probabilistic PLSR (PPLSR) model and an Estimation Maximisation (EM) algorithm for estimating parameters are proposed and provided a foundation to develop future Bayesian nonparametrics models.