Regularization-free principal curve estimation

  title={Regularization-free principal curve estimation},
  author={Samuel Gerber and Ross T. Whitaker},
  journal={J. Mach. Learn. Res.},
Principal curves and manifolds provide a framework to formulate manifold learning within a statistical context. Principal curves define the notion of a curve passing through the middle of a distribution. While the intuition is clear, the formal definition leads to some technical and practical difficulties. In particular, principal curves are saddle points of the mean-squared projection distance, which poses severe challenges for estimation and model selection. This paper demonstrates that the… 

Figures from this paper

Multiple Penalized Principal Curves: Analysis and Computation

This work considers an objective functional whose minimizers are a regularization of principal curves and introduces a new functional which allows for multiple curves, and proves existence of minimizers and investigates their properties.

Manifold unwrapping using density ridges

Research on manifold learning within a density ridge estimation framework has shown great potential in recent work for both estimation and de-noising of manifolds, building on the intuitive and

Principal manifold estimation via model complexity selection

  • Kun MengA. Eloyan
  • Computer Science
    Journal of the Royal Statistical Society. Series B, Statistical methodology
  • 2021
A novel method for model complexity selection to avoid overfitting, eliminate the effects of outliers and improve the computation speed is proposed and applied to estimate tumour surfaces and interiors in a lung cancer study.

Regularization of Mixture Models for Robust Principal Graph Learning

A regularized version of Mixture Models is proposed to learn a principal graph from a distribution of D-dimensional data points, assuming that the underlying structure can be modeled as a graph acting like a topological prior for the Gaussian clusters turning the problem into a maximum a posteriori estimation.

Principal Curves on Riemannian Manifolds

  • Søren Hauberg
  • Mathematics
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2016
It is argued that instead of generalizing linear Euclidean models, it is more fruitful to generalize non-linear Euclideans, and extended the classic Principal Curves from Hastie & Stuetzle to data residing on a complete Riemannian manifold.

Robust nonlinear principal components

A predictive approach in which a spline curve is fit minimizing a residual M-scale is proposed, which is almost as good as other proposals for row-wise contamination, and is better for element- wise contamination.

On principal curves with a length constraint

  • S. DelattreA. Fischer
  • Mathematics
    Annales de l'Institut Henri Poincaré, Probabilités et Statistiques
  • 2020
Principal curves are defined as parametric curves passing through the ``middle'' of a probability distribution in R^d. In addition to the original definition based on self-consistency, several points

Density ridge manifold traversal

A novel manifold traversal algorithm based on geodesics within the density ridge approach is introduced, executed in a subspace capturing the intrinsic dimensionality of the data using dimensionality reduction techniques such as principal component analysis or kernel entropy component analysis.


Euclidean statistics are often generalized to Riemannian manifolds by replacing straight-line interpolations with geodesic ones. While these Riemannian models are familiar-looking, they are

Machine Learning using Principal Manifolds and Mode Seeking

This thesis presents a fast and exact kernel density derivative estimator, a novel algorithm for manifold unwrapping based on tracing the gradient flow along a manifold estimated using density derivatives, and a novel framework for robust mode seeking based on ensemble clustering and resampling techniques.



Extremal properties of principal curves in the plane

Principal curves were introduced to formalize the notion of "a curve passing through the middle of a dataset." Vaguely speaking, a curve is said to pass through the middle of a dataset if every point

Parameter Selection for Principal Curves

  • G. BiauA. Fischer
  • Mathematics, Computer Science
    IEEE Transactions on Information Theory
  • 2012
This paper considers the principal curve problem from an empirical risk minimization perspective and addresses the parameter selection issue using the point of view of model selection via penalization and offers oracle inequalities and implements the proposed approach to recover the hidden structures in both simulated and real-life data.

Dimensionality reduction and principal surfaces via Kernel Map Manifolds

A manifold learning approach to dimensionality reduction that explicitly models the manifold as a mapping from low to high dimensional space and the extremal points converge to principal surfaces as the number of data to learn from increases is presented.

A Unified Model for Probabilistic Principal Surfaces

A unified covariance model is introduced that implements the probabilistic principal surface (PPS), and it is shown in two different comparisons that the PPS outperforms the GTM under identical parameter settings.

Principal Curves

Principal curves are smooth one-dimensional curves that pass through the middle of a p-dimensional data set, providing a nonlinear summary of the data. They are nonparametric, and their shape is

Principal surfaces from unsupervised kernel regression

This work proposes a nonparametric approach to learning of principal surfaces based on an unsupervised formulation of the Nadaraya-Watson kernel regression estimator, which allows for a convenient incorporation of nonlinear spectral methods for parameter initialization, beyond classical initializations based on linear PCA.

Learning and Design of Principal Curves

This work defines principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution, making it possible to theoretically analyze principal curve learning from training data and it also leads to a new practical construction.

Regularized Principal Manifolds

An algorithm for finding principal manifolds that can be regularized in a variety of ways is proposed and bounds on the covering numbers are given which allows us to obtain nearly optimal learning rates for certain types of regularization operators.

A global geometric framework for nonlinear dimensionality reduction.

An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.

Principal curves revisited

A principal curve (Hastie and Stuetzle, 1989) is a smooth curve passing through the ‘middle’ of a distribution or data cloud, and is a generalization of linear principal components. We give an