Robust Topological Inference in the Presence of Outliers

  title={Robust Topological Inference in the Presence of Outliers},
  author={Siddharth Vishwanath and Bharath K. Sriperumbudur and Kenji Fukumizu and Satoshi Kuriki},
The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular,the sublevel sets of the distance function are used in the computation of persistent homology— a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this work, we develop a framework of statistical inference for persistent homology in the presence of… 

Figures and Tables from this paper



Robust Topological Inference: Distance To a Measure and Kernel Distance

The distance-to-a-measure (DTM), and the kernel distance, introduced by Phillips et al. (2014), are smooth functions that provide useful topological information but are robust to noise and outliers.

Geometric Inference for Probability Measures

Replacing compact subsets by measures, a notion of distance function to a probability distribution in ℝd is introduced and it is shown that it is possible to reconstruct offsets of sampled shapes with topological guarantees even in the presence of outliers.

Robust Persistence Diagrams using Reproducing Kernels

This work develops a framework for constructing robust persistence diagrams from superlevel filtrations of robust density estimators constructed using reproducing kernels using an analogue of the influence function on the space of persistence diagrams to be less sensitive to outliers.

Efficient and robust persistent homology for measures

MONK - Outlier-Robust Mean Embedding Estimation by Median-of-Means

This paper shows how the recently emerged principle of median-of-means can be used to design estimators for kernel mean embedding and MMD with excessive resistance properties to outliers, and optimal sub-Gaussian deviation bounds under mild assumptions.

A roadmap for the computation of persistent homology

A friendly introduction to PH is given, the pipeline for the computation of PH is navigated with an eye towards applications, and a range of synthetic and real-world data sets are used to evaluate currently available open-source implementations for the computations of PH.

Subsampling Methods for Persistent Homology

This work proposes to compute the persistent homology of several subsamples of the data and then combines the resulting estimates to prove that the subsampling approach carries stable topological information while achieving a great reduction in computational complexity.

An Introduction to Topological Data Analysis: Fundamental and Practical Aspects for Data Scientists

This paper is a brief introduction, through a few selected topics, to basic fundamental and practical aspects of TDA for non experts.

DTM-based filtrations

A new family of filtrations is introduced, built on top of point clouds in the Euclidean space which are more robust to noise and outliers and relies on the notion of distance-to-measure functions.

Confidence sets for persistence diagrams

This paper derives confidence sets that allow us to separate topological signal from topological noise, and brings some statistical ideas to persistent homology.