• Corpus ID: 153313156

Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates

  title={Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates},
  author={George H. Chen},
  • George H. Chen
  • Published in ICML 13 May 2019
  • Computer Science, Mathematics
We establish the first nonasymptotic error bounds for Kaplan-Meier-based nearest neighbor and kernel survival probability estimators where feature vectors reside in metric spaces. Our bounds imply rates of strong consistency for these nonparametric estimators and, up to a log factor, match an existing lower bound for conditional CDF estimation. Our proof strategy also yields nonasymptotic guarantees for nearest neighbor and kernel variants of the Nelson-Aalen cumulative hazards estimator. We… 

Figures and Tables from this paper

Deep Kernel Survival Analysis and Subject-Specific Survival Time Prediction Intervals

The experiments show that the neural kernel survival estimators are competitive with a variety of existing survival analysis methods, and that their prediction intervals can help compare different methods' uncertainties, even for estimators that do not use kernels.

Survival Kernets: Scalable and Interpretable Deep Kernel Survival Analysis with an Accuracy Guarantee

A new deep kernel survival model called a survival kernet, which scales to large datasets in a manner that is amenable to model interpretation and also theoretical analysis, and establishes a finite-sample error bound on predicted survival distributions that is, up to a log factor, optimal.

Deep Survival Machines: Fully Parametric Survival Regression and Representation Learning for Censored Data With Competing Risks

We describe a new approach to estimating relative risks in time-to-event prediction problems with censored data in a fully parametric manner. Our approach does not require making strong assumptions

Metaparametric Neural Networks for Survival Analysis

The metaparametric neural network framework is presented that encompasses the existing survival analysis methods and enables their extension to solve the aforementioned issues and outperforms the current state-of-the-art methods in capturing nonlinearities and identifying temporal patterns, leading to more accurate overall estimations while placing no restrictions on the underlying function structure.

Experimental Comparison of Semi-parametric, Parametric, and Machine Learning Models for Time-to-Event Analysis Through the Concordance Index

This paper makes an experimental comparison of semi-parametric (Cox proportional hazards model, Aalen's additive regression model), parametric (Weibull AFT model), and machine learning models through the concordance index on two different datasets (PBC and GBCSG2).

(Decision and regression) tree ensemble based kernels for regression and classification

The results support the tree ensemble based kernels as a valuable addition to the practitioner’s toolbox and outline future line of research for kernels furnished by Bayesian counterparts of the frequentist tree ensembles.



A K‐nearest neighbors survival probability prediction method

We introduce a nonparametric survival prediction method for right‐censored data. The method generates a survival curve prediction by constructing a (weighted) Kaplan–Meier estimator using the

Uniform strong convergence results for the conditional kaplan-meier estimator and its quantiles

We consider a fixed design model in which the responses are possibly right censored. The aim of this paper is to establish some important almost sure convergence properties of the Kaplan-Meier type

Rates of Convergence for Nearest Neighbor Classification

This work analyzes the behavior of nearest neighbor classification in metric spaces and provides finite-sample, distribution-dependent rates of convergence under minimal assumptions, and finds that under the Tsybakov margin condition the convergence rate of nearest neighbors matches recently established lower bounds for nonparametric classification.


We consider the nonparametric kernel estimation of the conditional cumulative distribution function given a functional covariate. Given the bias-variance trade-off of the risk, we first propose a

Bandwidth selection in kernel density estimation: Oracle inequalities and adaptive minimax optimality

The proposed selection rule leads to the estimator being minimax adaptive over a scale of the anisotropic Nikol'skii classes and the main technical tools used in derivations are uniform bounds on the L s-norms of empirical processes developed recently by Goldenshluger and Lepski.


It is shown that the simplest kind of trees are complete in D-dimensional space if the number of terminal nodes T is greater than D and that the Adaboost minimization algorithm gives an ensemble converging to the Bayes risk.

k*-Nearest Neighbors: From Global to Local

This paper offers a simple approach to locally weighted regression/classification, where the bias-variance tradeoff is made explicit and the applicability is demonstrated on several datasets, showing superior performance over standard locally weighted methods.


Let P(X> t) = F(t), P( Y> t) = G(t) and P(Z> t) = H(t), (H(t) = F(t)G(t)). An important problem of survival analysis is the estimation of the distribution function F. Recently the properties of two

Classification in general finite dimensional spaces with the k-nearest neighbor rule

Two necessary and suf ficient conditions to obtain uniform consistency rates of classification are identified and sharp estimates in the case of the k -nearest neighbor rule are derived.

k-NN Regression Adapts to Local Intrinsic Dimension

The k-NN regression is shown to be adaptive to intrinsic dimension, and it is established that the minimax rate does not depend on a particular choice of metric space or distribution, but rather that this minimax rates holds for any metric space and doubling measure.