On the ERM Principle with Networked Data

@article{Wang2017OnTE,
  title={On the ERM Principle with Networked Data},
  author={Yuanhong Wang and Yuyi Wang and Xingwu Liu and Juhua Pu},
  journal={ArXiv},
  year={2017},
  volume={abs/1711.04297}
}
Networked data, in which every training example involves two objects and may share some common objects with others, is used in many machine learning tasks such as learning to rank and link prediction. A challenge of learning from networked examples is that target values are not known for some pairs of objects. In this case, neither the classical i.i.d. assumption nor techniques based on complete U-statistics can be used. Most existing theoretical results of this problem only deal with the… 

Figures and Tables from this paper

Generalization Bounds for Knowledge Graph Embedding (Trained by Maximum Likelihood)

The results provide an explanation for why knowledge graph embedding methods work, as much as classical learning theory results provide explanations for classical learning from i.i.d. data.

References

SHOWING 1-10 OF 51 REFERENCES

Learning from Networked Examples

This work shows that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then considers alternatives, which lead to novel concentration inequalities.

Ranking and empirical minimization of U-statistics

This paper forms the ranking problem in a rigorous statistical framework, establishes in particular a tail inequality for degenerate U-processes, and applies it for showing that fast rates of convergence may be achieved under specific noise assumptions, just like in classification.

Chromatic PAC-Bayes Bounds for Non-IID Data

This work proposes the first - to the best of the authors' knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies and shows how the results can be used to derive bounds for ranking statistics and classifierstrained on data distributed according to a stationary {\ss}-mixing process.

Robust Unsupervised Feature Selection on Networked Data

A robust unsupervised feature selection framework NetFS for networked data is proposed, which embeds the latent representation learning into feature selection, which is able to help mitigate the negative effects from noisy links in learning latent representations, while good latent representations in turn can contribute to extract more meaningful features.

Risk bounds for statistical learning

A general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM) when the classification rules belong to some VC-class under margin conditions is proposed and discussed the optimality of these bounds in a minimax sense.

On Ranking and Generalization Bounds

  • W. Rejchel
  • Computer Science, Mathematics
    J. Mach. Learn. Res.
  • 2012
This paper considers ranking estimators that minimize the empirical convex risk and proves generalization bounds for the excess risk of such estimators with rates that are faster than 1/√n.

Classification in Networked Data: a Toolkit and a Univariate Case Study

The results demonstrate that very simple network-classification models perform quite well---well enough that they should be used regularly as baseline classifiers for studies of learning with networked data.

Optimal aggregation of classifiers in statistical learning

The main result of the paper concerns optimal aggregation of classifiers: a classifier that automatically adapts both to the complexity and to the margin, and attains the optimal fast rates, up to a logarithmic factor.

Statistical inference on graphs

The problem of graph inference, or graph reconstruction, is to predict the presence or absence of edges between a set of given points known to form the vertices of a graph is shown to be random, with a probability distribution that possibly depends on the size of the graph.

Generalization error bounds for classifiers trained with interdependent data

A general framework to study the generalization properties of binary classifiers trained with data which may be dependent, but are deterministically generated upon a sample of independent examples is proposed.
...