# On the ERM Principle with Networked Data

@article{Wang2017OnTE, title={On the ERM Principle with Networked Data}, author={Yuanhong Wang and Yuyi Wang and Xingwu Liu and Juhua Pu}, journal={ArXiv}, year={2017}, volume={abs/1711.04297} }

Networked data, in which every training example involves two objects and may share some common objects with others, is used in many machine learning tasks such as learning to rank and link prediction. A challenge of learning from networked examples is that target values are not known for some pairs of objects. In this case, neither the classical i.i.d. assumption nor techniques based on complete U-statistics can be used. Most existing theoretical results of this problem only deal with the…

## One Citation

### Generalization Bounds for Knowledge Graph Embedding (Trained by Maximum Likelihood)

- Computer Science
- 2019

The results provide an explanation for why knowledge graph embedding methods work, as much as classical learning theory results provide explanations for classical learning from i.i.d. data.

## References

SHOWING 1-10 OF 51 REFERENCES

### Learning from Networked Examples

- Computer ScienceALT
- 2017

This work shows that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then considers alternatives, which lead to novel concentration inequalities.

### Ranking and empirical minimization of U-statistics

- Computer Science
- 2006

This paper forms the ranking problem in a rigorous statistical framework, establishes in particular a tail inequality for degenerate U-processes, and applies it for showing that fast rates of convergence may be achieved under specific noise assumptions, just like in classification.

### Chromatic PAC-Bayes Bounds for Non-IID Data

- Computer ScienceAISTATS
- 2009

This work proposes the first - to the best of the authors' knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies and shows how the results can be used to derive bounds for ranking statistics and classifierstrained on data distributed according to a stationary {\ss}-mixing process.

### Robust Unsupervised Feature Selection on Networked Data

- Computer ScienceSDM
- 2016

A robust unsupervised feature selection framework NetFS for networked data is proposed, which embeds the latent representation learning into feature selection, which is able to help mitigate the negative effects from noisy links in learning latent representations, while good latent representations in turn can contribute to extract more meaningful features.

### Risk bounds for statistical learning

- Computer Science
- 2007

A general theorem providing upper bounds for the risk of an empirical risk minimizer (ERM) when the classification rules belong to some VC-class under margin conditions is proposed and discussed the optimality of these bounds in a minimax sense.

### On Ranking and Generalization Bounds

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2012

This paper considers ranking estimators that minimize the empirical convex risk and proves generalization bounds for the excess risk of such estimators with rates that are faster than 1/√n.

### Classification in Networked Data: a Toolkit and a Univariate Case Study

- Computer ScienceJ. Mach. Learn. Res.
- 2007

The results demonstrate that very simple network-classification models perform quite well---well enough that they should be used regularly as baseline classifiers for studies of learning with networked data.

### Optimal aggregation of classifiers in statistical learning

- Computer Science, Mathematics
- 2003

The main result of the paper concerns optimal aggregation of classifiers: a classifier that automatically adapts both to the complexity and to the margin, and attains the optimal fast rates, up to a logarithmic factor.

### Statistical inference on graphs

- Computer Science, Mathematics
- 2006

The problem of graph inference, or graph reconstruction, is to predict the presence or absence of edges between a set of given points known to form the vertices of a graph is shown to be random, with a probability distribution that possibly depends on the size of the graph.

### Generalization error bounds for classifiers trained with interdependent data

- Computer ScienceNIPS
- 2005

A general framework to study the generalization properties of binary classifiers trained with data which may be dependent, but are deterministically generated upon a sample of independent examples is proposed.