# Semi-supervised Learning by Entropy Minimization

@inproceedings{Grandvalet2004SemisupervisedLB, title={Semi-supervised Learning by Entropy Minimization}, author={Yves Grandvalet and Yoshua Bengio}, booktitle={CAP}, year={2004} }

We consider the semi-supervised learning problem, where a decision rule is to be learned from labeled and unlabeled data. [...] Key Method The method challenges mixture models when the data are sampled from the distribution class spanned by the generative model. The performances are definitely in favor of minimum entropy regularization when generative models are misspecified, and the weighting of unlabeled data provides robustness to the violation of the "cluster assumption". Finally, we also illustrate that… Expand

## 1,137 Citations

Semi-Supervised Learning via Regularized Boosting Working on Multiple Semi-Supervised Assumptions

- Mathematics, Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2011

This paper proposes a novel cost functional consisting of the margin cost on labeled data and the regularization penalty on unlabeled data based on three fundamental semi-supervised assumptions and demonstrates that the algorithm yields favorite results for benchmark and real-world classification tasks in comparison to state-of-the-art semi- supervised learning algorithms, including newly developed boosting algorithms.

High Order Regularization for Semi-Supervised Learning of Structured Output Problems

- Mathematics, Computer ScienceICML
- 2014

A new max-margin framework for semi-supervised structured output learning is proposed, that allows the use of powerful discrete optimization algorithms and high order regularizers defined directly on model predictions for the unlabeled examples.

Beyond the Low-density Separation Principle: A Novel Approach to Semi-supervised Learning

- Computer Science, MathematicsArXiv
- 2016

A novel approach to semi-supervised learning that does not require such restrictive assumptions is proposed, which is to combine learning from positive and negative data (standard supervised learning) and learning frompositive and unlabeled data (PU learning).

SERAPH: Semi-supervised Metric Learning Paradigm with Hyper Sparsity

- Computer Science, MathematicsArXiv
- 2011

A general information-theoretic approach called Seraph (SEmi-supervised metRic leArning Paradigm with Hyper-sparsity) for metric learning that does not rely upon the manifold assumption and is regularized by encouraging a low-rank projection induced from the metric.

Information-Theoretic Semi-Supervised Metric Learning via Entropy Regularization

- Medicine, Computer ScienceNeural Computation
- 2014

A general information-theoretic approach to semi-supervised metric learning called SERAPH (SEmi- supervised metRic leArning Paradigm with Hypersparsity) that does not rely on the manifold assumption, and regularize SERAPH by trace-norm regularization to encourage low-dimensional projections associated with the distance metric.

Mutual exclusivity loss for semi-supervised deep learning

- Computer Science, Mathematics2016 IEEE International Conference on Image Processing (ICIP)
- 2016

An unsupervised regularization term is proposed that explicitly forces the classifier's prediction for multiple classes to be mutually-exclusive and effectively guides the decision boundary to lie on the low density space between the manifolds corresponding to different classes of data.

Asymptotic Bayes Risk for Gaussian Mixture in a Semi-Supervised Setting

- Computer Science, Mathematics2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP)
- 2019

This paper compute analytically the gap between the best fully-supervised approach on labeled data and the best semi- supervised approach using both labeled and unlabeled data, in a simple high-dimensional Gaussian mixture model.

Subspace Regularization: A New Semi-supervised Learning Method

- Mathematics, Computer ScienceECML/PKDD
- 2009

This work introduces into semi-supervised learning the classic low-dimensionality embedding assumption, stating that most geometric information of high dimensional data is embedded in a low dimensional manifold.

Semi-supervised learning via manifold regularization

- Mathematics
- 2012

Abstract This paper proposes a novel graph-based transductive learning algorithm based on manifold regularization. First, the manifold regularization was introduced to probabilistic discriminant…

A Rate Distortion Approach for Semi-Supervised Conditional Random Fields

- Mathematics, Computer ScienceNIPS
- 2009

This work proposes a novel information theoretic approach for semi-supervised learning of conditional random fields that defines a training objective to combine the conditional likelihood on labeled data and the mutual information on unlabeled data using the rate distortion theory in information theory.

## References

SHOWING 1-10 OF 39 REFERENCES

On Information Regularization

- Computer Science, MathematicsUAI
- 2003

The work extends Szummer and Jaakkola's information regularization to multiple dimensions, providing a regularizer independent of the covering of the space used in the derivation, and shows in addition how the information regularizer can be used as a measure of complexity of the classification task with unlabeled data and prove a relevant sample-complexity bound.

Semi Supervised Logistic Regression

- Computer ScienceECAI
- 2002

A new semi-supervised algorithm that relies on a discriminative approach to semi- supervised learning rather than a generative approach, which can be interpreted as an instance of the Classification Expectation Maximization algorithm.

Learning with Local and Global Consistency

- Computer Science, MathematicsNIPS
- 2003

A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points.

Information Regularization with Partially Labeled Data

- Mathematics, Computer ScienceNIPS
- 2002

A regularization approach to linking the marginal and the conditional in a general way is formulated and the regularization penalty measures the information that is implied about the labels over covering regions.

Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions

- Computer ScienceICML
- 2003

An approach to semi-supervised learning is proposed that is based on a Gaussian random field model, and methods to incorporate class priors and the predictions of classifiers obtained by supervised learning are discussed.

Learning with Multiple Labels

- Computer ScienceNIPS
- 2002

This paper proposes a novel discriminative approach for handling the ambiguity of class labels in the training examples and shows that the approach is able to find the correct label among the set of candidate labels and actually achieve performance close to the case when each training instance is given a single correct label.

Analyzing the effectiveness and applicability of co-training

- Computer ScienceCIKM '00
- 2000

It is demonstrated that when learning from labeled and unlabeled data, algorithms explicitly leveraging a natural independent split of the features outperform algorithms that do not and may out-perform algorithms not using a split.

Restricted Bayes Optimal Classifiers

- Computer ScienceAAAI/IAAI
- 2000

This paper investigates two particular instantiations of the notion of restricted Bayes optimal classifiers, and shows that the first uses a non-parametric density estimator — Parzen Windows with Gaussian kernels — and hyperplane decision boundaries and is asymptotically equivalent to a maximal margin hyperplane classifier, a highly successful discriminative classifier.

Semi-Supervised Support Vector Machines

- Mathematics, Computer ScienceNIPS
- 1998

A general S3VM model is proposed that minimizes both the misclassification error and the function capacity based on all the available data that can be converted to a mixed-integer program and then solved exactly using integer programming.

Special Invited Paper-Additive logistic regression: A statistical view of boosting

- Mathematics
- 2000

Boosting is one of the most important recent developments in classification methodology. Boosting works by sequentially applying a classification algorithm to reweighted versions of the training data…