A survey on semi-supervised learning
@article{vanEngelen2019ASO, title={A survey on semi-supervised learning}, author={Jesper E. van Engelen and Holger H. Hoos}, journal={Machine Learning}, year={2019}, volume={109}, pages={373 - 440} }
Semi-supervised learning is the branch of machine learning concerned with using labelled as well as unlabelled data to perform certain learning tasks. Conceptually situated between supervised and unsupervised learning, it permits harnessing the large amounts of unlabelled data available in many use cases in combination with typically smaller sets of labelled data. In recent years, research in this area has followed the general trends observed in machine learning, with much attention directed at…
692 Citations
Boosting the Performance of Semi-Supervised Learning with Unsupervised Clustering
- Computer ScienceArXiv
- 2020
It is shown that ignoring the labels altogether for whole epochs intermittently during training can significantly improve performance in the small sample regime, and the method's efficacy in boosting several state-of-the-art SSL algorithms is demonstrated.
An Efficient Approach to Select Instances in Self-Training and Co-Training Semi-supervised Methods
- Computer ScienceIEEE Access
- 2021
Three methods are proposed for automating the labeling process of unlabeled instances in semi-supervised learning and all three methods perform better than the original self-training and co-training methods, in most analysed cases.
Classification of acoustical signals by combining active learning strategies with semi-supervised learning schemes
- Computer ScienceNeural Computing and Applications
- 2021
Enter the proposed combinatory framework, which operates under training sets with small cardinality, the results prove the benefits of adopting such kind of semi-automated approaches regarding both the achieved predictive correctness when reduced consumption of resources takes place, as well as the smoothness of the learning convergence.
On tuning a mean-field model for semi-supervised classification
- Computer ScienceArXiv
- 2022
This work focuses on the task of transduction with a mean-field approximation to the Potts model and proposes a tuning approach based on a novel parameter γ that allows NMF to outperform other approaches in datasets with fewer classes.
Dealing With Multipositive Unlabeled Learning Combining Metric Learning and Deep Clustering
- Computer ScienceIEEE Access
- 2022
Experimental evaluations on real-world benchmarks considering recent MPUL competitors demonstrates that the proposed framework achieves state-of-the-art performances, thus supporting the validity of the proposed approach.
Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification
- Computer ScienceArXiv
- 2022
A (hierarchical) multi-label classification method based on semi-supervised learning of predictive clustering trees that preserves interpretability and reduces the time complexity of classical tree-based models is proposed.
DEVELOPMENT AND COMPARATIVE ANALYSIS OF SEMI-SUPERVISED LEARNING ALGORITHMS ON A SMALL AMOUNT OF LABELED DATA
- Computer ScienceBulletin of National Technical University "KhPI". Series: System Analysis, Control and Information Technologies
- 2021
It was shown, that even small amounts of labeled data allow us to use semi-supervised learning, and proposed modifications ensure to improve accuracy and algorithm performance, which was demonstrated during experiments.
A review of various semi-supervised learning models with a deep learning and memory approach
- Computer ScienceIran J. Comput. Sci.
- 2019
Memory-based neural networks are new models of neural networks which can be used in this area to benefit from memory to increase such an effect of semi-supervised learning.
Active learning for hierarchical multi-label classification
- Computer ScienceData Mining and Knowledge Discovery
- 2020
A public framework containing baseline and state-of-the-art algorithms suitable for hierarchical multi-label classification is provided, and a new algorithm, namely Hierarchical Query-By-Committee (H-QBC), which is validated on datasets from different domains.
Dash: Semi-Supervised Learning with Dynamic Thresholding
- Computer ScienceICML
- 2021
The proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection and its theoretical guarantee, and theoretically establishes the convergence rate of Dash from the view of non-convex optimization.
References
SHOWING 1-10 OF 225 REFERENCES
Introduction to Semi-Supervised Learning
- Computer ScienceIntroduction to Semi-Supervised Learning
- 2009
This introductory book presents some popular semi-supervised learning models, including self-training, mixture models, co-training and multiview learning, graph-based methods, and semi- supervised support vector machines, and discusses their basic mathematical formulation.
SemiBoost: Boosting for Semi-Supervised Learning
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2009
A boosting framework for semi-supervised learning, termed as SemiBoost, that improves the performance of several commonly used supervised learning algorithms, given a large number of unlabeled examples and is comparable to the state-of-the-art semi- supervised learning algorithms.
Semi-Supervised Learning
- Computer Science
- 2006
This first comprehensive overview of semi-supervised learning presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research.
Enhancing Supervised Learning with Unlabeled Data
- Computer ScienceICML
- 2000
A new semi-supervised learning method called co-learning that is designed to use unlabeled data to enhance standard supervised learning algorithms to leverage off the fact that they have different representations of the hypotheses and are likely to detect different patterns in labeled data.
Semi-Supervised Learning via Regularized Boosting Working on Multiple Semi-Supervised Assumptions
- Computer ScienceIEEE Transactions on Pattern Analysis and Machine Intelligence
- 2011
This paper proposes a novel cost functional consisting of the margin cost on labeled data and the regularization penalty on unlabeled data based on three fundamental semi-supervised assumptions and demonstrates that the algorithm yields favorite results for benchmark and real-world classification tasks in comparison to state-of-the-art semi- supervised learning algorithms, including newly developed boosting algorithms.
Semi-supervised learning with graphs
- Computer Science
- 2005
A series of novel semi-supervised learning approaches arising from a graph representation, where labeled and unlabeled instances are represented as vertices, and edges encode the similarity between instances are presented.
Graph-Based Semi-Supervised Learning
- Computer ScienceGraph-Based Semi-Supervised Learning
- 2014
This synthesis lecture focuses on graph-based SSL algorithms (e.g., label propagation methods), which have been shown to outperform the state-of-the-art in many applications in speech processing, computer vision, natural language processing, and other areas of Artificial Intelligence.
Semi-supervised classification trees
- Computer ScienceJournal of Intelligent Information Systems
- 2017
A semi-supervised classification tree induction algorithm that can exploit both the labelled and unlabeled data, while preserving all of the appealing characteristics of standard supervised decision trees: being non-parametric, efficient, having good predictive performance and producing readily interpretable models.
Semi-Supervised Random Forests
- Computer Science2009 IEEE 12th International Conference on Computer Vision
- 2009
This work develops a novel multi-class margin definition for the unlabeled data, and proposes a control mechanism based on the out-of-bag error, which prevents the algorithm from degradation if the unl labeled data is not useful for the task.
Semi-Supervised Regression with Co-Training
- Computer ScienceIJCAI
- 2005
Experiments show that COREG can effectively exploit unlabeled data to improve regression estimates and is proposed as a co-training style semi-supervised regression algorithm.