Label Propagation through Linear Neighborhoods

  title={Label Propagation through Linear Neighborhoods},
  author={Fei Wang and Changshui Zhang},
  journal={IEEE Transactions on Knowledge and Data Engineering},
In many practical data mining applications such as text classification, unlabeled training examples are readily available, but labeled ones are fairly expensive to obtain. Therefore, semi supervised learning algorithms have aroused considerable interests from the data mining and machine learning fields. In recent years, graph-based semi supervised learning has been becoming one of the most active research areas in the semi supervised learning community. In this paper, a novel graph-based semi… 

Greedy multi-class label propagation

  • H. Cecotti
  • Computer Science
    2015 International Joint Conference on Neural Networks (IJCNN)
  • 2015
A greedy graph-based semi-supervised learning (GGSL) approach is proposed for multi-class classification problems that assumes that nearby points share the same label, by starting with a small neighborhood where a reliable decision can be obtained, and iterates with larger neighborhoods where more examples are needed to determine the label of an example.

Graph-Based Semi-Supervised Learning

This synthesis lecture focuses on graph-based SSL algorithms (e.g., label propagation methods), which have been shown to outperform the state-of-the-art in many applications in speech processing, computer vision, natural language processing, and other areas of Artificial Intelligence.

Random Walk in Feature-Sample Networks for Semi-supervised Classification

  • F. VerriLiang Zhao
  • Computer Science
    2016 5th Brazilian Conference on Intelligent Systems (BRACIS)
  • 2016
A technique to grade positive-class pertinence levels of each sample, and the grades are interpreted to classify the unlabeled ones, which allows for extensions of the technique to several learning problems, including online learning and dimensionality reduction.

Graph Learning on Millions of Data in Seconds: Label Propagation Acceleration on Graph Using Data Distribution

This paper proposes a new method called Data Distribution Based Graph Learning (DDGL) for semi-supervised learning on large-scale data which can achieve a fast and effective label propagation and supports incremental learning.

Pick Your Neighborhood - Improving Labels and Neighborhood Structure for Label Propagation

Graph-based methods are very popular in semi-supervised learning due to their well founded theoretical background, intuitive interpretation of local neighborhood structure, and strong performance on

Generalized Label Propagation

This work reformulates the label propagation algorithm as a minimum energy control problem that embraces traditional label propagation as a special case and applies the formulation to benchmark data sets and the Yelp challenge data set, showing promising results.

Network-Based Semi-Supervised Learning

This chapter presents network-based algorithms that run in the semi-supervised learning scheme, and shows that different techniques apply different criteria in their label diffusion processes, generating, as a result, distinct outcomes.

Label Propagation Through Optimal Transport

The proposed approach, Optimal Transport Propagation (OTP), performs in an incremental process, label propagation through the edges of a complete bipartite edge-weighted graph, whose affinity matrix is constructed from the optimal transport plan between empirical measures defined on labeled and unlabeled data.

Semi-supervised learning on closed set lattices

A learning algorithm, called SELF SEmi-supervised Learning via FCA, is presented, which performs as a multiclass classifier and a label ranker for mixed-type data containing both discrete and continuous variables, while only few learning algorithms such as the decision tree-based classifier can directly handle mixed- type data.

How about utilizing ordinal information from the distribution of unlabeled data

In order to make the proposed semi-supervised ordinal regression method more applicable to problems with large scaled labeled data, a kernel based dual coordinate descent algorithm is put forward to efficiently solve SOSVM.



Learning from Labeled and Unlabeled Data Using Random Walks

This work investigates the proposed algorithm which can substantially benefit from large amounts of unlabeled data and demonstrates clear superiority to supervised learning methods using random walks and spectral graph theory, which shed light on the key steps in this algorithm.

Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions

An approach to semi-supervised learning is proposed that is based on a Gaussian random field model, and methods to incorporate class priors and the predictions of classifiers obtained by supervised learning are discussed.

Hyperparameter and Kernel Learning for Graph Based Semi-Supervised Classification

A Bayesian framework for learning hyperparameters for graph-based semi-supervised classification and shows that the posterior mean can be written in terms of the kernel matrix, providing a Bayesian classifier to classify new points.

Efficient Non-Parametric Function Induction in Semi-Supervised Learning

Experiments show that the proposed non-parametric algorithms which provide an estimated continuous label for the given unlabeled examples are extended to function induction algorithms that correspond to the minimization of a regularization criterion applied to an out-of-sample example, and happens to have the form of a Parzen windows regressor.

Text Classification from Labeled and Unlabeled Documents using EM

This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents, and presents two extensions to the algorithm that improve classification accuracy under these conditions.

Learning from Labeled and Unlabeled Data using Graph Mincuts

An algorithm based on finding minimum cuts in graphs, that uses pairwise relationships among the examples in order to learn from both labeled and unlabeled data is considered.

Combining labeled and unlabeled data with co-training

A PAC-style analysis is provided for a problem setting motivated by the task of learning to classify web pages, in which the description of each example can be partitioned into two distinct views, to allow inexpensive unlabeled data to augment, a much smaller set of labeled examples.

Tikhonov regularization and semi-supervised learning on large graphs

Using the notion of algorithmic stability, bounds on the generalization error are derived and related to the structural invariants of the graph and a framework for regularization is developed parallel to Tikhonov regularization on continuous spaces.

Partially labeled classification with Markov random walks

This work combines a limited number of labeled examples with a Markov random walk representation over the unlabeled examples and develops and compares several estimation criteria/algorithms suited to this representation.

Learning from labeled and unlabeled data with label propagation

A simple iterative algorithm to propagate labels through the dataset along high density are as d fined by unlabeled data is proposed and its solution is analyzed, and its connection to several other algorithms is analyzed.