Learn More
We consider the problem of how to improve the efficiency of Multiple Kernel Learning (MKL). In literature, MKL is often solved by an alternating approach: (1) the minimization of the kernel weights is solved by complicated techniques, such as Semi-infinite Linear Programming, Gradient Descent, or Level method; (2) the maximization of SVM dual variables can(More)
The goal of active learning is to select the most informative examples for manual labeling. Most of the previous studies in active learning have focused on selecting a <i>single</i> unlabeled example in each iteration. This could be inefficient since the classification model has to be retrained for every labeled example. In this paper, we present a(More)
We consider the problem of multiple kernel learning (MKL), which can be formulated as a convex-concave problem. In the past, two efficient methods, i.e., Semi-Infinite Linear Programming (SILP) and Subgradient Descent (SD), have been proposed for large-scale multiple kernel learning. Despite their success, both methods have their own shortcomings: (a) the(More)
Digital data explosion mandates the development of scalable tools to organize the data in a meaningful and easily accessible form. Clustering is a commonly used tool for data organization. However, many clustering algorithms designed to handle large data sets assume linear separability of data and hence do not perform well on real world data sets. While(More)
Semi-supervised learning has attracted a significant amount of attention in pattern recognition and machine learning. Most previous studies have focused on designing special algorithms to effectively exploit the unlabeled data in conjunction with labeled data. Our goal is to improve the classification accuracy of any given supervised learning algorithm by(More)
Active learning reduces the labeling cost by iteratively selecting the most valuable data to query their labels. It has attracted a lot of interests given the abundance of unlabeled data and the high cost of labeling. Most active learning approaches select either informative or representative unlabeled instances to query their labels, which could(More)
Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the difference could affect their generalization performances. Unlike approaches based on random Fourier features where the basis functions (i.e.,(More)