Gustavo Malkomes

Learn More
Clustering large data is a fundamental problem with a vast number of applications. Due to the increasing size of data, practitioners interested in clustering have turned to distributed computation methods. In this work, we consider the widely used kcenter clustering problem and its variant used to handle noisy data, k-center with outliers. In the noise-free(More)
Despite the success of kernel-based nonparametric methods, kernel selection still requires considerable expertise, and is often described as a “black art.” We present a sophisticated method for automatically searching for an appropriate kernel from an infinite space of potential choices. Previous efforts in this direction have focused on traversing a kernel(More)
We introduce a novel information-theoretic approach for active model selection and demonstrate its effectiveness in a real-world application. Although our method can work with arbitrary models, we focus on actively learning the appropriate structure for Gaussian process (GP) models with arbitrary observation likelihoods. We then apply this framework to(More)
We introduce a new model for cooperative agents that seek to optimize a common goal without communication or coordination. Given a universe of elements V, a set of agents, and a set function f , we ask each agent i to select a subset Si ⊂ V such that the size of Si is constrained (i.e., |Si| < k). The goal is for the agents to cooperatively choose the sets(More)
Image representation is an essential issue regarding the problems related to image processing and understanding. In the last years, the sparse representation modelling for signals has been receiving a lot of attention due to its state-of-the art performance in different computer vision tasks. One of the important factors to its success is the ability to(More)
In this section, we present the proof of Theorem 1. We assume that active search policies have access to the correct marginal probabilities f(x;D) = Pr(y = 1 | x,D), for any given point x and labeled data D, which may include “ficticious” observations. Further, the computational cost will be analyzed as the number of calls to f , i.e., f(x;D) has unit cost.(More)
Active search is a learning paradigm with the goal of actively identifying as many members of a given class as possible. Many real-world problems can be cast as an active search, including drug discovery, fraud detection, and product recommendation. Previous work has derived the Bayesian optimal policy for the problem, which is unfortunately intractable due(More)
In recent years, the sparse representation modeling of signals has received a lot of attention due to its state-of-the-art performance in different computer vision tasks. One important factor to its success is the ability to promote representations that are well adapted to the data. This is achieved by the use of dictionary learning algorithms. The most(More)
  • 1