• Corpus ID: 233481568

Submodular Mutual Information for Targeted Data Subset Selection

  title={Submodular Mutual Information for Targeted Data Subset Selection},
  author={S. Kothawade and Vishal Kaushal and Ganesh Ramakrishnan and Jeff A. Bilmes and Rishabh K. Iyer},
With the rapid growth of data, it is becoming increasingly difficult to train or improve deep learning models with the right subset of data. We show that this problem can be effectively solved at an additional labeling cost by targeted data subset selection (TSS) where a subset of unlabeled data points similar to an auxiliary set are added to the training data. We do so by using a rich class of Submodular Mutual Information (SMI) functions and demonstrate its effectiveness for image… 

Figures and Tables from this paper


The Online Submodular Cover Problem
Deep Residual Learning for Image Recognition
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning
This work forms GLISTER as a mixed discretecontinuous bi-level optimization problem to select a subset of the training data, which maximizes the log-likelihood on a held-out validation set, and proposes an iterative online algorithm GLISTER-ONLINE, which performs data selection iteratively along with the parameter updates, and can be applied to any loss-based learning algorithm.
A Unified Framework for Generic, Query-Focused, Privacy Preserving and Update Summarization using Submodular Information Measures
This work shows that several previous query-focused and update summarization techniques have, unknowingly, used various instantiations of the aforesaid submodular information measures, providing evidence for the benefit and naturalness of these models.
Submodular Combinatorial Information Measures with Applications in Machine Learning
This paper studies combinatorial information measures that generalize independence, (conditional) entropy, (Conditional) mutual information, and total correlation defined over sets of (not necessarily random) variables and shows that, unlike entropic mutual information in general, the submodular mutual information is actually sub modular in one argument, holding the other fixed.
On characterization of entropy function via information inequalities
  • Zhen Zhang, R. Yeung
  • Computer Science, Mathematics
    Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252)
  • 1998
The main discovery is a new information inequality involving 4 discrete random variables which gives a negative answer to this fundamental problem of information theory.
Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds
This work designs a new algorithm for batch active learning with deep neural network models that samples groups of points that are disparate and high-magnitude when represented in a hallucinated gradient space, and shows that while other approaches sometimes succeed for particular batch sizes or architectures, BADGE consistently performs as well or better, making it a versatile option for practical active learning problems.
Submodularity in natural language processing: algorithms and applications
This thesis demonstrates the applicability of submodular function optimization to three natural language processing tasks: word alignment for machine translation, optimal corpus creation, and document summarization, and introduces a class of sub modular functions that is not only monotone but also models relevance and diversity simultaneously for document summarizing.
Submodular functions and optimization
Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision
This work empirically demonstrate the effectiveness of two diversity models, namely the Facility-Location and Dispersion models for training-data subset selection and reducing labeling effort, which allows the training of complex machine learning models like Convolutional Neural Networks with much less training data and labeling costs while incurring minimal performance loss.