Training Experimentally Robust and Interpretable Binarized Regression Models Using Mixed-Integer Programming

  title={Training Experimentally Robust and Interpretable Binarized Regression Models Using Mixed-Integer Programming},
  author={Sanjana Tule and Nhi H. Le and B. Say},
  journal={2022 IEEE Symposium Series on Computational Intelligence (SSCI)},
  • Sanjana TuleNhi H. LeB. Say
  • Published 1 December 2021
  • Computer Science
  • 2022 IEEE Symposium Series on Computational Intelligence (SSCI)
In this paper, we explore model-based approach to training robust and interpretable binarized regression models for multiclass classification tasks using Mixed-Integer Programming (MIP). Our MIP model balances the optimization of prediction margin and model size by using a weighted objective that: minimizes the total margin of incorrectly classified training instances, maximizes the total margin of correctly classified training instances, and maximizes the overall model regularization. We… 

Figures and Tables from this paper



Training Binarized Neural Networks Using MIP and CP

The experimental results on the MNIST digit recognition dataset suggest that—when training data is limited—the BNNs found by the model-based approach generalize better than those obtained from a state-of-the-art gradient descent method.

Learning with Noisy Labels

The problem of binary classification in the presence of random classification noise is theoretically studied—the learner sees labels that have independently been flipped with some small probability, and methods used in practice such as biased SVM and weighted logistic regression are provably noise-tolerant.

Binarized Neural Networks

A binary matrix multiplication GPU kernel is written with which it is possible to run the MNIST BNN 7 times faster than with an unoptimized GPU kernel, without suffering any loss in classification accuracy.

Definitions, methods, and applications in interpretable machine learning

This work defines interpretability in the context of machine learning and introduces the predictive, descriptive, relevant (PDR) framework for discussing interpretations, and introduces 3 overarching desiderata for evaluation: predictive accuracy, descriptive accuracy, and relevancy.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Classification and regression trees

  • W. Loh
  • Computer Science
    WIREs Data Mining Knowl. Discov.
  • 2011
This article gives an introduction to the subject of classification and regression trees by reviewing some widely available algorithms and comparing their capabilities, strengths, and weakness in two examples.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Co-teaching: Robust training of deep neural networks with extremely noisy labels

Empirical results on noisy versions of MNIST, CIFar-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.

Certified Robustness to Label-Flipping Attacks via Randomized Smoothing

This work presents a unifying view of randomized smoothing over arbitrary functions, and uses this novel characterization to propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.

Design of Poisoning Attacks on Linear Regression Using Bilevel Optimization

This work proposes a bilevel optimization problem to model this adversarial process between the attacker generating poisoning attacks and the learner which tries to learn the best predictive regression model.