QuantifyML: How Good is my Machine Learning Model?

  title={QuantifyML: How Good is my Machine Learning Model?},
  author={Muhammad Usman and Divya Gopinath and Corina S. Pasareanu},
The efficacy of machine learning models is typically determined by computing their accuracy on test data sets. However, this may often be misleading, since the test data may not be representative of the problem that is being studied. With QuantifyML we aim to precisely quantify the extent to which machine learning models have learned and generalized from the given data. Given a trained model, QuantifyML translates it into a C program and feeds it to the CBMC model checker to produce a formula… 

Figures and Tables from this paper


Model Counting
Solving #SAT requires the solver to be cognizant of all solutions in the search space, thereby reducing the effectiveness and relevance of commonly used SAT heuristics designed to quickly narrow down the search to a single solution.
TestMC: Testing Model Counters using Differential and Metamorphic Testing
This experience paper presents an empirical study on testing industrial strength model counters by applying the principles of differential and metamorphic testing together with bounded exhaustive input generation and input minimization in the TestMC framework.
Bounded model checking
This article surveys a technique called Bounded Model Checking (BMC), which uses a propositional SAT solver rather than BDD manipulation techniques, and is widely perceived as a complementary technique to BDD-based model checking.
A Scalable Approximate Model Counter
A novel algorithm, as well as a reference implementation, that is the first scalable approximate model counter for CNF formulas, which scales to formulas with tens of thousands of variables and succeeds in reporting bounds with small tolerance and high confidence in cases that are too large for computing exact model counts.
#∃SAT: Projected Model Counting
This paper introduces the problem of model counting projected on a subset of original variables that the authors call priority variables, and discusses three different approaches to \(\#\exists \)SAT (two of which are novel), and compares their performance on different benchmark problems.
Exploiting Symbolic Techniques in Automated Synthesis of Distributed Programs with Large State Space
This paper presents a symbolic method for automatic synthesis of classical fault-tolerant distributed problems such as Byzantine agreement and token ring and is the first illustration where programs with large state space (beyond 2100) is handled during synthesis.
CBMC - C Bounded Model Checker - (Competition Contribution)
CBMC implements bit-precise bounded model checking for C programs and is now capable of finding counterexamples in all of SV-COMP’s categories.
A Recursive Algorithm for Projected Model Counting
Based on a ”standard” model counter, the algorithm projMC takes advantage of a disjunctive decomposition scheme of ∃X.Σk of a propositional formula Σ after eliminating from it a given set X of variables for improving the computation.
DeepSafe: A Data-Driven Approach for Assessing Robustness of Neural Networks
This work proposes DeepSafe, a novel approach for automatically assessing the overall robustness of a neural network, which applies clustering over known labeled data and leverages off-the-shelf constraint solvers to automatically identify and check safe regions in which the network is robust.
Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks
Results show that the novel, scalable, and efficient technique presented can successfully prove properties of networks that are an order of magnitude larger than the largest networks verified using existing methods.