A user-guided approach to program analysis

  title={A user-guided approach to program analysis},
  author={Ravi Mangal and Xin Zhang and Aditya V. Nori and M. Naik},
  journal={Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering},
  • Ravi Mangal, Xin Zhang, M. Naik
  • Published 30 August 2015
  • Computer Science
  • Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering
Program analysis tools often produce undesirable output due to various approximations. We present an approach and a system EUGENE that allows user feedback to guide such approximations towards producing the desired output. We formulate the problem of user-guided program analysis in terms of solving a combination of hard rules and soft rules: hard rules capture soundness while soft rules capture degrees of approximations and preferences of users. Our technique solves the rules using an off-the… 
Beyond Deductive Methods in Program Analysis
A framework DIFFLOG is presented that fundamentally extends the deductive reasoning rules that underlie program analyses with numerical weights and advances the state-of-the-art in synthesizing non-trivial analyses.
Automatically generating features for learning program analysis heuristics for C-like languages
The technique goes through selected program-query pairs in codebases, and it reduces and abstracts the program in each pair to a few lines of code, while ensuring that the analysis behaves similarly for the original and the new programs with respect to the query.
User-guided program reasoning using Bayesian inference
A new approach to leverage user feedback to guide program analyses towards true alarms and away from false alarms is proposed, which associates each alarm with a confidence value by performing Bayesian inference on a probabilistic model derived from the analysis rules.
ARBITRAR: User-Guided API Misuse Detection
This work proposes a new approach that allows regular programmers to find API misuses and minimizes user burden by employing an active learning algorithm that ranks API usages by their likelihood of being invalid.
Effective interactive resolution of static analysis alarms
The approach synergistically combines a sound but imprecise analysis with precise but unsound heuristics, through user interaction, and enables interactive alarm resolution for any analysis specified in the declarative logic programming language Datalog.
Active Inductive Logic Programming for Code Search
This work designs a query language to model the structure and semantics of code as logic facts, and uses nested program structure as an inductive bias to leverage both positive and negative examples and is effective in refining search queries.
Boosting static analysis accuracy with instrumented test executions
DynaBoost is presented, a system which uses information obtained from test executions to prioritize the alarms of a static analyzer, and uses these results to bootstrap a probabilistic alarm ranking system.
Example-guided synthesis of relational queries
Evaluation shows that EGS outperforms state-of-the-art synthesizers based on enumerative search, constraint solving, and hybrid techniques in terms of synthesis time, quality of synthesized programs, and ability to prove unrealizability.
Effects of Precise and Imprecise Value-Set Analysis (VSA) Information on Manual Code Analysis.
  • L. Matzen, Michelle Leger, Geoffrey Reedy
  • Computer Science, Psychology
    Proposed for presentation at the Binary Analysis Research (BAR) workshop, associated with Network and Distributed System Security Symposium (NDSS) held February 21-25, 2021.
  • 2021
A human study in which reverse engineers answered short information flow problems, determining whether code snippets would print sensitive information showed that precise VSA information changed participants’ problem-solving strategies and supported faster, more accurate analyses.
Striking a Balance: Pruning False-Positives from Static Call Graphs
A technique is overcome by a technique that leads to reporting fewer bugs but also much fewer false positives by automatically producing a call-graph pruner through an automatic, ahead-of-time learning process.


Solving Weighted Constraints with Applications to Program Analysis
A lazy grounding algorithm that generalizes and extends existing techniques for solving constraint systems of weighted constraints, and achieves significant speedup over existing approaches without sacrificing soundness for several real-world program analysis applications.
From uncertainty to belief: inferring the specification within
A novel framework based on factor graphs for automatically inferring specifications directly from programs that can incorporate many disparate sources of evidence, allowing us to squeeze significantly more information from the authors' observations than previously published techniques.
Finding application errors and security flaws using PQL: a program query language
This paper presents a language called PQL (Program Query Language) that allows programmers to express such questions easily in an application-specific context and develops both static and dynamic techniques to find solutions to PQL queries.
Probabilistic, modular and scalable inference of typestate specifications
The results for the large benchmark show that ANEK can quickly infer specifications that are both accurate and qualitatively similar to those written by hand, and at 5% of the time taken to manually discover and hand-code the specifications.
Refinement-based context-sensitive points-to analysis for Java
This work has developed a refinement-based analysis that succeeds by simultaneously refining handling of method calls and heap accesses, allowing the analysis to precisely analyze important code while entirely skipping irrelevant code.
On abstraction refinement for program analyses in Datalog
This work presents a new approach for finding such abstractions for program analyses written in Datalog based on counterexample-guided abstraction refinement, which uses a boolean satisfiability formulation that is general, complete, and optimal.
Strictly declarative specification of sophisticated points-to analyses
The DOOP framework for points-to analysis of Java programs is presented, carrying the declarative approach further than past work by describing the full end-to-end analysis in Datalog and optimizing aggressively using a novel technique specifically targeting highly recursive Datalogy programs.
Z-Ranking: Using Statistical Analysis to Counter the Impact of Static Analysis Approximations
This paper demonstrates that z-ranking applies to a range of program checking problems and that it performs up to an order of magnitude better than randomized ranking, and has transformed previously unusable analysis tools into effective program error finders.
Automated error diagnosis using abductive inference
The insight is that identifying missing facts is an instance of the abductive inference problem in logic, and a new algorithm is presented for computing the smallest and most general abductions in this setting.
Using Datalog with Binary Decision Diagrams for Program Analysis
Bddbddb is described, a BDD-Based Deductive DataBase, which implements the declarative language Datalog with stratified negation, totally-ordered finite domains and comparison operators, and it is shown that a context-insensitive points-to analysis implemented with bddb ddb is about twice as fast as a carefully hand-tuned version.