We propose a novel methodology that allows automatic construction of benchmarks for Static Analysis Security Testing (SAST) tools based on real-world software projects by differencing vulnerable and fixed versions in FOSS repositories. The methodology allows us to evaluate ``actual'' performance of SAST tools (without unrelated alarms). To test our approach, we benchmarked 7 SAST tools (although we report only results for the two best tools), against 70 revisions of four major versions of Apache Tomcat with 62 distinct CVEs as the source of ground truth vulnerabilities.
Unfortunately, ACM prohibits us from displaying non-influential references for this paper.
To see the full reference list, please visit http://dl.acm.org/citation.cfm?id=3121276.