SoK: On the Soundness and Precision of Dynamic Taint Analysis

Abstract

Taint analysis or dynamic information flow tracking is a key binary analysis technique for revealing data dependencies in programs. It has been used in many different applications, such as memory error detection, vulnerability analysis, malware analysis, and exploit diagnosis. While previous implementations are empirically effective for their chosen tasks, they use manually defined tainting rules which have not been proven to be sound (lack false negatives) or precise (lack false positives). Furthermore, even perfect tainting rules could be incorrectly implemented. We survey a number of existing systems, and find that all suffer from either unsoundness or imprecision, and many are quite imprecise. To improve the situation, we propose a set of formal methods to create tainting rules and verify their soundness and precision, and to verify the correctness of tainting implementations. To show the practicality of this approach, we build a new taint analysis system that is formally verified to be sound and in many cases precise at the instruction level. Our tests with real world workloads (tainted shell commands in both Windows and Linux, and keylogger detection) demonstrate observable advantages of having provable soundness and precision, as compared to existing taint analysis systems.

15 Figures and Tables

Cite this paper

@inproceedings{Yan2017SoKOT, title={SoK: On the Soundness and Precision of Dynamic Taint Analysis}, author={Lok Kwong Yan and Heng Yin}, year={2017} }