Decompilation of binary programs

@article{Cifuentes1995DecompilationOB,
  title={Decompilation of binary programs},
  author={Cristina Garcia Cifuentes and Kevin John Gough},
  journal={Software: Practice and Experience},
  year={1995},
  volume={25}
}
The structure of a decompiler is presented, along with a thorough description of the different modules that form part of a decompiler, and the type of analyses that are performed on the machine code to regenerate high-level language code. [] Key Method The front-end is a machine dependent module that performs the loading, parsing and semantic analysis of the input program, as well as generating an intermediate representation of the program. The universal decompiling machine is a machine and language…

Reverse compilation techniques

TLDR
Techniques for writing reverse compilers or decompilers are presented in this thesis, based on compiler and optimization theory, and applied to decompilation in a unique way; these techniques have never before been published.

Decompilation as search

TLDR
This thesis makes the case that decompilation is more effectively accomplished through search, and proposes an approach to prototype recovery that follows the principle of conformant execution, in the form of inlined data source tracking, to infer arrays, pointer-to-pointers and recursive data structures.

LLVM-IR based Decompilation

TLDR
This thesis designs and implements the middle end of a decompiler framework, focusing on Low Level Language properties reduction using the optimization techniques, propagation and elimination, and performs data flow analysis and control flow analysis on the LLVM format code to generate high-level code using aFPL, Haskell.

Decompilation of Java bytecode to Prolog by partial evaluation

Using a decompiler for real-world source recovery

TLDR
This work describes the experience gained from applying a native executable decompiler, assisted by a commercial disassembler and hand editing, to a real-world Windows-based application.

Interprocedural data flow decompilation

Traditional compiler data flow analysis techniques are used to transform the intermediate representation of a decompiled program to a higher representation that eliminates low-level concepts such as

A transformational approach to binary translation of delayed branches

TLDR
A disciplined method for deriving case analyses for identifying problematic cases, showing the translations for the nonproblematic cases, and giving confidence that all cases are considered is presented.

A Refined Decompiler to Generate C Code with High Readability

TLDR
A practical decompiler for Windows C programs that uses a shadow stack to perform refined data flow analysis, and adopts inter-basic-block register propagation to reduce redundant variables and is able to recognize functions with lower false positive and false negative rate.

To Goto Where No Statement Has Gone Before

TLDR
The method always produces an expression, unlike the heuristics for decompilation which may fail, and is efficient: the resulting expression is linear in the size of the CFG by maintaining all sharing of subgraphs.

Practical dynamic reconstruction of control flow graphs

TLDR
Experimental results provide evidence that completeness, that is, the ability to conclude that the entire CFG has been discovered, is achievable on many functions that are part of industry‐strong benchmarks, and indicate that dynamic information greatly enhances the ability of DynInst, a state‐of‐the‐art binary reconstructor, to deal with code stripped of debugging information.
...

References

SHOWING 1-10 OF 29 REFERENCES

A Methodology for Decompilation

A proposed methodology for decompilation of binary programs is presented, along with a description of a particular implementation of this methodology, dcc. dcc is a decompiler for the Intel 80x86

Reverse compilation techniques

TLDR
Techniques for writing reverse compilers or decompilers are presented in this thesis, based on compiler and optimization theory, and applied to decompilation in a unique way; these techniques have never before been published.

Language Design Using Decompilation.

Abstract : This report represents the results of a project in which decompilation techniques were used to identify the essential characteristics of a high-level progamming language suitable for

Decompilation: the enumeration of types and grammars

TLDR
The basic problem of enumerating the syntax trees of grammars, and then stopping, is shown to have no recursive solution, but methods of abstract interpretation can be used to guarantee the adequacy and completeness of the technique in practical instances, including the decompiler for the language presented here.

A Structuring Algorithm for Decompilation

TLDR
This paper presents a structuring algorithm for arbitrary reducible, unstructured graphs that makes use of structures such as, if..then..elses, while, repeat and loop loops, and case statements.

Interprocedural data flow decompilation

Traditional compiler data flow analysis techniques are used to transform the intermediate representation of a decompiled program to a higher representation that eliminates low-level concepts such as

Intercomputer Transportation of Assembly Language Software through Decompilation.

TLDR
A translator that performs a decompilation of the source program into an intermediate representation at a higher semantic level is described, and this translation scheme is shown to remove most of the machine dependency from assembly language software.

Taming control flow: a structured approach to eliminating goto statements

  • Ana M. ErosaL. Hendren
  • Computer Science
    Proceedings of 1994 IEEE International Conference on Computer Languages (ICCL'94)
  • 1994
TLDR
A straight-forward algorithm to structure C programs by eliminating all goto statements by working directly on a high-level abstract syntax tree (AST) representation of the program and could easily be integrated into any compiler that uses an AST-based intermediate representation.

An Algorithm for Structuring Flowgraphs

TLDR
An algorithm which transforms a flowgraph into a program containing control constructs such as if then else statements, repeat (do forever) statements, multileVEL break statements, and multilevel next statements, which is substantially more readable than their Fortran counterparts.

The Theory of Parsing, Translation, and Compiling

TLDR
It is the hope that the algorithms and concepts presented in this book will survive the next generation of computers and programming languages, and that at least some of them will be applicable to fields other than compiler writing.