Diffing Portal

This documentation aims to provide various resources on binary diffing which is handy for reverse-engineering. Tools and associated publications related to diffing are slightly scattered online thus the goal is to reference them here by centralizing information.

Diffing

Binary diffing is usually performed between two binaries, usually referred as primary and secondary. Diffing requires comparing the two programs using common artifacts. At binary-level, disassemblers usually lift the program into functions that encode the different functionalities provided by the program. This lifting requires identifying accurately functions content, their bounds etc. It is usually the last refinement steps of the disassembly before decompilation. As such, diffing is usually performed at function level. Diffing aims at computing an assignment between functions from primary to secondary. The assignment is usually 1-to-1 but by means of optimization or obfuscation, functions can be inlined or split. As such some utilities tries computing an M-to-N mapping between functions.

Overview

Most differs rely on existing disassemblers like IDA Pro or Ghidra for disassembly as they work on a disassembled representation of the program. However they usually rely on an intermediate format allowing to perform the diff outside of the disassembler context. The software generating this file is usually implemented as a disassembler plugin and is called exporter. The Figure below shows the relationships between some disassemblers, exporters and differs.

      Differs
      Differs
      Exporters
      Exporters
      Disassemblers
      Disassemblers
QBinDiff
QBinDiff
BinDiff
BinDiff
BinExport
BinExport
(in progress)
(in progress)
Quokka
Quokka
IDA
IDA
Ghidra
Ghidra
Binary Ninja
Binary Ninja
python-binexport
python-binexport
python-bindiff
python-bindiff
* All nodes and edges are clickables!
* All nodes and edges are clickables!
Text is not SVG - cannot display

As shown on the figure, the base layer is made of disassemblers from which differs try to be independent from. The second layer consists of exporters which provide an interface to disassembler by serializing the disassembly in a format specific file which is later read by the differ to perform the diff. The figure shows various tools and modules discussed on this portal, especially python-bindiff, python-binexport, qbindiff, and quokka.

How to Contribute ?

This page aims at aggregating various ressources related to diffing. Thus, pull requests to contribute are warmly welcomed to add new utility link or other resources.