Binary loader interface

Program

class qbindiff.Program(path: Path | str, *args, loader: LoaderType | None = None, backend: AbstractProgramBackend | None = None, **kwargs)[source]

Bases: MutableMapping, GenericGraph

Program class that shadows the underlying program backend used.

It is a MutableMapping, where keys are function addresses and values are Function objects.

Parameters:
  • path – Path to the main file to load (depends on the underlying backend)

  • loader – The loader type. If not provided, the loader is inferred from the path

  • backend – Optional parameter to provide the object instance implementing the AbstractProgramBackend interface

  • args – extra parameters passed to the Backend

  • kwargs – extra parameters forwarded to the backend constructor

The node label is the function address, the node itself is the Function object

property callgraph: networkx.DiGraph

The function callgraph with a Networkx DiGraph

property capabilities: ProgramCapability

Returns the underlying backend capabilities

clear() None.  Remove all items from D.
property edges: OutEdgeView[tuple[Addr, Addr]]

Iterate over the edges. An edge is a pair (addr_a, addr_b)

Returns:

An OutEdgeView over the edges.

property exec_path: str | None

The executable path if it has been specified, None otherwise

follow_through(to_remove: Addr, target: Addr) None[source]

Replace node to_remove with a follow-through edge from every parent of the node with the node target.

Example : { parents } -> (to_remove) -> (target)

--> { parents } -> (target)

Parameters:
  • to_remove – node to remove

  • target – targe node

Returns:

None

static from_backend(backend: AbstractProgramBackend) Program[source]

Load the Program from an instanciated program backend object

static from_binexport(file_path: str, arch: str | None = None) Program[source]

Load the Program using the binexport backend

Parameters:
  • file_path – File path to the binexport file

  • arch – Architecture to pass to the capstone disassembler. This is useful when the binexport’ed architecture is not enough to correctly disassemble the binary (for example with arm thumb2 or some mips modes).

Returns:

Program instance

static from_ida() Program[source]

Load the program using the IDA backend

Returns:

Program instance

static from_quokka(file_path: str, exec_path: str) Program[source]

Load the Program using the Quokka backend.

Parameters:
  • file_path – File path to the binexport file

  • exec_path – Path of the raw binary

Returns:

Program instance

get(k[, d]) D[k] if k in D, else d.  d defaults to None.
get_function(name: str) Function[source]

Returns the function by its name

Parameters:

name – name of the function

Returns:

the function

get_node(node_label: Addr) Function[source]

Get the function identified by the address node_label

Parameters:

node_label – the address of the function that will be returned

Returns:

the function identified by its address

items() Iterator[tuple[Addr, Function]][source]

Iterate over the items. Each item is {address: Function}

Returns:

A Iterator over the functions. Each element is a tuple (function_addr, function_obj)

keys() a set-like object providing a view on D's keys
property name: str

Returns the name of the program as defined by the backend

property node_labels: Iterator[Addr]

Iterate over the functions’ address

Returns:

An Iterator over the functions’ address

property nodes: Iterator[Function]

Iterate over the functions

Returns:

An Iterator over the functions

pop(k[, d]) v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair

as a 2-tuple; but raise KeyError if D is empty.

remove_function(to_remove: Addr) None[source]

Remove the node to_remove from the Call Graph of the program.

WARNING: The follow-through edges from the parents to the children are not added. Example :

{ parents } -> (to_remove) -> { children }

--> { parents }                   { children }

Parameters:

to_remove – function_to_remove

Returns:

None

set_function_filter(func: Callable[[Addr], bool]) None[source]

Filter out some functions, to ignore them in later processing.

Parameters:

func – function take the function address (the node label) and returns whether or not to keep it.

setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D
property structures: list[Structure]

Returns the list of structures defined in program

update([E, ]**F) None.  Update D from mapping/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values() an object providing a view on D's values

Function

class qbindiff.Function(backend: AbstractFunctionBackend)[source]

Bases: Mapping, GenericNode

Representation of a binary function.

This class is a non-mutable mapping between basic block’s address and the basic block itself.

It lazily loads all the basic blocks when iterating through them or even accessing one of them and it unloads all of them after the iteration has ended.

To keep a reference to the basic blocks the with statement can be used, for example:

1# func: Function
2with func:  # Loading all the basic blocks
3    for bb_addr, bb in func.items():  # Blocks are already loaded
4        pass
5    # The blocks are still loaded
6    for bb_addr, bb in func.items():
7        pass
8# here the blocks have been unloaded
property addr: Addr

Address of the function

property children: set[Addr]

Set of functions called by this function in the call graph.

property edges: list[tuple[Addr, Addr]]

Edges of the function flowgraph as a list of tuples with basic block addresses

property flowgraph: networkx.DiGraph

The networkx DiGraph of the function. This is used to perform networkx based algorithm.

static from_backend(backend: AbstractFunctionBackend) Function[source]

Load the Function from an instanciated function backend object

get(k[, d]) D[k] if k in D, else d.  d defaults to None.
get_label() Addr[source]

Get the address associated to this function

Returns:

The address associated with the function

is_alone() bool[source]

Returns whether the function have neither caller nor callee.

Returns:

bool

is_import() bool[source]

Returns whether this function is an import function. (Thus not having content)

Returns:

bool

is_library() bool[source]

Returns whether or not this function is a library function.

A library function is either a thunk function or it has been identified as part of an external library. It is not an imported function.

Returns:

bool

is_thunk() bool[source]

Returns whether this function is a thunk function.

Returns:

bool

items() Iterator[tuple[Addr, BasicBlock]][source]

Returns a generator of tuples with addresses of basic blocks and the corresponding basic blocks objects

Returns:

generator (addr, basicblock)

keys() a set-like object providing a view on D's keys
property name: str

Name of the function

property parents: set[Addr]

Set of function parents in the call graph. Thus functions that calls this function

property type: FunctionType

Returns the type of the instruction (as defined by IDA)

values() an object providing a view on D's values

BasicBlock

class qbindiff.loader.BasicBlock(backend: AbstractBasicBlockBackend)[source]

Bases: Iterable[Instruction]

Representation of a binary basic block. This class is an Iterable of Instruction.

property addr: int

Address of the basic block

property bytes: bytes

Raw bytes of basic block instructions.

static from_backend(backend: AbstractBasicBlockBackend) BasicBlock[source]

Load the BasicBlock from an instanciated basic block backend object

Parameters:

backend – backend to use

Returns:

the loaded basic block

property instructions: list[Instruction]

List of Instruction objects of the basic block

Instruction

class qbindiff.loader.Instruction(backend: AbstractInstructionBackend)[source]

Bases: object

Defines an Instruction object that wrap the backend using under the scene.

property addr: int

Returns the address of the instruction

property bytes: bytes

Returns the bytes representation of the instruction

property comment: str

Comment as set in IDA on the instruction

property data_references: list[Data]

Returns the list of data that are referenced by the instruction

Warning

The BinExport backend tends to return empty references and so are data references

static from_backend(backend: AbstractInstructionBackend) Instruction[source]

Load the Instruction from an instanciated instruction backend object

property groups: list[InstructionGroup]

Returns a list of groups of this instruction.

Warning

Requires INSTR_GROUP capability

property id: int

Return the instruction ID as int

property mnemonic: str

Returns the instruction mnemonic as a string

property operands: list[Operand]

Returns the list of operands as Operand object.

property pcode_ops: list[PcodeOp]

List of PcodeOp associated with the instruction.

Warning

Requires PCODE capability

property references: dict[ReferenceType, list[Data | Structure | StructureMember]]

Returns all the references towards the instruction

Operand

class qbindiff.loader.Operand(backend: AbstractOperandBackend)[source]

Bases: object

Represent an operand object which hide the underlying backend implementation

static from_backend(backend: AbstractOperandBackend) Operand[source]

Load the Operand from an instanciated operand backend object

is_immediate() bool[source]

Whether the operand is an immediate (not considering addresses)

property type: OperandType

The operand type as int as defined in the IDA API. Example : 1 corresponds to a register (ex: rax)

property value: int | None

The immediate value (not addresses) used by the operand. If not returns None.

Data

class qbindiff.loader.Data(data_type: DataType, addr: int, value: Any)[source]

Bases: object

Class that represents a data reference

Structure

class qbindiff.loader.Structure(struct_type: StructureType, name: str, size: int)[source]

Bases: object

Class that represents a struct reference

add_member(offset: int, data_type: DataType, name: str, size: int, value: Any) None[source]

Add a new member of the struct at offset offset

Parameters:
  • offset – offset where to add the member

  • data_type – type of the member

  • name – its name

  • size – its size

  • value – its value

Returns:

None

member_by_name(name: str) StructureMember | None[source]

Get member by name. WARNING: time complexity O(n)

Parameters:

name – name from which we want to recover the structure member

Returns:

member of the structure denoted by its name or None

StructureMember

class qbindiff.loader.StructureMember(data_type: DataType, name: str, size: int, value: Any, structure: Structure)[source]

Bases: object

Class that represents a struct member reference

ReferenceType

enum qbindiff.loader.types.ReferenceType(value)[source]

Reference types.

Member Type:

int

Valid values are as follows:

DATA = <ReferenceType.DATA: 0>

Reference is data

ENUM = <ReferenceType.ENUM: 1>

Reference is an enum

STRUC = <ReferenceType.STRUC: 2>

Reference is a structure

UNKNOWN = <ReferenceType.UNKNOWN: 3>

Reference type is unknown

LoaderType

enum qbindiff.loader.types.LoaderType(value)[source]

Enum of different loaders (supported or not)

Member Type:

int

Valid values are as follows:

binexport = <LoaderType.binexport: 0>

binexport loader

diaphora = <LoaderType.diaphora: 1>

diaphora loader (not supported)

ida = <LoaderType.ida: 2>

IDA loader

quokka = <LoaderType.quokka: 3>

Quokka loader

OperandType

enum qbindiff.loader.types.OperandType(value)[source]

All the operand types as defined by IDA

Member Type:

int

Valid values are as follows:

unknown = <OperandType.unknown: 0>

type is unknown

register = <OperandType.register: 1>

register (GPR)

memory = <OperandType.memory: 2>

Direct memory reference

immediate = <OperandType.immediate: 3>

Immediate value

float_point = <OperandType.float_point: 4>

Floating point operand

coprocessor = <OperandType.coprocessor: 5>

Coprocessor operand

arm_setend = <OperandType.arm_setend: 6>

operand for SETEND instruction (‘BE’/’LE’)

arm_sme = <OperandType.arm_sme: 7>

operand for SME instruction (matrix operation)

arm_memory_management = <OperandType.arm_memory_management: 8>

Memory management operand like prefetch, SYS and barrier

FunctionType

enum qbindiff.loader.types.FunctionType(value)[source]

Function types as defined by IDA.

Member Type:

int

Valid values are as follows:

normal = <FunctionType.normal: 0>

Normal function

library = <FunctionType.library: 1>

Function identified as a library one

imported = <FunctionType.imported: 2>

Imported function e.g: function in PLT

thunk = <FunctionType.thunk: 3>

Function identified as thunk (trampoline to another one)

invalid = <FunctionType.invalid: 4>

Invalid function (not properly disassembled)

extern = <FunctionType.extern: 5>

External symbol (function without content)

DataType

enum qbindiff.loader.types.DataType(value)[source]

Types of data

Member Type:

int

Valid values are as follows:

UNKNOWN = <DataType.UNKNOWN: 0>

Data type is unknown

BYTE = <DataType.BYTE: 1>

1 byte

WORD = <DataType.WORD: 2>

2 bytes

DOUBLE_WORD = <DataType.DOUBLE_WORD: 3>

4 bytes

QUAD_WORD = <DataType.QUAD_WORD: 4>

8 bytes

OCTO_WORD = <DataType.OCTO_WORD: 5>

16 bytes

FLOAT = <DataType.FLOAT: 6>

float value

DOUBLE = <DataType.DOUBLE: 7>

double value

ASCII = <DataType.ASCII: 8>

ASCII string

StructureType

enum qbindiff.loader.types.StructureType(value)[source]

Different structure types.

Member Type:

int

Valid values are as follows:

UNKNOWN = <StructureType.UNKNOWN: 0>

Type unknown

STRUCT = <StructureType.STRUCT: 1>

Type is structure

ENUM = <StructureType.ENUM: 2>

Type is enum

UNION = <StructureType.UNION: 3>

Type is union

ReferenceTarget

qbindiff.loader.types.ReferenceTarget: TypeAlias = 'Data | Structure | StructureMember'

Data reference target