Features

Address

class qbindiff.features.Address(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Address of the function as a feature

Parameters:

weight – weight to apply to this feature

key: str = 'addr'

feature name (short)

visit_function(_: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

DatName

class qbindiff.features.DatName(weight: float = 1.0)[source]

Bases: InstructionFeatureExtractor

References to data in the instruction (as retrieved by the backend loader). This feature maps the data value to the number of reference occurences to it. It’s a superset of StrRef feature.

Parameters:

weight – weight to apply to this feature

key: str = 'dat'

feature name (short)

visit_instruction(_: Program, instruction: Instruction, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.

Parameters:
  • program – program being visited

  • instruction – instruction being visited

  • collector – collector in which to save the feature value

property weight: float

Weight applied to the feature

StrRef

class qbindiff.features.StrRef(weight: float = 1.0)[source]

Bases: InstructionFeatureExtractor

References to strings in the instruction. This feature maps the string to the number of occurences to it.

Parameters:

weight – weight to apply to this feature

key: str = 'strref'

feature name (short)

visit_instruction(_: Program, instruction: Instruction, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.

Parameters:
  • program – program being visited

  • instruction – instruction being visited

  • collector – collector in which to save the feature value

property weight: float

Weight applied to the feature

Constant

class qbindiff.features.Constant(weight: float = 1.0)[source]

Bases: OperandFeatureExtractor

Numeric constant (32/64bits) in the instruction (not addresses). This maps numerical values to the number of occurences to it. It excludes the addresses (relies on IDA to discriminate them).

Parameters:

weight – weight to apply to this feature

key: str = 'cst'

feature name (short)

visit_operand(_: Program, operand: Operand, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering an operand in the program. Classes inheriting have to implement this method.

Parameters:
  • program – program being visited

  • operand – operand being visited

  • collector – collector in which to save the feature value

property weight: float

Weight applied to the feature

FuncName

class qbindiff.features.FuncName(*args: Any, excluded_regex: Pattern[str] | None = None, **kwargs: Any)[source]

Bases: FunctionFeatureExtractor

Match the function names. Optionally the constructor takes a regular expression pattern to exclude function names

Parameters:
  • args – parameters of a feature extractor

  • excluded_regex – regex to apply in order to exclude names

  • kwargs – keyworded arguments

is_excluded(function: Function) bool[source]

Returns if the function should be excluded (and not considered) based on an optional regex

Parameters:

function – function to consider

Returns:

bool

key: str = 'fname'

feature name (short)

visit_function(_: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

BBlockNb

class qbindiff.features.BBlockNb(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Number of basic blocks in the function as a feature.

Parameters:

weight – weight to apply to this feature

key: str = 'bnb'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

JumpNb

class qbindiff.features.JumpNb(weight: float = 1.0)[source]

Bases: InstructionFeatureExtractor

Number of jumps in the function.

Parameters:

weight – weight to apply to this feature

key: str = 'jnb'

feature name (short)

visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.

Parameters:
  • program – program being visited

  • instruction – instruction being visited

  • collector – collector in which to save the feature value

property weight: float

Weight applied to the feature

MaxParentNb

class qbindiff.features.MaxParentNb(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Maximum number of parent of a basic block in the function.

Parameters:

weight – weight to apply to this feature

key: str = 'maxp'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

MaxChildNb

class qbindiff.features.MaxChildNb(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Maximum number of children of a basic block in the function.

Parameters:

weight – weight to apply to this feature

key: str = 'maxc'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

MaxInsNB

class qbindiff.features.MaxInsNB(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Max number of instructions per basic blocks in the function.

Parameters:

weight – weight to apply to this feature

key: str = 'maxins'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

MeanInsNB

class qbindiff.features.MeanInsNB(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Mean number of instructions per basic blocks in the function.

Parameters:

weight – weight to apply to this feature

key: str = 'meanins'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

InstNB

class qbindiff.features.InstNB(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Number of instructions in the function.

Parameters:

weight – weight to apply to this feature

key: str = 'totins'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

GraphMeanDegree

class qbindiff.features.GraphMeanDegree(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Mean degree of the function.

Parameters:

weight – weight to apply to this feature

key: str = 'Gmd'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

GraphDensity

class qbindiff.features.GraphDensity(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Density of the function flow graph (CFG).

Parameters:

weight – weight to apply to this feature

key: str = 'Gd'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

GraphNbComponents

class qbindiff.features.GraphNbComponents(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Number of components in the function (non-connected flow graphs). (This can happen in the way IDA disassemble functions)

Parameters:

weight – weight to apply to this feature

key: str = 'Gnc'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

GraphDiameter

class qbindiff.features.GraphDiameter(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Graph diameter of the function flow graph (CFG).

Parameters:

weight – weight to apply to this feature

key: str = 'Gdi'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

GraphTransitivity

class qbindiff.features.GraphTransitivity(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Transitivity of the function flow graph (CFG).

Parameters:

weight – weight to apply to this feature

key: str = 'Gt'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

GraphCommunities

class qbindiff.features.GraphCommunities(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Number of graph communities (Louvain modularity).

Parameters:

weight – weight to apply to this feature

key: str = 'Gcom'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

MnemonicSimple

class qbindiff.features.MnemonicSimple(weight: float = 1.0)[source]

Bases: InstructionFeatureExtractor

Mnemonic feature. It extracts a dictionary with mnemonic as key and 1 as value.

Parameters:

weight – weight to apply to this feature

key: str = 'M'

feature name (short)

visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.

Parameters:
  • program – program being visited

  • instruction – instruction being visited

  • collector – collector in which to save the feature value

property weight: float

Weight applied to the feature

MnemonicTyped

class qbindiff.features.MnemonicTyped(weight: float = 1.0)[source]

Bases: InstructionFeatureExtractor

Typed mnemonic feature. It extracts a dictionary where key is a combination of the mnemonic and the type of the operands. e.g I: immediate, R: Register, thus mov rax, 10, becomes MOVRI. Values of the dictionary is 1 if the typed mnemonic is present.

Parameters:

weight – weight to apply to this feature

key: str = 'Mt'

feature name (short)

visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.

Parameters:
  • program – program being visited

  • instruction – instruction being visited

  • collector – collector in which to save the feature value

property weight: float

Weight applied to the feature

GroupsCategory

class qbindiff.features.GroupsCategory(weight: float = 1.0)[source]

Bases: InstructionFeatureExtractor

Categorization of instructions feature. It can correspond to instructions subset (XMM, AES etc..), or more generic grouping like (arithmetic, comparisons etc..). As of now, rely on capstone groups.

Warning

Feature in maintenance. Do nothing at the moment.

Parameters:

weight – weight to apply to this feature

key: str = 'Gp'

feature name (short)

visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.

Parameters:
  • program – program being visited

  • instruction – instruction being visited

  • collector – collector in which to save the feature value

property weight: float

Weight applied to the feature

ChildNb

class qbindiff.features.ChildNb(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Function children number. This feature extracts the number of functions called by the current one (in call graph).

Parameters:

weight – weight to apply to this feature

key: str = 'cnb'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

ParentNb

class qbindiff.features.ParentNb(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Function parent number. This feature extracts the number of functions calling the current one (in call graph).

Parameters:

weight – weight to apply to this feature

key: str = 'pnb'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

RelativeNb

class qbindiff.features.RelativeNb(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Function relatives number This feature counts both the number of parents and children of the current one (in call graph).

Parameters:

weight – weight to apply to this feature

key: str = 'rnb'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

LibName

class qbindiff.features.LibName(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Library (internal) calls feature. This features computes a dictionary of library functions called as keys and the count as values. It relies on the backend loader to correctly identify a function as a library.

Parameters:

weight – weight to apply to this feature

key: str = 'lib'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

ImpName

class qbindiff.features.ImpName(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

External calls feature. It computes a dictionary of external functions called as keys and the count as values. External functions are functions imported dynamically.

Parameters:

weight – weight to apply to this feature

key: str = 'imp'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

WeisfeilerLehman

class qbindiff.features.WeisfeilerLehman(*args, lsh: type[LSH] | None = None, max_passes: int = -1, **kwargs)[source]

Bases: FunctionFeatureExtractor

Weisfeiler-Lehman Graph Kernel feature. It’s strongly suggested to use the cosine distance with this feature. Options: [‘max_passes’: int]

Extract a feature vector by using a custom defined node labeling scheme.

Parameters:
  • lsh – The Local Sensitive Hashing function to use. Must inherit from LSH. If None is specified then BOWLSH is used

  • max_passes – The maximum number of iterations allowed. If it is set to -1 then no limit is specified.

key: str = 'wlgk'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector)[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

BOWLSH

class qbindiff.features.wlgk.BOWLSH(node: BasicBlock | None = None)[source]

Bases: LSH

Extract the bag-of-words representation of a block. The hashes are 4 bytes long.

add(lsh: LSH) None[source]

Add the hash lsh to the current hash

copy() BOWLSH[source]
class property hyperplanes

Generate the hyperplanes for the LSH. Each hyperplane is identified by its normal vector v from R^2000: v * x = 0 the dimension 2000 should be sufficient to characterize the basic asm blocks. Warning: this method will leak memory as the hyperplanes will never be deallocated.

LSH

class qbindiff.features.wlgk.LSH(node: BasicBlock)[source]

Bases: object

Abstract class representing a Locality Sensitive Hashing function. It defines the interface to the function.

abstract add(lsh: LSH) None[source]

Add the hash lsh to the current hash

StronglyConnectedComponents

class qbindiff.features.StronglyConnectedComponents(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Number of strongly connected components in a function CFG.

Parameters:

weight – weight to apply to this feature

key: str = 'scc'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

BytesHash

class qbindiff.features.BytesHash(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Hash of the function, using the instructions sorted by addresses. The hashing function used is MD5.

Parameters:

weight – weight to apply to this feature

key: str = 'bh'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

CyclomaticComplexity

class qbindiff.features.CyclomaticComplexity(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Cyclomatic complexity of the function CFG.

Parameters:

weight – weight to apply to this feature

key: str = 'cc'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

MDIndex

class qbindiff.features.MDIndex(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

MD-Index of the function, based on https://www.sto.nato.int/publications/STO%20Meeting%20Proceedings/RTO-MP-IST-091/MP-IST-091-26.pdf. A slightly modified version of it : notice the topological sort is only available for DAG graphs (which may not always be the case)

Parameters:

weight – weight to apply to this feature

key: str = 'mdidx'

feature name (short)

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

SmallPrimeNumbers

class qbindiff.features.SmallPrimeNumbers(weight: float = 1.0)[source]

Bases: FunctionFeatureExtractor

Small-Prime-Number based on mnemonics, as defined in Bindiff. This hash is slightly different from the theoretical implementation. Modulo is made at each round, instead of doing it at the end.

Parameters:

weight – weight to apply to this feature

key: str = 'spp'

feature name (short)

static primesbelow(n: int) list[int][source]

Utility function that returns a list of all the primes below n. This comes from Diaphora

Parameters:

n – integer n

Returns:

list of prime integers below n

visit_function(program: Program, function: Function, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.

Parameters:
  • program – Program being visited

  • function – Function object being visited

  • collector – collector in which to save the feature value.

property weight: float

Weight applied to the feature

ReadWriteAccess

class qbindiff.features.ReadWriteAccess(weight: float = 1.0)[source]

Bases: OperandFeatureExtractor

Number of memory access in the function. Both read and write.

Parameters:

weight – weight to apply to this feature

key: str = 'rwa'

feature name (short)

visit_operand(program: Program, operand: Operand, collector: FeatureCollector) None[source]

Function being called by the visitor when encountering an operand in the program. Classes inheriting have to implement this method.

Parameters:
  • program – program being visited

  • operand – operand being visited

  • collector – collector in which to save the feature value

property weight: float

Weight applied to the feature