Features¶
Address¶
- class qbindiff.features.Address(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Address of the function as a feature
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(_: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
DatName¶
- class qbindiff.features.DatName(weight: float = 1.0)[source]¶
Bases:
InstructionFeatureExtractor
References to data in the instruction (as retrieved by the backend loader). This feature maps the data value to the number of reference occurences to it. It’s a superset of
StrRef
feature.- Parameters:
weight – weight to apply to this feature
- help_msg: str = "References to data in the instruction (as retrieved by the backend loader).\n This feature maps the data value to the number of reference occurences to it.\n It's a superset of StrRef (strref) feature."¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_instruction(_: Program, instruction: Instruction, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
instruction – instruction being visited
collector – collector in which to save the feature value
StrRef¶
- class qbindiff.features.StrRef(weight: float = 1.0)[source]¶
Bases:
InstructionFeatureExtractor
References to strings in the instruction. This feature maps the string to the number of occurences to it.
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'References to strings in the instruction.\n This feature maps the string to the number of occurences to it.'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_instruction(_: Program, instruction: Instruction, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
instruction – instruction being visited
collector – collector in which to save the feature value
Constant¶
- class qbindiff.features.Constant(weight: float = 1.0)[source]¶
Bases:
OperandFeatureExtractor
Numeric constant (32/64bits) in the instruction (not addresses). This maps numerical values to the number of occurences to it. It excludes the addresses (relies on IDA to discriminate them).
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Numeric constant (32/64bits) in the instruction (not addresses).\n This maps numerical values to the number of occurences to it.\n It excludes the addresses (relies on IDA to discriminate them).'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_operand(_: Program, operand: Operand, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an operand in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
operand – operand being visited
collector – collector in which to save the feature value
FuncName¶
- class qbindiff.features.FuncName(*args: Any, excluded_regex: Pattern[str] | None = None, **kwargs: Any)[source]¶
Bases:
FunctionFeatureExtractor
Match the function names. Optionally the constructor takes a regular expression pattern to exclude function names
- Parameters:
args – parameters of a feature extractor
excluded_regex – regex to apply in order to exclude names
kwargs – keyworded arguments
- help_msg: str = 'Match the function names.\n Optionally the constructor takes a regular expression pattern to exclude function names'¶
CLI help message
- is_excluded(function: Function) bool [source]¶
Returns if the function should be excluded (and not considered) based on an optional regex
- Parameters:
function – function to consider
- Returns:
bool
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(_: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
BBlockNb¶
- class qbindiff.features.BBlockNb(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Number of basic blocks in the function as a feature.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
JumpNb¶
- class qbindiff.features.JumpNb(weight: float = 1.0)[source]¶
Bases:
InstructionFeatureExtractor
Number of jumps in the function. Requires INSTR_GROUP capability
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Number of jumps in the function.\n Requires INSTR_GROUP capability'¶
CLI help message
- required_capabilities = 2¶
By default there are no required capabilities
- visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
instruction – instruction being visited
collector – collector in which to save the feature value
MaxParentNb¶
- class qbindiff.features.MaxParentNb(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Maximum number of parent of a basic block in the function.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
MaxChildNb¶
- class qbindiff.features.MaxChildNb(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Maximum number of children of a basic block in the function.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
MaxInsNB¶
- class qbindiff.features.MaxInsNB(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Max number of instructions per basic blocks in the function.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
MeanInsNB¶
- class qbindiff.features.MeanInsNB(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Mean number of instructions per basic blocks in the function.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
InstNB¶
- class qbindiff.features.InstNB(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Number of instructions in the function.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
GraphMeanDegree¶
- class qbindiff.features.GraphMeanDegree(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Mean degree of the function.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
GraphDensity¶
- class qbindiff.features.GraphDensity(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Density of the function flow graph (CFG).
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
GraphNbComponents¶
- class qbindiff.features.GraphNbComponents(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Number of components in the function (non-connected flow graphs). (This can happen in the way IDA disassemble functions)
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Number of components in the function (non-connected flow graphs).\n (This can happen in the way IDA disassemble functions)'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
GraphDiameter¶
- class qbindiff.features.GraphDiameter(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Graph diameter of the function flow graph (CFG).
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
GraphTransitivity¶
- class qbindiff.features.GraphTransitivity(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Transitivity of the function flow graph (CFG).
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
GraphCommunities¶
- class qbindiff.features.GraphCommunities(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Number of graph communities (Louvain modularity).
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
PcodeMnemonicSimple¶
- class qbindiff.features.PcodeMnemonicSimple(weight: float = 1.0)[source]¶
Bases:
InstructionFeatureExtractor
Pcode mnemonic feature. It extracts a dictionary with mnemonic as key and 1 as value.
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Pcode mnemonic feature.\n It extracts a dictionary with mnemonic as key and 1 as value.'¶
CLI help message
- required_capabilities = 1¶
By default there are no required capabilities
- visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
instruction – instruction being visited
collector – collector in which to save the feature value
MnemonicSimple¶
- class qbindiff.features.MnemonicSimple(weight: float = 1.0)[source]¶
Bases:
InstructionFeatureExtractor
Mnemonic feature. It extracts a dictionary with mnemonic as key and 1 as value.
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Mnemonic feature.\n It extracts a dictionary with mnemonic as key and 1 as value.'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
instruction – instruction being visited
collector – collector in which to save the feature value
MnemonicTyped¶
- class qbindiff.features.MnemonicTyped(weight: float = 1.0)[source]¶
Bases:
InstructionFeatureExtractor
Typed mnemonic feature. It extracts a dictionary where key is a combination of the mnemonic and the type of the operands. e.g I: immediate, R: Register, thus mov rax, 10, becomes MOVRI. Values of the dictionary is 1 if the typed mnemonic is present.
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Typed mnemonic feature.\n It extracts a dictionary where key is a combination of the mnemonic\n and the type of the operands.\n e.g I: immediate, R: Register, thus mov rax, 10, becomes MOVRI.\n Values of the dictionary is 1 if the typed mnemonic is present.'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
instruction – instruction being visited
collector – collector in which to save the feature value
GroupsCategory¶
- class qbindiff.features.GroupsCategory(weight: float = 1.0)[source]¶
Bases:
InstructionFeatureExtractor
Categorization of instructions feature. It can correspond to instructions subset (XMM, AES etc..), or more generic grouping like (arithmetic, comparisons etc..). Requires INSTR_GROUP capability. It relies on
InstructionGroups
for the different categories.Warning
As of now there are not many categories. This might change in the future.
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Categorization of instructions feature.\n It can correspond to instructions subset (XMM, AES etc..),\n or more generic grouping like (arithmetic, comparisons etc..).\n Requires INSTR_GROUP capability.'¶
CLI help message
- required_capabilities = 2¶
By default there are no required capabilities
- visit_instruction(program: Program, instruction: Instruction, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an instruction in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
instruction – instruction being visited
collector – collector in which to save the feature value
ChildNb¶
- class qbindiff.features.ChildNb(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Function children number. This feature extracts the number of functions called by the current one (in call graph).
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Function children number.\n This feature extracts the number of functions called by the current one (in call graph).'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
ParentNb¶
- class qbindiff.features.ParentNb(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Function parent number. This feature extracts the number of functions calling the current one (in call graph).
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Function parent number.\n This feature extracts the number of functions calling the current one (in call graph).'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
RelativeNb¶
- class qbindiff.features.RelativeNb(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Function relatives number This feature counts both the number of parents and children of the current one (in call graph).
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Function relatives number\n This feature counts both the number of parents and children of the current one (in call graph).'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
LibName¶
- class qbindiff.features.LibName(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Library (internal) calls feature. This features computes a dictionary of library functions called as keys and the count as values. It relies on the backend loader to correctly identify a function as a library.
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Library (internal) calls feature.\n This features computes a dictionary of library functions called as keys and the count as values.\n It relies on the backend loader to correctly identify a function as a library.'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
ImpName¶
- class qbindiff.features.ImpName(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
External calls feature. It computes a dictionary of external functions called as keys and the count as values. External functions are functions imported dynamically.
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'External calls feature.\n It computes a dictionary of external functions called as keys and the count as values.\n External functions are functions imported dynamically.'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
WeisfeilerLehman¶
- class qbindiff.features.WeisfeilerLehman(*args, lsh: type[LSH] | None = None, max_passes: int = -1, **kwargs)[source]¶
Bases:
FunctionFeatureExtractor
Weisfeiler-Lehman Graph Kernel feature. It’s strongly suggested to use the cosine distance with this feature. Options: [‘max_passes’: int]
Extract a feature vector by using a custom defined node labeling scheme.
- Parameters:
lsh – The Local Sensitive Hashing function to use. Must inherit from LSH. If None is specified then BOWLSH is used
max_passes – The maximum number of iterations allowed. If it is set to -1 then no limit is specified.
- help_msg: str = "Weisfeiler-Lehman Graph Kernel feature.\n It's strongly suggested to use the cosine distance with this feature. Options: ['max_passes': int]"¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector)[source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
BOWLSH¶
- class qbindiff.features.wlgk.BOWLSH(node: BasicBlock | None = None)[source]¶
Bases:
LSH
Extract the bag-of-words representation of a block. The hashes are 4 bytes long.
- class property hyperplanes¶
Generate the hyperplanes for the LSH. Each hyperplane is identified by its normal vector v from R^2000: v * x = 0 the dimension 2000 should be sufficient to characterize the basic asm blocks. Warning: this method will leak memory as the hyperplanes will never be deallocated.
LSH¶
StronglyConnectedComponents¶
- class qbindiff.features.StronglyConnectedComponents(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Number of strongly connected components in a function CFG.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
BytesHash¶
- class qbindiff.features.BytesHash(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Hash of the function, using the instructions sorted by addresses. The hashing function used is MD5.
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'Hash of the function, using the instructions sorted by addresses.\n The hashing function used is MD5.'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
CyclomaticComplexity¶
- class qbindiff.features.CyclomaticComplexity(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
Cyclomatic complexity of the function CFG.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
MDIndex¶
- class qbindiff.features.MDIndex(weight: float = 1.0)[source]¶
Bases:
FunctionFeatureExtractor
MD-Index of the function, based on https://www.sto.nato.int/publications/STO%20Meeting%20Proceedings/RTO-MP-IST-091/MP-IST-091-26.pdf. A slightly modified version of it: notice the topological sort is only available for DAG graphs (which may not always be the case)
- Parameters:
weight – weight to apply to this feature
- help_msg: str = 'MD-Index of the function, based on https://www.sto.nato.int/publications/STO%20Meeting%20Proceedings/RTO-MP-IST-091/MP-IST-091-26.pdf.\n A slightly modified version of it: notice the topological sort is only available for\n DAG graphs (which may not always be the case)'¶
CLI help message
- required_capabilities = 0¶
By default there are no required capabilities
- visit_function(program: Program, function: Function, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering a function in the program. Inheriting classes of the implement the feature extraction in this method.
- Parameters:
program – Program being visited
function – Function object being visited
collector – collector in which to save the feature value.
ReadWriteAccess¶
- class qbindiff.features.ReadWriteAccess(weight: float = 1.0)[source]¶
Bases:
OperandFeatureExtractor
Number of memory access in the function. Both read and write.
- Parameters:
weight – weight to apply to this feature
- required_capabilities = 0¶
By default there are no required capabilities
- visit_operand(program: Program, operand: Operand, collector: FeatureCollector) None [source]¶
Function being called by the visitor when encountering an operand in the program. Classes inheriting have to implement this method.
- Parameters:
program – program being visited
operand – operand being visited
collector – collector in which to save the feature value