Quokka: String Deciphering¶
Introduction¶
The sample to work on is a Mirai
malware sample. It can be retrieve below.
It’s an ELF executable, for which all strings used internally are ciphered with a custom algorithm. The deciphering function is at 0x804f7e0
and uses a custom calling convention where the two firsts parameters are provided through edx
and eax
. The function takes as parameters two unrelated strings and decipher them with the key 0x37
. Strings are deciphered in-place.
This function is called many times to decipher all strings used. The figure below summarizes the problem.
Question
The goal is to write a script to decipher automatically all strings using Quokka API.
I. Loading the program¶
If the program has not been exported with Quokka. It can be exported with:
[1]:
from quokka import Program
program = Program.from_binary("33f46cac84fe0368f33a1e56712add18")
Otherwise it can be directly loaded with:
[3]:
from quokka import Program
program = Program("33f46cac84fe0368f33a1e56712add18.quokka", "33f46cac84fe0368f33a1e56712add18")
II. String Deciphering¶
Given the structure of the code, in order to solve, we need to be able to do the following:
Retrieving a string from
.rodata
sectionGet all the cross-references to
0x804f7e0
to find all locations where the deciphering function is calledAt each of the calls location, we need to backtract to retrieve both
edx
andeax
values (which are pointers to .rodata)Assemble everything to decipher strings
Reading a string¶
Reading a string in program memory can be done with Program.read_bytes. As the string length is not known in advance it should be read byte per byte.
Exercise: Write the snippet to read an arbitrary string at a given address
[5]:
def read_ciphered_str(program: Program, address: int) -> bytes:
offset = 0
data = b""
while (next_byte := program.read_bytes(address + offset, 1)) != b"\x00":
data += next_byte
offset += 1
return data
read_ciphered_str(program, 0x8054a70)
[5]:
b'^GTVZhEC\x02\x04\x02\x07'
Cross-refs to a function¶
We now need to retrieve addresses of all calls to the deciphering function. This can be achieved by looking at all cross-references to 0x804f7e0
. As cross-references are attached to instructions we should retrieve the first instruction of the function and to iterate the call_references attribute of the instruction.
Note call_references
returns a list of tuples composed of (Function, Block, instruction index in block).
Exercise: Iterate and print all call sites of the function
[16]:
inst = program.get_instruction(0x804f7e0)
for fun, block, idx in inst.call_references:
call_inst = list(block.instructions)[idx]
print(f"call at: {call_inst.address:#08x}")
call at: 0x804fac7
call at: 0x804fadb
call at: 0x804faec
call at: 0x804fb00
call at: 0x804fb14
call at: 0x804fb28
call at: 0x804fb39
call at: 0x804fb4d
call at: 0x804fb61
call at: 0x804fb75
call at: 0x804fb89
call at: 0x804fb9d
call at: 0x804fbb1
call at: 0x804fbc5
call at: 0x804fbd9
call at: 0x804fbea
call at: 0x804fbfb
call at: 0x804fc0f
call at: 0x804fc23
call at: 0x804fc37
call at: 0x804fc4b
call at: 0x804fc5c
call at: 0x804fc70
call at: 0x804fc84
call at: 0x804fc98
call at: 0x804fcac
call at: 0x804fcc0
call at: 0x804fcd4
call at: 0x804fce8
call at: 0x804fcfc
call at: 0x804fd10
call at: 0x804fd24
call at: 0x804fd38
call at: 0x804fd4c
call at: 0x804fd60
call at: 0x804fd74
call at: 0x804fd88
call at: 0x804fd9c
call at: 0x804fdad
call at: 0x804fdc1
call at: 0x804fdd5
call at: 0x804fde9
call at: 0x804fdfd
call at: 0x804fe11
call at: 0x804fe25
call at: 0x804fe39
call at: 0x804fe4d
call at: 0x804fe61
call at: 0x804fe75
call at: 0x804fe89
call at: 0x804fe9d
call at: 0x804feb1
call at: 0x804fec5
call at: 0x804fed9
call at: 0x804feed
call at: 0x804ff01
call at: 0x804ff15
call at: 0x804ff29
call at: 0x804ff3d
call at: 0x804ff51
call at: 0x804ff65
call at: 0x804ff79
call at: 0x804ff8d
call at: 0x804ffa1
Retrieving call parameters¶
To retrieve parameter values, we need to backtrack from the call instruction and to find specific register assignments. Under the hood, Quokka uses capstone objects. Thus we can use capstone constants to retrieve a specific register.
[18]:
from capstone import x86_const
EDX_ID = x86_const.X86_REG_EDX
EAX_ID = x86_const.X86_REG_EAX
Then quokka provides the find_register_access to retrieve a specific register read or write given a list of instructions.
On the register assignment, we then have to retrieve the data_references to the string in .rodata.
Exercise: Write a function to retrieve the data reference address for a specific register.
[26]:
from quokka import Block
from quokka.utils import find_register_access
from quokka.types import RegAccessMode
def read_reg_data_ref(reg_id: int, block: Block, call_idx: int) -> int | None:
# Reverse the list of instructions
instructions = list(block.instructions)[:call_idx][::-1]
while True:
instr = find_register_access(reg_id, RegAccessMode.WRITE, instructions)
if instr is None: # No instruction found
return None
if instr.data_references: # If there are data references, return that value
return instr.data_references[0].address
else: # No data references
# Recursively read the source registry
regs_read, regs_write = instr.cs_inst.regs_access()
reg_id = regs_read[0]
instructions[instructions.index(instr) + 1 :]
return None
data_addr = read_reg_data_ref(EDX_ID, block, idx)
data = read_ciphered_str(program, data_addr)
print(f"@{data_addr:#08x}: {repr(data)}")
@0x8054b2e: b'OVS\x14\x06\x05'
Assembling everything together¶
We can now assemble everything together to script the deciphering of all strings.
Exercise: Decipher all strings!
[31]:
def decrypt(data: bytes):
return bytes(map(lambda b: b ^ 0x37, data)) if data else b""
mirai_decrypt_func = program[0x804F7E0] # decryption function
first_instr = mirai_decrypt_func.get_instruction(mirai_decrypt_func.start) # get first instruction
for _, block, instr_idx in first_instr.call_references:
# Extract the arguments
# arg1 --> eax | arg2 --> edx
arg1_addr = read_reg_data_ref(EAX_ID, block, instr_idx)
arg2_addr = read_reg_data_ref(EDX_ID, block, instr_idx)
# Read ciphered strings
arg1_ciph = read_ciphered_str(program, arg1_addr)
arg2_ciph = read_ciphered_str(program, arg2_addr)
# Decrypt the string
arg1_plain = decrypt(arg1_ciph)
arg2_plain = decrypt(arg2_ciph)
print(f"{repr(arg1_ciph)} -> {arg1_plain.decode()}")
print(f"{repr(arg2_ciph)} -> {arg2_plain.decode()}")
b'VSZ^Y' -> admin
b'VSZ^Y' -> admin
b'EXXC' -> root
b'VSZ^Y' -> admin
b'BUYC' -> ubnt
b'BUYC' -> ubnt
b'SRQVB[C' -> default
b'[}@GUX\x01' -> lJwpbo6
b'SRQVB[C' -> default
b'd\x05QpFyqD' -> S2fGqNFs
b'SRQVB[C' -> default
b'xO_[@dp\x0f' -> OxhlwSG8
b'SRQVB[C' -> default
b'SRQVB[C' -> default
b'VSZ^Y' -> admin
b'GVDD@XES' -> password
b'EXXC' -> root
b'\x02BG' -> 5up
b'EXXC' -> root
b'M[OO\x19' -> zlxx.
b'EXXC' -> root
b'A^MOA' -> vizxv
b'EXXC' -> root
b'mCR\x02\x05\x06' -> Zte521
b'EXXC' -> root
b'VY\\X' -> anko
b'ADCVETVZ\x05\x07\x06\x02' -> vstarcam2015
b'\x05\x07\x06\x02\x07\x01\x07\x05' -> 20150602
b'EXXC' -> root
b'DAPXS^R' -> svgodie
b'BDRE' -> user
b'BDRE' -> user
b'PBRDC' -> guest
b'PBRDC' -> guest
b'PBRDC' -> guest
b'\x06\x05\x04\x03\x02' -> 12345
b'EXXC' -> root
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'EXXC' -> root
b'GVDD@XES' -> password
b'VSZ^Y' -> admin
b'\x06\x05\x04\x03' -> 1234
b'SVRZXY' -> daemon
b'SVRZXY' -> daemon
b'VSZ' -> adm
b'' ->
b'U^Y' -> bin
b'' ->
b'SVRZXY' -> daemon
b'' ->
b'EXXC' -> root
b'' ->
b'VSZ^Y' -> admin
b'' ->
b'SRQVB[C' -> default
b'' ->
b'VSZ^Y' -> admin
b'SAE\x05\x02\x0f\x07\x05\x05\x05' -> dvr2580222
b'EXXC' -> root
b'^GTVZhEC\x02\x04\x02\x07' -> ipcam_rt5350
b'EXXC' -> root
b'OZ_S^GT' -> xmhdipc
b'EXXC' -> root
b'SRQVB[C' -> default
b'EXXC' -> root
b']BVYCRT_' -> juantech
b'EXXC' -> root
b'\x02\x03\x04\x05\x06' -> 54321
b'VSZ^Y' -> admin
b'VSZ^Y\x06\x05\x04\x03' -> admin1234
b'SRQVB[C' -> default
b'VYCD[F' -> antslq
b'VSZ^Y' -> admin
b'T_VYPRZR' -> changeme
b'EXXC' -> root
b'T_VYPRZR' -> changeme
b'BDRE' -> user
b'BDRE' -> user
b'VSZ^Y' -> admin
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'PBRDC' -> guest
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'EXXC' -> root
b'SRQVB[C' -> default
b'EXXC' -> root
b'\x0f\x0f\x0f\x0f\x0f\x0f' -> 888888
b'EXXC' -> root
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'EXXC' -> root
b'' ->
b'EXXC' -> root
b'pz\x0f\x06\x0f\x05' -> GM8182
b'VSZ^Y' -> admin
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'VSZ^Y' -> admin
b'vsz~y' -> ADMIN
b'VSZ^Y' -> admin
b'VSZ^Y^DCEVCXE' -> administrator
b'VSZ^Y' -> admin
b'EVS^BD' -> radius
b'VSZ^Y' -> admin
b'VSZ^Y\x06\x05\x04' -> admin123
b'EXXC' -> root
b'EXXC\x06\x05\x04' -> root123
b'VSZ^Y' -> admin
b'' ->
b'ADCVETVZ\x05\x07\x06' -> vstarcam201
b'\x05\x07\x06' -> 201
b'EXXC' -> root
b'\x06\x05\x04\x03\x02' -> 12345
b'EXXC' -> root
b'VYY^R\x05\x07\x06\x02' -> annie2015
b'VSZ^Y' -> admin
b'ARECRO\x05\x02R\\C\\D\x06\x05\x04' -> vertex25ektks123
b'VSZ^Y' -> admin
b'\x06\x0e\x0f\x0f' -> 1988
b'EXXC' -> root
b'\x06\x07\x07\x06T_^Y' -> 1001chin
b'VSZ^Y' -> admin
b'\x02\x03\x04\x05\x06' -> 54321
b'D_R[[' -> shell
b'D_' -> sh
b'VSZ^Y' -> admin
b'TXZTXZTXZ' -> comcomcom
b'VSZ^Y' -> admin
b'bF\x1a\x03p~C\x04z' -> Uq-4GIt3M
b'VSZ^Y' -> admin
b'OVS\x14\x06\x05' -> xad#12