Quokka: String Deciphering

Introduction

The sample to work on is a Mirai malware sample. It can be retrieve below (password: `infected\). As it is a real world malware, please be careful and do not execute it on your system.


It’s an ELF executable, for which all strings used internally are ciphered with a custom algorithm. The deciphering function is at 0x804f7e0 and uses a custom calling convention where the two firsts parameters are provided through edx and eax. The function takes as parameters two unrelated strings and decipher them with the key 0x37. Strings are deciphered in-place.

This function is called many times to decipher all strings used. The figure below summarizes the problem.

summary Mirai deciphering process

Question

The goal is to write a script to decipher automatically all strings using Quokka API.

I. Loading the program

If the program has not been exported with Quokka. It can be exported with:

[1]:
from quokka import Program

program = Program.from_binary("33f46cac84fe0368f33a1e56712add18")

Otherwise it can be directly loaded with:

[3]:
from quokka import Program

program = Program("33f46cac84fe0368f33a1e56712add18.quokka", "33f46cac84fe0368f33a1e56712add18")

II. String Deciphering

Given the structure of the code, in order to solve, we need to be able to do the following:

  1. Retrieving a string from .rodata section

  2. Get all the cross-references to 0x804f7e0 to find all locations where the deciphering function is called

  3. At each of the calls location, we need to backtract to retrieve both edx and eax values (which are pointers to .rodata)

  4. Assemble everything to decipher strings

Reading a string

Reading a string in program memory can be done with Program.read_bytes. As the string length is not known in advance it should be read byte per byte.

Exercise: Write the snippet to read an arbitrary string at a given address

[5]:
def read_ciphered_str(program: Program, address: int) -> bytes:
    offset = 0
    data = b""
    while (next_byte := program.read_bytes(address + offset, 1)) != b"\x00":
        data += next_byte
        offset += 1
    return data

read_ciphered_str(program, 0x8054a70)
[5]:
b'^GTVZhEC\x02\x04\x02\x07'

Cross-refs to a function

We now need to retrieve addresses of all calls to the deciphering function. This can be achieved by looking at all cross-references to 0x804f7e0. As cross-references are attached to instructions we should retrieve the first instruction of the function and to iterate the call_references attribute of the instruction.

Note call_references returns a list of tuples composed of (Function, Block, instruction index in block).

Exercise: Iterate and print all call sites of the function

[16]:
inst = program.get_instruction(0x804f7e0)
for fun, block, idx in inst.call_references:
    call_inst = list(block.instructions)[idx]
    print(f"call at: {call_inst.address:#08x}")
call at: 0x804fac7
call at: 0x804fadb
call at: 0x804faec
call at: 0x804fb00
call at: 0x804fb14
call at: 0x804fb28
call at: 0x804fb39
call at: 0x804fb4d
call at: 0x804fb61
call at: 0x804fb75
call at: 0x804fb89
call at: 0x804fb9d
call at: 0x804fbb1
call at: 0x804fbc5
call at: 0x804fbd9
call at: 0x804fbea
call at: 0x804fbfb
call at: 0x804fc0f
call at: 0x804fc23
call at: 0x804fc37
call at: 0x804fc4b
call at: 0x804fc5c
call at: 0x804fc70
call at: 0x804fc84
call at: 0x804fc98
call at: 0x804fcac
call at: 0x804fcc0
call at: 0x804fcd4
call at: 0x804fce8
call at: 0x804fcfc
call at: 0x804fd10
call at: 0x804fd24
call at: 0x804fd38
call at: 0x804fd4c
call at: 0x804fd60
call at: 0x804fd74
call at: 0x804fd88
call at: 0x804fd9c
call at: 0x804fdad
call at: 0x804fdc1
call at: 0x804fdd5
call at: 0x804fde9
call at: 0x804fdfd
call at: 0x804fe11
call at: 0x804fe25
call at: 0x804fe39
call at: 0x804fe4d
call at: 0x804fe61
call at: 0x804fe75
call at: 0x804fe89
call at: 0x804fe9d
call at: 0x804feb1
call at: 0x804fec5
call at: 0x804fed9
call at: 0x804feed
call at: 0x804ff01
call at: 0x804ff15
call at: 0x804ff29
call at: 0x804ff3d
call at: 0x804ff51
call at: 0x804ff65
call at: 0x804ff79
call at: 0x804ff8d
call at: 0x804ffa1

Retrieving call parameters

To retrieve parameter values, we need to backtrack from the call instruction and to find specific register assignments. Under the hood, Quokka uses capstone objects. Thus we can use capstone constants to retrieve a specific register.

[18]:
from capstone import x86_const

EDX_ID = x86_const.X86_REG_EDX
EAX_ID = x86_const.X86_REG_EAX

Then quokka provides the find_register_access to retrieve a specific register read or write given a list of instructions.

On the register assignment, we then have to retrieve the data_references to the string in .rodata.

Exercise: Write a function to retrieve the data reference address for a specific register.

[26]:
from quokka import Block
from quokka.utils import find_register_access
from quokka.types import RegAccessMode

def read_reg_data_ref(reg_id: int, block: Block, call_idx: int) -> int | None:

    # Reverse the list of instructions
    instructions = list(block.instructions)[:call_idx][::-1]

    while True:
        instr = find_register_access(reg_id, RegAccessMode.WRITE, instructions)

        if instr is None:  # No instruction found
            return None

        if instr.data_references:  # If there are data references, return that value
            return instr.data_references[0].address
        else:  # No data references
            # Recursively read the source registry
            regs_read, regs_write = instr.cs_inst.regs_access()
            reg_id = regs_read[0]
            instructions[instructions.index(instr) + 1 :]

    return None

data_addr = read_reg_data_ref(EDX_ID, block, idx)
data = read_ciphered_str(program, data_addr)
print(f"@{data_addr:#08x}: {repr(data)}")
@0x8054b2e: b'OVS\x14\x06\x05'

Assembling everything together

We can now assemble everything together to script the deciphering of all strings.

Exercise: Decipher all strings!

[31]:
def decrypt(data: bytes):
    return bytes(map(lambda b: b ^ 0x37, data)) if data else b""

mirai_decrypt_func = program[0x804F7E0] # decryption function

first_instr = mirai_decrypt_func.get_instruction(mirai_decrypt_func.start)  # get first instruction

for _, block, instr_idx in first_instr.call_references:
    # Extract the arguments
    #  arg1  -->  eax | arg2  -->  edx
    arg1_addr = read_reg_data_ref(EAX_ID, block, instr_idx)
    arg2_addr = read_reg_data_ref(EDX_ID, block, instr_idx)

    # Read ciphered strings
    arg1_ciph = read_ciphered_str(program, arg1_addr)
    arg2_ciph = read_ciphered_str(program, arg2_addr)

    # Decrypt the string
    arg1_plain = decrypt(arg1_ciph)
    arg2_plain = decrypt(arg2_ciph)

    print(f"{repr(arg1_ciph)} -> {arg1_plain.decode()}")
    print(f"{repr(arg2_ciph)} -> {arg2_plain.decode()}")
b'VSZ^Y' -> admin
b'VSZ^Y' -> admin
b'EXXC' -> root
b'VSZ^Y' -> admin
b'BUYC' -> ubnt
b'BUYC' -> ubnt
b'SRQVB[C' -> default
b'[}@GUX\x01' -> lJwpbo6
b'SRQVB[C' -> default
b'd\x05QpFyqD' -> S2fGqNFs
b'SRQVB[C' -> default
b'xO_[@dp\x0f' -> OxhlwSG8
b'SRQVB[C' -> default
b'SRQVB[C' -> default
b'VSZ^Y' -> admin
b'GVDD@XES' -> password
b'EXXC' -> root
b'\x02BG' -> 5up
b'EXXC' -> root
b'M[OO\x19' -> zlxx.
b'EXXC' -> root
b'A^MOA' -> vizxv
b'EXXC' -> root
b'mCR\x02\x05\x06' -> Zte521
b'EXXC' -> root
b'VY\\X' -> anko
b'ADCVETVZ\x05\x07\x06\x02' -> vstarcam2015
b'\x05\x07\x06\x02\x07\x01\x07\x05' -> 20150602
b'EXXC' -> root
b'DAPXS^R' -> svgodie
b'BDRE' -> user
b'BDRE' -> user
b'PBRDC' -> guest
b'PBRDC' -> guest
b'PBRDC' -> guest
b'\x06\x05\x04\x03\x02' -> 12345
b'EXXC' -> root
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'EXXC' -> root
b'GVDD@XES' -> password
b'VSZ^Y' -> admin
b'\x06\x05\x04\x03' -> 1234
b'SVRZXY' -> daemon
b'SVRZXY' -> daemon
b'VSZ' -> adm
b'' ->
b'U^Y' -> bin
b'' ->
b'SVRZXY' -> daemon
b'' ->
b'EXXC' -> root
b'' ->
b'VSZ^Y' -> admin
b'' ->
b'SRQVB[C' -> default
b'' ->
b'VSZ^Y' -> admin
b'SAE\x05\x02\x0f\x07\x05\x05\x05' -> dvr2580222
b'EXXC' -> root
b'^GTVZhEC\x02\x04\x02\x07' -> ipcam_rt5350
b'EXXC' -> root
b'OZ_S^GT' -> xmhdipc
b'EXXC' -> root
b'SRQVB[C' -> default
b'EXXC' -> root
b']BVYCRT_' -> juantech
b'EXXC' -> root
b'\x02\x03\x04\x05\x06' -> 54321
b'VSZ^Y' -> admin
b'VSZ^Y\x06\x05\x04\x03' -> admin1234
b'SRQVB[C' -> default
b'VYCD[F' -> antslq
b'VSZ^Y' -> admin
b'T_VYPRZR' -> changeme
b'EXXC' -> root
b'T_VYPRZR' -> changeme
b'BDRE' -> user
b'BDRE' -> user
b'VSZ^Y' -> admin
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'PBRDC' -> guest
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'EXXC' -> root
b'SRQVB[C' -> default
b'EXXC' -> root
b'\x0f\x0f\x0f\x0f\x0f\x0f' -> 888888
b'EXXC' -> root
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'EXXC' -> root
b'' ->
b'EXXC' -> root
b'pz\x0f\x06\x0f\x05' -> GM8182
b'VSZ^Y' -> admin
b'\x06\x05\x04\x03\x02\x01' -> 123456
b'VSZ^Y' -> admin
b'vsz~y' -> ADMIN
b'VSZ^Y' -> admin
b'VSZ^Y^DCEVCXE' -> administrator
b'VSZ^Y' -> admin
b'EVS^BD' -> radius
b'VSZ^Y' -> admin
b'VSZ^Y\x06\x05\x04' -> admin123
b'EXXC' -> root
b'EXXC\x06\x05\x04' -> root123
b'VSZ^Y' -> admin
b'' ->
b'ADCVETVZ\x05\x07\x06' -> vstarcam201
b'\x05\x07\x06' -> 201
b'EXXC' -> root
b'\x06\x05\x04\x03\x02' -> 12345
b'EXXC' -> root
b'VYY^R\x05\x07\x06\x02' -> annie2015
b'VSZ^Y' -> admin
b'ARECRO\x05\x02R\\C\\D\x06\x05\x04' -> vertex25ektks123
b'VSZ^Y' -> admin
b'\x06\x0e\x0f\x0f' -> 1988
b'EXXC' -> root
b'\x06\x07\x07\x06T_^Y' -> 1001chin
b'VSZ^Y' -> admin
b'\x02\x03\x04\x05\x06' -> 54321
b'D_R[[' -> shell
b'D_' -> sh
b'VSZ^Y' -> admin
b'TXZTXZTXZ' -> comcomcom
b'VSZ^Y' -> admin
b'bF\x1a\x03p~C\x04z' -> Uq-4GIt3M
b'VSZ^Y' -> admin
b'OVS\x14\x06\x05' -> xad#12