Untangling Exotic Architectures with Binary Ninja

October 13th marked the conclusion of FireEye’s fourth annual Flare-On Challenge. Every year the Flare-On challenge attracts thousands of hackers, security researchers, and enthusiasts alike in a race to solve a diverse suite of increasingly difficult reverse engineering challenges.

The eleventh challenge (second to last) presented itself as a single PE32 with a subleq based virtualized obfuscator, an architecture consisting of only a single instruction.

Dumping the subleq assembly for the challenge

Some of you will find this eerily reminiscent of movfuscator, a toy compiler by domas which implements a subset of the x86 instruction set using only the mov instruction.

In this post I’ll detail a practical approach towards untangling this challenge. We will implement a custom architecture plugin for Binary Ninja, and then proceed to augment it with some basic reasoning to de-obfuscate the challenge.

Subleq Architecture

To provide some context, subleq is a simple architecture with only a single instruction:

sub [A], [B], C

The operation of this instruction roughly translates to this pseudo code:

*B -= *A
if (C != 0 && *B <= 0)
    goto C

Tracing the extracted stream of bytecode though a verbose emulator is the quickest (least expensive) approach to start analyzing this mess, but it comes at the cost of situational awareness.

In contrast, a static disassembler for VM based challenges will often provide one with a better understanding of the ‘big picture’ bytecode program structure.

Building Architecture Plugins in Binary Ninja

Binary Ninja is a disassembler developed by Vector35 that exposes a rich and easy to use API. The following walkthrough demonstrates the ease of building a new architecture plugin to disassemble the subleq bytecode.

We will use the Architecture class to create our new disassembler module.

class Subleq(Architecture):
    name = "subleq"
    address_size = 4
    default_int_size = 4
    max_instr_length = 12 # Each instruction is 3 dwords

    # SP register is required, even if we are not going to use it
    regs = {'sp': RegisterInfo('sp', 2)}
    stack_pointer = 'sp'

    def perform_get_instruction_info(self,data,addr):
        # Return information about branches
        pass

    def perform_get_instruction_text(self, data, addr):
        # Return disassembly tokens to display
        pass

    def perform_get_instruction_low_level_il(self, data, addr, il):
        # Perform LLIL lifting (we will talk about this later)
        return None

Subleq.register()

The core functionality of this micro architecture plugin will be implemented in the perform_get_instruction_info and perform_get_instruction_text functions. Among other things, these functions will provide the branching and textual information Binary Ninja needs to create control flow graphs (CFGs) and perform additional analysis.

In perform_get_instruction_info we need to return an IntructionInfo object. The IntructionInfo object provides Binary Ninja with branching information for a given instruction.

def perform_get_instruction_info(self,data,addr):
    # If we can't decode an instruction return None
    if len(data) < 12:
        return None

    # Unpack our operands from the data
    a,b,c = struct.unpack('<3I',data[:12])

    # Create the InstructionInfo object for our instruction
    res = InstructionInfo()
    res.length = 12

    if c != 0:
        # True branch jumps to integer index c 
        res.add_branch(BranchType.TrueBranch, c*4)
        # False branch continues to next instruction
        res.add_branch(BranchType.FalseBranch, addr + 12)

    return res

Next, we will need to implement the perform_get_instruction_text function to return a list of InstructionTextToken objects. These objects hold the text we want to display in the linear disassembly, and graph views.

Highlighting a single `PossibleAddressToken` inside Binary Ninja

In Binary Ninja, the disassembly text is split into tokens. Text tokens help Binary Ninja augment the disassembly with colors, and synchronized highlights.

# Helper function to make tokens easier to make
def makeToken(tokenType, text, data=None):
    tokenType = {
            'inst':InstructionTextTokenType.InstructionToken,
            'text':InstructionTextTokenType.TextToken,
            'addr':InstructionTextTokenType.PossibleAddressToken,
            'sep':InstructionTextTokenType.OperandSeparatorToken
    }[tokenType]

    if data is None:
        return InstructionTextToken(tokenType, text)
    return InstructionTextToken(tokenType, text, data)

def perform_get_instruction_text(self, data, addr):
    # If we can't decode an instruction return None
    if len(data) < 12:
        return None

    # Unpack our operands from data
    a,b,c = struct.unpack('<3I',data[:12])

    tokens = []

    # sub [A], [B]
    tokens.append(makeToken('inst', '{:7s}'.format('sub')))
    tokens.append(makeToken('text', '['))
    tokens.append(makeToken('addr', hex(b*4), b*4))
    tokens.append(makeToken('text', ']'))
    tokens.append(makeToken('sep', ', '))
    tokens.append(makeToken('text', '['))
    tokens.append(makeToken('addr', hex(a*4), a*4))
    tokens.append(makeToken('text', ']'))

    # ; jmp C if [B] <= 0
    if c != 0:
        tokens.append(makeToken('sep', '; '))
        tokens.append(makeToken('inst', '{:7s}'.format('jmp')))
        tokens.append(makeToken('addr', hex(c*4), c*4))
        tokens.append(makeToken('sep', ' if '))
        tokens.append(makeToken('text', '['))
        tokens.append(makeToken('addr', hex(b*4), b*4))
        tokens.append(makeToken('text', ']'))
        tokens.append(makeToken('text', ' <= 0'))

    return tokens, 12

We now have the basis for our architecture plugin and can begin using it by placing the script into the plugins folder (which can be found with Tools --> Open Plugin folder).

To test the newly created plugin, we load the subleq bytecode (extracted from the challenge) into Binary Ninja and press p which will prompt for an architecture to disassemble with.

Architecture Select Menu

Binary Ninja will begin disassembling at the specified address, unfurling massive control flow graphs packed full of conditional jumps.

Simple subleq architecture plugin at work

It quickly becomes apparent that there are far too many jumps to make any sense of the bytecode’s control flow. Since every instruction in subleq has the possibility to branch elsewhere, the graph as it exists now is an unusable spaghetti monster.

Too many conditional jumps to make this graph readable

In the next section, we’ll enhance our architecture plugin with some basic reasoning to simplify the control flow graph into something more useful.

Untangling the Control Flow Graph

If we look closely at some of the jumps in this challenge, we see that many take this form:

sub [X], [X]; jump C if [X] <=0

This will always result in *X == 0, forcing the jump. This is an anti-reversing technique which utilizes opaque predicates as a means of binary obfuscation.

To neuter this obfuscation, we can modify our architecture plugin to recognize these predicates and substitute them with unconditional jumps.

This is done by adding a special case to our perform_get_instruction_info function that will detect the opaque predicate, and return a BranchType.UnconditionalBranch instead of True and False branches.

def perform_get_instruction_info(self,data,addr):
    ...
    if c != 0:
        if b == a:
            # Unconditional branch jumps to integer index c
            res.add_branch(BranchType.UnconditionalBranch, c*4)
        else:
            # True branch jumps to integer index c 
            res.add_branch(BranchType.TrueBranch, c*4)
            # False branch continues to next instruction
            res.add_branch(BranchType.FalseBranch, addr + 4*3)

    return res

Eliminating the opaque jumps dramatically simplifies the CFG and removes a large number of ‘dead’ basic blocks.

This CFG is much more reasonable than its previous form

The architecture plugin now produces human readable control flow graphs, but one problem remains: We cannot rename global variables. Labeling the gloabl variables referenced by the bytecode functions will make our job of understanding the remaining semantics much easier.

Minimal Lifting

To make Binary Ninja aware of global variable references, it is necessary that we employ a minimal Low Level Intermediate Language implementation into our architecture plugin.

To make use of intermediate language features (eg, LLIL) an architecture plugin is required to implement perform_get_instruction_low_level_il. We will use the LLIL to advise Binary Ninja that the subleq instructions are loading memory from two addresses.

def perform_get_instruction_low_level_il(self, data, addr, il):
    # If we can't decode an instruction return None
    if len(data) < 12:
        return None

    # Unpack our operands from the data
    a,b,c = struct.unpack('<3I',data[:12])

    # mem[A] and mem[B] pointers
    mem_a = il.load(4, il.const_pointer(4, a*4))
    mem_b = il.load(4, il.const_pointer(4, b*4))

    # Append our IL expressions
    il.append(mem_a)
    il.append(mem_b)

    return 12

In the code snippet above, we construct the two load operations that the subleq instruction performs.

Global rename for variables

This implementation is a very minimal example of lifting, but is enough to give us the capability to rename memory addresses (global variables) in Binary Ninja.

In this post we have only scratched the surface of Binary Ninja’s IL. If you would like to learn more about it, check out this fantastic blogpost by Josh Watson.

A full subleq lifter has been implemented in the final version of this architecture plugin.

Detecting Self Modifying Code

Since subleq is a pretty limited language, the challenge uses self modifying code to perform iterative memory operations. This is not visible in our CFG and can be difficult to follow.

As an addendum to our architecture plugin, we register a right click menu action that will automatically detect and annotate these self modifying patterns in the current function.

def markModifiableCode(bv, func):
    # Loop over all instructions in the function
    for t in (t for bb in func.basic_blocks for t in bb.disassembly_text):
        # Find our 'sub' tokens (to make sure we are on a real instruction)
        if not t.tokens[0].text.startswith('sub '):
            continue

        addr = t.tokens[2].value
        # Check if the address is in a basic block
        bbs = bv.get_basic_blocks_at(addr)
        if len(bbs) == 0:
            continue

        # Check that this address really is an instruction
        for tt in bbs[0].disassembly_text:
            if addr - tt.address >= 3 or addr - tt.address < 0:
                continue
            # Highlight it and add comments
            bbs[0].function.set_user_instr_highlight(tt.address,
                    HighlightStandardColor.RedHighlightColor)
            bbs[0].function.set_comment_at(tt.address, "Modified by 0x%x"%t.address)
            func.set_comment_at(t.address, "Modifies code at 0x%x"%tt.address)
            break

PluginCommand.register_for_function('Subleq check modifiable code', 'subleq', markModifiableCode)

It is now much easier to tell when the code modifies itself while reading the disassembly.

The annotated self-modifying code

Whew! That’s it! The challenge still has to be reverse engineered as one normally would, but our efforts have won us all the luxuries of a fully fledged disassembler and distilled the challenge down to its core.

Conclusion

In this post we demonstrated the practical value of tailored tooling. By building a custom architecture plugin for Binary Ninja and leveraging its intermediate language, we reduced the reverse engineering process to a matter of hours.

A copy of the final subleq architecture plugin can be downloaded here.

ENGINEERING BLOG