CyberSecurity Blogs

Check Out This Great RCE PoC Walkthrough For The VMware ESXi OpenSLP Heap Overflow Vulnerability

Johnny Yu (@straight_blast)

During a recent engagement, I discovered a machine that is running VMware ESXi 6.7.0. Upon inspecting any known vulnerabilities associated with this version of the software, I identified it may be vulnerable to ESXi OpenSLP heap-overflow (CVE-2021–21974). Through googling, I found a blog post by Lucas Leong (@_wmliang_) of Trend Micro’s Zero Day Initiative, who is the security researcher that found this bug. Lucas wrote a brief overview on how to exploit the vulnerability but share no reference to a PoC. Since I couldn’t find any existing PoC on the internet, I thought it would be neat to develop an exploit based on Lucas’ approach. Before proceeding, I highly encourage fellow readers to review Lucas’ blog to get an overview of the bug and exploitation strategy from the founder’s perspective.

To setup a test environment, I need a vulnerable copy of VMware ESXi for testing and debugging. VMware offers trial version of ESXi for download. Setup is straight forward by deploying the image through VMware Fusion or similar tool. Once installation is completed, I used the web interface to enable SSH. To debug the ‘slpd’ binary on the server, I used gdbserver that comes with the image. To talk to the gdbserver, I used SSH local port forwarding:

ssh -L 1337:localhost:1337 root@<esxi-ip-address> 22

On the ESXi server, I attached gdbserver to ‘slpd’ as follow:

/etc/init.d/slpd restart ; sleep 1 ; gdbserver — attach localhost:1337 `ps | grep slpd | awk ‘{print $1}’`

Lastly, on my local gdb client, I connected to the gdbserver with the following command:

target remote localhost:1337

The Service Location Protocol is a service discovery protocol that allows connecting devices to identify services that are available within the local area network by querying a directory server. This is similar to a person walking into a shopping center and looking at the directory listing to see what stores is in the mall. To keep this brief, a device can query about a service and its location by making a ‘service request’ and specifying the type of service it wants to look up with an URL.

For example, to look up the VMInfrastructure service from the directory server, the device will make a request with ‘service:VMwareInfrastructure’ as the URL. The server will respond back with something like ‘service:VMwareInfrastructure://localhost.localdomain’.

A device can also collect additional attributes and meta-data about a service by making an ‘attribute request’ supplying the same URL. Devices that want to be added to the directory can submit a ‘service registration’. This request will include information such as the IP of the device that is making the announcement, the type of service, and any meta-data that it wants to share. There are more functions the SLP can do, but the last message type I am interested in is the ‘directory agent advertisement’ because this is where the vulnerability is at. The ‘directory agent advertisement’ is a broadcast message sent by the server to let devices on the network know who to reach out if they wanted to query about a service and its location. To learn more about SLP, please see this and that.

While the layout of the SLP structure will be slightly different between different SLP message types, they generally follow a header + body format.

A ‘service request’ packet looks like this:

An ‘attribute request’ packet looks like this:

A ‘service registration’ packet looks like this:

Lastly, a ‘directory agent advertisement’ packet looks like this:

As noted in Lucas’ blog, the bug is in the ‘SLPParseSrvURL’ function, which gets called when a ‘directory agent advertisement’ message is being process.

On line 18, the length of the URL is added with the number 0x1d to form the final size to ‘calloc’ from memory. On line 22, the ‘strstr’ function is called to seek the position of the substring “:/” within the URL. On line 28, the content of the URL before the substring “:/” will be copied into the newly ‘calloced’ memory from line 18.

Another thing to note is that the ‘strstr’ function will return 0 if the substring “:/” does not exists or if the function hits a null character.

I speculated VMware test case only tried ‘scopes’ with a length size below 256. If we look at the following ‘directory agent advertisement’ layout snippet, we see sample 1’s length of ‘scopes’ includes a null byte. This null byte accidentally acted as the string terminator for ‘URL’ since it sits right after it. If the length of ‘scopes’ is above 256, the hex representation of the length will not have a null byte (as in sample 2), and therefore the ‘strstr’ function will read passed the ‘URL’ and continue seeking the substring “:/” in ‘scopes’.

Therefore, the ‘memcpy’ call will lead to a heap overflow because the source contains content from‘URL’ + part of ‘scopes’ while the destination only have spaces to fit ‘URL’.

Here I will go over the relevant SLP components as they serve as the building blocks for exploitation.

_SLPDSocket

All client that connects to the ‘slpd’ daemon will create a ‘slpd-socket’ object on the heap. This object contains information on the current state of the connection, such as whether it is in a reading state or writing state. Other important information stored in this object includes the client’s IP address, the socket file descriptor in-use for the connection, pointers to ‘recv-buffer’ and ‘send-buffer’ for this specific connection, and pointers to ‘slpd-socket’ object created from prior and future established connections. The size of this object is fixed at 0xd0, and cannot be changed.

_SLPBuffer

All SLP message types received from the server will create at least two SLPBuffer objects. One is called ‘recv-buffer’, which stores the data received by the server from the client. Since I can control the size of the data I send from the client, I can control the size of the ‘recv-buffer’. The other SLPBuffer object is called ‘send-buffer’. This buffer stores the data that will be send from the server to client. The ‘send-buffer’ have a fixed size of 0x598 and I cannot control its size. Furthermore, the SLPBuffer have meta-data properties that points to the starting, current, and ending position of said data.

SLP Socket State

The SLP Socket State defines the status for a particular connection. The state value is set in the _SLPSocket object. A connection will either be calling ‘recv’ or ‘send’ depending on the state of the socket.

It is important to understand the properties of _SLPSocket, _SLPBuffer and Socket States because the exploitation process requires modifying those values.

This section goes over objectives required to land a successful exploitation.

Objective 1

Achieve remote code execution by leveraging the heap overflow to overwrite the ‘__free_hook’ to point to shellcode or ROP chain.

Expectation 1

If I can overwrite the ‘position’ pointers in a _SLPBuffer ‘recv-buffer’ object, I can force incoming data to the server to be written to arbitrary memory location.

Objective 2

In order to know the address of ‘__free_hook’, I have to leak an address referencing the libc library.

Expectation 2

If I can overwrite the ‘position’ pointers in a _SLPBuffer ‘send-buffer’ object, I can force outgoing data from the server to read from arbitrary memory location.

Now that I defined goals and objectives, I have to identify any limitations with the heap overflow vector and memory allocation in general.

Limitations

  1. ‘URL’ data stored in the “Directory Agent Advertisement’s URL” object cannot contain null bytes (due to the ‘strstr’ function). This limitation prevents me from directly overwriting meta-data within an adjacent ‘_SLPDSocket’ or ‘_SLPBuffer’ object because I would have to supply an invalid size value for the objects’ heap header before reaching those properties.
  2. The ‘slpd’ binary allocates ‘_SLPDSocket’ and ‘_SLPBuffer’ objects with ‘calloc’. The ‘calloc’ call will zero out the allocated memory slot. This limitation removes all past data of a memory slot which could contain interesting pointers or stack addresses. This looks like a show stopper because if I was to overwrite a ‘position’ pointer in a _SLPBuffer, I would need to know a valid address value. Since I don’t know such value, the next best thing I can do is partially overwrite a ‘position’ pointer to at least get me in a valid address range that could be meaningful. With ‘calloc’ zeroing everything out, I lose that opportunity.

Fortunately, not all is lost. As shared in Lucas’ blog post, I can still get around the limitations.

Limitations Bypass

  1. Use the heap overflow to partially overwrite the adjacent free memory chunk’s size to extend it. By extending the free chunk, I can have it position to overlap with its neighbor ‘_SLPDSocket’ or ‘_SLPBuffer’ object. When I allocate memory that occupies the extended free space, I can overwrite the object’s properties.
  2. The ‘calloc’ call will retain past data of a memory slot if it was previously marked as ‘IS_MAPPED’ when it was still freed. The key thing is the ‘calloc’ call must request a chunk size that is an exact size as the freed slot with ‘IS_MAPPED’ flag enabled to preserve its old data. If a ‘IS_MAPPED’ freed chunk is splitted up by a ‘calloc’ request, the ‘calloc’ will service a chunk without the ‘IS_MAPPED’ flag and zero out the slot’s content.

There is still one more catch. Even if I can mark arbitrary position to store or read data for the _SLPBuffer, the ‘slpd’ binary will not comply unless associated socket state is set to the proper status. Therefore, the heap overflow will also have to overwrite the associated _SLPDSocket object’s meta-data in order to get arbitrary read and write primitive to work.

This sections goes over the heap grooming strategy to achieve the following:

The Building Blocks

Before I go over the heap grooming design, I want to say a few words about the purpose of the SLP messages mentioned earlier in fitting into the exploitation process.

service request — primarily use for creating a consecutive heap layout and holes.

directory agent advertisement — use to trigger the heap overflow vector to overwrite into the next neighbor memory block.

service registration — store user controlled data into the memory database which will be retrieved through the ‘attribute request’ message. This message is solely to set up ‘attribute request’ and is not used for the purpose of heap grooming.

attribute request — pull user controlled data from the memory database. Its purpose is to create a ‘marker’ that can be used to identify current position during the information leak stage. Also, the dynamic memory use to store the user controlled data can be a good stack pivot spot with complete user controllable content.

Overwrite _SLPBuffer ‘send-buffer’ object (Arbitrary Read Primitive)

READ MORE HERE