Saturday, November 29, 2014

Windows Thread Suspension Internals Part 2

In the last blog post I talked about both NtSuspendThread and PsSuspendThread kernel routines. If you didn't check the first part I recommend to check it first : here
This part is dedicated to KeSuspendThread and KiSuspendThread routines (fun stuff).
Let's get started by looking a KeSuspendThread : (Windows 7 32-bit SMP as usual)
(pseudo-C) :
A quick overview of KeSuspendThread shows that it's actually the one responsible of calling KiInsertQueueApc in order to queue the target thread's suspend APC in its kernel APC queue. But that's not the only thing happening here , so take it slow and go step by step into what routine does.

As you can notice we start first by raising the irql to DISPATCH_LEVEL, this means we're running in the same irql where the thread dispatcher does so our thread is guaranteed to be running on this processor until the irql drops below DISPATCH_LEVEL. As I'm on a multiprocessor machine this doesn't protect from accessing shared objects safely as another thread executing on another processor might access the object simultaneously. That's why a couple of locks must be acquired in order to continue the execution of the routine , the first lock that KeSuspendThread tries to acquire is the APC queue lock (Thread->ApcQueueLock). After acquiring the lock, execution continues and the thread's previous suspend count is saved , then it is compared with the maximum value that a suspend count might reach (0x7F). The irql is lowered to it's old value and a fatal exception is raised with status (STATUS_SUSPEND_COUNT_EXCEEDED) if the SuspendCount is equal to that value. As I mentioned in the last part PsSuspendThread calls KeSuspendThread within a try-except statement so the machine won't bugcheck as a result of that exception.
If the target thread's suspend count is lower that 0x7F (general case), a second check is done against Thread->ApcQueuable bit to check whether APCs can be queued to that thread or no. Here I want to mention that if you patch that bit using windbg or a driver of a given thread object that thread becomes immune to suspending and even termination as it is done also using an APC.
If the bit is set (generally the case also), the target thread's suspend count is incremented. Next , the routine checks if the thread isn't suspended nor frozen.
If that's also true a third check is done :
line 29 : if(Thread->SuspendApc.Inserted == TRUE) { ....
The SuspendApc is a KAPC stucture , and the Inserted field is a boolean that represents whether the APC was inserted in the APCs queue or not.
Let's start by seeing the else statement at line 38 first and get back to this check. So basically we'll be in the else statement if (SuspendApc.Inserted == FALSE) , it will simply set the APC's Inserted boolean to TRUE and then call KiInsertQueueApc to insert the suspend APC in the target's thread kernel APCs queue. KiInsertQueueApc is internally called by the exported KeInsertQueueApc.

The check at line 29 is confusing, since if the SuspendApc.Inserted is TRUE this already means that the suspend count is different than 0 so we won't even reach this if statement.As we'll see in a later article KeResumeThread is the routine that actually decrements the SuspendCount but it doesn't proceed to do so until it acquires the ApcQueue lock , so this eliminates the fact that KeResumeThread and KeSuspendThread are operating simultaneously on the same target thread (SMP machine). If this check turns out true for a reason , we acquire a lock to safely access and modify the SuspendSemaphore initialized previously by &Thread->SuspendSemaphore and then decrement the Semaphore Count to turn it into the non-signaled state apparently.
If the SuspendApc is now queued , its kernel and normal routines (KiSuspendNop and KiSuspendThread respectively) will be executed as soon as KiDeliverApc is called in the context of the target thread.
The SuspendApc is initialized in KeInitThread  this way :
Let's now take a look at KiSuspendThread normal APC routine :
It simply calls KeWaitForSingleObject to make the thread wait for the SuspendSemaphore to be in the signaled state.
The Suspend semaphore is also initialized in KeInitThread routine :
As you can see the count limit is set to 2 and the initial semaphore is 0. As we'll see later when talking about thread resumption : each synchronization object has a header structure defined as : _DISPATCHER_HEADER, this structure contains the synchronization object's Type (mutant , thread , semaphore ...) , Lock , SignalState fields and some other flags.
The SignalState field in a semaphore is the same as the semaphore count and the semaphore count must not exceed the limit. Semaphores ,when in signaled state (semaphore count > 0) , satisfy the wait for semaphore count threads and unsignal the semaphore. Means if 4 threads are waiting on a semaphore and it became in a signaled state with a semaphore count of 2 , 2 threads will satisfy the wait and the semaphore will become non-signaled. The next waiting thread won't get a chance to run until one of the released threads releases the semaphore , resulting in its semaphore count being incremented (signaled state). 

Let's get back to the SuspendSemaphore now. As I said earlier, it is initialized as non-signaled in the first place so when a thread is suspended it'll stay in the wait state until the semaphore becomes signaled. In fact KeResumeThread is the responsible routine for turning the semaphore into the signaled state and then calling KiSignalSynchronizationObject to unlink the wait block and signal the suspended thread (future posts).

As we discovered together what happens when suspending a thread in detail , the next blog posts will be dedicated to talking about what happens when we call ResumeThread or ZwResumeThread. Stay tuned.

Follow me on twitter : here
- Souhail

Thursday, November 27, 2014

Windows Thread Suspension Internals Part 1

It's been a while since I haven't shared anything concerning Windows internals and I'm back to talk in detail about how Windows thread suspension and resumption works. I'm going to discuss the mentioned topics in this blog post and incoming ones. Even though it can be discussed in one or two entries but I'm quite busy with studies.

As you might already know Windows uses APCs (Asynchronous Procedure Calls) to perform thread suspension. This may form an incomplete image of what's going on in detail as other tasks are being performed besides queuing the suspend APC. I will share throughout this article the details about what's happening and some pseudo code snippets of the reversed routines (Windows 7 32-bit SMP).

Let's say that a usermode thread 'A' wanted to suspend a second usermode thread 'B' , it has to simply call SuspendThread with a previously opened handle to the target thread.
DWORD WINAPI SuspendThread(HANDLE hThread);
Upon the call we'll be taken to kernel32.dll then to kernelbase.dll which simply provides a supplementary argument to NtSuspendThread and calls it from ntdll.dll .
NTSTATUS NtSuspendThread(HANDLE ThreadHandle,PULONG PreviousSuspendCount );
The thread's previous suspend count is basically copied from kernel to *PreviousSuspendCount.
Ntdll then takes us to kernel land where we'll be executing NtSuspendThread.

- NtSuspendThread :
 If we came from usermode (CurrentThread->PreviousMode == UserMode), probing the PreviousSuspendCount pointer for write is crucial. Next, a pointer to the target thread object is obtained by calling ObReferenceObjectByHandle , if we succeed PsSuspendThread is called ; its return type is NTSTATUS and that is the status code returned to the caller (in PreviousMode) after calling ObDereferenceObject and storing the previous count value in the OUT (PreviousSuspendCount) argument if it's not NULL.

- PsSuspendThread :
Prototype : NTSTATUS PsSuspendThread(PETHREAD Thread,PULONG PreviousSuspendCount)
Here's a pseudo C manual decompilation of the routine code :

As you can see, PsSuspendThread starts with entering a critical region and then it tries to acquire run-down protection of the target thread to suspend , acquiring run-down protection for the thread guarantees that we can access and operate on the thread object safely without it being deleted. As you might already know a present thread object in memory doesn't mean that the thread isn't terminating or wasn't terminated simply because an object isn't deleted until all the references on that object are released (reference count reaches zero). The next check of the Terminated bit explains it , so if the thread is actually terminating or was terminated PsSuspendProcess return STATUS_THREAD_IS_TERMINATING. Let's suppose that our thread is up and running. KeSuspendThread will be called as a result and ,unlike the previous routines, will returns the previous count that we've previously spoken about. As we'll see later on KeSuspendThread raises a critical exception (by calling RtlRaiseStatus) if the thread suspend limit was exceeded (0x7F) that causes a BSOD if no exception handler is in place , so the kernel calls this function within a try-except statement. Upon returning from KeSuspendThread successfully , a recheck of the target thread is done to see if the thread was terminating while suspending , if that's true the thread is forced to resume right away by calling KeForceResumeThread (we'll see this routine in detail later when talking about thread resumption) and the previous suspend count is zeroed. Finally the executing thread leaves the critical region and dereferences the PreviousSuspendCount pointer with the value returned from KeSuspendThread or 0 in the case where KeForceResumeThread was called.

That's all for this short entry , in the next parts about thread suspension I'll talk about KeSuspendThread , the suspend semaphore and the KiSuspendThread kernel APC routine.

Follow me on twitter : Here

- Souhail.