Monday, June 5, 2017

Exploring Virtual Address Descriptors under Windows 10

This blog post is about my personal attempt to superficially list VAD types under Windows 10. It all started when I was wondering, out of sheer curiosity, if there's any way to determine the VAD type (MMVAD_SHORT or MMVAD) other than by looking at the pool tag preceding the structure. In order to do that, I had to list all VAD types, do some reverse engineering, and then draw a table describing what I've been able to find.
You can view the full document by clicking here 

From the table above it is possible to deduce the VAD structure type from both the VadType and PrivateMemory flags.

VadType flag
PrivateMemory flag

To test it out, I wrote a kernel driver that prints the deduced VAD type for each node of calc.exe. It also prints the pool tag so we can check the result.

And that's all for this article.
You can follow me on Twitter : here

Monday, May 22, 2017

RCTF 2017 - Crackme 714 pts Writeup

Crackme 714 pts (9 solves) :

The crackme is an MFC application :


We can locate the routine of interest by setting a breakpoint on GetWindowTextW. Keep in mind that the input is in Unicode.
Later on, we find that the program generates two distinct blocks of values. These are generated from hard-coded values independently from the user input, so they're always the same. We call the first one : static_block1 and the second static_block2.
Then, there's the call to the encrypt function which takes static_block1 as an argument.
The encrypted block will then be XORed with static_block2.
We also find a reference to the encrypted valid key here, which we can extract easily during runtime :

The loop above performs a double-word by double-word comparison of the encrypted user input with the encrypted valid key that already came with the crackme.

In order to solve the challenge we need to reverse engineer the encrypt function and do the exact reverse. We also don't have to forget the XOR that is done with static_block2. For that matter, we supply to the decrypt function (the one we need to write) encrypted_valid_key XOR static_block2.

The script below has my implementation of the decrypt function, it outputs the key to flag.txt :

All we need to do now, is provide the decrypted key and the flag will be displayed.

The flag is : RCTF{rtf2017crackmebyw31y1l1n9}

See you again soon
Follow me on Twitter : here

Friday, April 7, 2017

Nuit du Hack XV Quals - Reverse 350: Matriochka step 4 (I did it again)

Binary file download :

This script, when executed under IDA, writes the correct input to an output file :

The flag is simply the md5sum of this file :

Sunday, March 1, 2015

Boston key party 2015 - Community College Reversing 300 Writeup

The binary is a c++ compiled code under MIPS architecture that takes the flag as a command line argument. It uses a c++ list to store the whole flag in binary form and a class called Wires to store 3 'bits' (words in fact) in 3 different fields. In order to access those field the class has 4 different functions, one to initialize the 3 fields, and others to retrieve the value of the each one of them.
A vector of type Wires is created in order to store the flag , e.i : the previously created list is converted to that vector. The difference is that each field of the vector stores 3 'bits' now and each new field is pushed onto the back of the vector.
After setting up the vector, the binary start to somehow shuffle (check script) the bits around using a recursive function that calls itself 8196 times. Finally, the result vector is converted to a string a compared to another string in memory :

Here's a C script automating the process and printing the flag :
flag : myheadmateisafredkin!

binary download : here
See you again soon.
Follow me on twitter : here

Saturday, February 14, 2015

Windows Internals - Thread resumption and synchronization objects

Hello, in the two previous blog entries I discussed how thread suspension works. I'll dedicate this post to share my research concerning thread resumption, it was crucial to explore some parts of the internal synchronization mechanisms to achieve a better understanding. As usual, the reversing was done on a Windows 7 32-bit machine.

To resume a suspended thread you normally call ResumeThread from usermode or ZwResumeThread from kernelmode, as a result you'll be landing in the NtResumeThread kernel function, it's very similar to NtSuspendThread that I already talked about in the previous posts.
This is the function's prototype :
NTSTATUS NtResumeThread(HANDLE ThreadHandle,PULONG PreviousSuspendCount)

It returns a status code and optionally a previous suspend count indicating the thread's suspend count before it was decremented, as you might already know suspending a thread x times requires resuming it x times to make it able to run again.
In order to start the thread resumption and to get the previous suspend count, NtResumeThread calls KeResumeThread which prototype is the following :
LONG KeResumeThread(PKTHREAD Thread)

KeResumeThread returns the previous suspend thread count and resumes the thread if the suspend count reached 0. Let's look more closely at this function :

First the IRQL is raised to DISPATCH_LEVEL and the target thread's ApcQueueLock is acquired, after that the previous thread count is saved. If it isn't null, the thread was in fact suspended and the routine wasn't called just by mistake on a thread in a different state. The suspend count is then decremented and checked alongside the freeze count against 0. If both of them are null, the thread must be resumed and here where it gets interesting : A thread is suspended when executing an APC that just waits on the thread's Suspend Semaphore to switch into the signaled state. This Semaphore is initially in the non-signaled state and won't switch its state until the thread has to be resumed or was forced to be resumed (KeForceResumeThread).

Like any other synchronization object (mutex,thread,timer...) a semaphore has a header structure (_DISPATCH_HEADER). Its most important fields are the type, signal state, lock and the wait list head.

The WaitListHead field is the doubly linked list head of the wait blocks (KWAIT_BLOCK) waiting for the synchronization object. Let's dump KWAIT_BLOCK structure :
- The pointers to the next and previous wait block waiting on the same synchronization object are in the LIST_ENTRY WaitListEntry field. e.i : if there are threads waiting on a synchronization object, the dispatch header's WaitListHead field points to the first block's WaitListEntry field. The object fields of the wait blocks in this list is the same, but the thread field isn't.
- The NextWaitBlock field points to the next wait block when the wait is performed using KeWaitForMultipleObjects and each object field in this list points to a different synchronization object but the thread field is the same.
- The WaitKey field is the index of the wait block in the array used by KeWaitForMultipleEvents (either the default or the supplied array : see msdn). Its type is NTSTATUS and serves to know the index of the signaled object in case WaitAny was supplied. In KeWaitForSingleEvent this field is set to 0x00 (STATUS_SUCCESS/STATUS_WAIT_0).
- WaitType : WaitAll or WaitAny when waiting for multiple objects. WaitAny by default when waiting on a single object.

Back to KeResumeThread, if the signal state field value is greater than 0, then the synchronization object is in the signaled state and the wait could be satisfied for the thread(s) waiting on that object (depends on the object though). Compared to a mutex a semaphore is able to satisfy a wait for more than one single thread, a semaphore object has a Limit field in its structure indicating the limit of those threads. In addition, a semaphore has a semaphore count which is the SignalState field ; its value can't be above the Limit. Being in the signaled state, a semaphore will satisfy the wait for semaphore count threads.
KeResumeThread turns the semaphore into the signaled state by incrementing its count and then it calls KiSignalSynchronizationObject. Here's the routine :
The WaitListHead comes into scene in this function, where it is used to walk the doubly linked list of KWAIT_BLOCK structures waiting on the synchronization object. I forgot to mention earlier that the thread object structure KTHREAD stores 4 KWAIT_BLOCK structures in an array, more than one WaitBlock is clearly used when the thread is waiting on multiple objects , the msdn documentation on KeWaitForMultipleObjects discusses that point. The WaitBlock is mainly initialized inside KeWaitForSingleObject or KeWaitForMultipleObjects and then inserted in the tail of the KWAIT_BLOCK structures waiting list of the synchronization object.
You notice from the code above that WaitBlock->WaitType is checked, let's see the type definition of the WaitType field type.

typedef enum _WAIT_TYPE {

- WaitAll means that the wait isn't satisfied for the waiting thread until all the synchronization object become in the signaled state (KeWaitForMultipleObjects).
- WaitAny means that the wait is satisfied for the thread if at least one synchronization object turns into the signaled state.

Let's get back to where we stopped and treat each case alone. If the WaitType is WaitAny, an attempt to unwait the waiting thread is made by calling KiTryUnwaitThread (we'll looking into this function shortly). If the thread exited the wait state, then the synchronization object's signaled state field is decremented. If it reached 0 as a result, we stop iterating through the wait blocks linked list because the synchronization object would be in the non-signaled state.
Now let's see if the WaitType is equal to WaitAll ; In that case only a call to KiTryUnwaitThread is made.
The arguments given to KiTryUnwaitThread are quite different in the two cases. Here is the decompilation of parts that interest us of the function :
The function appears to call KiSignalThread , let's take a look at it too :
In general, KiTryUnwaitThread calls KiSignalThread if the thread is waiting and return a boolean value describing if the thread was signaled or not. In fact this boolean value is returned by KiSignalThread, this function unlinks the thread from the linked list of threads in the waiting state for the processor it was executing in before exiting the running state (WaitPrcb), then it inserts the thread into the deferred ready list and set its state to DeferredReady , after that it sets the Thread->WaitStatus to the same status code passed to KiTryUnwaitThread and then it returns TRUE. KiSignalThread does what I described previously if the Thread->WaitRegister.State == 1; KiCommitThreadWait initializes this field to 1. But if Thread->WaitRegister.State == 0 (this field is initialized to 0 by KeDelayExecutionThread), the WaitStatus is set to the status code and TRUE is returned.
The Thread->WaitStatus field is returned by KiSwapThread function which is called by KeWaitForSingleObject and KeWaitForMultipleObjects. KiSwapThread basically won't return to KiCommitThreadWait until the waiting thread exited the wait state (KiSignalThread). In our case, KiCommitThreadWait returns to KeWaitForXXXObject(s) with the WaitStatus as a return value. This WaitStatus describes the reason why the thread was awaken from its wait state. KeWaitForXXXObject(s) checks on this return value. Here's a very simplified pseudo code of what interests us:

Everything has become quite clear at this stage to explain why KiSignalSynchronizationObject supplies different arguments to KiTryUnwaitThread and also why it decrements the SignalState when the wait type is WaitAny. Let me explain :
When the wait type is WaitAny, this means that the waiting thread entered the wait state upon calling KeWaitForSingleObject or KeWaitForMultipleObject with the WaitAny wait type.Thus, KiTryUnwaitThread is called with the WaitBlock->WaitKey as the wait status. So when the awaken thread returns from KiCommitThreadWait in KeWaitForMultipleObjects the wait status won't be STATUS_KERNEL_APC and we'll bail out directly returning the index of the signaled synchronization object. In this case, the synchronization object signal state wasn't touched that's why it must be decremented after successfully unwaiting the thread.
Let's see now, if the wait type is WaitAll ; this implicates that the waiting thread waits for multiple objects to become in the signal state. That's why KiTryUnwaitThread is called with STATUS_KERNEL_APC so that KeWaitForMultipleObjects iterates again and checks the signaled state of all synchronization objects. If it turns out that they're all signaled KeWaitForMultipleObject takes care this time of decrementing or zeroing (depends on the object) the signal state of all the synchronization objects the thread was waiting on.

Let's continue the story of the suspended thread and see what happens when it's finally resumed. Now that the semaphore is signaled and therefore the thread is deferred ready thanks to KiSignalThread, it will be scheduled to run soon. When it does run it will return from KeWaitForSingleEvent with a STATUS_SUCCESS/STATUS_WAIT_0 (Status != STATUS_KERNEL_APC). We're now in the kernel APC routine after returning...

Conclusion :
While thread suspension relies on queuing a kernel APC calling WaitForSingleEvent on a suspend semaphore, thread resumption takes us more deeply into exploring synchronization objects and how the waiting threads behave differently when waiting on a single or multiple objects.

I hope you enjoyed reading this post.
Follow me on twitter : Here

Saturday, November 29, 2014

Windows Thread Suspension Internals Part 2

In the last blog post I talked about both NtSuspendThread and PsSuspendThread kernel routines. If you didn't check the first part I recommend to check it first : here
This part is dedicated to KeSuspendThread and KiSuspendThread routines (fun stuff).
Let's get started by looking a KeSuspendThread : (Windows 7 32-bit SMP as usual)
(pseudo-C) :
A quick overview of KeSuspendThread shows that it's actually the one responsible of calling KiInsertQueueApc in order to queue the target thread's suspend APC in its kernel APC queue. But that's not the only thing happening here , so take it slow and go step by step into what routine does.

As you can notice we start first by raising the irql to DISPATCH_LEVEL, this means we're running in the same irql where the thread dispatcher does so our thread is guaranteed to be running on this processor until the irql drops below DISPATCH_LEVEL. As I'm on a multiprocessor machine this doesn't protect from accessing shared objects safely as another thread executing on another processor might access the object simultaneously. That's why a couple of locks must be acquired in order to continue the execution of the routine , the first lock that KeSuspendThread tries to acquire is the APC queue lock (Thread->ApcQueueLock). After acquiring the lock, execution continues and the thread's previous suspend count is saved , then it is compared with the maximum value that a suspend count might reach (0x7F). The irql is lowered to it's old value and a fatal exception is raised with status (STATUS_SUSPEND_COUNT_EXCEEDED) if the SuspendCount is equal to that value. As I mentioned in the last part PsSuspendThread calls KeSuspendThread within a try-except statement so the machine won't bugcheck as a result of that exception.
If the target thread's suspend count is lower that 0x7F (general case), a second check is done against Thread->ApcQueuable bit to check whether APCs can be queued to that thread or no. Here I want to mention that if you patch that bit using windbg or a driver of a given thread object that thread becomes immune to suspending and even termination as it is done also using an APC.
If the bit is set (generally the case also), the target thread's suspend count is incremented. Next , the routine checks if the thread isn't suspended nor frozen.
If that's also true a third check is done :
line 29 : if(Thread->SuspendApc.Inserted == TRUE) { ....
The SuspendApc is a KAPC stucture , and the Inserted field is a boolean that represents whether the APC was inserted in the APCs queue or not.
Let's start by seeing the else statement at line 38 first and get back to this check. So basically we'll be in the else statement if (SuspendApc.Inserted == FALSE) , it will simply set the APC's Inserted boolean to TRUE and then call KiInsertQueueApc to insert the suspend APC in the target's thread kernel APCs queue. KiInsertQueueApc is internally called by the exported KeInsertQueueApc.

The check at line 29 is confusing, since if the SuspendApc.Inserted is TRUE this already means that the suspend count is different than 0 so we won't even reach this if statement.As we'll see in a later article KeResumeThread is the routine that actually decrements the SuspendCount but it doesn't proceed to do so until it acquires the ApcQueue lock , so this eliminates the fact that KeResumeThread and KeSuspendThread are operating simultaneously on the same target thread (SMP machine). If this check turns out true for a reason , we acquire a lock to safely access and modify the SuspendSemaphore initialized previously by &Thread->SuspendSemaphore and then decrement the Semaphore Count to turn it into the non-signaled state apparently.
If the SuspendApc is now queued , its kernel and normal routines (KiSuspendNop and KiSuspendThread respectively) will be executed as soon as KiDeliverApc is called in the context of the target thread.
The SuspendApc is initialized in KeInitThread  this way :
Let's now take a look at KiSuspendThread normal APC routine :
It simply calls KeWaitForSingleObject to make the thread wait for the SuspendSemaphore to be in the signaled state.
The Suspend semaphore is also initialized in KeInitThread routine :
As you can see the count limit is set to 2 and the initial semaphore is 0. As we'll see later when talking about thread resumption : each synchronization object has a header structure defined as : _DISPATCHER_HEADER, this structure contains the synchronization object's Type (mutant , thread , semaphore ...) , Lock , SignalState fields and some other flags.
The SignalState field in a semaphore is the same as the semaphore count and the semaphore count must not exceed the limit. Semaphores ,when in signaled state (semaphore count > 0) , satisfy the wait for semaphore count threads and unsignal the semaphore. Means if 4 threads are waiting on a semaphore and it became in a signaled state with a semaphore count of 2 , 2 threads will satisfy the wait and the semaphore will become non-signaled. The next waiting thread won't get a chance to run until one of the released threads releases the semaphore , resulting in its semaphore count being incremented (signaled state). 

Let's get back to the SuspendSemaphore now. As I said earlier, it is initialized as non-signaled in the first place so when a thread is suspended it'll stay in the wait state until the semaphore becomes signaled. In fact KeResumeThread is the responsible routine for turning the semaphore into the signaled state and then calling KiSignalSynchronizationObject to unlink the wait block and signal the suspended thread (future posts).

As we discovered together what happens when suspending a thread in detail , the next blog posts will be dedicated to talking about what happens when we call ResumeThread or ZwResumeThread. Stay tuned.

Follow me on twitter : here
- Souhail

Thursday, November 27, 2014

Windows Thread Suspension Internals Part 1

It's been a while since I haven't shared anything concerning Windows internals and I'm back to talk in detail about how Windows thread suspension and resumption works. I'm going to discuss the mentioned topics in this blog post and incoming ones. Even though it can be discussed in one or two entries but I'm quite busy with studies.

As you might already know Windows uses APCs (Asynchronous Procedure Calls) to perform thread suspension. This may form an incomplete image of what's going on in detail as other tasks are being performed besides queuing the suspend APC. I will share throughout this article the details about what's happening and some pseudo code snippets of the reversed routines (Windows 7 32-bit SMP).

Let's say that a usermode thread 'A' wanted to suspend a second usermode thread 'B' , it has to simply call SuspendThread with a previously opened handle to the target thread.
DWORD WINAPI SuspendThread(HANDLE hThread);
Upon the call we'll be taken to kernel32.dll then to kernelbase.dll which simply provides a supplementary argument to NtSuspendThread and calls it from ntdll.dll .
NTSTATUS NtSuspendThread(HANDLE ThreadHandle,PULONG PreviousSuspendCount );
The thread's previous suspend count is basically copied from kernel to *PreviousSuspendCount.
Ntdll then takes us to kernel land where we'll be executing NtSuspendThread.

- NtSuspendThread :
 If we came from usermode (CurrentThread->PreviousMode == UserMode), probing the PreviousSuspendCount pointer for write is crucial. Next, a pointer to the target thread object is obtained by calling ObReferenceObjectByHandle , if we succeed PsSuspendThread is called ; its return type is NTSTATUS and that is the status code returned to the caller (in PreviousMode) after calling ObDereferenceObject and storing the previous count value in the OUT (PreviousSuspendCount) argument if it's not NULL.

- PsSuspendThread :
Prototype : NTSTATUS PsSuspendThread(PETHREAD Thread,PULONG PreviousSuspendCount)
Here's a pseudo C manual decompilation of the routine code :

As you can see, PsSuspendThread starts with entering a critical region and then it tries to acquire run-down protection of the target thread to suspend , acquiring run-down protection for the thread guarantees that we can access and operate on the thread object safely without it being deleted. As you might already know a present thread object in memory doesn't mean that the thread isn't terminating or wasn't terminated simply because an object isn't deleted until all the references on that object are released (reference count reaches zero). The next check of the Terminated bit explains it , so if the thread is actually terminating or was terminated PsSuspendProcess return STATUS_THREAD_IS_TERMINATING. Let's suppose that our thread is up and running. KeSuspendThread will be called as a result and ,unlike the previous routines, will returns the previous count that we've previously spoken about. As we'll see later on KeSuspendThread raises a critical exception (by calling RtlRaiseStatus) if the thread suspend limit was exceeded (0x7F) that causes a BSOD if no exception handler is in place , so the kernel calls this function within a try-except statement. Upon returning from KeSuspendThread successfully , a recheck of the target thread is done to see if the thread was terminating while suspending , if that's true the thread is forced to resume right away by calling KeForceResumeThread (we'll see this routine in detail later when talking about thread resumption) and the previous suspend count is zeroed. Finally the executing thread leaves the critical region and dereferences the PreviousSuspendCount pointer with the value returned from KeSuspendThread or 0 in the case where KeForceResumeThread was called.

That's all for this short entry , in the next parts about thread suspension I'll talk about KeSuspendThread , the suspend semaphore and the KiSuspendThread kernel APC routine.

Follow me on twitter : Here

- Souhail.