Identify hard-to-diagnose heap corruption problems
I would like to continue my series on heap debugging by demonstrating what is arguably the nuclear weaponry of heap debugging for Windows: full page heap. So, where do you purchase full page heap, and what does it do, exactly?
Full page heap is basically free and built into the Windows operating system. Heap corruption is one of the hardest problems to solve and reproduce. That's because, most of the time, the corruption occurred long before the effects are noticed. Full page heap puts measures in place to help capture the offender red-handed and at the precise point where the offense occurs. Full page heap offers a number of benefits, such as:
- It enables tracking for each allocation and free, including the call stack from when the operation occurred.
- It allows specific control of heap tracking as mentioned previously, including restricting the process to specific address ranges, specific allocation sizes, or allocations coming from specific DLLs.
- It provides support to catch buffer overruns and underruns when they occur.
- It provides support for identifying memory leaks at process shutdown.
- It provides support for catching dangling pointer use.
- It helps isolate corrupting wild memory writes.
The Sample Application
To demonstrate the capabilities of the full page heap, I created a sample application, as shown in Figure 1. You can build this application by using the Microsoft Visual Studio IDE, or you can build it from a Visual Studio or Windows SDK command line as follows (assuming code is stored in test.exe):
cl /EHs /Zi test.cpp /link /debug
Additionally, you must have the Debugging Tools for Windows installed. This component is currently available as part of the Windows SDK. You can get the latest build by running the Windows SDK web installer. At the time of this writing, the latest version is 7.1, and you can download it from the Microsoft Windows SDK page. All the debugging described in this article is performed with WinDbg. Notice that I included the __debugbreak() compiler intrinsic in the code. I like to set up WinDbg as my just-in-time (JIT) debugger, and __debugbreak() will immediately cause a break in the debugger. You can set up WinDbg as your JIT debugger by running the following:
windbg /IIf you prefer to run WinDbg in the debugger manually, you can remove the __debugbreak() intrinsic. And assuming you compiled the test code into test.exe, the following command is a quick way to get the application to run within the debugger and to wait at the beginning of main():
windbg -Q -c "bu test!main;g;" test.exe
Enabling Full Page Heap
The easiest way to enable full page heap is by using the gflags.exe tool included with the Debugging Tools for Windows. Some of you may be familiar with gflags.exe from using it to set or clear various global flags that affect your application or the system at runtime. If you run gflags.exe without any command-line arguments, it assumes you want it to run in GUI mode. In this article, I will be demonstrating its use through the command line. To have gflags.exe show you how to set any of the global flag bits, execute the following from an elevated command prompt:
gflags /?Also, these same bits may be inspected, set, or cleared from inside the debugger by using the !gflag command. However, there is a special command-line option that puts gflags.exe into a different mode of sorts—specifically, the page heap mode. To see all the options available in page heap mode, execute the following from an elevated command prompt:
gflags /p /?
Now gflags.exe shows you all the options specific to page heap configuration. Assuming the sample code in Figure 1 is compiled into test.exe, execute the following command to turn on full page heap for test.exe:
gflags /p /enable test.exe /full
Catching Buffer Overruns
One of the most common causes of heap corruption is the simple but sinister buffer overrun. When your application is running as usual without page heap, and it oversteps the bounds of a heap-allocated buffer, it will most likely trash an internal heap structure or another heap allocation. By default, full page heap will put your requested allocation at the end of a natural page boundary, which on x86-based and x64-based platforms is a 4,096-byte granularity. How does this help catch buffer overruns? It helps because full page heap will then place an uncommitted page in virtual memory after that page, so that any read or write operation to the following uncommitted page will immediate result in an access violation.
This behavior helps catch heap corruptors red-handed. But it comes at a price: In order to put your allocation at the end of a page, full page heap must allocate and commit an entire page, no matter how small your requested memory may be. The uncommitted page that follows your allocation does not commit memory, but it does waste a page in virtual memory space. This is why gflags.exe has various command-line options that allow you to specify which allocations you want satisfied using full page heap. For example, if you know the buffer being overrun is likely being allocated in xyz.dll, you can tell full page heap to target allocations only from xyz.dll.
Let's examine what it looks like to use full page heap from inside the debugger. Go ahead and run test.exe within WinDbg. After the application is running, set a breakpoint at the beginning of the code associated with the buffer overrun, and then choose "o" to overrun a buffer. After the code hits the breakpoint, step until immediately after the allocation of the buffer. Now, using the following dv command, you can find the pointer to the buffer in the heap. (Make sure you build your application with optimizations turned off on x64 platforms for dv to work easily):
buffer = 0x00000000`03e5bff0 "???"
Now you can use the !heap command within the debugger to inspect the full page heap allocation. Similar to gflags.exe, the !heap command's -p option enables page heap mode. For example, I encourage you to execute the following command from inside the debugger to get more information about !heap in page heap mode:
!heap -p -?
To inspect a particular allocation, use the -a option followed by the address of the allocation, as shown in Figure 2. The output in Figure 2 shows a lot of information, including the fact that the allocation is busy. But probably the most useful is the stack trace from the point of the allocation. This is very valuable when you are debugging a heap issue, and you need to know when a pointer to a heap allocation was allocated or freed. All the numbers in the command output in Figure 2 are hexadecimal, including the VirtSize of 0x2000, which is two pages. This includes both the committed page and the uncommitted page.
Buffer Memory After Allocation
Before we let the debugger continue, let's examine the actual memory of the buffer just after allocation. To do this, you can pass the address to dc, as shown below:
00000000`03e5bff0 c0c0c0c0 d0d0d0d0 d0d0d0d0 d0d0d0d0 ................
00000000`03e5c000 ???????? ???????? ???????? ???????? ????????????????
00000000`03e5c010 ???????? ???????? ???????? ???????? ????????????????
The bytes filled with 0xc0 are my allocation. In the code, I asked for only a four-byte buffer. The 0xd0 bytes are a fill pattern for the padding between my allocation and the end of the page. And the question marks indicate that the virtual address is not committed or backed by physical memory. Notice that the allocation is near the end of the page and that the question marks represent the uncommitted page following the.allocated page. If you back up even more in memory from the address of the buffer, you'll find an internal page heap structure, DPH_BLOCK_INFORMATION. To get the size of this structure, execute the following within the debugger:
On x64 platforms, the size of this structure is 0x40 bytes. So, if you back up that amount and inspect memory again, you'll see the structure and its beginning signature marker, 0xabcdbbbb, and its end signature marker, 0xdcbabbbb. You can find more information about this structure on your own. And using '!heap -p -?' is a good starting point.
Sharp readers may be wondering why my buffer allocation is not at the end of the page—only near the end of the page. After all, I just got through explaining that full page heap places allocations at the end of the page. Before I explain why, let's just let the debugger run and see what happens. Sure enough, we hit an access violation as expected. If you inspect the registers and the instruction the processor is attempting to execute, you'll see that we're attempting to write to the first byte of the page following our allocation. However, notice that the output on the command line is as follows:
If you examine the code, the first try to overrun the buffer only overstepped the bounds slightly. This caused me to stomp the 0xd0 fill pattern at the end of the page. If I had tried to free the allocation after that, full page heap would have complained because it checks the full pattern on free. If it notices a corrupted suffix pattern, it will stop in the debugger with a message of corrupted suffix pattern that resembles the message in Figure 3.
My second attempt to overrun the buffer generates the expected behavior. So, why did full page heap not place my buffer at the end of the page? By default, the heap will satisfy an allocation by returning a correctly aligned pointer for the platform where it is running. On x64 platforms, this results in an actual allocation that is 16 bytes but only four bytes of which are actually supposed to be used. But many buffer overruns are, in fact, sinister off-by-one errors. So, based on this default behavior and using this example code, we would not know until the buffer was freed whether it was just an off-by-one error. Finding this out at the point of free may be too late. So what are we to do?
Thankfully, gflags.exe offers a command-line argument to tell page heap that it is OK to return an unaligned pointer. To do this, execute the following command:
gflags /p /enable test.exe /full /unaligned
Now, if you redo the previous experiment and inspect the buffer immediately after allocation, you'll see that the allocation truly is at the end of the page, and full page heap catches my first attempt to overrun the buffer, as shown in Figure 4.
The output shown in Figure 4 is similar to the first attempt of a buffer-overrun case without using the /unaligned page heap option. That is, the code trashed the _DPH_BLOCK_INFORMATION just before the buffer in memory, and page heap is telling you about it when we free the memory. Unfortunately, if your free occurs much later after the corruption, it may be difficult to find the offender.
Note: This heap validation mechanism is how page heap helps catch wild writes that corrupt random areas in the heap. Unfortunately, you don't find out about this corruption until the allocation is freed instead of at the point of offense.
To solve this problem, gflags.exe offers a /backwards option that places your allocation at the front of the page and puts an uncommitted page just before the allocation in virtual memory. This triggers an access violation the moment the buffer underrun occurs. For example, configure full page heap by using the following command:
gflags /p /enable test.exe /backwards
Then, execute the test application again and choose the buffer underrun code path. Now the debugger catches the access violation immediately.
Incidentally, when the /backwards switch is used, structures such as _DPH_BLOCK_INFORMATION are placed at the end of the allocation page instead of just before the allocation.
Catching Dangling Pointer Use
Another common cause of heap corruption involves using a pointer to an object or buffer after that allocation was already freed. This is generally known as using a dangling pointer. Full page heap does a really good job of handling this situation. To demonstrate, step through the dangling pointer code path of the sample application in the debugger.
Immediately after the instantiation of the new object, you can pass the returned pointer to '!heap -p -a' within the debugger to get the familiar detail of the allocation, including the stack from the point of allocation. But let's see what happens right after you step over the object delete. If you pass the pointer to '!heap -p -a' after the free, you will notice that it says the allocation is a "free-ed" allocation, and it also shows you the stack of code that freed it, as shown in Figure 5.
The fact that !heap shows you the stack at the point of free is very valuable because premature frees are common bugs leading to dangling pointer use. Now you can see which code path freed the allocation.
Now, try to step over the line of code that attempts to use the dangling pointer, and notice that the debugger breaks with a familiar-looking access violation. This happens because, after the allocation was put on the free list, full page heap marked the page as inaccessible. You can see this process by passing the address to !address within the debugger. You'll see that the page is marked PAGE_NOACCESS, as shown in Figure 6.
A Valuable Tool
As you can see, full page heap is a very valuable addition to your toolbox for catching the most common causes of heap corruption. The best part is, it's virtually free. Free, that is, because if you have a valid Windows license, you have full page heap. I highly encourage you to experiment with the various options and switches full page heap provides and to incorporate full page heap into your testing and troubleshooting regimen.
Trey Nash ( firstname.lastname@example.org) is a senior escalation engineer at Microsoft working on the Windows OSs and various other products and is the author of Accelerated C# 2010 (Apress). When not working feverishly within the bowels of the OS, he is delivering training on the .NET platform and kernel-mode debugging or playing ice hockey.