The blog continues at suszter.com/ReversingOnWindows

August 28, 2012

Read-Before-Write: An Example for False Positive Detection

During the Bank Holiday weekend I was working on testing and stabilizing one of my Windbg extension commands which detects read-before-write bugs. According to the test result some detection appeared to be false positive, and in this post I discuss a case that is interesting to me.

Instead of analyzing the test result in the Windbg log file I decided to do it in IDA. I took the errors from the Windbg log and simply highlighted the instructions causing read-before-write in IDA. The highlight idea was borrowed from team ZDI (thanks).

Below is the code snippet wherein an error detected that appears to be a false positive.
There are two questions need answering:
  • Why is it a false positive detection?
  • Why the detection was mistakenly issued?
Answering why is it a false positive question is very straightforward. void* is pushed onto the stack that is a parameter for delete(). delete() uses cdecl calling convention so it doesn't clean-up void* on the stack but pop ecx in the red line does. pop ecx reads void*. The memory address void* read from has been written by push instruction. So the red line cannot be a read-before-write access.

Answering the second question: why the detection was mistakenly issued is a bit more complex and requires knowledge how my Windbg plugin extension works - read the principles here if you like. When push [ebp+var_78] is executed it does the followings. First, it reads var_78 local variable, then it writes (pushes) var_78 onto the stack. The address of the read access and the address of the write access are both stack memory addresses. In this example, the protection flags were set on stack memory region by the Windbg command extension. As mentioned above when push is executed read access occurs first. We handle this by removing the protection flags and let the push execute. Since the memory protection flags removed push didn't have a chance to cause a write access exception, therefore the address is not added to the list of written addresses. Later on the execution, pop causes a read access violation and the plugin checks if the address has been written. Since the address is not on the list of written addresses it issues a read-before-write notification, mistakenly. The answer in short is because we didn't catch the write access of push.

The current implementation wrongly assumes that an instruction could cause either read or write access. As seen above, in fact, an instruction could cause both read and write accesses. A possible fix would be to deal with the situation of multiple accesses. However, I don't think it's a priority to implement at the moment because it's possible to do a post process on the output log to mark or to remove these situations when showed in IDA.

August 20, 2012

Research notes of finding Stale Pointer bugs

Keywords: use-after-free, dangling pointer, stale pointer, invalid pointer dereference, double free, deleted object

In 2009 Mozilla fixed one of the vulnerability (CVE-2009-2467) I reported them. It had allowed to reference freed object leading to code execution. Some time after this in 2010 I got interested researching this class of security bugs. I pursued dynamic approach to discover them but I wanted different approach than feeding the application with fuzzed data. It was also a requirement to discover these problems in binaries without having source code or debug information available.

I thought it would be nice to make it visible if a pointer is dangling. I knew detecting the pointer that is dangling could come with lots of false positives because some of them cannot be referenced from the execution flow.

Anyway, I came up with the following train of thought. malloc() is used to allocate a region of memory on the heap. The pointer referencing to the region of memory can be on the stack or on another region of the heap. So we need to handle the situation in a different way if a pointer is a dangling pointer on the stack or on the heap.

I kept working on this approach and thought we need to maintain a structure what region of memory was allocated and freed. I thought when free() is called we could check if there is a reference to the freed memory. Here is a skeleton of the approach I draw back in 2010 - it's unprofessional and not so important so you might wanna continue reading instead... :)

The solution above wouldn't have worked in practice however. The conceptual problem here is if there is a reference to a freed object it doesn't mean the code would use the pointer is reachable. When a region of memory is freed the reference might exist to the freed object even if you explicitly set it to NULL. This is because the compiler optimizes this out if it cannot be reached. Another conceptual problem is the original idea itself that is we depend on the check for the references only when free() is called.

I had discussed this idea to people how could this be improved but concluded none of the solutions would be practically applicable. Possible improvements involve timing, and applied static analysis. Static analysis in dynamic approach might be an area to explore further but this would require significant research effort and showed only a little benefit that time.

The low-level constructs are complex so I knew we could detect something that indicates the presence of dangling pointer because in a complex environment it's so big the playground. I suspended working on this for a long time, however, with the fact in my mind that I have a solution that show the sign of working. It was just not optimal enough to use it in practice.

Couple of weeks ago I started working on a debugger extension to place data breakpoint on arbitrary size of the memory. I had a huge success and already built two functionalities on it - both of them detect possible security problems. Thought why not give it a try to explore the old dangling pointer project further involving this new approach.

From the previous posts, you might know that it's possible to track data access, and to determine the kind of the access that is either read or write access. By applying hook on malloc() and on free() we can maintain a list of allocated and freed memory blocks. When there is a read data access to a pointer to freed memory we can issue a notification: pointer to freed memory has been read.

Here is an isolated example code that reads pointer to freed memory.
int *read_freed_ptr(void)
{
    int *ptr = (int *)malloc(sizeof(int));

    free(ptr);

    // read pointer to freed memory
    return ptr;
}
When a freed pointer is read it might not cause a crash later on the execution but definitely could do if the pointer is dereferenced. It is possible, for example, a lot of fuzzing cases cause the application to read freed pointer but the bugs remain undetected because they are not dereferenced. I'm particularly interested researching this approach further on JIT emitted code.

Anyway, here is the assembly code for the above C code. I highlighted the area when the pointer to the freed memory is read.
00401000 55                   push        ebp  
00401001 8B EC                mov         ebp,esp  
00401003 51                   push        ecx  
00401004 6A 04                push        4  
00401006 FF 15 A4 20 40 00    call        dword ptr [__imp__malloc (4020A4h)]  
0040100C 83 C4 04             add         esp,4  
0040100F 89 45 FC             mov         dword ptr [ptr],eax  
00401012 8B 45 FC             mov         eax,dword ptr [ptr]  
00401015 50                   push        eax  
00401016 FF 15 9C 20 40 00    call        dword ptr [__imp__free (40209Ch)]  
0040101C 83 C4 04             add         esp,4  
0040101F 8B 45 FC             mov         eax,dword ptr [ptr]  
00401022 8B E5                mov         esp,ebp  
00401024 5D                   pop         ebp  
00401025 C3                   ret
Detecting read of pointer to freed memory is possible and straightforward task to do prototype implementation.

--Attila Suszter (@reon_wi)

August 13, 2012

An approach to detect signedness conversion

Here come the details of the code that is built on the top of the functionality involving detecting uninitialized read access to stack memory.

With the current implementation it's possible to make it visible how certain integers on the stack are being treated: whether signed or unsigned. This means if there is an integer that is accessed from multiple locations in the execution flow and we successfully determined how the integer was treated at each location in the execution flow we are able to tell if the integer is treated both signed and unsigned.

When there is a read access to the stack memory (a local variable read) and the instruction causing the read access exception is CMP we read the next instruction. If the next instruction is one of the followings the comparison is signed because these jumps based on signed comparisons: JG, JGE, JL, JLE, JNG, JNGE, JNL, JNLE. If the next instruction is one of the following the comparison is unsigned: JA, JAE, JB, JBE, JNA, JNAE, JNB, JNBE. Below is an example.

Signedness conversion is not a vulnerability but easily could be. For example, when the developer eliminates a signed/unsigned mismatch compiler warning in an if() condition by using explicit typecast. This could lead to the situation the program works normally but when the variable contains an unexpected value the execution continues on a different code path than it should due to the signedness conversion.

The signedness conversion could happen implicitly, too.

I wrote a test that you can see below to show how the code works.
// TestSignednessConversion.cpp : Example program for signedness conversion.
// Compile with optimization disabled.

#include "stdafx.h"

void set(int* value, int* value2)
{
    if (*value > 2)
    {
        *value = 2;
    }

    if (*value2 < 2)
    {
        *value2 = 2;
    }
}

void set2(unsigned int* value)
{
    if (*value > 2)
    {
        *value = 2;
    }
}

void set3(int* value2)
{
    if (*value2 < 0xffffffffU)
    {
        *value2 = 2;
    }
}

int _tmain(int argc, _TCHAR* argv[])
{
    int          value  = 3; // Treated both as signed and as unsigned (explicit cast)
    int          value2 = 1; // Treated both as signed and as unsigned (implicit)
    unsigned int value3 = 3; // Treated as unsigned only

    // value treated as signed
    // value2 treated as signed
    set(&value, &value2);

    // value treated as unsigned (explicit cast)
    set2((unsigned int*)&value);

    // value2 treated as unsigned (implicit)
    set3(&value2);

    // value3 treated as unsigned
    set2(&value3);

    return 0;
}
And here is the corresponding log of the program that is a Windbg extension. The log is verbose showing both read and write stack memory accesses. I highlighted the parts when the signedness of the comparison determined so you can match the parts to the source code above. You can also see when a variable is treated both as signed and as unsigned. U stands for unsigned comparison, S stands for signed comparison.
0:000> g wmain
eax=002b1a40 ebx=00000000 ecx=5d47471c edx=00000000 esi=00000001 edi=00403374
eip=00401070 esp=0018ff48 ebp=0018ff88 iopl=0         nv up ei pl zr na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
TestSignednessConversion!wmain:
00401070 55              push    ebp
0:000> l-t
Source options are 0:
    None
0:000> sxi av
0:000> !load c:\work\ext
0:000> !vprot esp
BaseAddress:       0018f000
AllocationBase:    00090000
AllocationProtect: 00000004  PAGE_READWRITE
RegionSize:        00001000
State:             00001000  MEM_COMMIT
Protect:           00000004  PAGE_READWRITE
Type:              00020000  MEM_PRIVATE
0:000> !region 56 18f000 1000
[Data Access] EIP=00401070 Data=0018ff44 W -
[Data Access] 00401070 55              push    ebp
[Data Access] EIP=00401076 Data=0018ff40 W -
[Data Access] 00401076 c745fc03000000  mov     dword ptr [ebp-4],3
[Data Access] EIP=0040107d Data=0018ff38 W -
[Data Access] 0040107d c745f401000000  mov     dword ptr [ebp-0Ch],1
[Data Access] EIP=00401084 Data=0018ff3c W -
[Data Access] 00401084 c745f803000000  mov     dword ptr [ebp-8],3
[Data Access] EIP=0040108e Data=0018ff34 W -
[Data Access] 0040108e 50              push    eax
[Data Access] EIP=00401092 Data=0018ff30 W -
[Data Access] 00401092 51              push    ecx
[Data Access] EIP=00401093 Data=0018ff2c W -
[Data Access] 00401093 e868ffffff      call    TestSignednessConversion!set (00401000)
[Data Access] EIP=00401000 Data=0018ff28 W -
[Data Access] 00401000 55              push    ebp
[Data Access] Value at 0018ff30 is read.
[Data Access] EIP=00401003 Data=0018ff30 R -
[Data Access] 00401003 8b4508          mov     eax,dword ptr [ebp+8]
[Data Access] Value at 0018ff40 is read.
[Data Access] EIP=00401006 Data=0018ff40 R S
[Data Access] 00401006 833802          cmp     dword ptr [eax],2
[Data Access] Value at 0018ff30 is read.
[Data Access] EIP=0040100b Data=0018ff30 R -
[Data Access] 0040100b 8b4d08          mov     ecx,dword ptr [ebp+8]
[Data Access] EIP=0040100e Data=0018ff40 W -
[Data Access] 0040100e c70102000000    mov     dword ptr [ecx],2
[Data Access] Value at 0018ff34 is read.
[Data Access] EIP=00401014 Data=0018ff34 R -
[Data Access] 00401014 8b550c          mov     edx,dword ptr [ebp+0Ch]
[Data Access] Value at 0018ff38 is read.
[Data Access] EIP=00401017 Data=0018ff38 R S
[Data Access] 00401017 833a02          cmp     dword ptr [edx],2
[Data Access] Value at 0018ff34 is read.
[Data Access] EIP=0040101c Data=0018ff34 R -
[Data Access] 0040101c 8b450c          mov     eax,dword ptr [ebp+0Ch]
[Data Access] EIP=0040101f Data=0018ff38 W -
[Data Access] 0040101f c70002000000    mov     dword ptr [eax],2
[Data Access] Value at 0018ff28 is read.
[Data Access] EIP=00401025 Data=0018ff28 R -
[Data Access] 00401025 5d              pop     ebp
[Data Access] Value at 0018ff2c is read.
[Data Access] EIP=00401026 Data=0018ff2c R -
[Data Access] 00401026 c3              ret
[Data Access] EIP=0040109e Data=0018ff34 W -
[Data Access] 0040109e 52              push    edx
[Data Access] EIP=0040109f Data=0018ff30 W -
[Data Access] 0040109f e88cffffff      call    TestSignednessConversion!set2 (00401030)
[Data Access] EIP=00401030 Data=0018ff2c W -
[Data Access] 00401030 55              push    ebp
[Data Access] Value at 0018ff34 is read.
[Data Access] EIP=00401033 Data=0018ff34 R -
[Data Access] 00401033 8b4508          mov     eax,dword ptr [ebp+8]
[Data Access] Value at 0018ff40 is treated both as signed and as unsigned. 
[Data Access] EIP=00401036 Data=0018ff40 R U
[Data Access] 00401036 833802          cmp     dword ptr [eax],2
[Data Access] Value at 0018ff2c is read.
[Data Access] EIP=00401044 Data=0018ff2c R -
[Data Access] 00401044 5d              pop     ebp
[Data Access] Value at 0018ff30 is read.
[Data Access] EIP=00401045 Data=0018ff30 R -
[Data Access] 00401045 c3              ret
[Data Access] EIP=004010aa Data=0018ff34 W -
[Data Access] 004010aa 50              push    eax
[Data Access] EIP=004010ab Data=0018ff30 W -
[Data Access] 004010ab e8a0ffffff      call    TestSignednessConversion!set3 (00401050)
[Data Access] EIP=00401050 Data=0018ff2c W -
[Data Access] 00401050 55              push    ebp
[Data Access] Value at 0018ff34 is read.
[Data Access] EIP=00401053 Data=0018ff34 R -
[Data Access] 00401053 8b4508          mov     eax,dword ptr [ebp+8]
[Data Access] Value at 0018ff38 is treated both as signed and as unsigned. 
[Data Access] EIP=00401056 Data=0018ff38 R U
[Data Access] 00401056 8338ff          cmp     dword ptr [eax],0FFFFFFFFh
[Data Access] Value at 0018ff34 is read.
[Data Access] EIP=0040105b Data=0018ff34 R -
[Data Access] 0040105b 8b4d08          mov     ecx,dword ptr [ebp+8]
[Data Access] EIP=0040105e Data=0018ff38 W -
[Data Access] 0040105e c70102000000    mov     dword ptr [ecx],2
[Data Access] Value at 0018ff2c is read.
[Data Access] EIP=00401064 Data=0018ff2c R -
[Data Access] 00401064 5d              pop     ebp
[Data Access] Value at 0018ff30 is read.
[Data Access] EIP=00401065 Data=0018ff30 R -
[Data Access] 00401065 c3              ret
[Data Access] EIP=004010b6 Data=0018ff34 W -
[Data Access] 004010b6 51              push    ecx
[Data Access] EIP=004010b7 Data=0018ff30 W -
[Data Access] 004010b7 e874ffffff      call    TestSignednessConversion!set2 (00401030)
[Data Access] EIP=00401030 Data=0018ff2c W -
[Data Access] 00401030 55              push    ebp
[Data Access] Value at 0018ff34 is read.
[Data Access] EIP=00401033 Data=0018ff34 R -
[Data Access] 00401033 8b4508          mov     eax,dword ptr [ebp+8]
[Data Access] Value at 0018ff3c is read.
[Data Access] EIP=00401036 Data=0018ff3c R U
[Data Access] 00401036 833802          cmp     dword ptr [eax],2
[Data Access] Value at 0018ff34 is read.
[Data Access] EIP=0040103b Data=0018ff34 R -
[Data Access] 0040103b 8b4d08          mov     ecx,dword ptr [ebp+8]
[Data Access] EIP=0040103e Data=0018ff3c W -
[Data Access] 0040103e c70102000000    mov     dword ptr [ecx],2
[Data Access] Value at 0018ff2c is read.
[Data Access] EIP=00401044 Data=0018ff2c R -
[Data Access] 00401044 5d              pop     ebp
[Data Access] Value at 0018ff30 is read.
[Data Access] EIP=00401045 Data=0018ff30 R -
[Data Access] 00401045 c3              ret
[Data Access] Value at 0018ff44 is read.
[Data Access] EIP=004010c3 Data=0018ff44 R -
[Data Access] 004010c3 5d              pop     ebp
[Data Access] Value at 0018ff48 is read.
[Data Access] EIP=004010c4 Data=0018ff48 R -
[Data Access] 004010c4 c3              ret
Break reason: 00000010
In the log l-t is set to disable step to next source line and use step to next instruction instead. sxi av is used to let the event callback implementation handle the exception. 18f000 is stack base address 1000 is the size. 56 is to set some flags including verbose logging, etc.

August 8, 2012

Experiences with Signedness II

Data that is read from the memory can be treated as signed integer or unsigned integer. It's possible that at some stage of the execution the integer is treated as unsigned integer but other point of the execution it's treated as signed integer. When it comes to write code there could be circumstances when you might not be immediately aware how the integer is treated unless you take an extra care, for example, by looking at the compiled code. This is definitely an attack surface, and the root cause of lots of published vulnerabilities.

In case you want to see some examples what I mean, earlier last year, I wrote a little about experiences regarding signed/unsigned comparisons.

Some time ago, I started developing a Windbg plugin command that has a tracing functionality, and the ability to break in the debugger when a signed comparison is reached. However, if EIP is not in user defined range e.g. due to an API call, the program executes normally. When EIP is in the user defined range again the program resumes tracing.

I was able to trace some function in a Visual C++ project, but it was needed to run l-t command beforehand to step by assembly instructions rather than source lines. Here is how to use Windbg in VS.

This plugin can be extended to work with other signed instructions than signed comparison ones. In addition, the plugin can be extended to execute the program until comparison is reached rather than to trace, in a similar way to the working of the ph command.

One possible area to explore further is to record how the data that is read from the memory is treated in point of signedness. Also, to detect any weak points to attack, or even to detect signedness conversions.
  This blog is written and maintained by Attila Suszter. Read in Feed Reader.