Manually Unpacking Malware (part 2)

Intro

This is a follow on to the first post on this, Manually Unpacking Malware, where I talked about a way to break on the real entry point of a packed malware sample. This is a second technique called call stack backtracing wherein you find a system call to break on and then traverse backward in the chain of function calls to get to the real origin.

The malware sample used in this blog post

MD5 hash: 5B47FBC2DA1868022F30317EC7DDA39F
VirusTotal link to sample: https://www.virustotal.com/gui/file/1bb8c898b249c47249c31d9901b12add9d307f027e5cbff97937769d7e5c732e/detection

Call stack backtracing

Definition (from Technopedia)

A stack trace is a report that provides information about program subroutines. It is commonly used for certain kinds of debugging, where a stack trace can help software engineers figure out where a problem lies or how various subroutines work together during execution.

A stack trace is also known as a stack traceback or a stack backtrace. [source]

This can usually be graphically or textually visualized to understand how the chain of functions are calling each other. We will use Immunity Debugger to accomplish this.

For our purposes, the idea here is that you need to successfully break at some point within this callstack and then walk backward to the root of the chain to find where it started.

An example fragment from our sample

The following is an example of one point in this piece of malware where it calls WinExec which in turn calls CreateProcessA which in turn calls CreateProcessInternalA.

In Immunity Debugger’s stack view, we can see how this gets visualized. When a call from a function is made it pushes the memory address just after the call onto the stack. This address is seen in the image below next to the RETURN comments.

Just above this is the stack memory address of which you can highlight and press enter to jump backward to the previous stackframe to see what called it. This allows us to follow it backward and traverse the call stack until we get to the starting code of the thread. Immunity Debugger will also provide some lines to help identify these stack frames.

The green dots below indicate the stack addresses you can jump between to go from frame to frame. (click image for full resolution)

Continuing, we jump to where the original KERNEL32.WinExec call occured. (click image for full resolution)

Getting OEP in our sample via this technique

To gain the true OEP address of the unpacked malware in our sample we can start to look for calls to functions that are typically seen very early on in the lifecycle of a Windows application. The following set of functions are possible candidates:

KERNEL32.GetStartupInfoA
KERNEL32.GetCommandLineA

Upon opening the malware binary in Immunity Debugger, setting a breakpoint on these two functions and hitting F9, we land in the KERNEL32 memory section at the GetStartupInfoA function. In this case, it was called at the first level in the main thread and shows a return address value of 00401730.

If we follow to the return address of 00401730 we can start to scroll up through the instructions to find the beginning of the function. You will see the instruction PUSH EBP in the preamble of the functions assembly. Another indicator we are at the beginning is the RETN instruction a little higher above it.

Conclusion

The backtracing technique is useful for many purposes beyond this in reverse engineering. A common use outside of finding where the unpacked code starts is when identifying the use of strings or syscalls elsewhere in the binary and further understanding how and why they are used (without having to work through the entire binary to get to the most interesting parts).

A final caveat to this is when there are multiple threads running. Each will have their own stack, so you should be aware of this and set breakpoints on the creation of new threads and backtrace each one individually.

Manually Unpacking Malware (part 2)

Intro

The malware sample used in this blog post

Call stack backtracing

Definition (from Technopedia)

An example fragment from our sample

Getting OEP in our sample via this technique

Conclusion

Further Reading

Manually Unpacking Malware

Digging into obfuscated excel formula code

PEB/TEB/TIB Structure Offsets