Advanced Static Analysis and Reverse Engineering
Unit 8 - Advanced Static Analysis and Reverse Engineering
Last Update Unknown
Static Analysis
Basic Static Analysis
- Trying to understand a program (or in our case, a piece of malware), without actually running it
- Looking at the code / structure of program
Examples of this would be:
- Analysing the sample with AVs
- Hash signatures
- Strings
- Function names
Analysing the sample with AVs
- May confirm maliciousness
- Signature-based vs. Heuristic
- Lots of techniques to counter detection
Hash signatures
- Known malicious samples
- Sharing
- Label samples
- Reduce size of datasets for experiments
Strings
- IP addresses/URLs
- Function names
- DLLs
- Error messages
- Packer signatures
Function names
- In Windows, each word starts with an uppercase letter, e.g. SetLayout
- Imports
- Import functions from another program
- Exports
- Conversely, export functions to another program
- Link Libraries
- Static linking: Copy whole code over and add to program
- Runtime linking: Program interacts with libraries only when a function is needed
- Dynamic linking: Program import referenced libraries as soon as program starts
Obfuscation
But what happens if our malware was packed/obfuscated?
- AVs may not tag it as malicious
- Hash signatures will differ
- Not all strings will be shown
- Not all function names will be displayed, but some functions can hint that a sample is packed, e.g. GetProcAddress, VirtualAlloc
Portable Executable (PE) File Format
Basic structure of a Windows executable file
- PE header (metadata)
- Sections:
- .text
- .idata
- .data
- .rdata
- .edata
- .pdata
- .rsrc
- .reloc
Why do you need to know about it?
- Understand the inner of what you are working with
- Spot uncommon sections which could raise suspiciousness
- Possibly identify packers/obfuscation methods
- Manual unpacking
- Modifying an executable
- Research
What do you need to remember?
- What the PE file format is and its structure
- What the PE header contains
- What some of the typical sections are
- What a malware analyst could find by looking at the PE header and sections of an executable
- Examples of related tool
Tools
- www.virustotal.com
- 'strings'
- md5sum/sha1sum/sha1deep/WinMD5/etc.
- PEiD: detecting packed files
- PEView: examining PE Files
- PE Browse: browsing a PE header
- PE Explorer: browsing a PE header
- ImpREC: rebuilding the Import Table
- LordPE: dumping an executable from memory
- Dependencies: exploring DLLs and functions imported by a piece of malware.
Reverse Engineering
Reversing
Reverse engineering is a process where an engineered artefact (such as a car, a jet engine, or a software program) is deconstructed in a way that reveals its innermost details, such as its design and architecture.
- i.e. dissecting a product to understand how it works
- An advanced static analysis technique
- Can vary from easy... to very challenging
- Requires a lot of patience
Why learn Reverse Engineering?
- Strengthen understanding (applies to literally everything)
- Penetration-Testing
- Problem solving skills
- Solving legacy system issues
- Benefits of looking at code written by others (think open source)
Why apply it to Malware Analysis?
- Basic static analysis not always sufficient
- Complex problems require complex solutions
Key Concepts
Machine code
- Result of compiling a high-level language
- Raw binary data
Low-level languages
- Machine code is too much hassle
- Hence, we use assembly
- Uses "mnemonics" (opcodes)
- Different instructions depending on platform
- Different syntax... e.g. Intel vs. AT&T
- Different processors require different approaches
High-level languages
- E.g. C, C++
- Easy to use and understand
Interpreted languages
- E.g. C#, Java
- Code → intermediate language (bytecode) → machine code
Processor and registers
Register are data storage units closest to the processor
- Quick access
- Different registers serve different purposes
Description | 32-bit register | 16-bit register | 8-bit register |
---|---|---|---|
Extended Accumulator Register | EAX | AX | AH/AL |
Extended Base Register | EBX | BX | BH/BL |
Extended Counter Register | ECX | CX | CH/CL |
Extended Data Register | EDX | DX | DH/DL |
Extended Base Pointer | EBP | BP | |
Extended Stack Pointer | ESP | SP | |
Extended Source Index | ESI | SI | |
Extended Destination Index | EDI | DI |
Description | Register(s) |
---|---|
Extended Instruction Pointer | EIP |
Represent the outcome of computations and control the operation of the CPU | EFLAGS |
Segment registers (used to describe different segments of memory) | CS, DS, ES, FS, GS, SS |
Fetch and execute cycle
Main memory (RAM)
- Data: contains values
- Code: instructions to be fetched by the CPU to execute
- Heap: allocate/free values to/from RAM
- Stack: local variable and function parameters, control program flow
What could happen if an attacker corrupted the instruction pointer?
- Modify it so they could then execute their own code in memory
Assembly Primers
Instructions
- mnemonics, e.g. ADD, MOV, SUB, XOR, INC
- May or may not have (an) operand(s), e.g. RET, POP
Operands
- Registers
- Values
- Memory addresses
- x86 → little-endian
Syntax
- A source and a destination
- Intel (<mnemonic> <dst>,<src>)
- AT&T (<mnemonic> <src>,<dst>)
Simple instructions
Instruction | Description |
---|---|
mov eax, ebx | Copies the contents of EBX into the EAX register |
mov eax, 0x42 | Copies the value 0x42 into the EAX register |
mov eax, [0x4037C4] | Copies the 4 bytes at the memory location 0x4037C4 into the EAX register |
mov eax, [ebx] | Copies the 4 bytes at the memory location specified by the EBX register into the EAX register |
mov eax, [ebx+esi*4] | Copies the 4 bytes at the memory location specified by the result of the equation ebx+esi*4 into the EAX register |
Arithmetic
Instruction | Description |
---|---|
sub eax, 0x10 | Subtracts 0x10 from EAX |
add eax, ebx | Adds EBX to EAX and stores the result in EAX |
inc edx | Increments EDX by 1 |
dec edx | Decrements EDX by 1 |
mul 0x50 | Multiplies EAX by 0x50 and stores the result in EDX:EAX |
div 0x75 | Divides EDX:EAX by 0x75 and stores the result in EAX and the remainder in EDX |
Instruction | Description |
---|---|
xor eax, eax | Clears the EAX register |
or eax, 0x7575 | Performs the logical or operation on EAX with 0x7575 |
mov eax, OxA shl eax, 2 |
Shifts the EAX register to the left 2 bits; these two instructions result in EAX = 0x28, because 1010 (0xA in binary) shifted 2 bits left is 101000 (0x28) |
mov bl, OxA ror bl, 2 |
Rotates the BL register to the right 2 bits; these two instructions result in BL = 10000010, because 1010 rotated 2 bits right is 10000010 |
Conditionals
cmp dst, src - The CMP instruction is identical to the sub instruction; however, the CMP instruction is used only to set the zero flag and carry flag (CF) but does not affect the operands.
cmp dst, src | ZF | CF |
---|---|---|
dst = src | 1 | 0 |
dst < src | 0 | 1 |
dst > src | 0 | 0 |
test dst, src - The TEST operation returns 1, if the matching bits from both the operands are 1, otherwise it returns 0
Branching
Instruction | Description |
---|---|
jz loc | Jump to specified location if ZF = 1 |
jnz loc | Jump to specified location if ZF = 0 |
je loc | Same as jz, but commonly used after a cmp instruction. Jump will occur if the destination operand is not equal to the source operand |
jne loc | Same as jnz, but commonly used after a cmp instruction. Jump will occur if the destination operand is not equal to the source operand |
jg loc | Performs signed comparison jump after a cmp if the destination operand is greater than the source operand |
jge loc | Performs signed comparison jump after a cmp if the destination operand is greater than or equal to the source operand |
ja loc | Same as jg, but an unsigned comparison is performed |
jae loc | Same as jge, but an unsigned comparison is performed |
jl loc | Performs signed comparison jump after a cmp if the destination operand is less than the source operand |
jle loc | Performs signed comparison jump after a cmp if the destination operand is less than or equal to the source operand |
jb loc | Same as jl, but an unsigned comparison is performed |
jbe loc | Same as jle, but an unsigned comparison is performed |
jo loc | Jump if the previous instruction set the overflow flag (OF = 1) |
js loc | Jump if the sign flag is set (SF = 1) |
jecxz loc | Jump to location if ECX = 0 |
Recognising IF Statements
Recognising FOR loop
Recognising WHILE loop
Recognising SWITCH statements
"Hello World" Example
Current Research
- Eilam, E. (2011). Reversing: Secrets of Reverse Engineering. John Wiley & Sons.
- Sikorski, M., & Honig, A. (2012). Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software. No Starch Press.
- http://www.peter-cockerell.net/aalp/html/frames.html
- Penetration-Testing module: Lecture 5 Exploitation/Vulnerability Validation