Reverse engineering (Rev) is the art figuring out how a system 'works'. Often you have little to work with, sometimes only a binary file, datasheet or user interface. Rev is one of the most crucial steps when attacking a device/server/program. You can only craft a succesfull attack when you understand the system. In many CTF categories Rev is an implicit step in the challenge (Web & Binary exploitation, etc.). However it is also often a seperate categorie that focusses mostly on reverse engineering binary code.
Binary exploitation challenges introduces us to the wonderfull world of low level memory shenanigans and other forms of manipulation. Most exercises will revolve around attacks on insecure C code, however you may also encounter other langauges. For this training we will use challenges from PicoCTF.
For this training we will use challenges from Sofia Santos and Cipherstick. Sofia Santos has a great repository of image based OSINT challenges on her website. Each exercise also has writeups (video and written) that explain how to solve the exercise. This site is a great starting point if you are new to OSINT, however they also offer exercises for more experienced people.
For the Cipherstick challenges we will need to find missing persons. Each challenge has many smaller sub challenges that incrementally lead to solving the mystery.
For today's training session we will use PicoCTF. This platform provides a lot of interesting challenges on various different topics, for both beginners and experienced people. To view the exercises and submit flags you will need to create an account.
In general you can follow these steps when analyzing binary files:
Static analysis: use a decompiler and try to answer these questions:
Dynamic analysis: use a debugger and try to answer these questions:
Platform specific analysis: use tools that are specifically designed to analyze software build for that platform (i.e. Java decompiler if our binary was written in Java).
Tools:
Youtube guides/tutorials:
Example Rev challenge walkthroughs
There are many usefull tools and techniques for solving binary exploitation challenges. In general you can follow the following steps to attack programs:
Program memory layout:
Addr.
0000 +---------------+
| |
| .txt | executable code
| |
+---------------+
| .data | initialized data
+---------------+
| .bss | uninitialized data
+---------------+
| | |
| Heap | | grows to larger
| | | addresses
+---------------+ V
| unused memory |
+---------------+ A
| | | grows to lower
| Stack | | addresses
| | |
1024 +---------------+
Size known at compilation time:
- .txt, .data, .bss
Dynamically grows:
- Heap, unused memory, Stack
Stack layout:
#----------------
# PROGRAM
#----------------
void hello(int n) {
// Do something cool
}
void greet() {
hello(5); // say hello 5 times
}
int main() {
greet();
return 0;
}
#----------------
# Stack
#----------------
Addr.
XXXX +---------------+
| |
| unused memory |
| |
+ - - - - - - - + A
| Hello | |
+ - - - - - - - + | Stack grows up
| Greet | |
+ - - - - - - - + |
| Main | |
1024 +---------------+
The stack consists of frames "stacked" on top of eachother. Each frame corresponds to function call and contains all the necessary information to excute that function (arguments, local variables and control flow data). The frame on top of the stack corresponds to the function that is currently being executed. When a function returns its frame is removed from the stack. For example in the program shown above we have the following steps:
Frame layout:
Registers in the CPU:
- PC: Program Counter
- SP: Stack pointer
- BP: Base pointer
#----------------
# PROGRAM
#----------------
int test(int a, int b) {
int c = 5;
int d = 18;
return a + b + c + d;
}
#----------------
# FRAME
#----------------
Address Test Frame
+-----------+
[ebp - 8] | d | <-- SP
+ - - - - - +
[ebp - 4] | c | 1st local variable
+ - - - - - +
[ ebp ] | oldBP | <-- BP
+ - - - - - +
[ebp + 8] |return addr|
+ - - - - - +
[ebp + 12] | a | 1st argument
+ - - - - - +
[ebp + 16] | b | 2nd argument
+-----------+
When Test returns SP, BP and PC will be updated:
Open Source Intelligence (OSINT) is a subset of "Intelligence", which only uses public sources to gather, evaluate and assess information about a target. In the context of CTFs this means that you need to use the internet itself to solve challenges. Typically tools like google maps and reverse image search prove to be very useful.
OSINT Framework
Site: overview of OSINT tools This website contains a huge list of tools (other websites) categorized by the type of information you are searching for.
OSINT playbook
Site: The OSINT playbook: essential tools and tutorials for every analyst
There are many guides and tools online on various web attacks. Below we have compiled a small list of youtube playlists related to this topic. Hexdumps playlist is more educational in nature, while CTF School & LiveOverflows playlists lean a bit more towards the "entertaining side".
hexdump
Youtube playlist: detailed explanation of common web attacks
CTF School
Youtube playlist: random topics related to web exploitation
LiveOverflow
Youtube playlist: educational videos and some random topics
We highly recommend that you have access to a Linux OS when solving CTF challenges. Many tools are only supported on Linux (and sometimes Mac). There are a lot of different distros (Linux versions) that you can choose from. Some come preinstalled with many useful tools for CTFs, like for example Kali or ParrotOS. However a more "general purpose" OS like Ubuntu also works perfectly fine.
(WSL) Windows Subsystem for Linux
Installation guide: What Is The Windows Subsystem for Linux (WSL) For?!
(VM) Virtual machine
Installation guide: you need to learn Virtual Machines RIGHT NOW!! (Kali Linux VM, Ubuntu, Windows)
(Native) Run Linux as a normal OS
Installation guide: How to Dual Boot Ubuntu 24.04 LTS and Windows 10 / 11