In this blog series, I will be putting the spotlight on useful Ghidra features you may have missed. Each post will look at a different feature and show how it helps you save time and be more effective in your reverse engineering workflows. Ghidra is an incredibly powerful tool, but much of this power comes from knowing how to use it effectively.
There are several circumstances where it can be helpful to make a modification to code or data within a compiled program. Sometimes, it is necessary to fix a vulnerability or compatibility issue without functional source code or compilers. This can happen when source code gets lost, systems go out of support, or software firms go out of business. In case you should find yourself in this situation, keep calm and read on to learn how to do this within Ghidra.
Until recently, Ghidra was rather limited in this capability. This changed with the summer 2021 release of Ghidra 10.0 which introduced the ability to export programs with proper executable formats for Windows (PE) and Linux (ELF). Ghidra versions before 10 or for executable formats besides PE and ELF require using a raw import and raw export and is generally far less robust. In this post, I will review a Windows x86 executable, but the general strategy is applicable more broadly with some nuances for specific platforms and architectures.
The first step for preparing a program patch is to gauge the complexity/length of the required patch and identify roughly where it needs to be inserted. If the patch is short enough, it may be possible to directly replace existing code inline. Patches introducing completely new functionality generally cannot be written inline and will require a different strategy. In this scenario, we must locate unused bytes which are loaded from the program file into executable memory space. These code caves are commonly generated when an executable section requires specific byte alignment. Longer patches can be written into a code cave along with appropriate branching instructions to insert the patch code into the right code path.
Let’s take an example to see this process in action. In case you haven’t seen them, MalwareTech has a fun set of reversing and exploitation challenges available online. Each reversing challenge presents an executable which, when executed, will display a message box containing the MD5 sum of a secret flag string. You are expected to recover the flag string using only static analysis techniques, but for this blog, we will be altering and then running the challenge program to directly print the flag. (Don’t worry, it’s not cheating if it is in the name of science, right?)
In this post, I will use the shellcode2 challenge, and I encourage readers to follow along and then attempt to repeat the process with a different challenge file. The objective for our patch is to reveal the flag value after it has been decoded by the shellcode and before it has been hashed. Let’s start by looking at how shellcode2.exe_ is structured:
In this snippet, we see local_bc being initialized as an MD5 object followed by the construction of a stack string. When looking at the end of the entry function, we can see where the flag is hashed and the message box is created:
In this snippet, the MD5 object at local_bc is being referenced to invoke the MD5::digestString() method with the address of local_2c as input. A reference to the resulting hash is stored at local_c0. The instructions from 4023a2-4023b2 pass this value into the MessageBoxA API call with a particular window title and style.
The first patch we’ll look at is to change the arguments to MessageBoxA so that it prints the value from local_2c rather than the value referred by local_c0. The address of the hash is loaded into EAX with the MOV instruction at 4023a9 and then pushed to the stack as an argument for MesageBoxA. This will need to be patched so that the address of local_2c is pushed instead. The LEA (Load Effective Address) instruction allows us to do just that.
Begin by right-clicking the MOV instruction and selecting Patch Instruction:
The instruction will change to an editable field with autocompletion:
Patch this to be LEA with the operands EAX, [EBP + -0x28] so that EAX receives the address of local_2c:
Note that the use of -0x28 rather than -0x2c as an offset to EBP is to account for the original EBP being pushed to the stack before EBP is loaded with the new stack pointer. The resulting offset is converted to its two’s complement as shown here:
The program can now be exported from the File -> Export Program menu as PE format. Running the exe file produces our new MessageBoxA: