This project provides a framework for modeling and analyzing the behavior of parsers for executable file formats, like the ones we can find in operating system loaders and reverse engineering tools.
The key problem we deal with: there is no reference implementation for parsing PE files, and there is no comprehensive specifications for the PE file format. Reimplementation is the de facto rule, and there is a lot of room for implementation differences. This, in turns, leads to discrepancies between software the actually needs to load PE executables (e.g., Windows OS) and reverse engineering / malware analysis tools. This a problem: these discrepancies can be used to mislead malware and reverse engineering tools.
Key contributions:
- We developed and release an analysis framework to systematically explore this problem and enumerate discrepancies among different software, especially OS loaders vs. reverse engineering / malware analysis tools.
- We developed and release models for various versions of Windows (XP, 7, 10) and reverse engineering tools (ClamAV, Yara, radare2).
- We can automatically validate and generate PE samples "exploiting" these discrepancies, thus tricking reverse engineering tools into extracting misleading information.
This project ships some ready-to-use models as well as the code of the analysis framework.
The models can be found in the dedicated submodule, while the interpreter's code for the custom language is in the modelLang directory.
The first step in the analysis consists in writing a "model" of the parser using the custom language supported by the framework.
Here follows an example extracted from the models of the Windows loader.
INPUT HEADER 2048 as DOSHeader
## Check the MZ magic number
V1: AND EQ HEADER.magic[0] "M" EQ HEADER.magic[1] "Z" term
V2: ULE (ADD HEADER.e_lfanew 0xf8) FILESIZE term
P: NT_HEADER <- HEADER[HEADER.e_lfanew, sizeof _IMAGE_NT_HEADERS] as _IMAGE_NT_HEADERS
## Check the PE magic number
V3: EQ NT_HEADER.Signature 0x4550 term
For more information about the modeling language, check out SPECIFICATIONS.md.
Given an executable and the model of a parser as input, the framework can determine whether the first meets the constraints of the second, in other words, whether the modeled software considers the input file as a valid executable. To check whether an executable meets the constraints of a model, you can run:
python3 verify.py <model file> <sample>
For example, if you want to check whether an unknown sample can run under Windows 10 by using the ready-to-use models in this project, you can launch:
python3 verify.py models/windows/10/MiCreateImageFile.lmod path/to/the/sample
The script returns 0 if the executable is valid, or 1 otherwise (in this case, the scripts also prints one line pointing to the broken constraint in the model).
The framework can create program headers that are valid according to one or more models.
The logic for generating valid samples is implemented in the generate.py
script, which can be invoked as follows:
python3 generate.py -A <model 1> [<model 2> [<model 3> ... ]]
For example, to generate a valid test case for Windows 7, you can run the following command that combines the models of both the kernel-space and user-space portions of its loader:
python3 generate.py -A models/windows/7/MiCreateImageFile.lmod models/windows/7/LdrpInitializeProcess.lmod
The output file can be specified with the -O
flag (default: testcase
).
Given two or more models, the framework can create program headers that are valid according to a subset of them but invalid for the others.
generate.py
also implements the differential test case generation and can be invoked with:
python3 generate.py -A <model 1> [<model 2> [...]] -N <model 3> [<model 3> [...]]
For example, to generate a sample that runs in Windows 7 but not in Windows 10, you can execute:
./generate.py -A models/windows/10/MiCreateImageFileMap.lmod models/windows/10/LdrpInitializeProcess.lmod -N models/windows/7/MiCreateImageFileMap.lmod models/windows/7/LdrpInitializeProcess.lmod
Given two models, the framework is able to create many different differential test cases, exploiting different discrepancies among the two models.
The differential.py
script implements the logic for the differences enumeration technique and can be invoked the same way as generate.py
.
Given a model, the framework creates many test cases that cover all the possible configurations that a set of models consider valid.
This technique is implemented in the explore_condition.py
script, which can run with the following command:
python3 explore_conditions.py -M <model 1> [<model 2> [...]]
The best way to start using this project is by creating a virtual environment.
mkvirtualenv --python=python3 models
This project uses python3-specific features and, as such, is unlikely to work with python2.
Most of the project dependencies can be installed using pip:
pip install -r requirements.txt
The z3
solver needs to be installed separately. On Ubuntu 20.04, you can do that with:
sudo apt install z3
This work was published at the 24th International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2021).
You can read the paper here.
If you want to cite this work in your academic paper, you can use this:
@inproceedings{10.1145/3471621.3471848,
author = {Nisi, Dario and Graziano, Mariano and Fratantonio, Yanick and Balzarotti, Davide},
title = {Lost in the Loader:The Many Faces of the Windows PE File Format},
year = {2021},
publisher = {Association for Computing Machinery},
booktitle = {24th International Symposium on Research in Attacks, Intrusions and Defenses},
location = {San Sebastian, Spain},
series = {RAID '21}
}
We will also present this project at Black Hat Europe 2021.