why not ¯\_(ツ)_/¯
One of my biggest projects, where I learned a lot about LLVM internals, binary formats, assembly and obfuscation techniques.
I believe that learning through building is the best way to learn, thus I built this project to learn more about these topics.
I have wrote 4 blogs explaining the concepts behind Zyrox:
- Part I: Building Zyrox: A Custom LLVM Obfuscator
- Part II: Control Flow Flattening
- Part III: Encrypted Jump Tables
- Part IV: The Finale
These parts go deeper than this readme, and definitely worth a read if you are interested in the topic.
This is intended for who wants to quick test Zyrox, or learn how to integrate it in a cmake project.
Follow the steps in Zyrox Template repo.
install llvm:
sudo apt update
sudo apt install llvm-18 llvm-18-dev clang-18clone and compile zyrox:
git clone --recurse-submodules https://github.com/PeterHackz/zyrox.git
cd zyrox
cmake -S . -B build -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++
cmake --build build --parallel 4make sure you have python3 and pip installed.
# Create a virtual environment
python3 -m venv .venv
# Activate the env
source .venv/bin/activate
pip install -r requirements.txtpip install -r requirements.txtclang -O0 -flto=full -c main.c -o out/main.o
clang -flto=full -fuse-ld=lld -Wl,--load-pass-plugin=./build/libzyrox.so out/main.o -o out/mainAfter obfuscation, run PyPlugin.py to encrypt jump tables:
# if you installed dependencies in a virtual environment, activate it first:
source .venv/bin/activate
# then run with:
python PyPlugin.py --in=<input_file> [--out=<output_file>] [--tables=<zyrox_tables_file>] [--android]Check out the Zyrox Template repo for an example CMake integration.
I get this is a complex topic, and this project was mostly for educational purposes, as well as to serve BSD Brawl. If you have any question, or just want to chat, feel free to reach out to me:
- Discord:
@s.b - Email:
[email protected]or[email protected] - Discord Server
any help, through pull requests or issues is appreciated!
ZyroxPlugin.cpp registers the pass, then links siphash (more on this later) and call StringEncryption to encrypt
strings.
The reason we encrypt strings early is so that decryption logic gets obfuscated too later.
then it calls ModuleUtils::ExpandCustomAnnotations
and QuickConfig::RegisterPasses to parse all __attribute__((annotate("..."))) expressions and run
QuickJs config (located in ZyroxConfig.js)
Every function is obfuscated by calling Zyrox::RunOnFunction located in ZyroxCore.cpp,
more documentation about this will be provided in the future.
switches create jump tables and PHI nodes are annoying to deal with thus we use FunctionUtils and BasicBlockUtils
to flatten (into if statements) and demote these respectively.
oh man, where do I start
- Basic Block Splitter
- Control Flow Flattening
- Indirect Branching
- Simple Indirect Branching
- Mixed Boolean Arithmetic
all js-plugin args are in index.d.ts so will not be talked about in this documentation.
for annotations documentation, click here
This pass splits and shuffles a basic block into smaller ones. suppose we have this:
int __test_fn(int x)
{
if (x == 2) {
printf("x is 2\n");
} else {
printf("x is not 2!, x is: %d\n", x);
}
return x + 4 * x - 2 / 4;
}which gets compiled into:
define internal i32 @__test_fn(i32 noundef %0) #0 !zyrox !8 !obfuscated !11 {
%2 = alloca i32, align 4
store i32 %0, ptr %2, align 4
%3 = load i32, ptr %2, align 4
%4 = icmp eq i32 %3, 2
br i1 %4, label %5, label %7
5: ; preds = %1
%6 = call i32 (ptr, ...) @printf(ptr noundef @.str.1)
br label %10
7: ; preds = %1
%8 = load i32, ptr %2, align 4
%9 = call i32 (ptr, ...) @printf(ptr noundef @.str.2, i32 noundef %8)
br label %10
10: ; preds = %7, %5
%11 = load i32, ptr %2, align 4
%12 = load i32, ptr %2, align 4
%13 = mul nsw i32 4, %12
%14 = add nsw i32 %11, %13
%15 = sub nsw i32 %14, 0
ret i32 %15
}
when using Basic Block Splitter with this config:
z.RegisterPass(ObfuscationType.BasicBlockSplitter, {
PassIterations: 1,
"BasicBlockSplitter.SplitBlockChance": 100,
"BasicBlockSplitter.SplitBlockMinSize": 2,
"BasicBlockSplitter.SplitBlockMaxSize": 5,
});it becomes:
define internal i32 @__test_fn(i32 noundef %0) #0 !zyrox !8 !obfuscated !11 {
%2 = alloca i32, align 4
store i32 %0, ptr %2, align 4
%3 = load i32, ptr %2, align 4
%4 = icmp eq i32 %3, 2
br i1 %4, label %5, label %14
5: ; preds = %1
%6 = call i32 (ptr, ...) @printf(ptr noundef @.str.1)
br label %7
7: ; preds = %14, %5
%8 = load i32, ptr %2, align 4
%9 = load i32, ptr %2, align 4
%10 = mul nsw i32 4, %9
%11 = add nsw i32 %8, %10
br label %12
12: ; preds = %7
%13 = sub nsw i32 %11, 0
ret i32 %13
14: ; preds = %1
%15 = load i32, ptr %2, align 4
%16 = call i32 (ptr, ...) @printf(ptr noundef @.str.2, i32 noundef %15)
br label %7
}now it won't be that much different for such small function but notice how it split a basic block? this is helpful combined with other passes like Control Flow Flattening
Oh, man this pass have the most features among all lol. I will start by explaining how it works then it's config suppose we have this code:
LABEL_A: bool b = x == 2;
IF EQ: goto LABEL_B
goto LABEL_C
LABEL_B do_stuff()
LABEL_C do_other_stuff()
goto LABEL_Aeach basic block (A, B and C) gets assigned a unique dispatcher state, example: (simplified)
states = {
1: LABEL_A,
2: LABEL_B,
3: LABEL_C,
};then we inject a dispatcher block that controls everything and the code becomes:
int state = 0;
LABEL_D goto LABEL_CA // dispatcher label jumps to first condition block, label condition A
LABEL_CA if state == 1: goto LABEL_A
// if not 1, go to check if it is label B (fallback)
LABEL_CB if state == 2: goto LABEL_B
LABEL_CC if state == 3: goto LABEL_CC
// unreachable
goto LABEL_D
LABEL_A: bool b = x == 2;
// IF EQ: goto LABEL_B
// goto LABEL_C
state = 2 if b else 3 // update state for the block we want and back to dispatcher
goto LABEL_D
LABEL_B do_stuff()
LABEL_C do_other_stuff()
state = 1
goto LABEL_Dnow this have some flaws that the obfuscator fix. as you see since we have a single dispatcher variable, it is easy to deobfuscate this since we know where a block is going to after it sets state. easy to fix!
z.RegisterPass(ObfuscationType.ControlFlowFlattening, {
PassIterations: 1,
"ControlFlowFlattening.UseFunctionResolverChance": 60,
"ControlFlowFlattening.UseGlobalStateVariablesChance": 60,
"ControlFlowFlattening.UseOpaqueTransformationChance": 40,
"ControlFlowFlattening.UseGlobalVariableOpaquesChance": 80,
"ControlFlowFlattening.UseSipHashedStateChance": 40,
"ControlFlowFlattening.CloneSipHashChance": 80,
});let's go over the options 1 by 1:
UseFunctionResolverChance: injects a function to check for state, so instead of doingif (state == expected_state), it doesif (injected_resolver(state)). example:bool __fastcall cff_resolve_state_check_3585(__int64 a1) { return a1 == 0x288A6154F8A5E3E2LL; }
UseGlobalStateVariablesChance: save the state value to be compared in a global variable:bool __fastcall cff_resolve_state_check_506(__int64 a1) { return a1 == qword_1B20D8; }
UseOpaqueTransformationChance: obfuscate the check into some transformation that will only yield true for a specific state:bool __fastcall cff_resolve_state_check_7901(__int64 a1) { return ((((a1 ^ 0xEA9E45BB6099BC6ELL) + qword_1C64D8) << qword_1A63F0) | (((a1 ^ 0xEA9E45BB6099BC6ELL) + qword_1C64D8) >> qword_1ACE98)) == qword_1B0B80; }
UseGlobalVariableOpaquesChance: use a global variable instead of number when doingUseOpaqueTransformationChance, as you noticed in the example above. (qword_1C64D8,qword_1A63F0,qword_1ACE98)UseSipHashedStateChance: uses a tiny customizedsiphashfunction to check for the state. soif (state == 23872)becomes something likeif (siphash(state) == 3874872081)making it harder to find what a block is jumping to. block would dostate = 23872when the condition of where it goes is hashed. each siphash call uses random values to make it harder to emulate it.CloneSipHashChance: clone and even try when possible to inlinesiphashfunction making more than a sibling for it, which makes hooking a single function not enough. it is very much preferred to use this as it only increase binary size and does not affect performance.
suppose we have this code:
if (x == 2) goto LABEL_A
goto LABEL_B
LABEL_A: do_stuff()
LABEL_B: // ...it would transform into:
@global jump_table = {0, &LABEL_A, &LABEL_B};
if (x == 2) goto jump_table[0] + @inline(decrypt(jump_table[1]));
goto jump_table[0] + @inline(decrypt(jump_table[2]));
// ...when this pass is used the plugin will output a zyrox_tables.txt file to be used by PyPlugin.py.
PyPlugin.py will encrypt the jump tables and patch relocation entries, then make the relocator point to jump_table[0]
of every table. a relocator basically does this:
target.writePointer(base.add(value)), so by setting value to 1, we make relocator give us base address and put it
in jump table at runtime, and we use it a long with the goto to generate the runtime address. on arm32 thumb mode
the pass automatically adds | 1 after decryption.
to use PyPlugin.py simply do the following:
(if using venv, activate it first)
python3 PyPlugin.py --in <out_obfuscated_file> --androidpassing --android is important if you are targeting the arm64 version as the x86_64 version have a different relocator
signature.
you can also pass --out (by default it will use same file passed to --in) and you can pass --tables (by default it
is zyrox_tables.txt)
while indirect branching seems great it also comes with a performance hit as it is decrypting pointers at runtime, this is a simple version that does not affect performance, where this:
if (x == 2) goto LABEL_A
goto LABEL_B
LABEL_A: do_stuff()
LABEL_B: // ...becomes:
@stack jump_table = {&LABEL_B, &LABEL_A}
goto jump_table[!(x == 2)]
LABEL_A: do_stuff()
LABEL_B: // ...while this seems simple and easily breakable (I agree), it is enough to break IDA and Ghidra without affecting performance.
also known as MBA Sub (Mixed Boolean Arithmetic Substitution), converts simple operations into complex ones that give same output. it uses a pre-defined set. example:
a ^ b = (~a & b) | (a & ~b)
b * c = (((b | c) * (b & c)) + ((b & ~c) * (c & ~b)))
r = rand(); c = b + r; a = a + c; a = a - ryou can see the full list in Passes/MBASub.cpp if you are interested.
just check index.d.ts. the annotation parser uses same order.
to mark a function just do the following:
__attribute__((annotate("ibr:1,100"))) void hello_world () {
some_hello ();
}annotation codes:
- Basic Block Splitter: bbs
- Control Flow Flattening: cff
- Indirect Branching: ibr
- Simple Indirect Branching: sibr
- Mixed Boolean Arithmetic: mba
example:
in index.d.ts we see:
{
"BasicBlockSplitter.SplitBlockMinSize"?: number;
"BasicBlockSplitter.SplitBlockMaxSize"?: number;
"BasicBlockSplitter.SplitBlockChance"?: number;
};now here's the thing, first argument and the shared one for all passes is PassIterations, so it will be the first
arg in the annotations.
to annotate something with bbs we do:
__attribute__((annotate("bbs:1,15,30,100"))) void hello_world () {
some_hello ();
}this means: run Basic Block Splitter on hello_world 1 time with min size =15, max size = 30
and chance = 100.
you can also combine passes:
__attribute__((annotate("bbs:1,15,30,100 ibr:1,100 sibr:1,100"))) void hello_world () {
some_hello ();
}this means run Basic Block Splitter then Indirect Branching and then
Simple Indirect Branching on hello_world.
They will run by the order of definition left
to right.