之前在學LLVM Pass的時候就在想,是否能用LLVM Pass來實現vmp?想了一陣子沒啥思路就沒再多想了。
直至看到大佬的這篇文章「[原创]写一个简单的VMP-不造轮子,何以知轮之精髓?」,才發現原來LLVM Pass真能寫VMP,而且十分的契合。
上述文章參考了xVMP這個項目,在閱讀源碼後深感其設計之精妙,令人由衷讚嘆,故而寫下這篇文章與各位靚仔分享,如有寫錯之處,還請指出!!
項目地址:86bK9s2c8@1M7s2y4Q4x3@1q4Q4x3V1k6Q4x3V1k6Y4K9i4c8Z5N6h3u0Q4x3X3g2U0L8$3#2Q4x3V1k6s2b7f1&6s2c8e0j5$3y4W2)9J5c8Y4S2h3e0g2l9`.
xVMP的流程可分成3步:
一些全局變量的說明:
GOVMTranslator的構造函數中調用了init()進行一些初始化。
setup_callinst_handler()創建了一個Wrapper函數,記為new_call_handler(),在目前階段仍是一個空殼( 只有一行func id判斷的邏輯 ),之後會在handle_callinst()中填充主邏輯,最後替換模板中的call_handler()。
IR指令翻譯的邏輯在GOVMTranslator類的run()方法。
若函數返回值不為空,則將curr_data_offset加上對應類型的大小,默認偏移0用來保存返回值。
注:curr_data_offset用來記錄當前gv_data_seg的大小。
遍歷當前函數的參數,並保存到value_map中。
( 之後在handle_inst()時就是通過value_map來找到對應指令的data_off,以此進行譯碼 )
遍歷當前Function的BB,每個BB對應不同的opcode_seed和vm_code_seed,前者專門用來加密「操作碼」,後者則是用於對vm_code整體的加密 ( 這部份之後會分析到 )。
遍歷BB中每條Inst,同時遍歷每條Inst的操作數,若Inst中某個操作數是ConstantExpr( 常量表達式 )類型,則需要把它「拉出來」變成一條獨立的指令,插到Inst前面。
所有Inst都會被傳入handle_inst()進入譯碼。
看看handle_inst()是怎麼把一條IR指令轉譯成自己的ByteCode。
AllocaInst指令:( 例如 ⇒ %ptr = **alloca** i32 )
GET_PACK_VALUE是個宏,最終調用的是packValue()。
首先構造packType = {size, TypeID} ( 占2B ),然後判斷傳入的Value是否常量,在上述情況下顯然不是。
然後構建packed = pack((*value_map)[value], POINTER_SIZE),在上述情況下就是把AllocaInst_Res轉成8B的字節流。
packValue()在遇到全局變量時會將其一併保存到gv_value_map中,留到之後處理 ( 後面會分析到 )。
最終alloca指令會被轉譯成如下字節流,保存在gv_code_seg中。
同時看看VM解釋器是怎樣解析上述字節流的。
get_byte_code()會返回gv_code_seg[ip++],但alloca的返回值固定是8字節的變量,因此var_size和var_type都沒有用。
unpack_code()會以小端形式從gv_code_seg讀取指定大小的值。data_seg_addr是運行時gv_data_seg的實際地址,最後那句代碼會把data_seg_addr + area_offset這個值保存到data_seg_addr + var_offset這個地址。
LoadInst指令:( 例子 ⇒ %val = load i32, i32* %ptr )
StoreInst指令:( 例子 ⇒ **store** i32 3, ptr %ptr )
GetElementPtrInst指令:( 例子 ⇒ %elem_ptr = getelementptr i32, ptr %array_ptr, i32 2 )
為什麼只指最後一個索引就可以?看一個實際例子。
在-O0優化下,上述代碼對應的llvm ir如下。
可以看到在取arr[2][0]時分成了2條gep指令,第1條gep指令返回&arr[2]。
而第2條gep指令有兩個索引,第1個0代表&arr[2]( 因為<ty>是[2 x i32] ),第2個0代表&arr[2][0]。
由此可見在-O0優化下會把多層的指針操作拆分到多條gep指令來執行,單條gep指令最多只有2個索引( 不確定是否有例外?),只有最後那個索引才是「有用」的。
然後看看-O1優化,它把2條gep指令合併成1條,類型被修改為i8,索引只有1個16,相當於返回((uint8_t*)arr + 16),它等價於&arr[2][0]。
BranchInst指令:( 例子 ⇒ br i1 %cmp, label %if.then, label %if.end )
處理邏輯很簡單,就是根據br_map信息回填跳轉偏移( 相對於gv_code_seg )。
CallInst指令:( 例子 ⇒ call void @decryptString.2(ptr %9) )
之後( 所有指令完成譯碼後 )會調用handle_callinst()來專門處理CallInst。
handle_callinst()就是在不斷完善new_call_handler(),最後new_call_handler會替換模板中的call_handler()。
首先處理CallInst指令的參數,分兩種情況:
然後創建了一個新的BasicBlock來構建CallInst指令,CallInst指令可分成直接調用和間接調用。
返回值不為void時,需要構建ir指令把返回值保存到gv_data_seg指定位置。
最後的賦值是為了下一輪的handle_callinst()做準備。
最後一次handle_callinst()後,會調用finish_callinst_handler(),以保證函數的完整性。
最終的new_call_handler()大概會是類似這樣的結構:
在handle_callinst()之前,還調用了encrypt_vm_code()和construct_gv()。
encrypt_vm_code()會以BB為單位加密vm_code。( 上面說到每個BB對應不同的vm_code_seed,被記錄在vm_code_seed_map中 )。
construct_gv()用於構建ir全局變量,如通過vm_code構建gv_code_seg全局變量。
這裡構建的ir全局變量,是為了之後替換模板中對應的全局變量。
相關邏輯在GOVMInterpreter::run()中。
調用了llvm_parse_bitcode_from_string()來加載模板。
llvm_parse_bitcode_from_string()實現如下,binary_ir_length是.bc字節碼的長度,binary_ir_vector是個string數組,每個元素都是.bc字節碼的字符串形式。
通過parseIR()來解析.bc字節碼,最後返回一個Module*。
把模板中的一些全局變量替換為上面construct_gv()構建的那些全局變量( opcode_xorshift32_state和vm_code_state是在GOVMInterpreter的構造函數裡構建的 )。
把模板中的call_handler()替換為上面的new_call_handler()。
然後clone模板中的vm_interpreter()到當前Module,並重命名成vm_interpreter_<func_name>來防止命名沖突。
然後將所有vm_interpreter()調用改為vm_interpreter_<func_name>()。
注:「模板」中其餘函數若不聲明為inline,則同樣需要進行上述處理。
最後把所有無調用的函數刪掉。
govm_interpreter保存了上面clone出來的vm_interpreter_<func_name>函數。
相關邏輯在GOVMModifier::run()中。
首先清空原函數體,創建body_bbl和ret_bbl。
將原函數參數保存在args_map。
然後遍歷gv_value_map,構建ir指令,把原函數所用到的全局變量保存到gv_data_seg中。
具體是把「編譯期才能確定的全局變量地址」複製到VM的data segment中,讓解釋器在執行期也能靠offset拿到對應的全局變量指針。
然後構建ir指令,將原函數參數保存到gv_data_seg。
這裡與上面同理,都是把「編譯期才能確定的地址」轉成64bit數值,寫進data_seg_addr和code_seg_addr。
模板中所有對gv_data_seg和gv_code_seg的操作都是通過data_seg_addr和code_seg_addr,它們分別代表gv_data_seg和gv_code_seg運行時的實際地址。
( 若僅通過gv_data_seg + offset的形式來讀/寫數據,在遇到指針操作時會有問題,後面會提到 )
最後插入govm_interpreter的函數調用,並根據原函數是否有返回值做對應的處理。
注:返回值默認保存在gv_data_seg[0]的位置。
記錄一下我在嘗試把xVMP移值到LLVM 19時遇到的問題( 不一定是xVMP問題,也可以是我移植的問題 )。
上述複製vm_interpreter()的做法在LLVM19已不可行,會報錯:
CloneFunctionInto()函數現在只允許NewFunc要麼與OldFunc具有相同的parent,要麼NewFunc沒有parent。
當時不太理解為什麼要clone vm_interpreter(),以為只是為了解決命名沖突的問題,因此嘗試只調用setName()來修改函數名字。
結果會報另一個錯:
意思是:「這個全域符號(GlobalVariable / Function / GlobalAlias)在語義上是外部可見的(external),但是卻沒有給它外部連結屬性(ExternalLinkage / WeakLinkage / ExternalWeakLinkage …),這種組合是非法的」
雖然可以將Linkage改成ExternalLinkage來解決上面這個問題,但在之後會鏈接錯誤:
最終的解決方法是創建一個沒有parent的函數NewFn,用Fn來初始化NewFn,最後再clone並添加到當前Module中。
xVMP中對alloca指令的轉譯部份如下,它是在gv_data_seg預留了一片位置來模擬其分配的棧空間。
一開始沒有看仔細,以為它的解釋器在解釋alloca指令時,<result>是以data_off來保存的。
後來回頭再看才發現,它保存的是一個真實的地址( virtual address ),指向data_seg_addr + area_offset,而data_seg_addr就是gv_data_seg的基址。
假如只保存area_offset會有什麼問題?在不涉及指針操作時不會有太大問題。
而一旦遇到像my_memcpy()這樣操作指針的函數時,會出現有問題。
test5()函數未經vmp前的ir代碼如下:
而vmp後,call指令的handler如下。

問題在於var_a和var_b的值是alloca指令預留的data_off。
而my_memcpy()中會把var_a和var_b視為指針進行相關操作,即把0xC視為地址,這顯然會導致報錯。

因此alloca的<result>必須被保存為一個真實的地址,這也是為什麼xVMP要把gv_data_seg轉換成地址後保存到data_seg_addr,後續的操作也是通過data_seg_addr來執行。
以下是一個llvm ir形式的間接函數調用。
根據xVMP的邏輯,在轉譯store ptr @_Z15on_ace_detectedPKc, ptr %Func, align 8時理應會把@_Z15on_ace_detectedPKc函數的實際地址保存到%Func。
問題是@_Z15on_ace_detectedPKc的類型是GlobalValue*,而packValue()中沒有針對GlobalValue*的處理,因此最終的gv_value_map中也不會有@_Z15on_ace_detectedPKc。
注:GlobalVariable繼承自GlobalValue。
這會繼而導致之後在遍歷gv_value_map時忽略了對@_Z15on_ace_detectedPKc的處理( 這裡的處理指:獲取實際地址 → 寫到gv_data_seg )。
最終vm在執行時,會調用一個空地址,導致段錯誤。
call指令的參數有可能是全局變量,但xVMP在處理時似乎忽略了這點。
除此之外,xVMP並不支援phi、select、switch、extractvalue、insertvalue等指令,不過這些指令適配起來也相對簡單。
但invoke指令似乎無法處理?這使得代碼中一旦有std相關的邏輯就會無法被vmp。
( 各位靚仔如有處理invoke指令的思路還請不吝賜教!!! )
樣例:e24K9s2c8@1M7s2y4Q4x3@1q4Q4x3V1k6Q4x3V1k6Y4K9i4c8Z5N6h3u0Q4x3X3g2U0L8$3#2Q4x3V1k6v1L8$3&6S2N6r3S2S2L8W2y4S2L8s2N6S2L8W2)9J5c8W2c8A6k6%4u0W2M7%4y4Q4y4h3k6H3M7X3!0@1k6h3y4@1K9h3!0F1i4K6u0r3j5X3I4G2j5W2)9J5c8X3#2S2M7%4c8W2M7W2)9J5c8Y4y4S2L8i4m8D9k6i4y4Q4x3V1k6K6j5h3#2H3L8r3f1I4x3q4)9J5k6h3x3`.
( 如有其他更複雜的樣例也歡迎提供給我! )
vmp後的效果大致如下。

配合控制流平坦化,可達到以下效果。

virtual bool runOnModule(Module &M)
{
for (auto Func = M.begin(); Func != M.end(); ++Func)
{
Function *F = &*Func;
if (toObfuscateFunction(this->flag, F, "vmp"))
{
if (F->isVarArg())
continue;
govm_interpreter = nullptr;
gv_code_seg = nullptr;
gv_data_seg = nullptr;
ip = nullptr;
data_seg_addr = nullptr;
GOVMTranslator *translator = new GOVMTranslator(F);
translator->run();
GOVMInterpreter *interpreter = new GOVMInterpreter(F, translator->get_callinst_handler());
interpreter->run();
GOVMModifier *modifier = new GOVMModifier(F, translator->get_gv_value_map());
modifier->run();
}
}
return true;
}
virtual bool runOnModule(Module &M)
{
for (auto Func = M.begin(); Func != M.end(); ++Func)
{
Function *F = &*Func;
if (toObfuscateFunction(this->flag, F, "vmp"))
{
if (F->isVarArg())
continue;
govm_interpreter = nullptr;
gv_code_seg = nullptr;
gv_data_seg = nullptr;
ip = nullptr;
data_seg_addr = nullptr;
GOVMTranslator *translator = new GOVMTranslator(F);
translator->run();
GOVMInterpreter *interpreter = new GOVMInterpreter(F, translator->get_callinst_handler());
interpreter->run();
GOVMModifier *modifier = new GOVMModifier(F, translator->get_gv_value_map());
modifier->run();
}
}
return true;
}
void init()
{
setup_callinst_handler();
init_xorshift32();
}
void init()
{
setup_callinst_handler();
init_xorshift32();
}
void GOVMTranslator::setup_callinst_handler()
{
std::vector<Type *> FuncTy_args;
FuncTy_args.push_back(Type::getInt64Ty(Mod->getContext()));
FunctionType *FuncTy = FunctionType::get(
Type::getVoidTy(this->Mod->getContext()),
FuncTy_args,
false);
Constant *tmp = Function::Create(FuncTy, llvm::GlobalValue::LinkageTypes::InternalLinkage, "vm_interpreter_callinst_dispatch_" + F->getName(), Mod);
Function *func = cast<Function>(tmp);
BasicBlock *entryBB = BasicBlock::Create(func->getContext(), "entryBB", func);
IRBuilder<> IRBentryBB(entryBB);
Value *target_id_value;
for (auto arg = func->arg_begin(); arg != func->arg_end(); arg++)
{
Value *tmparg = &*arg;
if (arg == func->arg_begin())
{
Value *paramPtr = IRBentryBB.CreateAlloca(Type::getInt64Ty(Mod->getContext()));
IRBentryBB.CreateStore(tmparg, paramPtr);
target_id_value = IRBentryBB.CreateLoad(paramPtr);
}
}
BasicBlock *conBBL = BasicBlock::Create(func->getContext(), "conBBL", func);
IRBentryBB.CreateBr(conBBL);
this->callinst_handler_curr_idx = 0;
this->callinst_handler = func;
this->callinst_handler_conBBL = conBBL;
this->targetfunc_id = target_id_value;
}
void GOVMTranslator::setup_callinst_handler()
{
std::vector<Type *> FuncTy_args;
FuncTy_args.push_back(Type::getInt64Ty(Mod->getContext()));
FunctionType *FuncTy = FunctionType::get(
Type::getVoidTy(this->Mod->getContext()),
FuncTy_args,
false);
Constant *tmp = Function::Create(FuncTy, llvm::GlobalValue::LinkageTypes::InternalLinkage, "vm_interpreter_callinst_dispatch_" + F->getName(), Mod);
Function *func = cast<Function>(tmp);
BasicBlock *entryBB = BasicBlock::Create(func->getContext(), "entryBB", func);
IRBuilder<> IRBentryBB(entryBB);
Value *target_id_value;
for (auto arg = func->arg_begin(); arg != func->arg_end(); arg++)
{
Value *tmparg = &*arg;
if (arg == func->arg_begin())
{
Value *paramPtr = IRBentryBB.CreateAlloca(Type::getInt64Ty(Mod->getContext()));
IRBentryBB.CreateStore(tmparg, paramPtr);
target_id_value = IRBentryBB.CreateLoad(paramPtr);
}
}
BasicBlock *conBBL = BasicBlock::Create(func->getContext(), "conBBL", func);
IRBentryBB.CreateBr(conBBL);
this->callinst_handler_curr_idx = 0;
this->callinst_handler = func;
this->callinst_handler_conBBL = conBBL;
this->targetfunc_id = target_id_value;
}
if (!F->getReturnType()->isVoidTy())
{
curr_data_offset += modDataLayout->getTypeAllocSize(F->getReturnType());
}
if (!F->getReturnType()->isVoidTy())
{
curr_data_offset += modDataLayout->getTypeAllocSize(F->getReturnType());
}
if (!F->isVarArg())
{
for (auto arg = F->arg_begin(); arg != F->arg_end(); ++arg)
{
Value *tmparg = &*arg;
insert_to_value_map(&value_map, tmparg, curr_data_offset);
curr_data_offset += modDataLayout->getTypeAllocSize(tmparg->getType());
}
}
if (!F->isVarArg())
{
for (auto arg = F->arg_begin(); arg != F->arg_end(); ++arg)
{
Value *tmparg = &*arg;
insert_to_value_map(&value_map, tmparg, curr_data_offset);
curr_data_offset += modDataLayout->getTypeAllocSize(tmparg->getType());
}
}
basicblock_map.insert(pair<BasicBlock *, int>(bb, vm_code.size()));
uint32_t opcode_seed = opcode_seed_setup();
uint32_t vm_code_seed = vm_code_seed_setup();
uint32_t currbb_begin = vm_code.size();
basicblock_map.insert(pair<BasicBlock *, int>(bb, vm_code.size()));
uint32_t opcode_seed = opcode_seed_setup();
uint32_t vm_code_seed = vm_code_seed_setup();
uint32_t currbb_begin = vm_code.size();
for (auto ins = bbl->begin(); ins != bbl->end(); ins++)
{
Instruction *inst = dyn_cast<Instruction>(ins);
for (unsigned idx = 0; idx < inst->getNumOperands(); idx++)
{
if (ConstantExpr *Op = dyn_cast<ConstantExpr>(inst->getOperand(idx)))
{
Instruction *const_inst = Op->getAsInstruction();
const_inst->insertBefore(inst);
inst->setOperand(idx, const_inst);
handle_inst(const_inst);
}
}
handle_inst(inst);
}
for (auto ins = bbl->begin(); ins != bbl->end(); ins++)
{
Instruction *inst = dyn_cast<Instruction>(ins);
for (unsigned idx = 0; idx < inst->getNumOperands(); idx++)
{
if (ConstantExpr *Op = dyn_cast<ConstantExpr>(inst->getOperand(idx)))
{
Instruction *const_inst = Op->getAsInstruction();
const_inst->insertBefore(inst);
inst->setOperand(idx, const_inst);
handle_inst(const_inst);
}
}
handle_inst(inst);
}
if (AllocaInst *inst = dyn_cast<AllocaInst>(ins))
{
int pointer_offset = curr_data_offset;
insert_to_value_map(&value_map, inst, curr_data_offset);
int res_size = modDataLayout->getTypeAllocSize(inst->getType());
curr_data_offset += res_size;
std::vector<uint8_t> packed_res = GET_PACK_VALUE(inst);
int area_offset = curr_data_offset;
int alloca_size = modDataLayout->getTypeAllocSize(inst->getAllocatedType());
curr_data_offset += alloca_size;
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(ALLOCA_OP), packed_res, pack(area_offset, POINTER_SIZE));
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
if (AllocaInst *inst = dyn_cast<AllocaInst>(ins))
{
int pointer_offset = curr_data_offset;
insert_to_value_map(&value_map, inst, curr_data_offset);
int res_size = modDataLayout->getTypeAllocSize(inst->getType());
curr_data_offset += res_size;
std::vector<uint8_t> packed_res = GET_PACK_VALUE(inst);
int area_offset = curr_data_offset;
int alloca_size = modDataLayout->getTypeAllocSize(inst->getAllocatedType());
curr_data_offset += alloca_size;
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(ALLOCA_OP), packed_res, pack(area_offset, POINTER_SIZE));
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
#define GET_PACK_VALUE(value) (packValue(value, &value_map))
std::vector<uint8_t> type_to_hex(Type *type)
{
std::vector<uint8_t> res;
res.push_back(modDataLayout->getTypeAllocSize(type));
res.push_back(type->getTypeID());
return res;
}
std::vector<uint8_t> packValue(Value *value, std::map<Value *, int> *value_map)
{
std::vector<uint8_t> res;
std::vector<uint8_t> packed;
std::vector<uint8_t> packType = type_to_hex(value->getType());
if (ConstantData *CD = dyn_cast<ConstantData>(value))
{
packed = pack_const_value(value);
}
else
{
if (value_map->find(value) == value_map->end())
{
if (GlobalVariable *gv = dyn_cast<GlobalVariable>(value))
{
insert_to_value_map(value_map, value, curr_data_offset);
gv_value_map.insert(pair<GlobalVariable *, int>(gv, curr_data_offset));
int res_size = modDataLayout->getTypeAllocSize(gv->getType());
curr_data_offset += res_size;
}
else
{
assert(value_map->find(value) != value_map->end());
}
}
packed = pack((*value_map)[value], POINTER_SIZE);
packType[1] = 0;
}
res.insert(res.end(), packType.begin(), packType.end());
res.insert(res.end(), packed.begin(), packed.end());
return res;
}
#define GET_PACK_VALUE(value) (packValue(value, &value_map))
std::vector<uint8_t> type_to_hex(Type *type)
{
std::vector<uint8_t> res;
res.push_back(modDataLayout->getTypeAllocSize(type));
res.push_back(type->getTypeID());
return res;
}
std::vector<uint8_t> packValue(Value *value, std::map<Value *, int> *value_map)
{
std::vector<uint8_t> res;
std::vector<uint8_t> packed;
std::vector<uint8_t> packType = type_to_hex(value->getType());
if (ConstantData *CD = dyn_cast<ConstantData>(value))
{
packed = pack_const_value(value);
}
else
{
if (value_map->find(value) == value_map->end())
{
if (GlobalVariable *gv = dyn_cast<GlobalVariable>(value))
{
insert_to_value_map(value_map, value, curr_data_offset);
gv_value_map.insert(pair<GlobalVariable *, int>(gv, curr_data_offset));
int res_size = modDataLayout->getTypeAllocSize(gv->getType());
curr_data_offset += res_size;
}
else
{
assert(value_map->find(value) != value_map->end());
}
}
packed = pack((*value_map)[value], POINTER_SIZE);
packType[1] = 0;
}
res.insert(res.end(), packType.begin(), packType.end());
res.insert(res.end(), packed.begin(), packed.end());
return res;
}
<ALLOCA_OP> | <var_size, var_type, var_data_off> | <alloca_area>
<ALLOCA_OP> | <var_size, var_type, var_data_off> | <alloca_area>
#ifdef IS_INLINE_FUNC
__inline__ __attribute__((always_inline))
#endif
void alloca_handler() {
uint8_t var_size = get_byte_code();
uint8_t var_type = get_byte_code();
uint64_t var_offset = unpack_code(POINTER_SIZE);
uint64_t area_offset = unpack_code(POINTER_SIZE);
pack_store_addr(data_seg_addr+var_offset, data_seg_addr+area_offset, var_size);
}
#ifdef IS_INLINE_FUNC
__inline__ __attribute__((always_inline))
#endif
void alloca_handler() {
uint8_t var_size = get_byte_code();
uint8_t var_type = get_byte_code();
uint64_t var_offset = unpack_code(POINTER_SIZE);
uint64_t area_offset = unpack_code(POINTER_SIZE);
pack_store_addr(data_seg_addr+var_offset, data_seg_addr+area_offset, var_size);
}
else if (LoadInst *inst = dyn_cast<LoadInst>(ins))
{
int res_offset = curr_data_offset;
insert_to_value_map(&value_map, inst, curr_data_offset);
int res_size = modDataLayout->getTypeAllocSize(inst->getType());
curr_data_offset += res_size;
std::vector<uint8_t> packed_res = GET_PACK_VALUE(inst);
std::vector<uint8_t> packed_pointer_operand = GET_PACK_VALUE(inst->getPointerOperand());
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(LOAD_OP), packed_res, packed_pointer_operand);
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
else if (LoadInst *inst = dyn_cast<LoadInst>(ins))
{
int res_offset = curr_data_offset;
insert_to_value_map(&value_map, inst, curr_data_offset);
int res_size = modDataLayout->getTypeAllocSize(inst->getType());
curr_data_offset += res_size;
std::vector<uint8_t> packed_res = GET_PACK_VALUE(inst);
std::vector<uint8_t> packed_pointer_operand = GET_PACK_VALUE(inst->getPointerOperand());
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(LOAD_OP), packed_res, packed_pointer_operand);
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
else if (StoreInst *inst = dyn_cast<StoreInst>(ins))
{
std::vector<uint8_t> packed_value_operand = GET_PACK_VALUE(inst->getValueOperand());
std::vector<uint8_t> packed_pointer_operand = GET_PACK_VALUE(inst->getPointerOperand());
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(STORE_OP), packed_value_operand, packed_pointer_operand);
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
else if (StoreInst *inst = dyn_cast<StoreInst>(ins))
{
std::vector<uint8_t> packed_value_operand = GET_PACK_VALUE(inst->getValueOperand());
std::vector<uint8_t> packed_pointer_operand = GET_PACK_VALUE(inst->getPointerOperand());
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(STORE_OP), packed_value_operand, packed_pointer_operand);
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
else if (GetElementPtrInst *inst = dyn_cast<GetElementPtrInst>(ins))
{
int res_offset = curr_data_offset;
insert_to_value_map(&value_map, inst, curr_data_offset);
int res_size = modDataLayout->getTypeAllocSize(inst->getType());
curr_data_offset += res_size;
std::vector<uint8_t> packed_res = GET_PACK_VALUE(inst);
std::vector<uint8_t> packed_ptr = GET_PACK_VALUE(inst->getPointerOperand());
std::vector<Value *> indices;
for (auto curr_idx = inst->idx_begin(); curr_idx != inst->idx_end(); curr_idx++)
{
indices.push_back(*curr_idx);
}
Type *srcType = inst->getSourceElementType();
std::vector<uint8_t> gep_type;
std::vector<uint8_t> packed_value;
if (dyn_cast<StructType>(srcType))
{
StructType *st = dyn_cast<StructType>(srcType);
gep_type = {0, 0};
ConstantInt *CI = dyn_cast<ConstantInt>(indices[indices.size() - 1]);
int element_idx = CI->getSExtValue();
int curr_element_offset = 0;
for (int i = 0; i < element_idx; i++)
{
curr_element_offset += modDataLayout->getTypeAllocSize(st->getElementType(i));
}
packed_value = {0, 0};
std:
vector<uint8_t> tmp = pack(curr_element_offset, POINTER_SIZE);
packed_value.insert(packed_value.end(), tmp.begin(), tmp.end());
}
else
{
gep_type = type_to_hex(inst->getResultElementType());
packed_value = GET_PACK_VALUE(indices[indices.size() - 1]);
}
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(GEP_OP), gep_type, packed_res, packed_ptr, packed_value);
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
else if (GetElementPtrInst *inst = dyn_cast<GetElementPtrInst>(ins))
{
int res_offset = curr_data_offset;
insert_to_value_map(&value_map, inst, curr_data_offset);
int res_size = modDataLayout->getTypeAllocSize(inst->getType());
curr_data_offset += res_size;
std::vector<uint8_t> packed_res = GET_PACK_VALUE(inst);
std::vector<uint8_t> packed_ptr = GET_PACK_VALUE(inst->getPointerOperand());
std::vector<Value *> indices;
for (auto curr_idx = inst->idx_begin(); curr_idx != inst->idx_end(); curr_idx++)
{
indices.push_back(*curr_idx);
}
Type *srcType = inst->getSourceElementType();
std::vector<uint8_t> gep_type;
std::vector<uint8_t> packed_value;
if (dyn_cast<StructType>(srcType))
{
StructType *st = dyn_cast<StructType>(srcType);
gep_type = {0, 0};
ConstantInt *CI = dyn_cast<ConstantInt>(indices[indices.size() - 1]);
int element_idx = CI->getSExtValue();
int curr_element_offset = 0;
for (int i = 0; i < element_idx; i++)
{
curr_element_offset += modDataLayout->getTypeAllocSize(st->getElementType(i));
}
packed_value = {0, 0};
std:
vector<uint8_t> tmp = pack(curr_element_offset, POINTER_SIZE);
packed_value.insert(packed_value.end(), tmp.begin(), tmp.end());
}
else
{
gep_type = type_to_hex(inst->getResultElementType());
packed_value = GET_PACK_VALUE(indices[indices.size() - 1]);
}
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(GEP_OP), gep_type, packed_res, packed_ptr, packed_value);
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
int test16(int arr[3][2]) {
return arr[2][0];
}
int test16(int arr[3][2]) {
return arr[2][0];
}
else if (BranchInst *inst = dyn_cast<BranchInst>(ins))
{
std::vector<uint8_t> hex_code;
hex_code = pack_op(BR_OP);
std::vector<uint8_t> padding = pack(0, POINTER_SIZE);
if (inst->isUnconditional())
{
hex_code.push_back(0);
br_map.push_back(pair<int, BasicBlock *>(vm_code.size() + hex_code.size(), inst->getSuccessor(0)));
hex_code.insert(hex_code.end(), padding.begin(), padding.end());
}
else
{
hex_code.push_back(1);
std::vector<uint8_t> pack_condition = packValue(inst->getCondition(), &value_map);
hex_code.insert(hex_code.end(), pack_condition.begin(), pack_condition.end());
br_map.push_back(pair<int, BasicBlock *>(vm_code.size() + hex_code.size(), inst->getSuccessor(0)));
hex_code.insert(hex_code.end(), padding.begin(), padding.end());
br_map.push_back(pair<int, BasicBlock *>(vm_code.size() + hex_code.size(), inst->getSuccessor(1)));
hex_code.insert(hex_code.end(), padding.begin(), padding.end());
}
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
else if (BranchInst *inst = dyn_cast<BranchInst>(ins))
{
std::vector<uint8_t> hex_code;
hex_code = pack_op(BR_OP);
std::vector<uint8_t> padding = pack(0, POINTER_SIZE);
if (inst->isUnconditional())
{
hex_code.push_back(0);
br_map.push_back(pair<int, BasicBlock *>(vm_code.size() + hex_code.size(), inst->getSuccessor(0)));
hex_code.insert(hex_code.end(), padding.begin(), padding.end());
}
else
{
hex_code.push_back(1);
std::vector<uint8_t> pack_condition = packValue(inst->getCondition(), &value_map);
hex_code.insert(hex_code.end(), pack_condition.begin(), pack_condition.end());
br_map.push_back(pair<int, BasicBlock *>(vm_code.size() + hex_code.size(), inst->getSuccessor(0)));
hex_code.insert(hex_code.end(), padding.begin(), padding.end());
br_map.push_back(pair<int, BasicBlock *>(vm_code.size() + hex_code.size(), inst->getSuccessor(1)));
hex_code.insert(hex_code.end(), padding.begin(), padding.end());
}
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
}
for (auto it = br_map.rbegin(); it != br_map.rend(); it++)
{
int code_pos = it->first;
BasicBlock *target_bb = it->second;
std::vector<uint8_t> bb_addr = pack(basicblock_map[target_bb], POINTER_SIZE);
std::copy(bb_addr.begin(), bb_addr.end(), vm_code.begin() + code_pos);
}
for (auto it = br_map.rbegin(); it != br_map.rend(); it++)
{
int code_pos = it->first;
BasicBlock *target_bb = it->second;
std::vector<uint8_t> bb_addr = pack(basicblock_map[target_bb], POINTER_SIZE);
std::copy(bb_addr.begin(), bb_addr.end(), vm_code.begin() + code_pos);
}
else if (CallInst *inst = dyn_cast<CallInst>(ins))
{
long long curr_func_id = this->callinst_handler_curr_idx++;
std::vector<uint8_t> packed_funcid = pack(curr_func_id, POINTER_SIZE);
std::vector<uint8_t> packed_res;
if (inst->getType() != Type::getVoidTy(this->Mod->getContext()))
{
int res_offset = curr_data_offset;
insert_to_value_map(&value_map, inst, curr_data_offset);
int res_size = modDataLayout->getTypeAllocSize(inst->getType());
curr_data_offset += res_size;
packed_res = GET_PACK_VALUE(inst);
}
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(Call_OP), packed_funcid);
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
callinst_map.insert(std::pair<CallInst *, long long>(inst, curr_func_id));
}
else if (CallInst *inst = dyn_cast<CallInst>(ins))
{
long long curr_func_id = this->callinst_handler_curr_idx++;
std::vector<uint8_t> packed_funcid = pack(curr_func_id, POINTER_SIZE);
std::vector<uint8_t> packed_res;
if (inst->getType() != Type::getVoidTy(this->Mod->getContext()))
{
int res_offset = curr_data_offset;
insert_to_value_map(&value_map, inst, curr_data_offset);
int res_size = modDataLayout->getTypeAllocSize(inst->getType());
curr_data_offset += res_size;
packed_res = GET_PACK_VALUE(inst);
}
std::vector<uint8_t> hex_code;
ins_to_hex(hex_code, pack_op(Call_OP), packed_funcid);
vm_code.insert(vm_code.end(), hex_code.begin(), hex_code.end());
callinst_map.insert(std::pair<CallInst *, long long>(inst, curr_func_id));
}
for (auto p : callinst_map)
{
handle_callinst(p.first, p.second);
}
for (auto p : callinst_map)
{
handle_callinst(p.first, p.second);
}
IRBuilder<> IRBcon(this->callinst_handler_conBBL);
std::vector<Value *> target_func_args;
for (unsigned idx = 0; idx < inst->getNumArgOperands(); idx++)
{
Value *currarg = inst->getArgOperand(idx);
if (ConstantData *CD = dyn_cast<ConstantData>(currarg))
{
target_func_args.push_back(currarg);
continue;
}
unsigned curroffset = value_map[currarg];
ConstantInt *Zero = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), 0);
Value *offset_value = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), curroffset);
Value *gepinst = IRBcon.CreateGEP(gv_data_seg, {Zero, offset_value}, "");
PointerType *target_ptr_type = PointerType::get(currarg->getType(), cast<PointerType>(gepinst->getType())->getAddressSpace());
Value *ptr = IRBcon.CreatePointerCast(gepinst, target_ptr_type);
Value *arg = IRBcon.CreateLoad(ptr);
target_func_args.push_back(arg);
}
IRBuilder<> IRBcon(this->callinst_handler_conBBL);
std::vector<Value *> target_func_args;
for (unsigned idx = 0; idx < inst->getNumArgOperands(); idx++)
{
Value *currarg = inst->getArgOperand(idx);
if (ConstantData *CD = dyn_cast<ConstantData>(currarg))
{
target_func_args.push_back(currarg);
continue;
}
unsigned curroffset = value_map[currarg];
ConstantInt *Zero = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), 0);
Value *offset_value = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), curroffset);
Value *gepinst = IRBcon.CreateGEP(gv_data_seg, {Zero, offset_value}, "");
PointerType *target_ptr_type = PointerType::get(currarg->getType(), cast<PointerType>(gepinst->getType())->getAddressSpace());
Value *ptr = IRBcon.CreatePointerCast(gepinst, target_ptr_type);
Value *arg = IRBcon.CreateLoad(ptr);
target_func_args.push_back(arg);
}
BasicBlock *callFunction = BasicBlock::Create(Mod->getContext(), "callFunction_" + to_string(this->callinst_handler_curr_idx), this->callinst_handler);
IRBuilder<> IRBcallFunction(callFunction);
Value *resultValue;
if (!inst->isIndirectCall())
{
Function *callee = inst->getCalledFunction();
resultValue = IRBcallFunction.CreateCall(callee->getFunctionType(), callee,
ArrayRef<Value *>(target_func_args));
}
else
{
Value *called_value = inst->getCalledValue();
unsigned called_value_offset = value_map[called_value];
ConstantInt *Zero = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), 0);
Value *offset_value = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), called_value_offset);
Value *gepinst = IRBcallFunction.CreateGEP(gv_data_seg, {Zero, offset_value}, "");
PointerType *target_ptr_type = PointerType::get(called_value->getType(), cast<PointerType>(gepinst->getType())->getAddressSpace());
Value *ptr = IRBcallFunction.CreatePointerCast(gepinst, target_ptr_type);
Value *value = IRBcallFunction.CreateLoad(ptr);
resultValue = IRBcallFunction.CreateCall(value, ArrayRef<Value *>(target_func_args));
}
BasicBlock *callFunction = BasicBlock::Create(Mod->getContext(), "callFunction_" + to_string(this->callinst_handler_curr_idx), this->callinst_handler);
IRBuilder<> IRBcallFunction(callFunction);
Value *resultValue;
if (!inst->isIndirectCall())
{
Function *callee = inst->getCalledFunction();
resultValue = IRBcallFunction.CreateCall(callee->getFunctionType(), callee,
ArrayRef<Value *>(target_func_args));
}
else
{
Value *called_value = inst->getCalledValue();
unsigned called_value_offset = value_map[called_value];
ConstantInt *Zero = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), 0);
Value *offset_value = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), called_value_offset);
Value *gepinst = IRBcallFunction.CreateGEP(gv_data_seg, {Zero, offset_value}, "");
PointerType *target_ptr_type = PointerType::get(called_value->getType(), cast<PointerType>(gepinst->getType())->getAddressSpace());
Value *ptr = IRBcallFunction.CreatePointerCast(gepinst, target_ptr_type);
Value *value = IRBcallFunction.CreateLoad(ptr);
resultValue = IRBcallFunction.CreateCall(value, ArrayRef<Value *>(target_func_args));
}
if (inst->getType() != Type::getVoidTy(this->Mod->getContext()))
{
unsigned result_value_offset = value_map[inst];
ConstantInt *Zero = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), 0);
Value *offset_value = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), result_value_offset);
Value *gepinst = IRBcallFunction.CreateGEP(gv_data_seg, {Zero, offset_value}, "");
PointerType *target_ptr_type = PointerType::get(resultValue->getType(), cast<PointerType>(gepinst->getType())->getAddressSpace());
Value *ptr = IRBcallFunction.CreatePointerCast(gepinst, target_ptr_type);
IRBcallFunction.CreateStore(resultValue, ptr);
}
IRBcallFunction.CreateRetVoid();
if (inst->getType() != Type::getVoidTy(this->Mod->getContext()))
{
unsigned result_value_offset = value_map[inst];
ConstantInt *Zero = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), 0);
Value *offset_value = ConstantInt::get(Type::getInt64Ty(Mod->getContext()), result_value_offset);
Value *gepinst = IRBcallFunction.CreateGEP(gv_data_seg, {Zero, offset_value}, "");
PointerType *target_ptr_type = PointerType::get(resultValue->getType(), cast<PointerType>(gepinst->getType())->getAddressSpace());
Value *ptr = IRBcallFunction.CreatePointerCast(gepinst, target_ptr_type);
[培训]科锐软件逆向54期预科班、正式班开始火爆招生报名啦!!!