-
-
[原创]菜鸟学8.1版本dex加载流程笔记--第三篇: OatFile::Open流程与OatDexFile的获得
-
发表于: 2020-3-16 14:55 6142
-
菜鸟最近破事比较多,磕磕绊绊总算把oat_file.cc大致流程看完了,论坛记录笔记方便以后查询
这个函数重点就是如何打开oat_file文件,然后通过解析oat文件构建出oat_dex_file数据结构,这个
oat_dex_file
存储了完整的dex信息,
如果走通过oat_file
获得 dex_file这条路,从
OpenDexFilesFromOat
直到
DexFile::Open
就是主要通过解析
oat_dex_file
的数据结构获得
dex_file
的
依然先贴一下流程,一些不重要的函数就省略了
OatFile::Open { GetVdexFilename OatFileBase::OpenOatFile<DlOpenOatFile> or OatFileBase::OpenOatFile<ElfOatFile> { PreLoad LoadVdex { VdexFile::Open } Load { Dlopen } ComputeFields { FindDynamicSymbolAddress&oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots } PreSetup { dl_iterate_phdr(dl_iterate_context::callback,&context)//遍历所有elf获得信息后调用callback函数map映射oat文件 } Setup { GetOatHeader GetInstructionSetPointerSize GetOatDexFilesOffset//这里达到了OatDexFile的起始 GetDexFileCount ReadOatDexFileData&dex_file_location_size// ResolveRelativeEncodedDexLocation ReadOatDexFileData&dex_file_checksum ReadOatDexFileData&dex_file_offset ReadOatDexFileData&class_offsets_offset ReadOatDexFileData&lookup_table_offset//加快类查找速度 ReadOatDexFileData&dex_layout_sections_offset ReadOatDexFileData&method_bss_mapping_offset FindDexFileMapItem&call_sites_item//调用站点标识符 new OatDexFile } } }
1.不管 DlOpenOatFile 还是 ElfOatFile ,都进入OpenOatFile,依次调用了,用于后面获得dex和调用等,流程比较清楚
OatFileBase* OatFileBase::OpenOatFile(const std::string& vdex_filename, const std::string& elf_filename, const std::string& location, uint8_t* requested_base, uint8_t* oat_file_begin, bool writable, bool executable, bool low_4gb, const char* abs_dex_location, std::string* error_msg) { std::unique_ptr<OatFileBase> ret(new kOatFileBaseSubType(location, executable));//不管是DlOpenOatFile还是ElfOatFile都先转换成OatFileBase指针 ret->PreLoad(); if (kIsVdexEnabled && !ret->LoadVdex(vdex_filename, writable, low_4gb, error_msg)) { return nullptr; } if (!ret->Load(elf_filename, oat_file_begin, writable, executable, low_4gb, error_msg)) { return nullptr; } if (!ret->ComputeFields(requested_base, elf_filename, error_msg)) { return nullptr; } ret->PreSetup(elf_filename); if (!ret->Setup(abs_dex_location, error_msg)) { return nullptr; } return ret.release(); }
2.先看 PreLoad,通过
dl_iterate_phdr遍历所有加载的elf对象获得它们的dl_phdr_info,每次循环count+1, 然后把
count
存储在shared_objects_before_,下面
PreSetup
会使用
shared_objects_before_这个变量
这里重点关注一下
结构dl_phdr_info
,存储了elf的address,name,Pointer to array of ELF program headers等几个重要字段。
这里的 struct dl_iterate_context只有一个count字段,用于存储计数遍历的elf对象,
callback
功能也比较简单,下面PreSetup还有一个dl_iterate_context 结构,它的callback函数就比较复杂了,遍历并且映射了oat_file的program segments
void DlOpenOatFile::PreLoad() { #ifdef __APPLE__ UNUSED(shared_objects_before_); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else // Count the entries in dl_iterate_phdr we get at this point in time.//遍历所有elf的phdr struct dl_iterate_context { static int callback(struct dl_phdr_info *info ATTRIBUTE_UNUSED, size_t size ATTRIBUTE_UNUSED, void *data) { // struct dl_phdr_info { // ElfW(Addr) dlpi_addr; /* Base address of object */ // const char *dlpi_name; /* (Null-terminated) name of // object */ // const ElfW(Phdr) *dlpi_phdr; /* Pointer to array of // ELF program headers // for this object */ // ElfW(Half) dlpi_phnum; /* # of items in dlpi_phdr */ // } reinterpret_cast<dl_iterate_context*>(data)->count++;//每次循环count自增 return 0; // Continue iteration. } size_t count = 0; } context; dl_iterate_phdr(dl_iterate_context::callback, &context);//遍历所有elf对象获得dl_phdr_info并调用callback,这里的callback就是count自增1 shared_objects_before_ = context.count; //把count最终值存储到shared_objects_before_ #endif }
3.然后 LoadVdex,最终调用了 VdexFile::Open,这里的vdex是8.0以后的新变化,原先存储在oat里的dexfile现在似乎被quickene后放在在vdex里,组合oat_file和vdex_才能获得完整的oat_dex_file
bool OatFileBase::LoadVdex(const std::string& vdex_filename, bool writable, bool low_4gb, std::string* error_msg) { vdex_ = VdexFile::Open(vdex_filename, writable, low_4gb, /* unquicken*/ false, error_msg);//打开并获得vdex_ if (vdex_.get() == nullptr) { *error_msg = StringPrintf("Failed to load vdex file '%s' %s", vdex_filename.c_str(), error_msg->c_str()); return false; } return true; }
vdex简单结构
vdex_file.h,包含dex_files和QuickeningInfo
// File format: // VdexFile::Header fixed-length header // // DEX[0] array of the input DEX files // DEX[1] the bytecode may have been quickened // ... // DEX[D] // QuickeningInfo // uint8[] quickening data // unaligned_uint32_t[2][] table of offsets pair: // uint32_t[0] contains code_item_offset // uint32_t[1] contains quickening data offset from the start // of QuickeningInfo // unalgined_uint32_t[D] start offsets (from the start of QuickeningInfo) in previous // table for each dex file
4.下面是 Load 函数,最终调用了Dlopen加载oat,获得dlopen_handle_
bool DlOpenOatFile::Load(const std::string& elf_filename, uint8_t* oat_file_begin, bool writable, bool executable, bool low_4gb, std::string* error_msg) { // Use dlopen only when flagged to do so, and when it's OK to load things executable. // TODO: Also try when not executable? The issue here could be re-mapping as writable (as // !executable is a sign that we may want to patch), which may not be allowed for // various reasons. if (!kUseDlopen) { *error_msg = "DlOpen is disabled."; return false; } if (low_4gb) { *error_msg = "DlOpen does not support low 4gb loading."; return false; } if (writable) { *error_msg = "DlOpen does not support writable loading."; return false; } if (!executable) { *error_msg = "DlOpen does not support non-executable loading."; return false; } // dlopen always returns the same library if it is already opened on the host. For this reason // we only use dlopen if we are the target or we do not already have the dex file opened. Having // the same library loaded multiple times at different addresses is required for class unloading // and for having dex caches arrays in the .bss section. if (!kIsTargetBuild) { if (!kUseDlopenOnHost) { *error_msg = "DlOpen disabled for host."; return false; } } bool success = Dlopen(elf_filename, oat_file_begin, error_msg);//调用Dlopen加载oat,获得dlopen_handle_ DCHECK(dlopen_handle_ != nullptr || !success); return success; }
看一下Dlopen,最终调用了android_dlopen_ext或者dlopen
bool DlOpenOatFile::Dlopen(const std::string& elf_filename, uint8_t* oat_file_begin, std::string* error_msg) { #ifdef __APPLE__ // The dl_iterate_phdr syscall is missing. There is similar API on OSX, // but let's fallback to the custom loading code for the time being. UNUSED(elf_filename, oat_file_begin); *error_msg = "Dlopen unsupported on Mac."; return false; #else { UniqueCPtr<char> absolute_path(realpath(elf_filename.c_str(), nullptr)); if (absolute_path == nullptr) { *error_msg = StringPrintf("Failed to find absolute path for '%s'", elf_filename.c_str()); return false; } #ifdef ART_TARGET_ANDROID android_dlextinfo extinfo = {}; // typedef struct { // uint64_t flags; // void* reserved_addr; // size_t reserved_size; // int relro_fd; // int library_fd; // } android_dlextinfo; extinfo.flags = ANDROID_DLEXT_FORCE_LOAD | // Force-load, don't reuse handle // (open oat files multiple // times). ANDROID_DLEXT_FORCE_FIXED_VADDR; // Take a non-zero vaddr as absolute // (non-pic boot image). if (oat_file_begin != nullptr) { // extinfo.flags |= ANDROID_DLEXT_LOAD_AT_FIXED_ADDRESS; // Use the requested addr if extinfo.reserved_addr = oat_file_begin; // vaddr = 0. } // (pic boot image). dlopen_handle_ = android_dlopen_ext(absolute_path.get(), RTLD_NOW, &extinfo);//这里oat_file_begin不为空如果调用android_dlopen_ext打开获得dlopen_handle_,在/bionic/libdl/libdl.c里 #else UNUSED(oat_file_begin); static_assert(!kIsTargetBuild || kIsTargetLinux, "host_dlopen_handles_ will leak handles"); MutexLock mu(Thread::Current(), *Locks::host_dlopen_handles_lock_); dlopen_handle_ = dlopen(absolute_path.get(), RTLD_NOW);//如果没有oat_file_begin,直接调用dlopen从路径加载获得dlopen_handle_ if (dlopen_handle_ != nullptr) { if (!host_dlopen_handles_.insert(dlopen_handle_).second) {//把dlopen_handle_插入host_dlopen_handles_中 dlclose(dlopen_handle_); dlopen_handle_ = nullptr; *error_msg = StringPrintf("host dlopen re-opened '%s'", elf_filename.c_str()); return false; } } #endif // ART_TARGET_ANDROID } if (dlopen_handle_ == nullptr) { *error_msg = StringPrintf("Failed to dlopen '%s': %s", elf_filename.c_str(), dlerror()); return false; } return true; #endif }
5.下面是 ComputeFields,它从begin开始,调用FindDynamicSymbolAddress定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots,其中
oatdata,oatlastword定位了begin_和end_
bool OatFileBase::ComputeFields(uint8_t* requested_base, const std::string& file_path, std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots std::string symbol_error_msg; begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg); if (begin_ == nullptr) { *error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } if (requested_base != nullptr && begin_ != requested_base) { // Host can fail this check. Do not dump there to avoid polluting the output. if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) { PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); } *error_msg = StringPrintf("Failed to find oatdata symbol at expected address: " "oatdata=%p != expected=%p. See process maps in the log.", begin_, requested_base); return false; } end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg); if (end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } // Readjust to be non-inclusive upper bound. end_ += sizeof(uint32_t); bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg)); if (bss_begin_ == nullptr) { // No .bss section. bss_end_ = nullptr; } else { bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg)); if (bss_end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str()); return false; } // Readjust to be non-inclusive upper bound. bss_end_ += sizeof(uint32_t); // Find bss methods if present. bss_methods_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg)); // Find bss roots if present. bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关 } return true; }
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr
大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了
oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以
PreSetup 大致功能如下:
声明一个
dl_iterate_context 结构
通过dl_iterate_phdr循环遍历加载的elf对象,每一次遍历shared_objects_seen自增1,当
shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,重复循环,直到最后一个elf执行下面逻辑
通过dlpi_phnum判断segment数量,遍历elf 加载到内存的segment,如果p_type == PT_LOAD说明是load段,通过dl_phdr_info取出dlpi_phdr[i].p_memsz与dlpi_phdr[i].p_vaddr,获得每个segment加载到内存的地址和大小,如果begin_大于地址小于地址+大小,设置contains_begin = true,说明要开始遍历oat_file的sgment了,跳出循环,执行下面的逻辑
遍历dlpi_phdr,当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz映射sgment到内存,
其实这个函数我也没看太明白,希望大佬指正一下,等闲下来抽时间在认真研究研究,
void DlOpenOatFile::PreSetup(const std::string& elf_filename) {//Ask the linker where it mmaped the file and notify our mmap wrapper of the regions #ifdef __APPLE__ UNUSED(elf_filename); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else struct dl_iterate_context { static int callback(struct dl_phdr_info *info, size_t /* size */, void *data) { /* struct dl_phdr_info { ElfW(Addr) dlpi_addr; const char* dlpi_name; const ElfW(Phdr)* dlpi_phdr; ElfW(Half) dlpi_phnum;} */ auto* context = reinterpret_cast<dl_iterate_context*>(data); context->shared_objects_seen++; //这里是shared_objects_seen自增了,跟上面shared_objects_before对比 if (context->shared_objects_seen < context->shared_objects_before) { //只要shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,如果其他线程卸载了一个elf,这有可能出问题 // We haven't been called yet for anything we haven't seen before. Just continue. // Note: this is aggressively optimistic. If another thread was unloading a library, // we may miss out here. However, this does not happen often in practice. return 0; } // See whether this callback corresponds to the file which we have just loaded. bool contains_begin = false; // 一直遍历直到contains_begin也就是包含begin_,这个begin_通过函数Begin()获得也就是oat_file的begin_ for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; if (vaddr <= context->begin_ && context->begin_ < vaddr + memsz) { contains_begin = true; break; } } } // Add dummy mmaps for this file. if (contains_begin) { //一旦 contains_begin = true,遍历dlpi_phdr当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz装载segment到内存 for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; MemMap* mmap = MemMap::MapDummy(info->dlpi_name, vaddr, memsz); context->dlopen_mmaps_->push_back(std::unique_ptr<MemMap>(mmap));//把新建的mmap添加进dlopen_mmaps_ } } return 1; // Stop iteration and return 1 from dl_iterate_phdr. //结束循环 } return 0; // Continue iteration and return 0 from dl_iterate_phdr when finished. } const uint8_t* const begin_; //begin_通过函数Begin()获得也就是oat_file的begin_ std::vector<std::unique_ptr<MemMap>>* const dlopen_mmaps_; const size_t shared_objects_before; //上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数 size_t shared_objects_seen; //本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数 };//到这一行struct dl_iterate_context结束 dl_iterate_context context = { Begin(), &dlopen_mmaps_, shared_objects_before_, 0}; //声明一个context if (dl_iterate_phdr(dl_iterate_context::callback, &context) == 0) { //这里调用dl_iterate_phdr,这个callback回调函数完成了oat_file各个segment的mmap // Hm. Maybe our optimization went wrong. Try another time with shared_objects_before == 0 // before giving up. This should be unusual. VLOG(oat) << "Need a second run in PreSetup, didn't find with shared_objects_before=" << shared_objects_before_; dl_iterate_context context0 = { Begin(), &dlopen_mmaps_, 0, 0}; if (dl_iterate_phdr(dl_iterate_context::callback, &context0) == 0) { // OK, give up and print an error. PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); LOG(ERROR) << "File " << elf_filename << " loaded with dlopen but cannot find its mmaps."; } } #endif }
7.再往下就是OatFileBase::Setup,这里主要通过 ReadOatDexFileData函数 运用上文装载的oat_file获得了oat_dex_file以用于获得dex_file,这里的整个oat_file的数据结构综合了oat文件和vdex文件的信息。
Setup { GetOatHeader GetInstructionSetPointerSize GetOatDexFilesOffset//这里达到了OatDexFile的Offset GetDexFileCount ReadOatDexFileData&dex_file_location_size// ResolveRelativeEncodedDexLocation ReadOatDexFileData&dex_file_checksum ReadOatDexFileData&dex_file_offset ReadOatDexFileData&class_offsets_offset ReadOatDexFileData&lookup_table_offset//加快类查找速度 ReadOatDexFileData&dex_layout_sections_offset ReadOatDexFileData&method_bss_mapping_offset FindDexFileMapItem&call_sites_item//调用站点标识符 new OatDexFile //根据上面的信息new OatDexFile 以便于GetBestOatFile获得 }
源码流程比较清楚,主要把握住 ReadOatDexFileData和oat文件指针的移动,最后创建oat_dex_file是最重要的
bool OatFileBase::Setup(const char* abs_dex_location, std::string* error_msg) { if (!GetOatHeader().IsValid()) { std::string cause = GetOatHeader().GetValidationErrorMessage(); *error_msg = StringPrintf("Invalid oat header for '%s': %s", GetLocation().c_str(), cause.c_str()); return false; } PointerSize pointer_size = GetInstructionSetPointerSize(GetOatHeader().GetInstructionSet()); size_t key_value_store_size = (Size() >= sizeof(OatHeader)) ? GetOatHeader().GetKeyValueStoreSize() : 0u; if (Size() < sizeof(OatHeader) + key_value_store_size) { *error_msg = StringPrintf("In oat file '%s' found truncated OatHeader, " "size = %zu < %zu + %zu", GetLocation().c_str(), Size(), sizeof(OatHeader), key_value_store_size); return false; } size_t oat_dex_files_offset = GetOatHeader().GetOatDexFilesOffset(); if (oat_dex_files_offset < GetOatHeader().GetHeaderSize() || oat_dex_files_offset > Size()) { *error_msg = StringPrintf("In oat file '%s' found invalid oat dex files offset: " "%zu is not in [%zu, %zu]", GetLocation().c_str(), oat_dex_files_offset, GetOatHeader().GetHeaderSize(), Size()); return false; } const uint8_t* oat = Begin() + oat_dex_files_offset; // Jump to the OatDexFile records.//oat指针跳到OatDexFile去 DCHECK_GE(static_cast<size_t>(pointer_size), alignof(GcRoot<mirror::Object>)); if (!IsAligned<kPageSize>(bss_begin_) || !IsAlignedParam(bss_methods_, static_cast<size_t>(pointer_size)) || !IsAlignedParam(bss_roots_, static_cast<size_t>(pointer_size)) || !IsAligned<alignof(GcRoot<mirror::Object>)>(bss_end_)) { *error_msg = StringPrintf("In oat file '%s' found unaligned bss symbol(s): " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } if ((bss_methods_ != nullptr && (bss_methods_ < bss_begin_ || bss_methods_ > bss_end_)) || (bss_roots_ != nullptr && (bss_roots_ < bss_begin_ || bss_roots_ > bss_end_)) || (bss_methods_ != nullptr && bss_roots_ != nullptr && bss_methods_ > bss_roots_)) { *error_msg = StringPrintf("In oat file '%s' found bss symbol(s) outside .bss or unordered: " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } uint8_t* after_arrays = (bss_methods_ != nullptr) ? bss_methods_ : bss_roots_; // May be null. uint8_t* dex_cache_arrays = (bss_begin_ == after_arrays) ? nullptr : bss_begin_; uint8_t* dex_cache_arrays_end = (bss_begin_ == after_arrays) ? nullptr : (after_arrays != nullptr) ? after_arrays : bss_end_; DCHECK_EQ(dex_cache_arrays != nullptr, dex_cache_arrays_end != nullptr); uint32_t dex_file_count = GetOatHeader().GetDexFileCount();//获得dex_file_count oat_dex_files_storage_.reserve(dex_file_count); for (size_t i = 0; i < dex_file_count; i++) { uint32_t dex_file_location_size; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_location_size))) //循环通过ReadOatDexFileData函数读取dex_file_location_size并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu truncated after dex file " "location size", GetLocation().c_str(), i); return false; } if (UNLIKELY(dex_file_location_size == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with empty location name", GetLocation().c_str(), i); return false; } if (UNLIKELY(static_cast<size_t>(End() - oat) < dex_file_location_size)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with truncated dex file " "location", GetLocation().c_str(), i); return false; } const char* dex_file_location_data = reinterpret_cast<const char*>(oat); oat += dex_file_location_size; std::string dex_file_location = ResolveRelativeEncodedDexLocation( abs_dex_location, std::string(dex_file_location_data, dex_file_location_size)); uint32_t dex_file_checksum; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_checksum))) {//通过ReadOatDexFileData函数读取dex_file_checksum并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated after " "dex file checksum", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } uint32_t dex_file_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_offset))) {//通过ReadOatDexFileData函数读取dex_file_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after dex file offsets", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with zero dex " "file offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset > DexSize())) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u > %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize()); return false; } if (UNLIKELY(DexSize() - dex_file_offset < sizeof(DexFile::Header))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u of %zu but the size of dex file header is %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize(), sizeof(DexFile::Header)); return false; } const uint8_t* dex_file_pointer = DexBegin() + dex_file_offset; if (UNLIKELY(!DexFile::IsMagicValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file magic '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } if (UNLIKELY(!DexFile::IsVersionValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file version '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } const DexFile::Header* header = reinterpret_cast<const DexFile::Header*>(dex_file_pointer); if (DexSize() - dex_file_offset < header->file_size_) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u and size %u truncated at %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, header->file_size_, DexSize()); return false; } uint32_t class_offsets_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &class_offsets_offset))) {//通过ReadOatDexFileData函数读取class_offsets_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after class offsets offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(class_offsets_offset > Size()) || UNLIKELY((Size() - class_offsets_offset) / sizeof(uint32_t) < header->class_defs_size_)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "class offsets, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset, Size(), header->class_defs_size_); return false; } if (UNLIKELY(!IsAligned<alignof(uint32_t)>(class_offsets_offset))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned " "class offsets, offset %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset); return false; } const uint32_t* class_offsets_pointer = reinterpret_cast<const uint32_t*>(Begin() + class_offsets_offset); uint32_t lookup_table_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &lookup_table_offset))) {//通过ReadOatDexFileData函数读取lookup_table_offset并调整oat指针,lookup_table用于加速类的查找 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after lookup table offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const uint8_t* lookup_table_data = lookup_table_offset != 0u ? Begin() + lookup_table_offset : nullptr; if (lookup_table_offset != 0u && (UNLIKELY(lookup_table_offset > Size()) || UNLIKELY(Size() - lookup_table_offset < TypeLookupTable::RawDataLength(header->class_defs_size_)))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "type lookup table, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), lookup_table_offset, Size(), header->class_defs_size_); return false; } uint32_t dex_layout_sections_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_layout_sections_offset))) {//通过ReadOatDexFileData函数读取dex_layout_sections_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after dex layout sections offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const DexLayoutSections* const dex_layout_sections = dex_layout_sections_offset != 0 ? reinterpret_cast<const DexLayoutSections*>(Begin() + dex_layout_sections_offset) : nullptr; uint32_t method_bss_mapping_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &method_bss_mapping_offset))) {//通过ReadOatDexFileData函数读取method_bss_mapping_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after method bss mapping offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const bool readable_method_bss_mapping_size = method_bss_mapping_offset != 0u && method_bss_mapping_offset <= Size() && IsAligned<alignof(MethodBssMapping)>(method_bss_mapping_offset) && Size() - method_bss_mapping_offset >= MethodBssMapping::ComputeSize(0); const MethodBssMapping* method_bss_mapping = readable_method_bss_mapping_size ? reinterpret_cast<const MethodBssMapping*>(Begin() + method_bss_mapping_offset) : nullptr; if (method_bss_mapping_offset != 0u && (UNLIKELY(method_bss_mapping == nullptr) || UNLIKELY(method_bss_mapping->size() == 0u) || UNLIKELY(Size() - method_bss_mapping_offset < MethodBssMapping::ComputeSize(method_bss_mapping->size())))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned or " " truncated method bss mapping, offset %u of %zu, length %zu", GetLocation().c_str(), i, dex_file_location.c_str(), method_bss_mapping_offset, Size(), method_bss_mapping != nullptr ? method_bss_mapping->size() : 0u); return false; } if (kIsDebugBuild && method_bss_mapping != nullptr) { const MethodBssMappingEntry* prev_entry = nullptr; for (const MethodBssMappingEntry& entry : *method_bss_mapping) { CHECK_ALIGNED_PARAM(entry.bss_offset, static_cast<size_t>(pointer_size)); CHECK_LT(entry.bss_offset, BssSize()); CHECK_LE(POPCOUNT(entry.index_mask) * static_cast<size_t>(pointer_size), entry.bss_offset); size_t index_mask_span = (entry.index_mask != 0u) ? 16u - CTZ(entry.index_mask) : 0u; CHECK_LE(index_mask_span, entry.method_index); if (prev_entry != nullptr) { CHECK_LT(prev_entry->method_index, entry.method_index - index_mask_span); } prev_entry = &entry; } CHECK_LT(prev_entry->method_index, reinterpret_cast<const DexFile::Header*>(dex_file_pointer)->method_ids_size_); } uint8_t* current_dex_cache_arrays = nullptr; if (dex_cache_arrays != nullptr) { // All DexCache types except for CallSite have their instance counts in the // DexFile header. For CallSites, we need to read the info from the MapList. //对于CallSites,必须从MapList中读取,他不存储在header中 const DexFile::MapItem* call_sites_item = nullptr; if (!FindDexFileMapItem(DexBegin(), //通过FindDexFileMapItem读取call_sites_item并解析 DexEnd(), DexFile::MapItemType::kDexTypeCallSiteIdItem, &call_sites_item)) { *error_msg = StringPrintf("In oat file '%s' could not read data from truncated DexFile map", GetLocation().c_str()); return false; } size_t num_call_sites = call_sites_item == nullptr ? 0 : call_sites_item->size_; DexCacheArraysLayout layout(pointer_size, *header, num_call_sites); if (layout.Size() != 0u) { if (static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays) < layout.Size()) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with " "truncated dex cache arrays, %zu < %zu.", GetLocation().c_str(), i, dex_file_location.c_str(), static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays), layout.Size()); return false; } current_dex_cache_arrays = dex_cache_arrays; dex_cache_arrays += layout.Size(); } } std::string canonical_location = DexFile::GetDexCanonicalLocation(dex_file_location.c_str()); // Create the OatDexFile and add it to the owning container. OatDexFile* oat_dex_file = new OatDexFile(this, //根据上面ReadOatDexFileData和FindDexFileMapItem获得的信息构建oat_dex_file dex_file_location, canonical_location, dex_file_checksum, dex_file_pointer, lookup_table_data, method_bss_mapping, class_offsets_pointer, current_dex_cache_arrays, dex_layout_sections); oat_dex_files_storage_.push_back(oat_dex_file); // Add the location and canonical location (if different) to the oat_dex_files_ table. StringPiece key(oat_dex_file->GetDexFileLocation()); oat_dex_files_.Put(key, oat_dex_file); if (canonical_location != dex_file_location) { StringPiece canonical_key(oat_dex_file->GetCanonicalDexFileLocation()); oat_dex_files_.Put(canonical_key, oat_dex_file); } } if (dex_cache_arrays != dex_cache_arrays_end) { // We expect the bss section to be either empty (dex_cache_arrays and bss_end_ // both null) or contain just the dex cache arrays and optionally some GC roots. *error_msg = StringPrintf("In oat file '%s' found unexpected bss size bigger by %zu bytes.", GetLocation().c_str(), static_cast<size_t>(bss_end_ - dex_cache_arrays)); return false; } return true; }
还有一种
打开 ElfOatFile 的方式,应该是调用了系统自己的elf加载器,大致流程应该类似,菜鸟有空在慢慢分析,
最后再梳理一下流程
,大致如下:
PreLoad,遍历所有加载的elf对象获得dl_phdr_info,计算所有elf的个数存储在shared_objects_before_中
LoadVdex,通过VdexFile::Open加载vdex文件,vdex里面也存储了一些dex文件信息
Load,调用Dlopen加载oat_file,获得dlopen_handle_
ComputeFields,从begin开始,通过FindDynamicSymbolAddress定位各种符号地址,也就界定了oat_file在内存中的范围
PreSetup,
再次遍历所有加载的elf对象,在最后一个elf对象的load段之后,通过mmap映射oat_file的segment到内存
Setup,通过 ReadOatDexFileData等函数解析oat_file信息,组装
oat_dex_file
根据以上几步,最终通过oat_file获得了oat_dex_file.
由于菜鸟有些地方也没搞太明白,中间免不了有一些错误,有些语句也叙述的不够恰当,毕竟外行而且语文不咋地,但大致流程应该没问题,希望各位大佬指出我的问题,我好早日改正。
参考:老罗大佬的安卓之旅 https://www.kancloud.cn/alex_wsc/androids/473622
OatFileBase* OatFileBase::OpenOatFile(const std::string& vdex_filename, const std::string& elf_filename, const std::string& location, uint8_t* requested_base, uint8_t* oat_file_begin, bool writable, bool executable, bool low_4gb, const char* abs_dex_location, std::string* error_msg) { std::unique_ptr<OatFileBase> ret(new kOatFileBaseSubType(location, executable));//不管是DlOpenOatFile还是ElfOatFile都先转换成OatFileBase指针 ret->PreLoad(); if (kIsVdexEnabled && !ret->LoadVdex(vdex_filename, writable, low_4gb, error_msg)) { return nullptr; } if (!ret->Load(elf_filename, oat_file_begin, writable, executable, low_4gb, error_msg)) { return nullptr; } if (!ret->ComputeFields(requested_base, elf_filename, error_msg)) { return nullptr; } ret->PreSetup(elf_filename); if (!ret->Setup(abs_dex_location, error_msg)) { return nullptr; } return ret.release(); }
2.先看 PreLoad,通过
dl_iterate_phdr遍历所有加载的elf对象获得它们的dl_phdr_info,每次循环count+1, 然后把
count
存储在shared_objects_before_,下面
PreSetup
会使用
shared_objects_before_这个变量
这里重点关注一下
结构dl_phdr_info
,存储了elf的address,name,Pointer to array of ELF program headers等几个重要字段。
这里的 struct dl_iterate_context只有一个count字段,用于存储计数遍历的elf对象,
callback
功能也比较简单,下面PreSetup还有一个dl_iterate_context 结构,它的callback函数就比较复杂了,遍历并且映射了oat_file的program segments
void DlOpenOatFile::PreLoad() { #ifdef __APPLE__ UNUSED(shared_objects_before_); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else // Count the entries in dl_iterate_phdr we get at this point in time.//遍历所有elf的phdr struct dl_iterate_context { static int callback(struct dl_phdr_info *info ATTRIBUTE_UNUSED, size_t size ATTRIBUTE_UNUSED, void *data) { // struct dl_phdr_info { // ElfW(Addr) dlpi_addr; /* Base address of object */ // const char *dlpi_name; /* (Null-terminated) name of // object */ // const ElfW(Phdr) *dlpi_phdr; /* Pointer to array of // ELF program headers // for this object */ // ElfW(Half) dlpi_phnum; /* # of items in dlpi_phdr */ // } reinterpret_cast<dl_iterate_context*>(data)->count++;//每次循环count自增 return 0; // Continue iteration. } size_t count = 0; } context; dl_iterate_phdr(dl_iterate_context::callback, &context);//遍历所有elf对象获得dl_phdr_info并调用callback,这里的callback就是count自增1 shared_objects_before_ = context.count; //把count最终值存储到shared_objects_before_ #endif }
3.然后 LoadVdex,最终调用了 VdexFile::Open,这里的vdex是8.0以后的新变化,原先存储在oat里的dexfile现在似乎被quickene后放在在vdex里,组合oat_file和vdex_才能获得完整的oat_dex_file
bool OatFileBase::LoadVdex(const std::string& vdex_filename, bool writable, bool low_4gb, std::string* error_msg) { vdex_ = VdexFile::Open(vdex_filename, writable, low_4gb, /* unquicken*/ false, error_msg);//打开并获得vdex_ if (vdex_.get() == nullptr) { *error_msg = StringPrintf("Failed to load vdex file '%s' %s", vdex_filename.c_str(), error_msg->c_str()); return false; } return true; }
vdex简单结构
vdex_file.h,包含dex_files和QuickeningInfo
// File format: // VdexFile::Header fixed-length header // // DEX[0] array of the input DEX files // DEX[1] the bytecode may have been quickened // ... // DEX[D] // QuickeningInfo // uint8[] quickening data // unaligned_uint32_t[2][] table of offsets pair: // uint32_t[0] contains code_item_offset // uint32_t[1] contains quickening data offset from the start // of QuickeningInfo // unalgined_uint32_t[D] start offsets (from the start of QuickeningInfo) in previous // table for each dex file
4.下面是 Load 函数,最终调用了Dlopen加载oat,获得dlopen_handle_
bool DlOpenOatFile::Load(const std::string& elf_filename, uint8_t* oat_file_begin, bool writable, bool executable, bool low_4gb, std::string* error_msg) { // Use dlopen only when flagged to do so, and when it's OK to load things executable. // TODO: Also try when not executable? The issue here could be re-mapping as writable (as // !executable is a sign that we may want to patch), which may not be allowed for // various reasons. if (!kUseDlopen) { *error_msg = "DlOpen is disabled."; return false; } if (low_4gb) { *error_msg = "DlOpen does not support low 4gb loading."; return false; } if (writable) { *error_msg = "DlOpen does not support writable loading."; return false; } if (!executable) { *error_msg = "DlOpen does not support non-executable loading."; return false; } // dlopen always returns the same library if it is already opened on the host. For this reason // we only use dlopen if we are the target or we do not already have the dex file opened. Having // the same library loaded multiple times at different addresses is required for class unloading // and for having dex caches arrays in the .bss section. if (!kIsTargetBuild) { if (!kUseDlopenOnHost) { *error_msg = "DlOpen disabled for host."; return false; } } bool success = Dlopen(elf_filename, oat_file_begin, error_msg);//调用Dlopen加载oat,获得dlopen_handle_ DCHECK(dlopen_handle_ != nullptr || !success); return success; }
看一下Dlopen,最终调用了android_dlopen_ext或者dlopen
bool DlOpenOatFile::Dlopen(const std::string& elf_filename, uint8_t* oat_file_begin, std::string* error_msg) { #ifdef __APPLE__ // The dl_iterate_phdr syscall is missing. There is similar API on OSX, // but let's fallback to the custom loading code for the time being. UNUSED(elf_filename, oat_file_begin); *error_msg = "Dlopen unsupported on Mac."; return false; #else { UniqueCPtr<char> absolute_path(realpath(elf_filename.c_str(), nullptr)); if (absolute_path == nullptr) { *error_msg = StringPrintf("Failed to find absolute path for '%s'", elf_filename.c_str()); return false; } #ifdef ART_TARGET_ANDROID android_dlextinfo extinfo = {}; // typedef struct { // uint64_t flags; // void* reserved_addr; // size_t reserved_size; // int relro_fd; // int library_fd; // } android_dlextinfo; extinfo.flags = ANDROID_DLEXT_FORCE_LOAD | // Force-load, don't reuse handle // (open oat files multiple // times). ANDROID_DLEXT_FORCE_FIXED_VADDR; // Take a non-zero vaddr as absolute // (non-pic boot image). if (oat_file_begin != nullptr) { // extinfo.flags |= ANDROID_DLEXT_LOAD_AT_FIXED_ADDRESS; // Use the requested addr if extinfo.reserved_addr = oat_file_begin; // vaddr = 0. } // (pic boot image). dlopen_handle_ = android_dlopen_ext(absolute_path.get(), RTLD_NOW, &extinfo);//这里oat_file_begin不为空如果调用android_dlopen_ext打开获得dlopen_handle_,在/bionic/libdl/libdl.c里 #else UNUSED(oat_file_begin); static_assert(!kIsTargetBuild || kIsTargetLinux, "host_dlopen_handles_ will leak handles"); MutexLock mu(Thread::Current(), *Locks::host_dlopen_handles_lock_); dlopen_handle_ = dlopen(absolute_path.get(), RTLD_NOW);//如果没有oat_file_begin,直接调用dlopen从路径加载获得dlopen_handle_ if (dlopen_handle_ != nullptr) { if (!host_dlopen_handles_.insert(dlopen_handle_).second) {//把dlopen_handle_插入host_dlopen_handles_中 dlclose(dlopen_handle_); dlopen_handle_ = nullptr; *error_msg = StringPrintf("host dlopen re-opened '%s'", elf_filename.c_str()); return false; } } #endif // ART_TARGET_ANDROID } if (dlopen_handle_ == nullptr) { *error_msg = StringPrintf("Failed to dlopen '%s': %s", elf_filename.c_str(), dlerror()); return false; } return true; #endif }
5.下面是 ComputeFields,它从begin开始,调用FindDynamicSymbolAddress定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots,其中
oatdata,oatlastword定位了begin_和end_
bool OatFileBase::ComputeFields(uint8_t* requested_base, const std::string& file_path, std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots std::string symbol_error_msg; begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg); if (begin_ == nullptr) { *error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } if (requested_base != nullptr && begin_ != requested_base) { // Host can fail this check. Do not dump there to avoid polluting the output. if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) { PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); } *error_msg = StringPrintf("Failed to find oatdata symbol at expected address: " "oatdata=%p != expected=%p. See process maps in the log.", begin_, requested_base); return false; } end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg); if (end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } // Readjust to be non-inclusive upper bound. end_ += sizeof(uint32_t); bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg)); if (bss_begin_ == nullptr) { // No .bss section. bss_end_ = nullptr; } else { bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg)); if (bss_end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str()); return false; } // Readjust to be non-inclusive upper bound. bss_end_ += sizeof(uint32_t); // Find bss methods if present. bss_methods_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg)); // Find bss roots if present. bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关 } return true; }
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr
大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了
oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以
PreSetup 大致功能如下:
声明一个
dl_iterate_context 结构
通过dl_iterate_phdr循环遍历加载的elf对象,每一次遍历shared_objects_seen自增1,当
shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,重复循环,直到最后一个elf执行下面逻辑
通过dlpi_phnum判断segment数量,遍历elf 加载到内存的segment,如果p_type == PT_LOAD说明是load段,通过dl_phdr_info取出dlpi_phdr[i].p_memsz与dlpi_phdr[i].p_vaddr,获得每个segment加载到内存的地址和大小,如果begin_大于地址小于地址+大小,设置contains_begin = true,说明要开始遍历oat_file的sgment了,跳出循环,执行下面的逻辑
遍历dlpi_phdr,当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz映射sgment到内存,
其实这个函数我也没看太明白,希望大佬指正一下,等闲下来抽时间在认真研究研究,
void DlOpenOatFile::PreSetup(const std::string& elf_filename) {//Ask the linker where it mmaped the file and notify our mmap wrapper of the regions #ifdef __APPLE__ UNUSED(elf_filename); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else struct dl_iterate_context { static int callback(struct dl_phdr_info *info, size_t /* size */, void *data) { /* struct dl_phdr_info { ElfW(Addr) dlpi_addr; const char* dlpi_name; const ElfW(Phdr)* dlpi_phdr; ElfW(Half) dlpi_phnum;} */ auto* context = reinterpret_cast<dl_iterate_context*>(data); context->shared_objects_seen++; //这里是shared_objects_seen自增了,跟上面shared_objects_before对比 if (context->shared_objects_seen < context->shared_objects_before) { //只要shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,如果其他线程卸载了一个elf,这有可能出问题 // We haven't been called yet for anything we haven't seen before. Just continue. // Note: this is aggressively optimistic. If another thread was unloading a library, // we may miss out here. However, this does not happen often in practice. return 0; } // See whether this callback corresponds to the file which we have just loaded. bool contains_begin = false; // 一直遍历直到contains_begin也就是包含begin_,这个begin_通过函数Begin()获得也就是oat_file的begin_ for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; if (vaddr <= context->begin_ && context->begin_ < vaddr + memsz) { contains_begin = true; break; } } } // Add dummy mmaps for this file. if (contains_begin) { //一旦 contains_begin = true,遍历dlpi_phdr当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz装载segment到内存 for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; MemMap* mmap = MemMap::MapDummy(info->dlpi_name, vaddr, memsz); context->dlopen_mmaps_->push_back(std::unique_ptr<MemMap>(mmap));//把新建的mmap添加进dlopen_mmaps_ } } return 1; // Stop iteration and return 1 from dl_iterate_phdr. //结束循环 } return 0; // Continue iteration and return 0 from dl_iterate_phdr when finished. } const uint8_t* const begin_; //begin_通过函数Begin()获得也就是oat_file的begin_ std::vector<std::unique_ptr<MemMap>>* const dlopen_mmaps_; const size_t shared_objects_before; //上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数 size_t shared_objects_seen; //本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数 };//到这一行struct dl_iterate_context结束 dl_iterate_context context = { Begin(), &dlopen_mmaps_, shared_objects_before_, 0}; //声明一个context if (dl_iterate_phdr(dl_iterate_context::callback, &context) == 0) { //这里调用dl_iterate_phdr,这个callback回调函数完成了oat_file各个segment的mmap // Hm. Maybe our optimization went wrong. Try another time with shared_objects_before == 0 // before giving up. This should be unusual. VLOG(oat) << "Need a second run in PreSetup, didn't find with shared_objects_before=" << shared_objects_before_; dl_iterate_context context0 = { Begin(), &dlopen_mmaps_, 0, 0}; if (dl_iterate_phdr(dl_iterate_context::callback, &context0) == 0) { // OK, give up and print an error. PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); LOG(ERROR) << "File " << elf_filename << " loaded with dlopen but cannot find its mmaps."; } } #endif }
7.再往下就是OatFileBase::Setup,这里主要通过 ReadOatDexFileData函数 运用上文装载的oat_file获得了oat_dex_file以用于获得dex_file,这里的整个oat_file的数据结构综合了oat文件和vdex文件的信息。
Setup { GetOatHeader GetInstructionSetPointerSize GetOatDexFilesOffset//这里达到了OatDexFile的Offset GetDexFileCount ReadOatDexFileData&dex_file_location_size// ResolveRelativeEncodedDexLocation ReadOatDexFileData&dex_file_checksum ReadOatDexFileData&dex_file_offset ReadOatDexFileData&class_offsets_offset ReadOatDexFileData&lookup_table_offset//加快类查找速度 ReadOatDexFileData&dex_layout_sections_offset ReadOatDexFileData&method_bss_mapping_offset FindDexFileMapItem&call_sites_item//调用站点标识符 new OatDexFile //根据上面的信息new OatDexFile 以便于GetBestOatFile获得 }
源码流程比较清楚,主要把握住 ReadOatDexFileData和oat文件指针的移动,最后创建oat_dex_file是最重要的
bool OatFileBase::Setup(const char* abs_dex_location, std::string* error_msg) { if (!GetOatHeader().IsValid()) { std::string cause = GetOatHeader().GetValidationErrorMessage(); *error_msg = StringPrintf("Invalid oat header for '%s': %s", GetLocation().c_str(), cause.c_str()); return false; } PointerSize pointer_size = GetInstructionSetPointerSize(GetOatHeader().GetInstructionSet()); size_t key_value_store_size = (Size() >= sizeof(OatHeader)) ? GetOatHeader().GetKeyValueStoreSize() : 0u; if (Size() < sizeof(OatHeader) + key_value_store_size) { *error_msg = StringPrintf("In oat file '%s' found truncated OatHeader, " "size = %zu < %zu + %zu", GetLocation().c_str(), Size(), sizeof(OatHeader), key_value_store_size); return false; } size_t oat_dex_files_offset = GetOatHeader().GetOatDexFilesOffset(); if (oat_dex_files_offset < GetOatHeader().GetHeaderSize() || oat_dex_files_offset > Size()) { *error_msg = StringPrintf("In oat file '%s' found invalid oat dex files offset: " "%zu is not in [%zu, %zu]", GetLocation().c_str(), oat_dex_files_offset, GetOatHeader().GetHeaderSize(), Size()); return false; } const uint8_t* oat = Begin() + oat_dex_files_offset; // Jump to the OatDexFile records.//oat指针跳到OatDexFile去 DCHECK_GE(static_cast<size_t>(pointer_size), alignof(GcRoot<mirror::Object>)); if (!IsAligned<kPageSize>(bss_begin_) || !IsAlignedParam(bss_methods_, static_cast<size_t>(pointer_size)) || !IsAlignedParam(bss_roots_, static_cast<size_t>(pointer_size)) || !IsAligned<alignof(GcRoot<mirror::Object>)>(bss_end_)) { *error_msg = StringPrintf("In oat file '%s' found unaligned bss symbol(s): " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } if ((bss_methods_ != nullptr && (bss_methods_ < bss_begin_ || bss_methods_ > bss_end_)) || (bss_roots_ != nullptr && (bss_roots_ < bss_begin_ || bss_roots_ > bss_end_)) || (bss_methods_ != nullptr && bss_roots_ != nullptr && bss_methods_ > bss_roots_)) { *error_msg = StringPrintf("In oat file '%s' found bss symbol(s) outside .bss or unordered: " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } uint8_t* after_arrays = (bss_methods_ != nullptr) ? bss_methods_ : bss_roots_; // May be null. uint8_t* dex_cache_arrays = (bss_begin_ == after_arrays) ? nullptr : bss_begin_; uint8_t* dex_cache_arrays_end = (bss_begin_ == after_arrays) ? nullptr : (after_arrays != nullptr) ? after_arrays : bss_end_; DCHECK_EQ(dex_cache_arrays != nullptr, dex_cache_arrays_end != nullptr); uint32_t dex_file_count = GetOatHeader().GetDexFileCount();//获得dex_file_count oat_dex_files_storage_.reserve(dex_file_count); for (size_t i = 0; i < dex_file_count; i++) { uint32_t dex_file_location_size; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_location_size))) //循环通过ReadOatDexFileData函数读取dex_file_location_size并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu truncated after dex file " "location size", GetLocation().c_str(), i); return false; } if (UNLIKELY(dex_file_location_size == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with empty location name", GetLocation().c_str(), i); return false; } if (UNLIKELY(static_cast<size_t>(End() - oat) < dex_file_location_size)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with truncated dex file " "location", GetLocation().c_str(), i); return false; } const char* dex_file_location_data = reinterpret_cast<const char*>(oat); oat += dex_file_location_size; std::string dex_file_location = ResolveRelativeEncodedDexLocation( abs_dex_location, std::string(dex_file_location_data, dex_file_location_size)); uint32_t dex_file_checksum; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_checksum))) {//通过ReadOatDexFileData函数读取dex_file_checksum并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated after " "dex file checksum", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } uint32_t dex_file_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_offset))) {//通过ReadOatDexFileData函数读取dex_file_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after dex file offsets", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with zero dex " "file offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset > DexSize())) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u > %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize()); return false; } if (UNLIKELY(DexSize() - dex_file_offset < sizeof(DexFile::Header))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u of %zu but the size of dex file header is %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize(), sizeof(DexFile::Header)); return false; } const uint8_t* dex_file_pointer = DexBegin() + dex_file_offset; if (UNLIKELY(!DexFile::IsMagicValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file magic '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } if (UNLIKELY(!DexFile::IsVersionValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file version '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } const DexFile::Header* header = reinterpret_cast<const DexFile::Header*>(dex_file_pointer); if (DexSize() - dex_file_offset < header->file_size_) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u and size %u truncated at %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, header->file_size_, DexSize()); return false; } uint32_t class_offsets_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &class_offsets_offset))) {//通过ReadOatDexFileData函数读取class_offsets_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after class offsets offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(class_offsets_offset > Size()) || UNLIKELY((Size() - class_offsets_offset) / sizeof(uint32_t) < header->class_defs_size_)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "class offsets, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset, Size(), header->class_defs_size_); return false; } if (UNLIKELY(!IsAligned<alignof(uint32_t)>(class_offsets_offset))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned " "class offsets, offset %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset); return false; } const uint32_t* class_offsets_pointer = reinterpret_cast<const uint32_t*>(Begin() + class_offsets_offset); uint32_t lookup_table_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &lookup_table_offset))) {//通过ReadOatDexFileData函数读取lookup_table_offset并调整oat指针,lookup_table用于加速类的查找 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after lookup table offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const uint8_t* lookup_table_data = lookup_table_offset != 0u ? Begin() + lookup_table_offset : nullptr; if (lookup_table_offset != 0u && (UNLIKELY(lookup_table_offset > Size()) || UNLIKELY(Size() - lookup_table_offset < TypeLookupTable::RawDataLength(header->class_defs_size_)))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "type lookup table, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), lookup_table_offset, Size(), header->class_defs_size_); return false; } uint32_t dex_layout_sections_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_layout_sections_offset))) {//通过ReadOatDexFileData函数读取dex_layout_sections_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after dex layout sections offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const DexLayoutSections* const dex_layout_sections = dex_layout_sections_offset != 0 ? reinterpret_cast<const DexLayoutSections*>(Begin() + dex_layout_sections_offset) : nullptr; uint32_t method_bss_mapping_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &method_bss_mapping_offset))) {//通过ReadOatDexFileData函数读取method_bss_mapping_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after method bss mapping offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const bool readable_method_bss_mapping_size = method_bss_mapping_offset != 0u && method_bss_mapping_offset <= Size() && IsAligned<alignof(MethodBssMapping)>(method_bss_mapping_offset) && Size() - method_bss_mapping_offset >= MethodBssMapping::ComputeSize(0); const MethodBssMapping* method_bss_mapping = readable_method_bss_mapping_size ? reinterpret_cast<const MethodBssMapping*>(Begin() + method_bss_mapping_offset) : nullptr; if (method_bss_mapping_offset != 0u && (UNLIKELY(method_bss_mapping == nullptr) || UNLIKELY(method_bss_mapping->size() == 0u) || UNLIKELY(Size() - method_bss_mapping_offset < MethodBssMapping::ComputeSize(method_bss_mapping->size())))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned or " " truncated method bss mapping, offset %u of %zu, length %zu", GetLocation().c_str(), i, dex_file_location.c_str(), method_bss_mapping_offset, Size(), method_bss_mapping != nullptr ? method_bss_mapping->size() : 0u); return false; } if (kIsDebugBuild && method_bss_mapping != nullptr) { const MethodBssMappingEntry* prev_entry = nullptr; for (const MethodBssMappingEntry& entry : *method_bss_mapping) { CHECK_ALIGNED_PARAM(entry.bss_offset, static_cast<size_t>(pointer_size)); CHECK_LT(entry.bss_offset, BssSize()); CHECK_LE(POPCOUNT(entry.index_mask) * static_cast<size_t>(pointer_size), entry.bss_offset); size_t index_mask_span = (entry.index_mask != 0u) ? 16u - CTZ(entry.index_mask) : 0u; CHECK_LE(index_mask_span, entry.method_index); if (prev_entry != nullptr) { CHECK_LT(prev_entry->method_index, entry.method_index - index_mask_span); } prev_entry = &entry; } CHECK_LT(prev_entry->method_index, reinterpret_cast<const DexFile::Header*>(dex_file_pointer)->method_ids_size_); } uint8_t* current_dex_cache_arrays = nullptr; if (dex_cache_arrays != nullptr) { // All DexCache types except for CallSite have their instance counts in the // DexFile header. For CallSites, we need to read the info from the MapList. //对于CallSites,必须从MapList中读取,他不存储在header中 const DexFile::MapItem* call_sites_item = nullptr; if (!FindDexFileMapItem(DexBegin(), //通过FindDexFileMapItem读取call_sites_item并解析 DexEnd(), DexFile::MapItemType::kDexTypeCallSiteIdItem, &call_sites_item)) { *error_msg = StringPrintf("In oat file '%s' could not read data from truncated DexFile map", GetLocation().c_str()); return false; } size_t num_call_sites = call_sites_item == nullptr ? 0 : call_sites_item->size_; DexCacheArraysLayout layout(pointer_size, *header, num_call_sites); if (layout.Size() != 0u) { if (static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays) < layout.Size()) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with " "truncated dex cache arrays, %zu < %zu.", GetLocation().c_str(), i, dex_file_location.c_str(), static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays), layout.Size()); return false; } current_dex_cache_arrays = dex_cache_arrays; dex_cache_arrays += layout.Size(); } } std::string canonical_location = DexFile::GetDexCanonicalLocation(dex_file_location.c_str()); // Create the OatDexFile and add it to the owning container. OatDexFile* oat_dex_file = new OatDexFile(this, //根据上面ReadOatDexFileData和FindDexFileMapItem获得的信息构建oat_dex_file dex_file_location, canonical_location, dex_file_checksum, dex_file_pointer, lookup_table_data, method_bss_mapping, class_offsets_pointer, current_dex_cache_arrays, dex_layout_sections); oat_dex_files_storage_.push_back(oat_dex_file); // Add the location and canonical location (if different) to the oat_dex_files_ table. StringPiece key(oat_dex_file->GetDexFileLocation()); oat_dex_files_.Put(key, oat_dex_file); if (canonical_location != dex_file_location) { StringPiece canonical_key(oat_dex_file->GetCanonicalDexFileLocation()); oat_dex_files_.Put(canonical_key, oat_dex_file); } } if (dex_cache_arrays != dex_cache_arrays_end) { // We expect the bss section to be either empty (dex_cache_arrays and bss_end_ // both null) or contain just the dex cache arrays and optionally some GC roots. *error_msg = StringPrintf("In oat file '%s' found unexpected bss size bigger by %zu bytes.", GetLocation().c_str(), static_cast<size_t>(bss_end_ - dex_cache_arrays)); return false; } return true; }
还有一种
打开 ElfOatFile 的方式,应该是调用了系统自己的elf加载器,大致流程应该类似,菜鸟有空在慢慢分析,
最后再梳理一下流程
,大致如下:
PreLoad,遍历所有加载的elf对象获得dl_phdr_info,计算所有elf的个数存储在shared_objects_before_中
LoadVdex,通过VdexFile::Open加载vdex文件,vdex里面也存储了一些dex文件信息
Load,调用Dlopen加载oat_file,获得dlopen_handle_
ComputeFields,从begin开始,通过FindDynamicSymbolAddress定位各种符号地址,也就界定了oat_file在内存中的范围
PreSetup,
再次遍历所有加载的elf对象,在最后一个elf对象的load段之后,通过mmap映射oat_file的segment到内存
Setup,通过 ReadOatDexFileData等函数解析oat_file信息,组装
oat_dex_file
根据以上几步,最终通过oat_file获得了oat_dex_file.
由于菜鸟有些地方也没搞太明白,中间免不了有一些错误,有些语句也叙述的不够恰当,毕竟外行而且语文不咋地,但大致流程应该没问题,希望各位大佬指出我的问题,我好早日改正。
参考:老罗大佬的安卓之旅 https://www.kancloud.cn/alex_wsc/androids/473622
void DlOpenOatFile::PreLoad() { #ifdef __APPLE__ UNUSED(shared_objects_before_); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else // Count the entries in dl_iterate_phdr we get at this point in time.//遍历所有elf的phdr struct dl_iterate_context { static int callback(struct dl_phdr_info *info ATTRIBUTE_UNUSED, size_t size ATTRIBUTE_UNUSED, void *data) { // struct dl_phdr_info { // ElfW(Addr) dlpi_addr; /* Base address of object */ // const char *dlpi_name; /* (Null-terminated) name of // object */ // const ElfW(Phdr) *dlpi_phdr; /* Pointer to array of // ELF program headers // for this object */ // ElfW(Half) dlpi_phnum; /* # of items in dlpi_phdr */ // } reinterpret_cast<dl_iterate_context*>(data)->count++;//每次循环count自增 return 0; // Continue iteration. } size_t count = 0; } context; dl_iterate_phdr(dl_iterate_context::callback, &context);//遍历所有elf对象获得dl_phdr_info并调用callback,这里的callback就是count自增1 shared_objects_before_ = context.count; //把count最终值存储到shared_objects_before_ #endif }
3.然后 LoadVdex,最终调用了 VdexFile::Open,这里的vdex是8.0以后的新变化,原先存储在oat里的dexfile现在似乎被quickene后放在在vdex里,组合oat_file和vdex_才能获得完整的oat_dex_file
bool OatFileBase::LoadVdex(const std::string& vdex_filename, bool writable, bool low_4gb, std::string* error_msg) { vdex_ = VdexFile::Open(vdex_filename, writable, low_4gb, /* unquicken*/ false, error_msg);//打开并获得vdex_ if (vdex_.get() == nullptr) { *error_msg = StringPrintf("Failed to load vdex file '%s' %s", vdex_filename.c_str(), error_msg->c_str()); return false; } return true; }
vdex简单结构
vdex_file.h,包含dex_files和QuickeningInfo
// File format: // VdexFile::Header fixed-length header // // DEX[0] array of the input DEX files // DEX[1] the bytecode may have been quickened // ... // DEX[D] // QuickeningInfo // uint8[] quickening data // unaligned_uint32_t[2][] table of offsets pair: // uint32_t[0] contains code_item_offset // uint32_t[1] contains quickening data offset from the start // of QuickeningInfo // unalgined_uint32_t[D] start offsets (from the start of QuickeningInfo) in previous // table for each dex file
4.下面是 Load 函数,最终调用了Dlopen加载oat,获得dlopen_handle_
bool DlOpenOatFile::Load(const std::string& elf_filename, uint8_t* oat_file_begin, bool writable, bool executable, bool low_4gb, std::string* error_msg) { // Use dlopen only when flagged to do so, and when it's OK to load things executable. // TODO: Also try when not executable? The issue here could be re-mapping as writable (as // !executable is a sign that we may want to patch), which may not be allowed for // various reasons. if (!kUseDlopen) { *error_msg = "DlOpen is disabled."; return false; } if (low_4gb) { *error_msg = "DlOpen does not support low 4gb loading."; return false; } if (writable) { *error_msg = "DlOpen does not support writable loading."; return false; } if (!executable) { *error_msg = "DlOpen does not support non-executable loading."; return false; } // dlopen always returns the same library if it is already opened on the host. For this reason // we only use dlopen if we are the target or we do not already have the dex file opened. Having // the same library loaded multiple times at different addresses is required for class unloading // and for having dex caches arrays in the .bss section. if (!kIsTargetBuild) { if (!kUseDlopenOnHost) { *error_msg = "DlOpen disabled for host."; return false; } } bool success = Dlopen(elf_filename, oat_file_begin, error_msg);//调用Dlopen加载oat,获得dlopen_handle_ DCHECK(dlopen_handle_ != nullptr || !success); return success; }
看一下Dlopen,最终调用了android_dlopen_ext或者dlopen
bool DlOpenOatFile::Dlopen(const std::string& elf_filename, uint8_t* oat_file_begin, std::string* error_msg) { #ifdef __APPLE__ // The dl_iterate_phdr syscall is missing. There is similar API on OSX, // but let's fallback to the custom loading code for the time being. UNUSED(elf_filename, oat_file_begin); *error_msg = "Dlopen unsupported on Mac."; return false; #else { UniqueCPtr<char> absolute_path(realpath(elf_filename.c_str(), nullptr)); if (absolute_path == nullptr) { *error_msg = StringPrintf("Failed to find absolute path for '%s'", elf_filename.c_str()); return false; } #ifdef ART_TARGET_ANDROID android_dlextinfo extinfo = {}; // typedef struct { // uint64_t flags; // void* reserved_addr; // size_t reserved_size; // int relro_fd; // int library_fd; // } android_dlextinfo; extinfo.flags = ANDROID_DLEXT_FORCE_LOAD | // Force-load, don't reuse handle // (open oat files multiple // times). ANDROID_DLEXT_FORCE_FIXED_VADDR; // Take a non-zero vaddr as absolute // (non-pic boot image). if (oat_file_begin != nullptr) { // extinfo.flags |= ANDROID_DLEXT_LOAD_AT_FIXED_ADDRESS; // Use the requested addr if extinfo.reserved_addr = oat_file_begin; // vaddr = 0. } // (pic boot image). dlopen_handle_ = android_dlopen_ext(absolute_path.get(), RTLD_NOW, &extinfo);//这里oat_file_begin不为空如果调用android_dlopen_ext打开获得dlopen_handle_,在/bionic/libdl/libdl.c里 #else UNUSED(oat_file_begin); static_assert(!kIsTargetBuild || kIsTargetLinux, "host_dlopen_handles_ will leak handles"); MutexLock mu(Thread::Current(), *Locks::host_dlopen_handles_lock_); dlopen_handle_ = dlopen(absolute_path.get(), RTLD_NOW);//如果没有oat_file_begin,直接调用dlopen从路径加载获得dlopen_handle_ if (dlopen_handle_ != nullptr) { if (!host_dlopen_handles_.insert(dlopen_handle_).second) {//把dlopen_handle_插入host_dlopen_handles_中 dlclose(dlopen_handle_); dlopen_handle_ = nullptr; *error_msg = StringPrintf("host dlopen re-opened '%s'", elf_filename.c_str()); return false; } } #endif // ART_TARGET_ANDROID } if (dlopen_handle_ == nullptr) { *error_msg = StringPrintf("Failed to dlopen '%s': %s", elf_filename.c_str(), dlerror()); return false; } return true; #endif }
5.下面是 ComputeFields,它从begin开始,调用FindDynamicSymbolAddress定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots,其中
oatdata,oatlastword定位了begin_和end_
bool OatFileBase::ComputeFields(uint8_t* requested_base, const std::string& file_path, std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots std::string symbol_error_msg; begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg); if (begin_ == nullptr) { *error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } if (requested_base != nullptr && begin_ != requested_base) { // Host can fail this check. Do not dump there to avoid polluting the output. if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) { PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); } *error_msg = StringPrintf("Failed to find oatdata symbol at expected address: " "oatdata=%p != expected=%p. See process maps in the log.", begin_, requested_base); return false; } end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg); if (end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } // Readjust to be non-inclusive upper bound. end_ += sizeof(uint32_t); bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg)); if (bss_begin_ == nullptr) { // No .bss section. bss_end_ = nullptr; } else { bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg)); if (bss_end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str()); return false; } // Readjust to be non-inclusive upper bound. bss_end_ += sizeof(uint32_t); // Find bss methods if present. bss_methods_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg)); // Find bss roots if present. bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关 } return true; }
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr
大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了
oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以
PreSetup 大致功能如下:
声明一个
dl_iterate_context 结构
通过dl_iterate_phdr循环遍历加载的elf对象,每一次遍历shared_objects_seen自增1,当
shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,重复循环,直到最后一个elf执行下面逻辑
通过dlpi_phnum判断segment数量,遍历elf 加载到内存的segment,如果p_type == PT_LOAD说明是load段,通过dl_phdr_info取出dlpi_phdr[i].p_memsz与dlpi_phdr[i].p_vaddr,获得每个segment加载到内存的地址和大小,如果begin_大于地址小于地址+大小,设置contains_begin = true,说明要开始遍历oat_file的sgment了,跳出循环,执行下面的逻辑
遍历dlpi_phdr,当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz映射sgment到内存,
其实这个函数我也没看太明白,希望大佬指正一下,等闲下来抽时间在认真研究研究,
void DlOpenOatFile::PreSetup(const std::string& elf_filename) {//Ask the linker where it mmaped the file and notify our mmap wrapper of the regions #ifdef __APPLE__ UNUSED(elf_filename); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else struct dl_iterate_context { static int callback(struct dl_phdr_info *info, size_t /* size */, void *data) { /* struct dl_phdr_info { ElfW(Addr) dlpi_addr; const char* dlpi_name; const ElfW(Phdr)* dlpi_phdr; ElfW(Half) dlpi_phnum;} */ auto* context = reinterpret_cast<dl_iterate_context*>(data); context->shared_objects_seen++; //这里是shared_objects_seen自增了,跟上面shared_objects_before对比 if (context->shared_objects_seen < context->shared_objects_before) { //只要shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,如果其他线程卸载了一个elf,这有可能出问题 // We haven't been called yet for anything we haven't seen before. Just continue. // Note: this is aggressively optimistic. If another thread was unloading a library, // we may miss out here. However, this does not happen often in practice. return 0; } // See whether this callback corresponds to the file which we have just loaded. bool contains_begin = false; // 一直遍历直到contains_begin也就是包含begin_,这个begin_通过函数Begin()获得也就是oat_file的begin_ for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; if (vaddr <= context->begin_ && context->begin_ < vaddr + memsz) { contains_begin = true; break; } } } // Add dummy mmaps for this file. if (contains_begin) { //一旦 contains_begin = true,遍历dlpi_phdr当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz装载segment到内存 for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; MemMap* mmap = MemMap::MapDummy(info->dlpi_name, vaddr, memsz); context->dlopen_mmaps_->push_back(std::unique_ptr<MemMap>(mmap));//把新建的mmap添加进dlopen_mmaps_ } } return 1; // Stop iteration and return 1 from dl_iterate_phdr. //结束循环 } return 0; // Continue iteration and return 0 from dl_iterate_phdr when finished. } const uint8_t* const begin_; //begin_通过函数Begin()获得也就是oat_file的begin_ std::vector<std::unique_ptr<MemMap>>* const dlopen_mmaps_; const size_t shared_objects_before; //上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数 size_t shared_objects_seen; //本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数 };//到这一行struct dl_iterate_context结束 dl_iterate_context context = { Begin(), &dlopen_mmaps_, shared_objects_before_, 0}; //声明一个context if (dl_iterate_phdr(dl_iterate_context::callback, &context) == 0) { //这里调用dl_iterate_phdr,这个callback回调函数完成了oat_file各个segment的mmap // Hm. Maybe our optimization went wrong. Try another time with shared_objects_before == 0 // before giving up. This should be unusual. VLOG(oat) << "Need a second run in PreSetup, didn't find with shared_objects_before=" << shared_objects_before_; dl_iterate_context context0 = { Begin(), &dlopen_mmaps_, 0, 0}; if (dl_iterate_phdr(dl_iterate_context::callback, &context0) == 0) { // OK, give up and print an error. PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); LOG(ERROR) << "File " << elf_filename << " loaded with dlopen but cannot find its mmaps."; } } #endif }
7.再往下就是OatFileBase::Setup,这里主要通过 ReadOatDexFileData函数 运用上文装载的oat_file获得了oat_dex_file以用于获得dex_file,这里的整个oat_file的数据结构综合了oat文件和vdex文件的信息。
Setup { GetOatHeader GetInstructionSetPointerSize GetOatDexFilesOffset//这里达到了OatDexFile的Offset GetDexFileCount ReadOatDexFileData&dex_file_location_size// ResolveRelativeEncodedDexLocation ReadOatDexFileData&dex_file_checksum ReadOatDexFileData&dex_file_offset ReadOatDexFileData&class_offsets_offset ReadOatDexFileData&lookup_table_offset//加快类查找速度 ReadOatDexFileData&dex_layout_sections_offset ReadOatDexFileData&method_bss_mapping_offset FindDexFileMapItem&call_sites_item//调用站点标识符 new OatDexFile //根据上面的信息new OatDexFile 以便于GetBestOatFile获得 }
源码流程比较清楚,主要把握住 ReadOatDexFileData和oat文件指针的移动,最后创建oat_dex_file是最重要的
bool OatFileBase::Setup(const char* abs_dex_location, std::string* error_msg) { if (!GetOatHeader().IsValid()) { std::string cause = GetOatHeader().GetValidationErrorMessage(); *error_msg = StringPrintf("Invalid oat header for '%s': %s", GetLocation().c_str(), cause.c_str()); return false; } PointerSize pointer_size = GetInstructionSetPointerSize(GetOatHeader().GetInstructionSet()); size_t key_value_store_size = (Size() >= sizeof(OatHeader)) ? GetOatHeader().GetKeyValueStoreSize() : 0u; if (Size() < sizeof(OatHeader) + key_value_store_size) { *error_msg = StringPrintf("In oat file '%s' found truncated OatHeader, " "size = %zu < %zu + %zu", GetLocation().c_str(), Size(), sizeof(OatHeader), key_value_store_size); return false; } size_t oat_dex_files_offset = GetOatHeader().GetOatDexFilesOffset(); if (oat_dex_files_offset < GetOatHeader().GetHeaderSize() || oat_dex_files_offset > Size()) { *error_msg = StringPrintf("In oat file '%s' found invalid oat dex files offset: " "%zu is not in [%zu, %zu]", GetLocation().c_str(), oat_dex_files_offset, GetOatHeader().GetHeaderSize(), Size()); return false; } const uint8_t* oat = Begin() + oat_dex_files_offset; // Jump to the OatDexFile records.//oat指针跳到OatDexFile去 DCHECK_GE(static_cast<size_t>(pointer_size), alignof(GcRoot<mirror::Object>)); if (!IsAligned<kPageSize>(bss_begin_) || !IsAlignedParam(bss_methods_, static_cast<size_t>(pointer_size)) || !IsAlignedParam(bss_roots_, static_cast<size_t>(pointer_size)) || !IsAligned<alignof(GcRoot<mirror::Object>)>(bss_end_)) { *error_msg = StringPrintf("In oat file '%s' found unaligned bss symbol(s): " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } if ((bss_methods_ != nullptr && (bss_methods_ < bss_begin_ || bss_methods_ > bss_end_)) || (bss_roots_ != nullptr && (bss_roots_ < bss_begin_ || bss_roots_ > bss_end_)) || (bss_methods_ != nullptr && bss_roots_ != nullptr && bss_methods_ > bss_roots_)) { *error_msg = StringPrintf("In oat file '%s' found bss symbol(s) outside .bss or unordered: " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } uint8_t* after_arrays = (bss_methods_ != nullptr) ? bss_methods_ : bss_roots_; // May be null. uint8_t* dex_cache_arrays = (bss_begin_ == after_arrays) ? nullptr : bss_begin_; uint8_t* dex_cache_arrays_end = (bss_begin_ == after_arrays) ? nullptr : (after_arrays != nullptr) ? after_arrays : bss_end_; DCHECK_EQ(dex_cache_arrays != nullptr, dex_cache_arrays_end != nullptr); uint32_t dex_file_count = GetOatHeader().GetDexFileCount();//获得dex_file_count oat_dex_files_storage_.reserve(dex_file_count); for (size_t i = 0; i < dex_file_count; i++) { uint32_t dex_file_location_size; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_location_size))) //循环通过ReadOatDexFileData函数读取dex_file_location_size并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu truncated after dex file " "location size", GetLocation().c_str(), i); return false; } if (UNLIKELY(dex_file_location_size == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with empty location name", GetLocation().c_str(), i); return false; } if (UNLIKELY(static_cast<size_t>(End() - oat) < dex_file_location_size)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with truncated dex file " "location", GetLocation().c_str(), i); return false; } const char* dex_file_location_data = reinterpret_cast<const char*>(oat); oat += dex_file_location_size; std::string dex_file_location = ResolveRelativeEncodedDexLocation( abs_dex_location, std::string(dex_file_location_data, dex_file_location_size)); uint32_t dex_file_checksum; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_checksum))) {//通过ReadOatDexFileData函数读取dex_file_checksum并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated after " "dex file checksum", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } uint32_t dex_file_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_offset))) {//通过ReadOatDexFileData函数读取dex_file_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after dex file offsets", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with zero dex " "file offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset > DexSize())) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u > %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize()); return false; } if (UNLIKELY(DexSize() - dex_file_offset < sizeof(DexFile::Header))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u of %zu but the size of dex file header is %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize(), sizeof(DexFile::Header)); return false; } const uint8_t* dex_file_pointer = DexBegin() + dex_file_offset; if (UNLIKELY(!DexFile::IsMagicValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file magic '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } if (UNLIKELY(!DexFile::IsVersionValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file version '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } const DexFile::Header* header = reinterpret_cast<const DexFile::Header*>(dex_file_pointer); if (DexSize() - dex_file_offset < header->file_size_) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u and size %u truncated at %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, header->file_size_, DexSize()); return false; } uint32_t class_offsets_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &class_offsets_offset))) {//通过ReadOatDexFileData函数读取class_offsets_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after class offsets offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(class_offsets_offset > Size()) || UNLIKELY((Size() - class_offsets_offset) / sizeof(uint32_t) < header->class_defs_size_)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "class offsets, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset, Size(), header->class_defs_size_); return false; } if (UNLIKELY(!IsAligned<alignof(uint32_t)>(class_offsets_offset))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned " "class offsets, offset %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset); return false; } const uint32_t* class_offsets_pointer = reinterpret_cast<const uint32_t*>(Begin() + class_offsets_offset); uint32_t lookup_table_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &lookup_table_offset))) {//通过ReadOatDexFileData函数读取lookup_table_offset并调整oat指针,lookup_table用于加速类的查找 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after lookup table offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const uint8_t* lookup_table_data = lookup_table_offset != 0u ? Begin() + lookup_table_offset : nullptr; if (lookup_table_offset != 0u && (UNLIKELY(lookup_table_offset > Size()) || UNLIKELY(Size() - lookup_table_offset < TypeLookupTable::RawDataLength(header->class_defs_size_)))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "type lookup table, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), lookup_table_offset, Size(), header->class_defs_size_); return false; } uint32_t dex_layout_sections_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_layout_sections_offset))) {//通过ReadOatDexFileData函数读取dex_layout_sections_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after dex layout sections offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const DexLayoutSections* const dex_layout_sections = dex_layout_sections_offset != 0 ? reinterpret_cast<const DexLayoutSections*>(Begin() + dex_layout_sections_offset) : nullptr; uint32_t method_bss_mapping_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &method_bss_mapping_offset))) {//通过ReadOatDexFileData函数读取method_bss_mapping_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after method bss mapping offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const bool readable_method_bss_mapping_size = method_bss_mapping_offset != 0u && method_bss_mapping_offset <= Size() && IsAligned<alignof(MethodBssMapping)>(method_bss_mapping_offset) && Size() - method_bss_mapping_offset >= MethodBssMapping::ComputeSize(0); const MethodBssMapping* method_bss_mapping = readable_method_bss_mapping_size ? reinterpret_cast<const MethodBssMapping*>(Begin() + method_bss_mapping_offset) : nullptr; if (method_bss_mapping_offset != 0u && (UNLIKELY(method_bss_mapping == nullptr) || UNLIKELY(method_bss_mapping->size() == 0u) || UNLIKELY(Size() - method_bss_mapping_offset < MethodBssMapping::ComputeSize(method_bss_mapping->size())))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned or " " truncated method bss mapping, offset %u of %zu, length %zu", GetLocation().c_str(), i, dex_file_location.c_str(), method_bss_mapping_offset, Size(), method_bss_mapping != nullptr ? method_bss_mapping->size() : 0u); return false; } if (kIsDebugBuild && method_bss_mapping != nullptr) { const MethodBssMappingEntry* prev_entry = nullptr; for (const MethodBssMappingEntry& entry : *method_bss_mapping) { CHECK_ALIGNED_PARAM(entry.bss_offset, static_cast<size_t>(pointer_size)); CHECK_LT(entry.bss_offset, BssSize()); CHECK_LE(POPCOUNT(entry.index_mask) * static_cast<size_t>(pointer_size), entry.bss_offset); size_t index_mask_span = (entry.index_mask != 0u) ? 16u - CTZ(entry.index_mask) : 0u; CHECK_LE(index_mask_span, entry.method_index); if (prev_entry != nullptr) { CHECK_LT(prev_entry->method_index, entry.method_index - index_mask_span); } prev_entry = &entry; } CHECK_LT(prev_entry->method_index, reinterpret_cast<const DexFile::Header*>(dex_file_pointer)->method_ids_size_); } uint8_t* current_dex_cache_arrays = nullptr; if (dex_cache_arrays != nullptr) { // All DexCache types except for CallSite have their instance counts in the // DexFile header. For CallSites, we need to read the info from the MapList. //对于CallSites,必须从MapList中读取,他不存储在header中 const DexFile::MapItem* call_sites_item = nullptr; if (!FindDexFileMapItem(DexBegin(), //通过FindDexFileMapItem读取call_sites_item并解析 DexEnd(), DexFile::MapItemType::kDexTypeCallSiteIdItem, &call_sites_item)) { *error_msg = StringPrintf("In oat file '%s' could not read data from truncated DexFile map", GetLocation().c_str()); return false; } size_t num_call_sites = call_sites_item == nullptr ? 0 : call_sites_item->size_; DexCacheArraysLayout layout(pointer_size, *header, num_call_sites); if (layout.Size() != 0u) { if (static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays) < layout.Size()) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with " "truncated dex cache arrays, %zu < %zu.", GetLocation().c_str(), i, dex_file_location.c_str(), static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays), layout.Size()); return false; } current_dex_cache_arrays = dex_cache_arrays; dex_cache_arrays += layout.Size(); } } std::string canonical_location = DexFile::GetDexCanonicalLocation(dex_file_location.c_str()); // Create the OatDexFile and add it to the owning container. OatDexFile* oat_dex_file = new OatDexFile(this, //根据上面ReadOatDexFileData和FindDexFileMapItem获得的信息构建oat_dex_file dex_file_location, canonical_location, dex_file_checksum, dex_file_pointer, lookup_table_data, method_bss_mapping, class_offsets_pointer, current_dex_cache_arrays, dex_layout_sections); oat_dex_files_storage_.push_back(oat_dex_file); // Add the location and canonical location (if different) to the oat_dex_files_ table. StringPiece key(oat_dex_file->GetDexFileLocation()); oat_dex_files_.Put(key, oat_dex_file); if (canonical_location != dex_file_location) { StringPiece canonical_key(oat_dex_file->GetCanonicalDexFileLocation()); oat_dex_files_.Put(canonical_key, oat_dex_file); } } if (dex_cache_arrays != dex_cache_arrays_end) { // We expect the bss section to be either empty (dex_cache_arrays and bss_end_ // both null) or contain just the dex cache arrays and optionally some GC roots. *error_msg = StringPrintf("In oat file '%s' found unexpected bss size bigger by %zu bytes.", GetLocation().c_str(), static_cast<size_t>(bss_end_ - dex_cache_arrays)); return false; } return true; }
还有一种
打开 ElfOatFile 的方式,应该是调用了系统自己的elf加载器,大致流程应该类似,菜鸟有空在慢慢分析,
最后再梳理一下流程
,大致如下:
PreLoad,遍历所有加载的elf对象获得dl_phdr_info,计算所有elf的个数存储在shared_objects_before_中
LoadVdex,通过VdexFile::Open加载vdex文件,vdex里面也存储了一些dex文件信息
Load,调用Dlopen加载oat_file,获得dlopen_handle_
ComputeFields,从begin开始,通过FindDynamicSymbolAddress定位各种符号地址,也就界定了oat_file在内存中的范围
PreSetup,
再次遍历所有加载的elf对象,在最后一个elf对象的load段之后,通过mmap映射oat_file的segment到内存
Setup,通过 ReadOatDexFileData等函数解析oat_file信息,组装
oat_dex_file
根据以上几步,最终通过oat_file获得了oat_dex_file.
由于菜鸟有些地方也没搞太明白,中间免不了有一些错误,有些语句也叙述的不够恰当,毕竟外行而且语文不咋地,但大致流程应该没问题,希望各位大佬指出我的问题,我好早日改正。
参考:老罗大佬的安卓之旅 https://www.kancloud.cn/alex_wsc/androids/473622
bool OatFileBase::LoadVdex(const std::string& vdex_filename, bool writable, bool low_4gb, std::string* error_msg) { vdex_ = VdexFile::Open(vdex_filename, writable, low_4gb, /* unquicken*/ false, error_msg);//打开并获得vdex_ if (vdex_.get() == nullptr) { *error_msg = StringPrintf("Failed to load vdex file '%s' %s", vdex_filename.c_str(), error_msg->c_str()); return false; } return true; }
vdex简单结构
vdex_file.h,包含dex_files和QuickeningInfo
// File format: // VdexFile::Header fixed-length header // // DEX[0] array of the input DEX files // DEX[1] the bytecode may have been quickened // ... // DEX[D] // QuickeningInfo // uint8[] quickening data // unaligned_uint32_t[2][] table of offsets pair: // uint32_t[0] contains code_item_offset // uint32_t[1] contains quickening data offset from the start // of QuickeningInfo // unalgined_uint32_t[D] start offsets (from the start of QuickeningInfo) in previous // table for each dex file
4.下面是 Load 函数,最终调用了Dlopen加载oat,获得dlopen_handle_
bool DlOpenOatFile::Load(const std::string& elf_filename, uint8_t* oat_file_begin, bool writable, bool executable, bool low_4gb, std::string* error_msg) { // Use dlopen only when flagged to do so, and when it's OK to load things executable. // TODO: Also try when not executable? The issue here could be re-mapping as writable (as // !executable is a sign that we may want to patch), which may not be allowed for // various reasons. if (!kUseDlopen) { *error_msg = "DlOpen is disabled."; return false; } if (low_4gb) { *error_msg = "DlOpen does not support low 4gb loading."; return false; } if (writable) { *error_msg = "DlOpen does not support writable loading."; return false; } if (!executable) { *error_msg = "DlOpen does not support non-executable loading."; return false; } // dlopen always returns the same library if it is already opened on the host. For this reason // we only use dlopen if we are the target or we do not already have the dex file opened. Having // the same library loaded multiple times at different addresses is required for class unloading // and for having dex caches arrays in the .bss section. if (!kIsTargetBuild) { if (!kUseDlopenOnHost) { *error_msg = "DlOpen disabled for host."; return false; } } bool success = Dlopen(elf_filename, oat_file_begin, error_msg);//调用Dlopen加载oat,获得dlopen_handle_ DCHECK(dlopen_handle_ != nullptr || !success); return success; }
看一下Dlopen,最终调用了android_dlopen_ext或者dlopen
bool DlOpenOatFile::Dlopen(const std::string& elf_filename, uint8_t* oat_file_begin, std::string* error_msg) { #ifdef __APPLE__ // The dl_iterate_phdr syscall is missing. There is similar API on OSX, // but let's fallback to the custom loading code for the time being. UNUSED(elf_filename, oat_file_begin); *error_msg = "Dlopen unsupported on Mac."; return false; #else { UniqueCPtr<char> absolute_path(realpath(elf_filename.c_str(), nullptr)); if (absolute_path == nullptr) { *error_msg = StringPrintf("Failed to find absolute path for '%s'", elf_filename.c_str()); return false; } #ifdef ART_TARGET_ANDROID android_dlextinfo extinfo = {}; // typedef struct { // uint64_t flags; // void* reserved_addr; // size_t reserved_size; // int relro_fd; // int library_fd; // } android_dlextinfo; extinfo.flags = ANDROID_DLEXT_FORCE_LOAD | // Force-load, don't reuse handle // (open oat files multiple // times). ANDROID_DLEXT_FORCE_FIXED_VADDR; // Take a non-zero vaddr as absolute // (non-pic boot image). if (oat_file_begin != nullptr) { // extinfo.flags |= ANDROID_DLEXT_LOAD_AT_FIXED_ADDRESS; // Use the requested addr if extinfo.reserved_addr = oat_file_begin; // vaddr = 0. } // (pic boot image). dlopen_handle_ = android_dlopen_ext(absolute_path.get(), RTLD_NOW, &extinfo);//这里oat_file_begin不为空如果调用android_dlopen_ext打开获得dlopen_handle_,在/bionic/libdl/libdl.c里 #else UNUSED(oat_file_begin); static_assert(!kIsTargetBuild || kIsTargetLinux, "host_dlopen_handles_ will leak handles"); MutexLock mu(Thread::Current(), *Locks::host_dlopen_handles_lock_); dlopen_handle_ = dlopen(absolute_path.get(), RTLD_NOW);//如果没有oat_file_begin,直接调用dlopen从路径加载获得dlopen_handle_ if (dlopen_handle_ != nullptr) { if (!host_dlopen_handles_.insert(dlopen_handle_).second) {//把dlopen_handle_插入host_dlopen_handles_中 dlclose(dlopen_handle_); dlopen_handle_ = nullptr; *error_msg = StringPrintf("host dlopen re-opened '%s'", elf_filename.c_str()); return false; } } #endif // ART_TARGET_ANDROID } if (dlopen_handle_ == nullptr) { *error_msg = StringPrintf("Failed to dlopen '%s': %s", elf_filename.c_str(), dlerror()); return false; } return true; #endif }
5.下面是 ComputeFields,它从begin开始,调用FindDynamicSymbolAddress定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots,其中
oatdata,oatlastword定位了begin_和end_
bool OatFileBase::ComputeFields(uint8_t* requested_base, const std::string& file_path, std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots std::string symbol_error_msg; begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg); if (begin_ == nullptr) { *error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } if (requested_base != nullptr && begin_ != requested_base) { // Host can fail this check. Do not dump there to avoid polluting the output. if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) { PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); } *error_msg = StringPrintf("Failed to find oatdata symbol at expected address: " "oatdata=%p != expected=%p. See process maps in the log.", begin_, requested_base); return false; } end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg); if (end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } // Readjust to be non-inclusive upper bound. end_ += sizeof(uint32_t); bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg)); if (bss_begin_ == nullptr) { // No .bss section. bss_end_ = nullptr; } else { bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg)); if (bss_end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str()); return false; } // Readjust to be non-inclusive upper bound. bss_end_ += sizeof(uint32_t); // Find bss methods if present. bss_methods_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg)); // Find bss roots if present. bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关 } return true; }
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr
大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了
oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以
PreSetup 大致功能如下:
声明一个
dl_iterate_context 结构
通过dl_iterate_phdr循环遍历加载的elf对象,每一次遍历shared_objects_seen自增1,当
shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,重复循环,直到最后一个elf执行下面逻辑
通过dlpi_phnum判断segment数量,遍历elf 加载到内存的segment,如果p_type == PT_LOAD说明是load段,通过dl_phdr_info取出dlpi_phdr[i].p_memsz与dlpi_phdr[i].p_vaddr,获得每个segment加载到内存的地址和大小,如果begin_大于地址小于地址+大小,设置contains_begin = true,说明要开始遍历oat_file的sgment了,跳出循环,执行下面的逻辑
遍历dlpi_phdr,当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz映射sgment到内存,
其实这个函数我也没看太明白,希望大佬指正一下,等闲下来抽时间在认真研究研究,
void DlOpenOatFile::PreSetup(const std::string& elf_filename) {//Ask the linker where it mmaped the file and notify our mmap wrapper of the regions #ifdef __APPLE__ UNUSED(elf_filename); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else struct dl_iterate_context { static int callback(struct dl_phdr_info *info, size_t /* size */, void *data) { /* struct dl_phdr_info { ElfW(Addr) dlpi_addr; const char* dlpi_name; const ElfW(Phdr)* dlpi_phdr; ElfW(Half) dlpi_phnum;} */ auto* context = reinterpret_cast<dl_iterate_context*>(data); context->shared_objects_seen++; //这里是shared_objects_seen自增了,跟上面shared_objects_before对比 if (context->shared_objects_seen < context->shared_objects_before) { //只要shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,如果其他线程卸载了一个elf,这有可能出问题 // We haven't been called yet for anything we haven't seen before. Just continue. // Note: this is aggressively optimistic. If another thread was unloading a library, // we may miss out here. However, this does not happen often in practice. return 0; } // See whether this callback corresponds to the file which we have just loaded. bool contains_begin = false; // 一直遍历直到contains_begin也就是包含begin_,这个begin_通过函数Begin()获得也就是oat_file的begin_ for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; if (vaddr <= context->begin_ && context->begin_ < vaddr + memsz) { contains_begin = true; break; } } } // Add dummy mmaps for this file. if (contains_begin) { //一旦 contains_begin = true,遍历dlpi_phdr当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz装载segment到内存 for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; MemMap* mmap = MemMap::MapDummy(info->dlpi_name, vaddr, memsz); context->dlopen_mmaps_->push_back(std::unique_ptr<MemMap>(mmap));//把新建的mmap添加进dlopen_mmaps_ } } return 1; // Stop iteration and return 1 from dl_iterate_phdr. //结束循环 } return 0; // Continue iteration and return 0 from dl_iterate_phdr when finished. } const uint8_t* const begin_; //begin_通过函数Begin()获得也就是oat_file的begin_ std::vector<std::unique_ptr<MemMap>>* const dlopen_mmaps_; const size_t shared_objects_before; //上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数 size_t shared_objects_seen; //本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数 };//到这一行struct dl_iterate_context结束 dl_iterate_context context = { Begin(), &dlopen_mmaps_, shared_objects_before_, 0}; //声明一个context if (dl_iterate_phdr(dl_iterate_context::callback, &context) == 0) { //这里调用dl_iterate_phdr,这个callback回调函数完成了oat_file各个segment的mmap // Hm. Maybe our optimization went wrong. Try another time with shared_objects_before == 0 // before giving up. This should be unusual. VLOG(oat) << "Need a second run in PreSetup, didn't find with shared_objects_before=" << shared_objects_before_; dl_iterate_context context0 = { Begin(), &dlopen_mmaps_, 0, 0}; if (dl_iterate_phdr(dl_iterate_context::callback, &context0) == 0) { // OK, give up and print an error. PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); LOG(ERROR) << "File " << elf_filename << " loaded with dlopen but cannot find its mmaps."; } } #endif }
7.再往下就是OatFileBase::Setup,这里主要通过 ReadOatDexFileData函数 运用上文装载的oat_file获得了oat_dex_file以用于获得dex_file,这里的整个oat_file的数据结构综合了oat文件和vdex文件的信息。
Setup { GetOatHeader GetInstructionSetPointerSize GetOatDexFilesOffset//这里达到了OatDexFile的Offset GetDexFileCount ReadOatDexFileData&dex_file_location_size// ResolveRelativeEncodedDexLocation ReadOatDexFileData&dex_file_checksum ReadOatDexFileData&dex_file_offset ReadOatDexFileData&class_offsets_offset ReadOatDexFileData&lookup_table_offset//加快类查找速度 ReadOatDexFileData&dex_layout_sections_offset ReadOatDexFileData&method_bss_mapping_offset FindDexFileMapItem&call_sites_item//调用站点标识符 new OatDexFile //根据上面的信息new OatDexFile 以便于GetBestOatFile获得 }
源码流程比较清楚,主要把握住 ReadOatDexFileData和oat文件指针的移动,最后创建oat_dex_file是最重要的
bool OatFileBase::Setup(const char* abs_dex_location, std::string* error_msg) { if (!GetOatHeader().IsValid()) { std::string cause = GetOatHeader().GetValidationErrorMessage(); *error_msg = StringPrintf("Invalid oat header for '%s': %s", GetLocation().c_str(), cause.c_str()); return false; } PointerSize pointer_size = GetInstructionSetPointerSize(GetOatHeader().GetInstructionSet()); size_t key_value_store_size = (Size() >= sizeof(OatHeader)) ? GetOatHeader().GetKeyValueStoreSize() : 0u; if (Size() < sizeof(OatHeader) + key_value_store_size) { *error_msg = StringPrintf("In oat file '%s' found truncated OatHeader, " "size = %zu < %zu + %zu", GetLocation().c_str(), Size(), sizeof(OatHeader), key_value_store_size); return false; } size_t oat_dex_files_offset = GetOatHeader().GetOatDexFilesOffset(); if (oat_dex_files_offset < GetOatHeader().GetHeaderSize() || oat_dex_files_offset > Size()) { *error_msg = StringPrintf("In oat file '%s' found invalid oat dex files offset: " "%zu is not in [%zu, %zu]", GetLocation().c_str(), oat_dex_files_offset, GetOatHeader().GetHeaderSize(), Size()); return false; } const uint8_t* oat = Begin() + oat_dex_files_offset; // Jump to the OatDexFile records.//oat指针跳到OatDexFile去 DCHECK_GE(static_cast<size_t>(pointer_size), alignof(GcRoot<mirror::Object>)); if (!IsAligned<kPageSize>(bss_begin_) || !IsAlignedParam(bss_methods_, static_cast<size_t>(pointer_size)) || !IsAlignedParam(bss_roots_, static_cast<size_t>(pointer_size)) || !IsAligned<alignof(GcRoot<mirror::Object>)>(bss_end_)) { *error_msg = StringPrintf("In oat file '%s' found unaligned bss symbol(s): " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } if ((bss_methods_ != nullptr && (bss_methods_ < bss_begin_ || bss_methods_ > bss_end_)) || (bss_roots_ != nullptr && (bss_roots_ < bss_begin_ || bss_roots_ > bss_end_)) || (bss_methods_ != nullptr && bss_roots_ != nullptr && bss_methods_ > bss_roots_)) { *error_msg = StringPrintf("In oat file '%s' found bss symbol(s) outside .bss or unordered: " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } uint8_t* after_arrays = (bss_methods_ != nullptr) ? bss_methods_ : bss_roots_; // May be null. uint8_t* dex_cache_arrays = (bss_begin_ == after_arrays) ? nullptr : bss_begin_; uint8_t* dex_cache_arrays_end = (bss_begin_ == after_arrays) ? nullptr : (after_arrays != nullptr) ? after_arrays : bss_end_; DCHECK_EQ(dex_cache_arrays != nullptr, dex_cache_arrays_end != nullptr); uint32_t dex_file_count = GetOatHeader().GetDexFileCount();//获得dex_file_count oat_dex_files_storage_.reserve(dex_file_count); for (size_t i = 0; i < dex_file_count; i++) { uint32_t dex_file_location_size; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_location_size))) //循环通过ReadOatDexFileData函数读取dex_file_location_size并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu truncated after dex file " "location size", GetLocation().c_str(), i); return false; } if (UNLIKELY(dex_file_location_size == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with empty location name", GetLocation().c_str(), i); return false; } if (UNLIKELY(static_cast<size_t>(End() - oat) < dex_file_location_size)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with truncated dex file " "location", GetLocation().c_str(), i); return false; } const char* dex_file_location_data = reinterpret_cast<const char*>(oat); oat += dex_file_location_size; std::string dex_file_location = ResolveRelativeEncodedDexLocation( abs_dex_location, std::string(dex_file_location_data, dex_file_location_size)); uint32_t dex_file_checksum; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_checksum))) {//通过ReadOatDexFileData函数读取dex_file_checksum并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated after " "dex file checksum", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } uint32_t dex_file_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_offset))) {//通过ReadOatDexFileData函数读取dex_file_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after dex file offsets", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with zero dex " "file offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset > DexSize())) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u > %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize()); return false; } if (UNLIKELY(DexSize() - dex_file_offset < sizeof(DexFile::Header))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u of %zu but the size of dex file header is %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize(), sizeof(DexFile::Header)); return false; } const uint8_t* dex_file_pointer = DexBegin() + dex_file_offset; if (UNLIKELY(!DexFile::IsMagicValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file magic '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } if (UNLIKELY(!DexFile::IsVersionValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file version '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } const DexFile::Header* header = reinterpret_cast<const DexFile::Header*>(dex_file_pointer); if (DexSize() - dex_file_offset < header->file_size_) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u and size %u truncated at %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, header->file_size_, DexSize()); return false; } uint32_t class_offsets_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &class_offsets_offset))) {//通过ReadOatDexFileData函数读取class_offsets_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after class offsets offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(class_offsets_offset > Size()) || UNLIKELY((Size() - class_offsets_offset) / sizeof(uint32_t) < header->class_defs_size_)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "class offsets, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset, Size(), header->class_defs_size_); return false; } if (UNLIKELY(!IsAligned<alignof(uint32_t)>(class_offsets_offset))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned " "class offsets, offset %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset); return false; } const uint32_t* class_offsets_pointer = reinterpret_cast<const uint32_t*>(Begin() + class_offsets_offset); uint32_t lookup_table_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &lookup_table_offset))) {//通过ReadOatDexFileData函数读取lookup_table_offset并调整oat指针,lookup_table用于加速类的查找 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after lookup table offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const uint8_t* lookup_table_data = lookup_table_offset != 0u ? Begin() + lookup_table_offset : nullptr; if (lookup_table_offset != 0u && (UNLIKELY(lookup_table_offset > Size()) || UNLIKELY(Size() - lookup_table_offset < TypeLookupTable::RawDataLength(header->class_defs_size_)))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "type lookup table, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), lookup_table_offset, Size(), header->class_defs_size_); return false; } uint32_t dex_layout_sections_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_layout_sections_offset))) {//通过ReadOatDexFileData函数读取dex_layout_sections_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after dex layout sections offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const DexLayoutSections* const dex_layout_sections = dex_layout_sections_offset != 0 ? reinterpret_cast<const DexLayoutSections*>(Begin() + dex_layout_sections_offset) : nullptr; uint32_t method_bss_mapping_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &method_bss_mapping_offset))) {//通过ReadOatDexFileData函数读取method_bss_mapping_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after method bss mapping offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const bool readable_method_bss_mapping_size = method_bss_mapping_offset != 0u && method_bss_mapping_offset <= Size() && IsAligned<alignof(MethodBssMapping)>(method_bss_mapping_offset) && Size() - method_bss_mapping_offset >= MethodBssMapping::ComputeSize(0); const MethodBssMapping* method_bss_mapping = readable_method_bss_mapping_size ? reinterpret_cast<const MethodBssMapping*>(Begin() + method_bss_mapping_offset) : nullptr; if (method_bss_mapping_offset != 0u && (UNLIKELY(method_bss_mapping == nullptr) || UNLIKELY(method_bss_mapping->size() == 0u) || UNLIKELY(Size() - method_bss_mapping_offset < MethodBssMapping::ComputeSize(method_bss_mapping->size())))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned or " " truncated method bss mapping, offset %u of %zu, length %zu", GetLocation().c_str(), i, dex_file_location.c_str(), method_bss_mapping_offset, Size(), method_bss_mapping != nullptr ? method_bss_mapping->size() : 0u); return false; } if (kIsDebugBuild && method_bss_mapping != nullptr) { const MethodBssMappingEntry* prev_entry = nullptr; for (const MethodBssMappingEntry& entry : *method_bss_mapping) { CHECK_ALIGNED_PARAM(entry.bss_offset, static_cast<size_t>(pointer_size)); CHECK_LT(entry.bss_offset, BssSize()); CHECK_LE(POPCOUNT(entry.index_mask) * static_cast<size_t>(pointer_size), entry.bss_offset); size_t index_mask_span = (entry.index_mask != 0u) ? 16u - CTZ(entry.index_mask) : 0u; CHECK_LE(index_mask_span, entry.method_index); if (prev_entry != nullptr) { CHECK_LT(prev_entry->method_index, entry.method_index - index_mask_span); } prev_entry = &entry; } CHECK_LT(prev_entry->method_index, reinterpret_cast<const DexFile::Header*>(dex_file_pointer)->method_ids_size_); } uint8_t* current_dex_cache_arrays = nullptr; if (dex_cache_arrays != nullptr) { // All DexCache types except for CallSite have their instance counts in the // DexFile header. For CallSites, we need to read the info from the MapList. //对于CallSites,必须从MapList中读取,他不存储在header中 const DexFile::MapItem* call_sites_item = nullptr; if (!FindDexFileMapItem(DexBegin(), //通过FindDexFileMapItem读取call_sites_item并解析 DexEnd(), DexFile::MapItemType::kDexTypeCallSiteIdItem, &call_sites_item)) { *error_msg = StringPrintf("In oat file '%s' could not read data from truncated DexFile map", GetLocation().c_str()); return false; } size_t num_call_sites = call_sites_item == nullptr ? 0 : call_sites_item->size_; DexCacheArraysLayout layout(pointer_size, *header, num_call_sites); if (layout.Size() != 0u) { if (static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays) < layout.Size()) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with " "truncated dex cache arrays, %zu < %zu.", GetLocation().c_str(), i, dex_file_location.c_str(), static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays), layout.Size()); return false; } current_dex_cache_arrays = dex_cache_arrays; dex_cache_arrays += layout.Size(); } } std::string canonical_location = DexFile::GetDexCanonicalLocation(dex_file_location.c_str()); // Create the OatDexFile and add it to the owning container. OatDexFile* oat_dex_file = new OatDexFile(this, //根据上面ReadOatDexFileData和FindDexFileMapItem获得的信息构建oat_dex_file dex_file_location, canonical_location, dex_file_checksum, dex_file_pointer, lookup_table_data, method_bss_mapping, class_offsets_pointer, current_dex_cache_arrays, dex_layout_sections); oat_dex_files_storage_.push_back(oat_dex_file); // Add the location and canonical location (if different) to the oat_dex_files_ table. StringPiece key(oat_dex_file->GetDexFileLocation()); oat_dex_files_.Put(key, oat_dex_file); if (canonical_location != dex_file_location) { StringPiece canonical_key(oat_dex_file->GetCanonicalDexFileLocation()); oat_dex_files_.Put(canonical_key, oat_dex_file); } } if (dex_cache_arrays != dex_cache_arrays_end) { // We expect the bss section to be either empty (dex_cache_arrays and bss_end_ // both null) or contain just the dex cache arrays and optionally some GC roots. *error_msg = StringPrintf("In oat file '%s' found unexpected bss size bigger by %zu bytes.", GetLocation().c_str(), static_cast<size_t>(bss_end_ - dex_cache_arrays)); return false; } return true; }
还有一种
打开 ElfOatFile 的方式,应该是调用了系统自己的elf加载器,大致流程应该类似,菜鸟有空在慢慢分析,
最后再梳理一下流程
,大致如下:
PreLoad,遍历所有加载的elf对象获得dl_phdr_info,计算所有elf的个数存储在shared_objects_before_中
LoadVdex,通过VdexFile::Open加载vdex文件,vdex里面也存储了一些dex文件信息
Load,调用Dlopen加载oat_file,获得dlopen_handle_
ComputeFields,从begin开始,通过FindDynamicSymbolAddress定位各种符号地址,也就界定了oat_file在内存中的范围
PreSetup,
再次遍历所有加载的elf对象,在最后一个elf对象的load段之后,通过mmap映射oat_file的segment到内存
Setup,通过 ReadOatDexFileData等函数解析oat_file信息,组装
oat_dex_file
根据以上几步,最终通过oat_file获得了oat_dex_file.
由于菜鸟有些地方也没搞太明白,中间免不了有一些错误,有些语句也叙述的不够恰当,毕竟外行而且语文不咋地,但大致流程应该没问题,希望各位大佬指出我的问题,我好早日改正。
参考:老罗大佬的安卓之旅 https://www.kancloud.cn/alex_wsc/androids/473622
// File format: // VdexFile::Header fixed-length header // // DEX[0] array of the input DEX files // DEX[1] the bytecode may have been quickened // ... // DEX[D] // QuickeningInfo // uint8[] quickening data // unaligned_uint32_t[2][] table of offsets pair: // uint32_t[0] contains code_item_offset // uint32_t[1] contains quickening data offset from the start // of QuickeningInfo // unalgined_uint32_t[D] start offsets (from the start of QuickeningInfo) in previous // table for each dex file
4.下面是 Load 函数,最终调用了Dlopen加载oat,获得dlopen_handle_
bool DlOpenOatFile::Load(const std::string& elf_filename, uint8_t* oat_file_begin, bool writable, bool executable, bool low_4gb, std::string* error_msg) { // Use dlopen only when flagged to do so, and when it's OK to load things executable. // TODO: Also try when not executable? The issue here could be re-mapping as writable (as // !executable is a sign that we may want to patch), which may not be allowed for // various reasons. if (!kUseDlopen) { *error_msg = "DlOpen is disabled."; return false; } if (low_4gb) { *error_msg = "DlOpen does not support low 4gb loading."; return false; } if (writable) { *error_msg = "DlOpen does not support writable loading."; return false; } if (!executable) { *error_msg = "DlOpen does not support non-executable loading."; return false; } // dlopen always returns the same library if it is already opened on the host. For this reason // we only use dlopen if we are the target or we do not already have the dex file opened. Having // the same library loaded multiple times at different addresses is required for class unloading // and for having dex caches arrays in the .bss section. if (!kIsTargetBuild) { if (!kUseDlopenOnHost) { *error_msg = "DlOpen disabled for host."; return false; } } bool success = Dlopen(elf_filename, oat_file_begin, error_msg);//调用Dlopen加载oat,获得dlopen_handle_ DCHECK(dlopen_handle_ != nullptr || !success); return success; }
看一下Dlopen,最终调用了android_dlopen_ext或者dlopen
bool DlOpenOatFile::Dlopen(const std::string& elf_filename, uint8_t* oat_file_begin, std::string* error_msg) { #ifdef __APPLE__ // The dl_iterate_phdr syscall is missing. There is similar API on OSX, // but let's fallback to the custom loading code for the time being. UNUSED(elf_filename, oat_file_begin); *error_msg = "Dlopen unsupported on Mac."; return false; #else { UniqueCPtr<char> absolute_path(realpath(elf_filename.c_str(), nullptr)); if (absolute_path == nullptr) { *error_msg = StringPrintf("Failed to find absolute path for '%s'", elf_filename.c_str()); return false; } #ifdef ART_TARGET_ANDROID android_dlextinfo extinfo = {}; // typedef struct { // uint64_t flags; // void* reserved_addr; // size_t reserved_size; // int relro_fd; // int library_fd; // } android_dlextinfo; extinfo.flags = ANDROID_DLEXT_FORCE_LOAD | // Force-load, don't reuse handle // (open oat files multiple // times). ANDROID_DLEXT_FORCE_FIXED_VADDR; // Take a non-zero vaddr as absolute // (non-pic boot image). if (oat_file_begin != nullptr) { // extinfo.flags |= ANDROID_DLEXT_LOAD_AT_FIXED_ADDRESS; // Use the requested addr if extinfo.reserved_addr = oat_file_begin; // vaddr = 0. } // (pic boot image). dlopen_handle_ = android_dlopen_ext(absolute_path.get(), RTLD_NOW, &extinfo);//这里oat_file_begin不为空如果调用android_dlopen_ext打开获得dlopen_handle_,在/bionic/libdl/libdl.c里 #else UNUSED(oat_file_begin); static_assert(!kIsTargetBuild || kIsTargetLinux, "host_dlopen_handles_ will leak handles"); MutexLock mu(Thread::Current(), *Locks::host_dlopen_handles_lock_); dlopen_handle_ = dlopen(absolute_path.get(), RTLD_NOW);//如果没有oat_file_begin,直接调用dlopen从路径加载获得dlopen_handle_ if (dlopen_handle_ != nullptr) { if (!host_dlopen_handles_.insert(dlopen_handle_).second) {//把dlopen_handle_插入host_dlopen_handles_中 dlclose(dlopen_handle_); dlopen_handle_ = nullptr; *error_msg = StringPrintf("host dlopen re-opened '%s'", elf_filename.c_str()); return false; } } #endif // ART_TARGET_ANDROID } if (dlopen_handle_ == nullptr) { *error_msg = StringPrintf("Failed to dlopen '%s': %s", elf_filename.c_str(), dlerror()); return false; } return true; #endif }
5.下面是 ComputeFields,它从begin开始,调用FindDynamicSymbolAddress定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots,其中
oatdata,oatlastword定位了begin_和end_
bool OatFileBase::ComputeFields(uint8_t* requested_base, const std::string& file_path, std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots std::string symbol_error_msg; begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg); if (begin_ == nullptr) { *error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } if (requested_base != nullptr && begin_ != requested_base) { // Host can fail this check. Do not dump there to avoid polluting the output. if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) { PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); } *error_msg = StringPrintf("Failed to find oatdata symbol at expected address: " "oatdata=%p != expected=%p. See process maps in the log.", begin_, requested_base); return false; } end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg); if (end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } // Readjust to be non-inclusive upper bound. end_ += sizeof(uint32_t); bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg)); if (bss_begin_ == nullptr) { // No .bss section. bss_end_ = nullptr; } else { bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg)); if (bss_end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str()); return false; } // Readjust to be non-inclusive upper bound. bss_end_ += sizeof(uint32_t); // Find bss methods if present. bss_methods_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg)); // Find bss roots if present. bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关 } return true; }
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr
大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了
oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以
PreSetup 大致功能如下:
声明一个
dl_iterate_context 结构
通过dl_iterate_phdr循环遍历加载的elf对象,每一次遍历shared_objects_seen自增1,当
shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,重复循环,直到最后一个elf执行下面逻辑
通过dlpi_phnum判断segment数量,遍历elf 加载到内存的segment,如果p_type == PT_LOAD说明是load段,通过dl_phdr_info取出dlpi_phdr[i].p_memsz与dlpi_phdr[i].p_vaddr,获得每个segment加载到内存的地址和大小,如果begin_大于地址小于地址+大小,设置contains_begin = true,说明要开始遍历oat_file的sgment了,跳出循环,执行下面的逻辑
遍历dlpi_phdr,当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz映射sgment到内存,
其实这个函数我也没看太明白,希望大佬指正一下,等闲下来抽时间在认真研究研究,
void DlOpenOatFile::PreSetup(const std::string& elf_filename) {//Ask the linker where it mmaped the file and notify our mmap wrapper of the regions #ifdef __APPLE__ UNUSED(elf_filename); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else struct dl_iterate_context { static int callback(struct dl_phdr_info *info, size_t /* size */, void *data) { /* struct dl_phdr_info { ElfW(Addr) dlpi_addr; const char* dlpi_name; const ElfW(Phdr)* dlpi_phdr; ElfW(Half) dlpi_phnum;} */ auto* context = reinterpret_cast<dl_iterate_context*>(data); context->shared_objects_seen++; //这里是shared_objects_seen自增了,跟上面shared_objects_before对比 if (context->shared_objects_seen < context->shared_objects_before) { //只要shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,如果其他线程卸载了一个elf,这有可能出问题 // We haven't been called yet for anything we haven't seen before. Just continue. // Note: this is aggressively optimistic. If another thread was unloading a library, // we may miss out here. However, this does not happen often in practice. return 0; } // See whether this callback corresponds to the file which we have just loaded. bool contains_begin = false; // 一直遍历直到contains_begin也就是包含begin_,这个begin_通过函数Begin()获得也就是oat_file的begin_ for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; if (vaddr <= context->begin_ && context->begin_ < vaddr + memsz) { contains_begin = true; break; } } } // Add dummy mmaps for this file. if (contains_begin) { //一旦 contains_begin = true,遍历dlpi_phdr当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz装载segment到内存 for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; MemMap* mmap = MemMap::MapDummy(info->dlpi_name, vaddr, memsz); context->dlopen_mmaps_->push_back(std::unique_ptr<MemMap>(mmap));//把新建的mmap添加进dlopen_mmaps_ } } return 1; // Stop iteration and return 1 from dl_iterate_phdr. //结束循环 } return 0; // Continue iteration and return 0 from dl_iterate_phdr when finished. } const uint8_t* const begin_; //begin_通过函数Begin()获得也就是oat_file的begin_ std::vector<std::unique_ptr<MemMap>>* const dlopen_mmaps_; const size_t shared_objects_before; //上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数 size_t shared_objects_seen; //本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数 };//到这一行struct dl_iterate_context结束 dl_iterate_context context = { Begin(), &dlopen_mmaps_, shared_objects_before_, 0}; //声明一个context if (dl_iterate_phdr(dl_iterate_context::callback, &context) == 0) { //这里调用dl_iterate_phdr,这个callback回调函数完成了oat_file各个segment的mmap // Hm. Maybe our optimization went wrong. Try another time with shared_objects_before == 0 // before giving up. This should be unusual. VLOG(oat) << "Need a second run in PreSetup, didn't find with shared_objects_before=" << shared_objects_before_; dl_iterate_context context0 = { Begin(), &dlopen_mmaps_, 0, 0}; if (dl_iterate_phdr(dl_iterate_context::callback, &context0) == 0) { // OK, give up and print an error. PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); LOG(ERROR) << "File " << elf_filename << " loaded with dlopen but cannot find its mmaps."; } } #endif }
7.再往下就是OatFileBase::Setup,这里主要通过 ReadOatDexFileData函数 运用上文装载的oat_file获得了oat_dex_file以用于获得dex_file,这里的整个oat_file的数据结构综合了oat文件和vdex文件的信息。
Setup { GetOatHeader GetInstructionSetPointerSize GetOatDexFilesOffset//这里达到了OatDexFile的Offset GetDexFileCount ReadOatDexFileData&dex_file_location_size// ResolveRelativeEncodedDexLocation ReadOatDexFileData&dex_file_checksum ReadOatDexFileData&dex_file_offset ReadOatDexFileData&class_offsets_offset ReadOatDexFileData&lookup_table_offset//加快类查找速度 ReadOatDexFileData&dex_layout_sections_offset ReadOatDexFileData&method_bss_mapping_offset FindDexFileMapItem&call_sites_item//调用站点标识符 new OatDexFile //根据上面的信息new OatDexFile 以便于GetBestOatFile获得 }
源码流程比较清楚,主要把握住 ReadOatDexFileData和oat文件指针的移动,最后创建oat_dex_file是最重要的
bool OatFileBase::Setup(const char* abs_dex_location, std::string* error_msg) { if (!GetOatHeader().IsValid()) { std::string cause = GetOatHeader().GetValidationErrorMessage(); *error_msg = StringPrintf("Invalid oat header for '%s': %s", GetLocation().c_str(), cause.c_str()); return false; } PointerSize pointer_size = GetInstructionSetPointerSize(GetOatHeader().GetInstructionSet()); size_t key_value_store_size = (Size() >= sizeof(OatHeader)) ? GetOatHeader().GetKeyValueStoreSize() : 0u; if (Size() < sizeof(OatHeader) + key_value_store_size) { *error_msg = StringPrintf("In oat file '%s' found truncated OatHeader, " "size = %zu < %zu + %zu", GetLocation().c_str(), Size(), sizeof(OatHeader), key_value_store_size); return false; } size_t oat_dex_files_offset = GetOatHeader().GetOatDexFilesOffset(); if (oat_dex_files_offset < GetOatHeader().GetHeaderSize() || oat_dex_files_offset > Size()) { *error_msg = StringPrintf("In oat file '%s' found invalid oat dex files offset: " "%zu is not in [%zu, %zu]", GetLocation().c_str(), oat_dex_files_offset, GetOatHeader().GetHeaderSize(), Size()); return false; } const uint8_t* oat = Begin() + oat_dex_files_offset; // Jump to the OatDexFile records.//oat指针跳到OatDexFile去 DCHECK_GE(static_cast<size_t>(pointer_size), alignof(GcRoot<mirror::Object>)); if (!IsAligned<kPageSize>(bss_begin_) || !IsAlignedParam(bss_methods_, static_cast<size_t>(pointer_size)) || !IsAlignedParam(bss_roots_, static_cast<size_t>(pointer_size)) || !IsAligned<alignof(GcRoot<mirror::Object>)>(bss_end_)) { *error_msg = StringPrintf("In oat file '%s' found unaligned bss symbol(s): " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } if ((bss_methods_ != nullptr && (bss_methods_ < bss_begin_ || bss_methods_ > bss_end_)) || (bss_roots_ != nullptr && (bss_roots_ < bss_begin_ || bss_roots_ > bss_end_)) || (bss_methods_ != nullptr && bss_roots_ != nullptr && bss_methods_ > bss_roots_)) { *error_msg = StringPrintf("In oat file '%s' found bss symbol(s) outside .bss or unordered: " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } uint8_t* after_arrays = (bss_methods_ != nullptr) ? bss_methods_ : bss_roots_; // May be null. uint8_t* dex_cache_arrays = (bss_begin_ == after_arrays) ? nullptr : bss_begin_; uint8_t* dex_cache_arrays_end = (bss_begin_ == after_arrays) ? nullptr : (after_arrays != nullptr) ? after_arrays : bss_end_; DCHECK_EQ(dex_cache_arrays != nullptr, dex_cache_arrays_end != nullptr); uint32_t dex_file_count = GetOatHeader().GetDexFileCount();//获得dex_file_count oat_dex_files_storage_.reserve(dex_file_count); for (size_t i = 0; i < dex_file_count; i++) { uint32_t dex_file_location_size; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_location_size))) //循环通过ReadOatDexFileData函数读取dex_file_location_size并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu truncated after dex file " "location size", GetLocation().c_str(), i); return false; } if (UNLIKELY(dex_file_location_size == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with empty location name", GetLocation().c_str(), i); return false; } if (UNLIKELY(static_cast<size_t>(End() - oat) < dex_file_location_size)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with truncated dex file " "location", GetLocation().c_str(), i); return false; } const char* dex_file_location_data = reinterpret_cast<const char*>(oat); oat += dex_file_location_size; std::string dex_file_location = ResolveRelativeEncodedDexLocation( abs_dex_location, std::string(dex_file_location_data, dex_file_location_size)); uint32_t dex_file_checksum; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_checksum))) {//通过ReadOatDexFileData函数读取dex_file_checksum并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated after " "dex file checksum", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } uint32_t dex_file_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_offset))) {//通过ReadOatDexFileData函数读取dex_file_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after dex file offsets", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with zero dex " "file offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset > DexSize())) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u > %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize()); return false; } if (UNLIKELY(DexSize() - dex_file_offset < sizeof(DexFile::Header))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u of %zu but the size of dex file header is %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize(), sizeof(DexFile::Header)); return false; } const uint8_t* dex_file_pointer = DexBegin() + dex_file_offset; if (UNLIKELY(!DexFile::IsMagicValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file magic '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } if (UNLIKELY(!DexFile::IsVersionValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file version '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } const DexFile::Header* header = reinterpret_cast<const DexFile::Header*>(dex_file_pointer); if (DexSize() - dex_file_offset < header->file_size_) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u and size %u truncated at %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, header->file_size_, DexSize()); return false; } uint32_t class_offsets_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &class_offsets_offset))) {//通过ReadOatDexFileData函数读取class_offsets_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after class offsets offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(class_offsets_offset > Size()) || UNLIKELY((Size() - class_offsets_offset) / sizeof(uint32_t) < header->class_defs_size_)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "class offsets, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset, Size(), header->class_defs_size_); return false; } if (UNLIKELY(!IsAligned<alignof(uint32_t)>(class_offsets_offset))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned " "class offsets, offset %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset); return false; } const uint32_t* class_offsets_pointer = reinterpret_cast<const uint32_t*>(Begin() + class_offsets_offset); uint32_t lookup_table_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &lookup_table_offset))) {//通过ReadOatDexFileData函数读取lookup_table_offset并调整oat指针,lookup_table用于加速类的查找 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after lookup table offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const uint8_t* lookup_table_data = lookup_table_offset != 0u ? Begin() + lookup_table_offset : nullptr; if (lookup_table_offset != 0u && (UNLIKELY(lookup_table_offset > Size()) || UNLIKELY(Size() - lookup_table_offset < TypeLookupTable::RawDataLength(header->class_defs_size_)))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "type lookup table, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), lookup_table_offset, Size(), header->class_defs_size_); return false; } uint32_t dex_layout_sections_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_layout_sections_offset))) {//通过ReadOatDexFileData函数读取dex_layout_sections_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after dex layout sections offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const DexLayoutSections* const dex_layout_sections = dex_layout_sections_offset != 0 ? reinterpret_cast<const DexLayoutSections*>(Begin() + dex_layout_sections_offset) : nullptr; uint32_t method_bss_mapping_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &method_bss_mapping_offset))) {//通过ReadOatDexFileData函数读取method_bss_mapping_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after method bss mapping offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const bool readable_method_bss_mapping_size = method_bss_mapping_offset != 0u && method_bss_mapping_offset <= Size() && IsAligned<alignof(MethodBssMapping)>(method_bss_mapping_offset) && Size() - method_bss_mapping_offset >= MethodBssMapping::ComputeSize(0); const MethodBssMapping* method_bss_mapping = readable_method_bss_mapping_size ? reinterpret_cast<const MethodBssMapping*>(Begin() + method_bss_mapping_offset) : nullptr; if (method_bss_mapping_offset != 0u && (UNLIKELY(method_bss_mapping == nullptr) || UNLIKELY(method_bss_mapping->size() == 0u) || UNLIKELY(Size() - method_bss_mapping_offset < MethodBssMapping::ComputeSize(method_bss_mapping->size())))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned or " " truncated method bss mapping, offset %u of %zu, length %zu", GetLocation().c_str(), i, dex_file_location.c_str(), method_bss_mapping_offset, Size(), method_bss_mapping != nullptr ? method_bss_mapping->size() : 0u); return false; } if (kIsDebugBuild && method_bss_mapping != nullptr) { const MethodBssMappingEntry* prev_entry = nullptr; for (const MethodBssMappingEntry& entry : *method_bss_mapping) { CHECK_ALIGNED_PARAM(entry.bss_offset, static_cast<size_t>(pointer_size)); CHECK_LT(entry.bss_offset, BssSize()); CHECK_LE(POPCOUNT(entry.index_mask) * static_cast<size_t>(pointer_size), entry.bss_offset); size_t index_mask_span = (entry.index_mask != 0u) ? 16u - CTZ(entry.index_mask) : 0u; CHECK_LE(index_mask_span, entry.method_index); if (prev_entry != nullptr) { CHECK_LT(prev_entry->method_index, entry.method_index - index_mask_span); } prev_entry = &entry; } CHECK_LT(prev_entry->method_index, reinterpret_cast<const DexFile::Header*>(dex_file_pointer)->method_ids_size_); } uint8_t* current_dex_cache_arrays = nullptr; if (dex_cache_arrays != nullptr) { // All DexCache types except for CallSite have their instance counts in the // DexFile header. For CallSites, we need to read the info from the MapList. //对于CallSites,必须从MapList中读取,他不存储在header中 const DexFile::MapItem* call_sites_item = nullptr; if (!FindDexFileMapItem(DexBegin(), //通过FindDexFileMapItem读取call_sites_item并解析 DexEnd(), DexFile::MapItemType::kDexTypeCallSiteIdItem, &call_sites_item)) { *error_msg = StringPrintf("In oat file '%s' could not read data from truncated DexFile map", GetLocation().c_str()); return false; } size_t num_call_sites = call_sites_item == nullptr ? 0 : call_sites_item->size_; DexCacheArraysLayout layout(pointer_size, *header, num_call_sites); if (layout.Size() != 0u) { if (static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays) < layout.Size()) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with " "truncated dex cache arrays, %zu < %zu.", GetLocation().c_str(), i, dex_file_location.c_str(), static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays), layout.Size()); return false; } current_dex_cache_arrays = dex_cache_arrays; dex_cache_arrays += layout.Size(); } } std::string canonical_location = DexFile::GetDexCanonicalLocation(dex_file_location.c_str()); // Create the OatDexFile and add it to the owning container. OatDexFile* oat_dex_file = new OatDexFile(this, //根据上面ReadOatDexFileData和FindDexFileMapItem获得的信息构建oat_dex_file dex_file_location, canonical_location, dex_file_checksum, dex_file_pointer, lookup_table_data, method_bss_mapping, class_offsets_pointer, current_dex_cache_arrays, dex_layout_sections); oat_dex_files_storage_.push_back(oat_dex_file); // Add the location and canonical location (if different) to the oat_dex_files_ table. StringPiece key(oat_dex_file->GetDexFileLocation()); oat_dex_files_.Put(key, oat_dex_file); if (canonical_location != dex_file_location) { StringPiece canonical_key(oat_dex_file->GetCanonicalDexFileLocation()); oat_dex_files_.Put(canonical_key, oat_dex_file); } } if (dex_cache_arrays != dex_cache_arrays_end) { // We expect the bss section to be either empty (dex_cache_arrays and bss_end_ // both null) or contain just the dex cache arrays and optionally some GC roots. *error_msg = StringPrintf("In oat file '%s' found unexpected bss size bigger by %zu bytes.", GetLocation().c_str(), static_cast<size_t>(bss_end_ - dex_cache_arrays)); return false; } return true; }
还有一种
打开 ElfOatFile 的方式,应该是调用了系统自己的elf加载器,大致流程应该类似,菜鸟有空在慢慢分析,
最后再梳理一下流程
,大致如下:
PreLoad,遍历所有加载的elf对象获得dl_phdr_info,计算所有elf的个数存储在shared_objects_before_中
LoadVdex,通过VdexFile::Open加载vdex文件,vdex里面也存储了一些dex文件信息
Load,调用Dlopen加载oat_file,获得dlopen_handle_
ComputeFields,从begin开始,通过FindDynamicSymbolAddress定位各种符号地址,也就界定了oat_file在内存中的范围
PreSetup,
再次遍历所有加载的elf对象,在最后一个elf对象的load段之后,通过mmap映射oat_file的segment到内存
Setup,通过 ReadOatDexFileData等函数解析oat_file信息,组装
oat_dex_file
根据以上几步,最终通过oat_file获得了oat_dex_file.
由于菜鸟有些地方也没搞太明白,中间免不了有一些错误,有些语句也叙述的不够恰当,毕竟外行而且语文不咋地,但大致流程应该没问题,希望各位大佬指出我的问题,我好早日改正。
参考:老罗大佬的安卓之旅 https://www.kancloud.cn/alex_wsc/androids/473622
bool DlOpenOatFile::Load(const std::string& elf_filename, uint8_t* oat_file_begin, bool writable, bool executable, bool low_4gb, std::string* error_msg) { // Use dlopen only when flagged to do so, and when it's OK to load things executable. // TODO: Also try when not executable? The issue here could be re-mapping as writable (as // !executable is a sign that we may want to patch), which may not be allowed for // various reasons. if (!kUseDlopen) { *error_msg = "DlOpen is disabled."; return false; } if (low_4gb) { *error_msg = "DlOpen does not support low 4gb loading."; return false; } if (writable) { *error_msg = "DlOpen does not support writable loading."; return false; } if (!executable) { *error_msg = "DlOpen does not support non-executable loading."; return false; } // dlopen always returns the same library if it is already opened on the host. For this reason // we only use dlopen if we are the target or we do not already have the dex file opened. Having // the same library loaded multiple times at different addresses is required for class unloading // and for having dex caches arrays in the .bss section. if (!kIsTargetBuild) { if (!kUseDlopenOnHost) { *error_msg = "DlOpen disabled for host."; return false; } } bool success = Dlopen(elf_filename, oat_file_begin, error_msg);//调用Dlopen加载oat,获得dlopen_handle_ DCHECK(dlopen_handle_ != nullptr || !success); return success; }
看一下Dlopen,最终调用了android_dlopen_ext或者dlopen
bool DlOpenOatFile::Dlopen(const std::string& elf_filename, uint8_t* oat_file_begin, std::string* error_msg) { #ifdef __APPLE__ // The dl_iterate_phdr syscall is missing. There is similar API on OSX, // but let's fallback to the custom loading code for the time being. UNUSED(elf_filename, oat_file_begin); *error_msg = "Dlopen unsupported on Mac."; return false; #else { UniqueCPtr<char> absolute_path(realpath(elf_filename.c_str(), nullptr)); if (absolute_path == nullptr) { *error_msg = StringPrintf("Failed to find absolute path for '%s'", elf_filename.c_str()); return false; } #ifdef ART_TARGET_ANDROID android_dlextinfo extinfo = {}; // typedef struct { // uint64_t flags; // void* reserved_addr; // size_t reserved_size; // int relro_fd; // int library_fd; // } android_dlextinfo; extinfo.flags = ANDROID_DLEXT_FORCE_LOAD | // Force-load, don't reuse handle // (open oat files multiple // times). ANDROID_DLEXT_FORCE_FIXED_VADDR; // Take a non-zero vaddr as absolute // (non-pic boot image). if (oat_file_begin != nullptr) { // extinfo.flags |= ANDROID_DLEXT_LOAD_AT_FIXED_ADDRESS; // Use the requested addr if extinfo.reserved_addr = oat_file_begin; // vaddr = 0. } // (pic boot image). dlopen_handle_ = android_dlopen_ext(absolute_path.get(), RTLD_NOW, &extinfo);//这里oat_file_begin不为空如果调用android_dlopen_ext打开获得dlopen_handle_,在/bionic/libdl/libdl.c里 #else UNUSED(oat_file_begin); static_assert(!kIsTargetBuild || kIsTargetLinux, "host_dlopen_handles_ will leak handles"); MutexLock mu(Thread::Current(), *Locks::host_dlopen_handles_lock_); dlopen_handle_ = dlopen(absolute_path.get(), RTLD_NOW);//如果没有oat_file_begin,直接调用dlopen从路径加载获得dlopen_handle_ if (dlopen_handle_ != nullptr) { if (!host_dlopen_handles_.insert(dlopen_handle_).second) {//把dlopen_handle_插入host_dlopen_handles_中 dlclose(dlopen_handle_); dlopen_handle_ = nullptr; *error_msg = StringPrintf("host dlopen re-opened '%s'", elf_filename.c_str()); return false; } } #endif // ART_TARGET_ANDROID } if (dlopen_handle_ == nullptr) { *error_msg = StringPrintf("Failed to dlopen '%s': %s", elf_filename.c_str(), dlerror()); return false; } return true; #endif }
5.下面是 ComputeFields,它从begin开始,调用FindDynamicSymbolAddress定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots,其中
oatdata,oatlastword定位了begin_和end_
bool OatFileBase::ComputeFields(uint8_t* requested_base, const std::string& file_path, std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots std::string symbol_error_msg; begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg); if (begin_ == nullptr) { *error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } if (requested_base != nullptr && begin_ != requested_base) { // Host can fail this check. Do not dump there to avoid polluting the output. if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) { PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); } *error_msg = StringPrintf("Failed to find oatdata symbol at expected address: " "oatdata=%p != expected=%p. See process maps in the log.", begin_, requested_base); return false; } end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg); if (end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } // Readjust to be non-inclusive upper bound. end_ += sizeof(uint32_t); bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg)); if (bss_begin_ == nullptr) { // No .bss section. bss_end_ = nullptr; } else { bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg)); if (bss_end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str()); return false; } // Readjust to be non-inclusive upper bound. bss_end_ += sizeof(uint32_t); // Find bss methods if present. bss_methods_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg)); // Find bss roots if present. bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关 } return true; }
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr
大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了
oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以
PreSetup 大致功能如下:
声明一个
dl_iterate_context 结构
通过dl_iterate_phdr循环遍历加载的elf对象,每一次遍历shared_objects_seen自增1,当
shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,重复循环,直到最后一个elf执行下面逻辑
通过dlpi_phnum判断segment数量,遍历elf 加载到内存的segment,如果p_type == PT_LOAD说明是load段,通过dl_phdr_info取出dlpi_phdr[i].p_memsz与dlpi_phdr[i].p_vaddr,获得每个segment加载到内存的地址和大小,如果begin_大于地址小于地址+大小,设置contains_begin = true,说明要开始遍历oat_file的sgment了,跳出循环,执行下面的逻辑
遍历dlpi_phdr,当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz映射sgment到内存,
其实这个函数我也没看太明白,希望大佬指正一下,等闲下来抽时间在认真研究研究,
void DlOpenOatFile::PreSetup(const std::string& elf_filename) {//Ask the linker where it mmaped the file and notify our mmap wrapper of the regions #ifdef __APPLE__ UNUSED(elf_filename); LOG(FATAL) << "Should not reach here."; UNREACHABLE(); #else struct dl_iterate_context { static int callback(struct dl_phdr_info *info, size_t /* size */, void *data) { /* struct dl_phdr_info { ElfW(Addr) dlpi_addr; const char* dlpi_name; const ElfW(Phdr)* dlpi_phdr; ElfW(Half) dlpi_phnum;} */ auto* context = reinterpret_cast<dl_iterate_context*>(data); context->shared_objects_seen++; //这里是shared_objects_seen自增了,跟上面shared_objects_before对比 if (context->shared_objects_seen < context->shared_objects_before) { //只要shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,如果其他线程卸载了一个elf,这有可能出问题 // We haven't been called yet for anything we haven't seen before. Just continue. // Note: this is aggressively optimistic. If another thread was unloading a library, // we may miss out here. However, this does not happen often in practice. return 0; } // See whether this callback corresponds to the file which we have just loaded. bool contains_begin = false; // 一直遍历直到contains_begin也就是包含begin_,这个begin_通过函数Begin()获得也就是oat_file的begin_ for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; if (vaddr <= context->begin_ && context->begin_ < vaddr + memsz) { contains_begin = true; break; } } } // Add dummy mmaps for this file. if (contains_begin) { //一旦 contains_begin = true,遍历dlpi_phdr当p_type == PT_LOAD时通过MemMap::MapDummy根据segment 的vaddr, memsz装载segment到内存 for (int i = 0; i < info->dlpi_phnum; i++) { if (info->dlpi_phdr[i].p_type == PT_LOAD) { uint8_t* vaddr = reinterpret_cast<uint8_t*>(info->dlpi_addr + info->dlpi_phdr[i].p_vaddr); size_t memsz = info->dlpi_phdr[i].p_memsz; MemMap* mmap = MemMap::MapDummy(info->dlpi_name, vaddr, memsz); context->dlopen_mmaps_->push_back(std::unique_ptr<MemMap>(mmap));//把新建的mmap添加进dlopen_mmaps_ } } return 1; // Stop iteration and return 1 from dl_iterate_phdr. //结束循环 } return 0; // Continue iteration and return 0 from dl_iterate_phdr when finished. } const uint8_t* const begin_; //begin_通过函数Begin()获得也就是oat_file的begin_ std::vector<std::unique_ptr<MemMap>>* const dlopen_mmaps_; const size_t shared_objects_before; //上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数 size_t shared_objects_seen; //本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数 };//到这一行struct dl_iterate_context结束 dl_iterate_context context = { Begin(), &dlopen_mmaps_, shared_objects_before_, 0}; //声明一个context if (dl_iterate_phdr(dl_iterate_context::callback, &context) == 0) { //这里调用dl_iterate_phdr,这个callback回调函数完成了oat_file各个segment的mmap // Hm. Maybe our optimization went wrong. Try another time with shared_objects_before == 0 // before giving up. This should be unusual. VLOG(oat) << "Need a second run in PreSetup, didn't find with shared_objects_before=" << shared_objects_before_; dl_iterate_context context0 = { Begin(), &dlopen_mmaps_, 0, 0}; if (dl_iterate_phdr(dl_iterate_context::callback, &context0) == 0) { // OK, give up and print an error. PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); LOG(ERROR) << "File " << elf_filename << " loaded with dlopen but cannot find its mmaps."; } } #endif }
7.再往下就是OatFileBase::Setup,这里主要通过 ReadOatDexFileData函数 运用上文装载的oat_file获得了oat_dex_file以用于获得dex_file,这里的整个oat_file的数据结构综合了oat文件和vdex文件的信息。
Setup { GetOatHeader GetInstructionSetPointerSize GetOatDexFilesOffset//这里达到了OatDexFile的Offset GetDexFileCount ReadOatDexFileData&dex_file_location_size// ResolveRelativeEncodedDexLocation ReadOatDexFileData&dex_file_checksum ReadOatDexFileData&dex_file_offset ReadOatDexFileData&class_offsets_offset ReadOatDexFileData&lookup_table_offset//加快类查找速度 ReadOatDexFileData&dex_layout_sections_offset ReadOatDexFileData&method_bss_mapping_offset FindDexFileMapItem&call_sites_item//调用站点标识符 new OatDexFile //根据上面的信息new OatDexFile 以便于GetBestOatFile获得 }
源码流程比较清楚,主要把握住 ReadOatDexFileData和oat文件指针的移动,最后创建oat_dex_file是最重要的
bool OatFileBase::Setup(const char* abs_dex_location, std::string* error_msg) { if (!GetOatHeader().IsValid()) { std::string cause = GetOatHeader().GetValidationErrorMessage(); *error_msg = StringPrintf("Invalid oat header for '%s': %s", GetLocation().c_str(), cause.c_str()); return false; } PointerSize pointer_size = GetInstructionSetPointerSize(GetOatHeader().GetInstructionSet()); size_t key_value_store_size = (Size() >= sizeof(OatHeader)) ? GetOatHeader().GetKeyValueStoreSize() : 0u; if (Size() < sizeof(OatHeader) + key_value_store_size) { *error_msg = StringPrintf("In oat file '%s' found truncated OatHeader, " "size = %zu < %zu + %zu", GetLocation().c_str(), Size(), sizeof(OatHeader), key_value_store_size); return false; } size_t oat_dex_files_offset = GetOatHeader().GetOatDexFilesOffset(); if (oat_dex_files_offset < GetOatHeader().GetHeaderSize() || oat_dex_files_offset > Size()) { *error_msg = StringPrintf("In oat file '%s' found invalid oat dex files offset: " "%zu is not in [%zu, %zu]", GetLocation().c_str(), oat_dex_files_offset, GetOatHeader().GetHeaderSize(), Size()); return false; } const uint8_t* oat = Begin() + oat_dex_files_offset; // Jump to the OatDexFile records.//oat指针跳到OatDexFile去 DCHECK_GE(static_cast<size_t>(pointer_size), alignof(GcRoot<mirror::Object>)); if (!IsAligned<kPageSize>(bss_begin_) || !IsAlignedParam(bss_methods_, static_cast<size_t>(pointer_size)) || !IsAlignedParam(bss_roots_, static_cast<size_t>(pointer_size)) || !IsAligned<alignof(GcRoot<mirror::Object>)>(bss_end_)) { *error_msg = StringPrintf("In oat file '%s' found unaligned bss symbol(s): " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } if ((bss_methods_ != nullptr && (bss_methods_ < bss_begin_ || bss_methods_ > bss_end_)) || (bss_roots_ != nullptr && (bss_roots_ < bss_begin_ || bss_roots_ > bss_end_)) || (bss_methods_ != nullptr && bss_roots_ != nullptr && bss_methods_ > bss_roots_)) { *error_msg = StringPrintf("In oat file '%s' found bss symbol(s) outside .bss or unordered: " "begin = %p, methods_ = %p, roots = %p, end = %p", GetLocation().c_str(), bss_begin_, bss_methods_, bss_roots_, bss_end_); return false; } uint8_t* after_arrays = (bss_methods_ != nullptr) ? bss_methods_ : bss_roots_; // May be null. uint8_t* dex_cache_arrays = (bss_begin_ == after_arrays) ? nullptr : bss_begin_; uint8_t* dex_cache_arrays_end = (bss_begin_ == after_arrays) ? nullptr : (after_arrays != nullptr) ? after_arrays : bss_end_; DCHECK_EQ(dex_cache_arrays != nullptr, dex_cache_arrays_end != nullptr); uint32_t dex_file_count = GetOatHeader().GetDexFileCount();//获得dex_file_count oat_dex_files_storage_.reserve(dex_file_count); for (size_t i = 0; i < dex_file_count; i++) { uint32_t dex_file_location_size; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_location_size))) //循环通过ReadOatDexFileData函数读取dex_file_location_size并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu truncated after dex file " "location size", GetLocation().c_str(), i); return false; } if (UNLIKELY(dex_file_location_size == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with empty location name", GetLocation().c_str(), i); return false; } if (UNLIKELY(static_cast<size_t>(End() - oat) < dex_file_location_size)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu with truncated dex file " "location", GetLocation().c_str(), i); return false; } const char* dex_file_location_data = reinterpret_cast<const char*>(oat); oat += dex_file_location_size; std::string dex_file_location = ResolveRelativeEncodedDexLocation( abs_dex_location, std::string(dex_file_location_data, dex_file_location_size)); uint32_t dex_file_checksum; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_checksum))) {//通过ReadOatDexFileData函数读取dex_file_checksum并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated after " "dex file checksum", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } uint32_t dex_file_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_file_offset))) {//通过ReadOatDexFileData函数读取dex_file_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after dex file offsets", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset == 0U)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with zero dex " "file offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(dex_file_offset > DexSize())) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u > %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize()); return false; } if (UNLIKELY(DexSize() - dex_file_offset < sizeof(DexFile::Header))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u of %zu but the size of dex file header is %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, DexSize(), sizeof(DexFile::Header)); return false; } const uint8_t* dex_file_pointer = DexBegin() + dex_file_offset; if (UNLIKELY(!DexFile::IsMagicValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file magic '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } if (UNLIKELY(!DexFile::IsVersionValid(dex_file_pointer))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with invalid " "dex file version '%s'", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_pointer); return false; } const DexFile::Header* header = reinterpret_cast<const DexFile::Header*>(dex_file_pointer); if (DexSize() - dex_file_offset < header->file_size_) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with dex file " "offset %u and size %u truncated at %zu", GetLocation().c_str(), i, dex_file_location.c_str(), dex_file_offset, header->file_size_, DexSize()); return false; } uint32_t class_offsets_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &class_offsets_offset))) {//通过ReadOatDexFileData函数读取class_offsets_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' truncated " "after class offsets offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } if (UNLIKELY(class_offsets_offset > Size()) || UNLIKELY((Size() - class_offsets_offset) / sizeof(uint32_t) < header->class_defs_size_)) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "class offsets, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset, Size(), header->class_defs_size_); return false; } if (UNLIKELY(!IsAligned<alignof(uint32_t)>(class_offsets_offset))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned " "class offsets, offset %u", GetLocation().c_str(), i, dex_file_location.c_str(), class_offsets_offset); return false; } const uint32_t* class_offsets_pointer = reinterpret_cast<const uint32_t*>(Begin() + class_offsets_offset); uint32_t lookup_table_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &lookup_table_offset))) {//通过ReadOatDexFileData函数读取lookup_table_offset并调整oat指针,lookup_table用于加速类的查找 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after lookup table offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const uint8_t* lookup_table_data = lookup_table_offset != 0u ? Begin() + lookup_table_offset : nullptr; if (lookup_table_offset != 0u && (UNLIKELY(lookup_table_offset > Size()) || UNLIKELY(Size() - lookup_table_offset < TypeLookupTable::RawDataLength(header->class_defs_size_)))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with truncated " "type lookup table, offset %u of %zu, class defs %u", GetLocation().c_str(), i, dex_file_location.c_str(), lookup_table_offset, Size(), header->class_defs_size_); return false; } uint32_t dex_layout_sections_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &dex_layout_sections_offset))) {//通过ReadOatDexFileData函数读取dex_layout_sections_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after dex layout sections offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const DexLayoutSections* const dex_layout_sections = dex_layout_sections_offset != 0 ? reinterpret_cast<const DexLayoutSections*>(Begin() + dex_layout_sections_offset) : nullptr; uint32_t method_bss_mapping_offset; if (UNLIKELY(!ReadOatDexFileData(*this, &oat, &method_bss_mapping_offset))) {//通过ReadOatDexFileData函数读取method_bss_mapping_offset并调整oat指针 *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zd for '%s' truncated " "after method bss mapping offset", GetLocation().c_str(), i, dex_file_location.c_str()); return false; } const bool readable_method_bss_mapping_size = method_bss_mapping_offset != 0u && method_bss_mapping_offset <= Size() && IsAligned<alignof(MethodBssMapping)>(method_bss_mapping_offset) && Size() - method_bss_mapping_offset >= MethodBssMapping::ComputeSize(0); const MethodBssMapping* method_bss_mapping = readable_method_bss_mapping_size ? reinterpret_cast<const MethodBssMapping*>(Begin() + method_bss_mapping_offset) : nullptr; if (method_bss_mapping_offset != 0u && (UNLIKELY(method_bss_mapping == nullptr) || UNLIKELY(method_bss_mapping->size() == 0u) || UNLIKELY(Size() - method_bss_mapping_offset < MethodBssMapping::ComputeSize(method_bss_mapping->size())))) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with unaligned or " " truncated method bss mapping, offset %u of %zu, length %zu", GetLocation().c_str(), i, dex_file_location.c_str(), method_bss_mapping_offset, Size(), method_bss_mapping != nullptr ? method_bss_mapping->size() : 0u); return false; } if (kIsDebugBuild && method_bss_mapping != nullptr) { const MethodBssMappingEntry* prev_entry = nullptr; for (const MethodBssMappingEntry& entry : *method_bss_mapping) { CHECK_ALIGNED_PARAM(entry.bss_offset, static_cast<size_t>(pointer_size)); CHECK_LT(entry.bss_offset, BssSize()); CHECK_LE(POPCOUNT(entry.index_mask) * static_cast<size_t>(pointer_size), entry.bss_offset); size_t index_mask_span = (entry.index_mask != 0u) ? 16u - CTZ(entry.index_mask) : 0u; CHECK_LE(index_mask_span, entry.method_index); if (prev_entry != nullptr) { CHECK_LT(prev_entry->method_index, entry.method_index - index_mask_span); } prev_entry = &entry; } CHECK_LT(prev_entry->method_index, reinterpret_cast<const DexFile::Header*>(dex_file_pointer)->method_ids_size_); } uint8_t* current_dex_cache_arrays = nullptr; if (dex_cache_arrays != nullptr) { // All DexCache types except for CallSite have their instance counts in the // DexFile header. For CallSites, we need to read the info from the MapList. //对于CallSites,必须从MapList中读取,他不存储在header中 const DexFile::MapItem* call_sites_item = nullptr; if (!FindDexFileMapItem(DexBegin(), //通过FindDexFileMapItem读取call_sites_item并解析 DexEnd(), DexFile::MapItemType::kDexTypeCallSiteIdItem, &call_sites_item)) { *error_msg = StringPrintf("In oat file '%s' could not read data from truncated DexFile map", GetLocation().c_str()); return false; } size_t num_call_sites = call_sites_item == nullptr ? 0 : call_sites_item->size_; DexCacheArraysLayout layout(pointer_size, *header, num_call_sites); if (layout.Size() != 0u) { if (static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays) < layout.Size()) { *error_msg = StringPrintf("In oat file '%s' found OatDexFile #%zu for '%s' with " "truncated dex cache arrays, %zu < %zu.", GetLocation().c_str(), i, dex_file_location.c_str(), static_cast<size_t>(dex_cache_arrays_end - dex_cache_arrays), layout.Size()); return false; } current_dex_cache_arrays = dex_cache_arrays; dex_cache_arrays += layout.Size(); } } std::string canonical_location = DexFile::GetDexCanonicalLocation(dex_file_location.c_str()); // Create the OatDexFile and add it to the owning container. OatDexFile* oat_dex_file = new OatDexFile(this, //根据上面ReadOatDexFileData和FindDexFileMapItem获得的信息构建oat_dex_file dex_file_location, canonical_location, dex_file_checksum, dex_file_pointer, lookup_table_data, method_bss_mapping, class_offsets_pointer, current_dex_cache_arrays, dex_layout_sections); oat_dex_files_storage_.push_back(oat_dex_file); // Add the location and canonical location (if different) to the oat_dex_files_ table. StringPiece key(oat_dex_file->GetDexFileLocation()); oat_dex_files_.Put(key, oat_dex_file); if (canonical_location != dex_file_location) { StringPiece canonical_key(oat_dex_file->GetCanonicalDexFileLocation()); oat_dex_files_.Put(canonical_key, oat_dex_file); } } if (dex_cache_arrays != dex_cache_arrays_end) { // We expect the bss section to be either empty (dex_cache_arrays and bss_end_ // both null) or contain just the dex cache arrays and optionally some GC roots. *error_msg = StringPrintf("In oat file '%s' found unexpected bss size bigger by %zu bytes.", GetLocation().c_str(), static_cast<size_t>(bss_end_ - dex_cache_arrays)); return false; } return true; }
还有一种
打开 ElfOatFile 的方式,应该是调用了系统自己的elf加载器,大致流程应该类似,菜鸟有空在慢慢分析,
最后再梳理一下流程
,大致如下:
PreLoad,遍历所有加载的elf对象获得dl_phdr_info,计算所有elf的个数存储在shared_objects_before_中
LoadVdex,通过VdexFile::Open加载vdex文件,vdex里面也存储了一些dex文件信息
Load,调用Dlopen加载oat_file,获得dlopen_handle_
ComputeFields,从begin开始,通过FindDynamicSymbolAddress定位各种符号地址,也就界定了oat_file在内存中的范围
PreSetup,
再次遍历所有加载的elf对象,在最后一个elf对象的load段之后,通过mmap映射oat_file的segment到内存
Setup,通过 ReadOatDexFileData等函数解析oat_file信息,组装
oat_dex_file
根据以上几步,最终通过oat_file获得了oat_dex_file.
由于菜鸟有些地方也没搞太明白,中间免不了有一些错误,有些语句也叙述的不够恰当,毕竟外行而且语文不咋地,但大致流程应该没问题,希望各位大佬指出我的问题,我好早日改正。
参考:老罗大佬的安卓之旅 https://www.kancloud.cn/alex_wsc/androids/473622
bool DlOpenOatFile::Dlopen(const std::string& elf_filename, uint8_t* oat_file_begin, std::string* error_msg) { #ifdef __APPLE__ // The dl_iterate_phdr syscall is missing. There is similar API on OSX, // but let's fallback to the custom loading code for the time being. UNUSED(elf_filename, oat_file_begin); *error_msg = "Dlopen unsupported on Mac."; return false; #else { UniqueCPtr<char> absolute_path(realpath(elf_filename.c_str(), nullptr)); if (absolute_path == nullptr) { *error_msg = StringPrintf("Failed to find absolute path for '%s'", elf_filename.c_str()); return false; } #ifdef ART_TARGET_ANDROID android_dlextinfo extinfo = {}; // typedef struct { // uint64_t flags; // void* reserved_addr; // size_t reserved_size; // int relro_fd; // int library_fd; // } android_dlextinfo; extinfo.flags = ANDROID_DLEXT_FORCE_LOAD | // Force-load, don't reuse handle // (open oat files multiple // times). ANDROID_DLEXT_FORCE_FIXED_VADDR; // Take a non-zero vaddr as absolute // (non-pic boot image). if (oat_file_begin != nullptr) { // extinfo.flags |= ANDROID_DLEXT_LOAD_AT_FIXED_ADDRESS; // Use the requested addr if extinfo.reserved_addr = oat_file_begin; // vaddr = 0. } // (pic boot image). dlopen_handle_ = android_dlopen_ext(absolute_path.get(), RTLD_NOW, &extinfo);//这里oat_file_begin不为空如果调用android_dlopen_ext打开获得dlopen_handle_,在/bionic/libdl/libdl.c里 #else UNUSED(oat_file_begin); static_assert(!kIsTargetBuild || kIsTargetLinux, "host_dlopen_handles_ will leak handles"); MutexLock mu(Thread::Current(), *Locks::host_dlopen_handles_lock_); dlopen_handle_ = dlopen(absolute_path.get(), RTLD_NOW);//如果没有oat_file_begin,直接调用dlopen从路径加载获得dlopen_handle_ if (dlopen_handle_ != nullptr) { if (!host_dlopen_handles_.insert(dlopen_handle_).second) {//把dlopen_handle_插入host_dlopen_handles_中 dlclose(dlopen_handle_); dlopen_handle_ = nullptr; *error_msg = StringPrintf("host dlopen re-opened '%s'", elf_filename.c_str()); return false; } } #endif // ART_TARGET_ANDROID } if (dlopen_handle_ == nullptr) { *error_msg = StringPrintf("Failed to dlopen '%s': %s", elf_filename.c_str(), dlerror()); return false; } return true; #endif }
5.下面是 ComputeFields,它从begin开始,调用FindDynamicSymbolAddress定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots,其中
oatdata,oatlastword定位了begin_和end_
bool OatFileBase::ComputeFields(uint8_t* requested_base, const std::string& file_path, std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots std::string symbol_error_msg; begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg); if (begin_ == nullptr) { *error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } if (requested_base != nullptr && begin_ != requested_base) { // Host can fail this check. Do not dump there to avoid polluting the output. if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) { PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); } *error_msg = StringPrintf("Failed to find oatdata symbol at expected address: " "oatdata=%p != expected=%p. See process maps in the log.", begin_, requested_base); return false; } end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg); if (end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } // Readjust to be non-inclusive upper bound. end_ += sizeof(uint32_t); bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg)); if (bss_begin_ == nullptr) { // No .bss section. bss_end_ = nullptr; } else { bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg)); if (bss_end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str()); return false; } // Readjust to be non-inclusive upper bound. bss_end_ += sizeof(uint32_t); // Find bss methods if present. bss_methods_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg)); // Find bss roots if present. bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关 } return true; }
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr
大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了
oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以
PreSetup 大致功能如下:
声明一个
dl_iterate_context 结构
通过dl_iterate_phdr循环遍历加载的elf对象,每一次遍历shared_objects_seen自增1,bool OatFileBase::ComputeFields(uint8_t* requested_base, const std::string& file_path, std::string* error_msg) {//这个函数从begin开始,定位各种符号地址oatdata,oatlastword,oatbss,oatbsslastword,oatbssmethods,oatbssroots std::string symbol_error_msg; begin_ = FindDynamicSymbolAddress("oatdata", &symbol_error_msg); if (begin_ == nullptr) { *error_msg = StringPrintf("Failed to find oatdata symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } if (requested_base != nullptr && begin_ != requested_base) { // Host can fail this check. Do not dump there to avoid polluting the output. if (kIsTargetBuild && (kIsDebugBuild || VLOG_IS_ON(oat))) { PrintFileToLog("/proc/self/maps", LogSeverity::WARNING); } *error_msg = StringPrintf("Failed to find oatdata symbol at expected address: " "oatdata=%p != expected=%p. See process maps in the log.", begin_, requested_base); return false; } end_ = FindDynamicSymbolAddress("oatlastword", &symbol_error_msg); if (end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatlastword symbol in '%s' %s", file_path.c_str(), symbol_error_msg.c_str()); return false; } // Readjust to be non-inclusive upper bound. end_ += sizeof(uint32_t); bss_begin_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbss", &symbol_error_msg)); if (bss_begin_ == nullptr) { // No .bss section. bss_end_ = nullptr; } else { bss_end_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbsslastword", &symbol_error_msg)); if (bss_end_ == nullptr) { *error_msg = StringPrintf("Failed to find oatbasslastword symbol in '%s'", file_path.c_str()); return false; } // Readjust to be non-inclusive upper bound. bss_end_ += sizeof(uint32_t); // Find bss methods if present. bss_methods_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssmethods", &symbol_error_msg)); // Find bss roots if present. bss_roots_ = const_cast<uint8_t*>(FindDynamicSymbolAddress("oatbssroots", &symbol_error_msg));//root跟gc有关 } return true; }
6.下面是 PreSetup,这里主要是搞清楚dl_iterate_context和dl_phdr_info这2个struct与dl_iterate_phdr函数调用的关系
dl_iterate_phdr
大概用于遍历当前所有加载的elf并获得每个elf的dl_phdr_info,对每个elf对象调用callback
dl_iterate_context跟上面PreLoad的struct对比, 多了好几个字段,
begin_通过函数Begin()获得也就是oat_file的begin_;
shared_objects_before是上文PreLoad的dl_iterate_context通过dl_iterate_phdr遍历获得的加载的elf对象的个数;
shared_objects_seen是本dl_iterate_context内部通过dl_iterate_phdr遍历获得的加载的elf对象的个数计数;
dlopen_mmaps_向量存储了
oat_file 各个可以加载的segment通过MapDummy映射到内存的MemMap指针。
所以
PreSetup 大致功能如下:
声明一个
dl_iterate_context 结构
当
shared_objects_seen小于shared_objects_before,就说明elf还没有遍历完,重复循环,直到最后一个elf执行下面逻辑
[注意]传递专业知识、拓宽行业人脉——看雪讲师团队等你加入!
最后于 2020-3-18 09:52
被挤蹭菌衣编辑
,原因:
赞赏
他的文章
- cocos2d逆向入门和某捕鱼游戏分析 26607
- [原创]capstone2llvmir入门---如何把汇编转换为llvmir 20913
- [原创]利用编译器优化干掉控制流平坦化flatten 40634
- [求助][原创]利用编译器优化干掉虚假控制流 14970
- [求助][原创]对类抽取加固的一点尝试与遇到的问题 7912
谁下载
无
谁下载
无
看原图
赞赏
雪币:
留言: