[原创]从Java到Native，so中的函数是如何一步步被加载的？-Android安全-看雪-安全社区|安全招聘|kanxue.com

[原创]从Java到Native，so中的函数是如何一步步被加载的？

发表于: 6天前 835

[原创]从Java到Native，so中的函数是如何一步步被加载的？

xianyuuuan 活跃值

6天前

835

so？what can i do?

0.一个so的加载流程

首先在java层肯定是要去调用so层文件,本篇重点讲解native层的加载流程，此处不再赘述。

1.了解dlopen函数/android_dlopen_ext()

“Dynamic Link” 动态装载库，我们会想到windows系统中，存在加载dll文件的动态加载类型，在linux中，类似的文件就是.so文件了，而加载文件的重要函数就是dlopen。

dlopen系列函数

dlopen：该函数将打开一个新库，并把它装入内存。该函数主要用来加载库中的符号，这些符号在编译的时候是不知道的。这种机制使得在系统中添加或者删除一个模块时，都不需要重新进行编译。

函数原型

1	`void` `dlopen(const` `char` `filename,` `int` `flag);`

第一个参数是头文件所在的文件名，也就是so文件，第二个标志参数有很多，例如有RTLD_NOW和RTLD_LAZY立即计算和需要时计算，以及RTLD_GLOBAL使得那些在以后才加载的库可以获得其中的符号。方式就是dlopen返回的句柄作为dlsym()的第一个参数，获取符号在库中的地址。

android_dlopen_ext()函数

此函数作为安卓平台上特有的函数是dlopen函数的拓展版本，进一步增强了dlopen函数的功能

函数原型

1	`void* android_dlopen_ext(const` `char* path,` `int` `flag,` `const` `void* ext_data);`

可以看到在原函数基础上增加了ext_data参数

参照源码可以大致了解增强的功能，其中大部分功能与第二个参数flag中的一些功能标志位相关联

typedef struct {

  /** A bitmask of `ANDROID_DLEXT_` enum values. */

  uint64_t flags;
 
  /** Used by `ANDROID_DLEXT_RESERVED_ADDRESS` and `ANDROID_DLEXT_RESERVED_ADDRESS_HINT`. */

  void*   _Nullable reserved_addr;

  /** Used by `ANDROID_DLEXT_RESERVED_ADDRESS` and `ANDROID_DLEXT_RESERVED_ADDRESS_HINT`. */

  size_t  reserved_size;
 
  /** Used by `ANDROID_DLEXT_WRITE_RELRO` and `ANDROID_DLEXT_USE_RELRO`. */

  int     relro_fd;
 
  /** Used by `ANDROID_DLEXT_USE_LIBRARY_FD`. */

  int     library_fd;

  /** Used by `ANDROID_DLEXT_USE_LIBRARY_FD_OFFSET` */

  off64_t library_fd_offset;
 
  /** Used by `ANDROID_DLEXT_USE_NAMESPACE`. */

  struct android_namespace_t* _Nullable library_namespace;
} android_dlextinfo;

dlopen函数后的流程

2.do_dlopen 函数

此版本为安卓4源码分析，最新版本的安卓号的加载流程在代码量上有进一步的提升，但是基础原理类似。

上面的dlopen函数只是一个引子，真正的核心功能代码实现在do_dlopen函数中

void* dlopen(const char* filename, int flags) {

  ScopedPthreadMutexLocker locker(&gDlMutex);

  soinfo* result = do_dlopen(filename, flags);

  if (result == NULL) {

    __bionic_format_dlerror("dlopen failed", linker_get_error_buffer());

    return NULL;

  }

  return result;
}
 
soinfo* do_dlopen(const char* name, int flags) {

    //判断传入标志的类型

  if ((flags & ~(RTLD_NOW|RTLD_LAZY|RTLD_LOCAL|RTLD_GLOBAL)) != 0) {

    DL_ERR("invalid flags to dlopen: %x", flags);

    return NULL;

  }

    //内存保护权限设置 此处为可读可写，目的是在加载find_library函数后可以对soinfo结构体的内容进行修改

  set_soinfo_pool_protection(PROT_READ | PROT_WRITE);

  soinfo* si = find_library(name); //这里返回了so信息链

  if (si != NULL) {

    si->CallConstructors(); //这里执行了此构造方法

  }

    //在这里就没有可写权限了

  set_soinfo_pool_protection(PROT_READ);

  return si;
}

此函数的类型为soinfo指针，soinfo代表的含义是“进程加载的so链”，其中包含了已经被加载的so 的信息。这里所返回的值也是si——进程加载的so链。

其中涉及的主要两步函数为find_library和CallConstructor下面会继续介绍。

3.find_library函数

find_library函数传入参数后也会进行进一步的函数调用，流程见注释

static soinfo *find_loaded_library(const char *name)
{

    soinfo *si;

    const char *bname;
 
    // TODO: don't use basename only for determining libraries

    // http://code.google.com/p/android/issues/detail?id=6670
 
    bname = strrchr(name, '/'); //分割so文件的名称

    bname = bname ? bname + 1 : name;
 
    for (si = solist; si != NULL; si = si->next) {//递归查找是否存在so文件名称

        if (!strcmp(bname, si->name)) {

            return si;

        }

    }

    return NULL;
}
 
static soinfo* find_library_internal(const char* name) {

  if (name == NULL) {

    return somain;  //返回共享库

  }
 
  soinfo* si = find_loaded_library(name);

  if (si != NULL) {

    if (si->flags & FLAG_LINKED) {// 前者检查是否有flag标志字段，后者检查是否被链接

      return si; //如果被链接就直接加载

    }

    DL_ERR("OOPS: recursive link to \"%s\"", si->name);

      //报错递归链接错误。

      //【递归链接：在动态库的加载过程中，如果同一个库被多次请求加载，可能会发生递归链接。通常这是不希望发生的情况，因为这会导致循环依赖或重复加载的错误。】

    return NULL;

  }
 
  TRACE("[ '%s' has not been loaded yet.  Locating...]", name);

    //发现未被加载后会通过load_library重新加载

  si = load_library(name); //load_library的函数在下面介绍

  if (si == NULL) { //如果再次加载仍为null 则返回null

    return NULL;

  }
 
  // At this point we know that whatever is loaded @ base is a valid ELF

  // shared library whose segments are properly mapped in.

    //返回了基址，大小和名称

  TRACE("[ init_library base=0x%08x sz=0x%08x name='%s' ]",

        si->base, si->size, si->name);
 
    //通过此函数

  if (!soinfo_link_image(si)) {//此函数实现了动态链接库中section信息解析。

    munmap(reinterpret_cast<void*>(si->base), si->size);

    soinfo_free(si);

    return NULL;

  }
 
  return si;
}
 
static soinfo* find_library(const char* name) {

  soinfo* si = find_library_internal(name);

  if (si != NULL) {

    si->ref_count++;

  }

  return si;
}

load_library()函数加载【待补充】

涉及知识点：elf文件格式，文件分区的加载

4.call_constructors 函数

此函数根据上面的soinfo链接映像函数分析的section动态节区中的信息，获取共享库依赖的所有的so文件名，所有的依赖库初始化完成后，执行init_func、init_array方法初始化该动态库。

void soinfo::CallConstructors() {

if (constructors_called) {

 return;
}
 
// We set constructors_called before actually calling the constructors, otherwise it doesn't
// protect against recursive constructor calls. One simple example of constructor recursion
// is the libc debug malloc, which is implemented in libc_malloc_debug_leak.so:
// 1. The program depends on libc, so libc's constructor is called here.
// 2. The libc constructor calls dlopen() to load libc_malloc_debug_leak.so.
// 3. dlopen() calls the constructors on the newly created
//    soinfo for libc_malloc_debug_leak.so.
// 4. The debug .so depends on libc, so CallConstructors is
//    called again with the libc soinfo. If it doesn't trigger the early-
//    out above, the libc constructor will be called again (recursively!).

constructors_called = true;
 

if ((flags & FLAG_EXE) == 0 && preinit_array != NULL) {

 // The GNU dynamic linker silently ignores these, but we warn the developer.

 PRINT("\"%s\": ignoring %d-entry DT_PREINIT_ARRAY in shared library!",

       name, preinit_array_count);
}
 

    //确保库已被初始化加载

if (dynamic != NULL) {

 for (Elf32_Dyn* d = dynamic; d->d_tag != DT_NULL; ++d) {

   if (d->d_tag == DT_NEEDED) {

     const char* library_name = strtab + d->d_un.d_val;

     TRACE("\"%s\": calling constructors in DT_NEEDED \"%s\"", name, library_name);

     find_loaded_library(library_name)->CallConstructors();

   }

 }
}
 

TRACE("\"%s\": calling constructors", name);
 

    //最后进行初始化函数的执行
// DT_INIT should be called before DT_INIT_ARRAY if both are present.

CallFunction("DT_INIT", init_func);

CallArray("DT_INIT_ARRAY", init_array, init_array_count, false);
}

5.init 和 init_array函数

函数简介

这两个函数是so文件在被加载或者卸载时自动执行的函数，用于初始化的操作，其中init函数优先于init_array函数。作为so层加载很早的函数，可以通过实现hook他来绕过一些关键的检测点。

也正是因为加载过早且初始化后只加载一次，我们如果直接去hook是无法get到的，通过上面的流程我们知道了这两个函数是在call_constructors中进行加载，我们就可以通过逆向hook相关的native函数进行加载hook。通过在android_dlopen_ext加载过程中进行hook操作。

登录后可查看完整内容

[培训]内核驱动高级班，冲击BAT一流互联网大厂工作，每周日13:00-18:00直播授课

最后于 5天前被xianyuuuan编辑，原因：原创

#基础理论 #逆向分析 #HOOK注入

收藏・1

免费

支持