Original link: https://darvincitech.wordpress.com/2020/03/01/yet-another-tamper-detection-in-android/
Android apps are signed by developer private key before uploading to the play store. Every private key has an associated public certificate that devices and services use to verify that the app is from a trusted source. App updates rely on the fact that the app has to be signed by the same developer who signed the already installed app. However, it is trivial to get the APK published in the playstore. One can fetch the APK by downloading it in rooted device or there are other handful web sites crawling the playstore to fetch different versions of all APKs published in playstore. Once you have the APK, tampering the application and repackaging them with your own certificates is piece of cake, thanks to apktool and decompilers. On the defensive side, application developers try to protect their app by checking whether the application signing certificate is the same that is used in signing the app. This is an effective mechanism, but the downside of this is to rely on system APIs for getting the application signing keys. In static analysis, System APIs used in fetching the signing certificate show up which indicates the places where such checks are being done leading to patching such defenses. In dynamic analysis, there are readymade hooking scripts that can quickly bypass such defensive mechanism done in the app. There are RASP(Runtime Application Self Protection) providers who provide better mechanisms to protect from tampering, but I am not going to delve into that. In this post, I am going to focus on doing the tamper checks in C in an effective way and reduce dependency on system APIs. This is not completely new as it has been already attempted in other open source projects too.
Following are the requirements I have set when doing this project
a. Verify the hash of signing certificate and compare it with precomputed hash within C code
b. Detect patching of native code and patching of precomputed hash which is stored in read-only section of the binary.
c. Detect dynamic tampering(hooking) on to the native code.
Verifying hash of signing certificate in C
The signing certificate that is used in verifying the APK signature is stored in CERT.RSA or CERT.EC etc. Compute the hash of CERT.RSA using the openssl command openssl dgst -sha256 CERT.RSA. Copy the output in the variable APK_SIGNER_HASH in native-lib.c. For unzipping the apk to get CERT.RSA at runtime, I used open source libzip library. Then compute the hash on this certificate using mbedtls library and compare it with the APK_SIGNER_HASH embedded in the code. This is just one time you will need to do as long as you are not changing the certificate attributes. An ideal way to do this is to parse the pkcs7 certificate to fetch the public key and apply hash on it. I am bit lazy on this part and instead chose to apply the hash on the entire certificate.
Detect static and dynamic tampering of native code
Static tampering refers to patching the binary, dynamic tampering refers to hooking the native library at runtime.
Now how would an adversary bypass the verification of signing certificate checks?
Patch the hash value with their own developer certificate hash.
Patch the code where the comparison happens between the newly computed and pre-computed hash and return as always success.
Hook the instructions performing the comparison between newly computed and pre-computed hash and return as always success.
There can be other attacks related to hooking the libc APIs used by libzip to point to a different RSA file. As discussed in my other blog post, disk to memory checks on libc library can be effective, but not for static tampering of my native library.
I was looking for solution from open source and came across how openssl fips and boringssl fips does the integrity checks on specific portions of the crypto library. Infact, these libraries have already solved the problem of static and dynamic tampering. Inspired by their approach, I set out to do the same for this project. My native code is just 1 C file(lets call it core), it depends on mbedtls for crypto(HMAC, SHA256) and libzip static library for unzipping the APK file. I create marker files to mark the start and end of text and rodata section like shown below
In the core, compute the HMAC on the text and rodata section and compares it with the precomputed values. Wait! How do I know the precomputed HMAC on the text and rodata section before the binary is built? This brings us to an additional post linking step which takes care of injecting the HMAC after the shared library is built.For now, the HMAC Key, Text HMAC, rodata HMAC are declared as const variables(not static const) that holds dummy values.With this setup, build the markers, core, mbedtls as object files. While linking append with start marker followed by core, mbedtls, libzip static library and prepend with end marker.The final shared library will show the markers sandwiching other binaries in both text and rodata section. Note the addresses of text start and end to know the span of text section in the below image.
nm -D libnative-lib.so
In rodata section, you can find the rodata HMAC preceding the rodata start as the computed HMAC cannot be within rodata section.
objdump -s libnative-lib.so – rodata start
rodata end
The work is just half-done now. As part of the post-linking step, HMAC needs to be computed on the text and rodata section. HMAC is used as it provides both integrity and authenticity of the data. The code for injecting the HMAC is written in Go language, primarily because I wanted to learn Go and also I had reference implementations in BoringSSL FIPS. In the injecthash go file, a random HMAC key is generated. With this key, compute the HMAC on the text section ( text end – text start ) and the rodata section ( rodata end – rodata start). Inject the HMAC key, text section HMAC and rodata HMAC in the native library. Finally the strip process will remove the const symbols in rodata. Voila! The binary is ready for detecting tampering.
Performance
Computing HMAC on the text, rodata section is expensive especially when the binary is in MBs. For huge binaries, the markers can be adjusted such that it protects only the sensitive portions of the code. The code to verify the HMAC can be positioned such that, it can be called only during sensitive operations.
Proof of Concept
PoC for the above mechanisms is given in my github project. The same concept of self integrity checks can be adapted to any applications having similar binary structures . This PoC has scope to improve in terms of including obfuscation, reducing dependency on libc APIs and much more. However, it has to be noted this code can be tampered as well but it takes more time and effort compared to the existing signer certificate hash checks done in Java. After all Security is all about delaying an attack.