Josef Ressel Center for User-friendly Secure Mobile Environments
Android Security Symposium 2017
Statistical deobfuscation of Android applications
About the speaker
Petar Tsankov
Abstract
In this talk, I will present DeGuard (www.apk-deguard.com),
a new system for deobfuscating Android APKs based on probabilistic learning from large
code bases. DeGuard learns a probabilistic model over thousands of non-obfuscated
Android applications and uses this model to deobfuscate new, unseen Android APKs.
DeGuard effectively reverses the process of layout obfuscation, the most common
obfuscation mechanism for Android applications, which renames key program elements
such as classes, packages, and methods, thus making it difficult to understand what
the application does.
To make this possible, DeGuard phrases the layout deobfuscation problem of Android
APKs as a structured prediction in a probabilistic graphical model. I will describe
DeGuard's probabilistic model, along with the rich set of features and constraints
for Android that ensure both semantic equivalence and high prediction accuracy. I
will present experiments that demonstrate that DeGuard is useful in practice: it
recovers 79.1% of the program element names obfuscated with ProGuard, it predicts
third-party libraries with an accuracy of 91.3%, and it reveals string decoders and
classes that handle sensitive data in Android malware.