Android Security Symposium 2017

Statistical deobfuscation of Android applications

About the speaker

Petar Tsankov

ETH Zurich, Software Reliability Lab, Zurich, Switzerland
Petar Tsankov is a security researcher at the Software Reliability Lab at ETH Zurich. The goal of his research is to make it easier for developers who are not security experts to build secure and reliable systems. Towards this goal, he combines novel techniques from Programming Synthesis, Machine Learning, and Probabilistic Programming to build new practical systems that solve important problems in Information Security.

Abstract

In this talk, I will present DeGuard (www.apk-deguard.com),
a new system for deobfuscating Android APKs based on probabilistic learning from large
code bases. DeGuard learns a probabilistic model over thousands of non-obfuscated
Android applications and uses this model to deobfuscate new, unseen Android APKs.
DeGuard effectively reverses the process of layout obfuscation, the most common
obfuscation mechanism for Android applications, which renames key program elements
such as classes, packages, and methods, thus making it difficult to understand what
the application does.

To make this possible, DeGuard phrases the layout deobfuscation problem of Android
APKs as a structured prediction in a probabilistic graphical model. I will describe
DeGuard's probabilistic model, along with the rich set of features and constraints
for Android that ensure both semantic equivalence and high prediction accuracy. I
will present experiments that demonstrate that DeGuard is useful in practice: it
recovers 79.1% of the program element names obfuscated with ProGuard, it predicts
third-party libraries with an accuracy of 91.3%, and it reveals string decoders and
classes that handle sensitive data in Android malware.

Slides

Get the slides here.

Video