As Android malware grows and evolves, deep learning has been introduced into malware detection, resulting in great effectiveness. Recent work is considering hybrid models and multi-view learning. However, they use only simple features, limiting the accuracy of these approaches in practice. This study proposes DeepCatra, a multi-view learning approach for Android malware detection, whose model consists of a bidirectional LSTM (BiLSTM) and a graph neural network (GNN) as subnets. The two subnets rely on features extracted from statically computed call traces leading to critical APIs derived from public vulnerabilities. For each Android app, DeepCatra first constructs its call graph and computes call traces reaching critical APIs. Then, temporal opcode features used by the BiLSTM subnet are extracted from the call traces, while flow graph features used by the GNN subnet are constructed from all call traces and inter-component communications. We evaluate the effectiveness of DeepCatra by comparing it with several state-of-the-art detection approaches. Experimental results on over 18,000 real-world apps and prevalent malware show that DeepCatra achieves considerable improvement, for example, 2.7%–14.6% on the F1 measure, which demonstrates the feasibility of DeepCatra in practice.
Read full abstract