Malware Instances Research Articles

We study the challenging task of malware recognition on both known and novel unknown malware families, called malware open-set recognition (MOSR). Previous works usually assume the malware families are known to the classifier in a close-set scenario, i.e., testing families are the subset or at most identical to training families. However, novel unknown malware families frequently emerge in real-world applications, and as such, require recognizing malware instances in an open-set scenario, i.e., some unknown families are also included in the test set, which has been rarely and nonthoroughly investigated in the cyber-security domain. One practical solution for MOSR may consider jointly classifying known and detecting unknown malware families by a single classifier (e.g., neural network) from the variance of the predicted probability distribution on known families. However, conventional well-trained classifiers usually tend to obtain overly high recognition probabilities in the outputs, especially when the instance feature distributions are similar to each other, e.g., unknown versus known malware families, and thus, dramatically degrade the recognition on novel unknown malware families. To address the problem and construct an applicable MOSR system, we propose a novel model that can conservatively synthesize malware instances to mimic unknown malware families and support a more robust training of the classifier. More specifically, we build upon the generative adversarial networks to explore and obtain marginal malware instances that are close to known families while falling into mimical unknown ones to guide the classifier to lower and flatten the recognition probabilities of unknown families and relatively raise that of known ones to rectify the performance of classification and detection. A cooperative training scheme involving the classification, synthesizing and rectification are further constructed to facilitate the training and jointly improve the model performance. Moreover, we also build a new large-scale malware dataset, named MAL-100, to fill the gap of lacking a large open-set malware benchmark dataset. Experimental results on two widely used malware datasets and our MAL-100 demonstrate the effectiveness of our model compared with other representative methods.

Read full abstract

Android, the most popular mobile operating system, has attracted millions of users around the world. Meanwhile, the number of new Android malware instances has grown exponentially in recent years. On the one hand, existing Android malware detection systems have shown that distilling the program semantics into a graph representation and detecting malicious programs by conducting graph matching are able to achieve high accuracy on detecting Android malware. However, these traditional graph-based approaches always perform expensive program analysis and suffer from low scalability on malware detection. On the other hand, because of the high scalability of social network analysis, it has been applied to complete large-scale malware detection. However, the social-network-analysis-based method only considers simple semantic information (i.e., centrality) for achieving market-wide mobile malware scanning, which may limit the detection effectiveness when benign apps show some similar behaviors as malware. In this article, we aim to combine the high accuracy of traditional graph-based method with the high scalability of social-network-analysis--based method for Android malware detection. Instead of using traditional heavyweight static analysis, we treat function call graphs of apps as complex social networks and apply social-network--based centrality analysis to unearth the central nodes within call graphs. After obtaining the central nodes, the average intimacies between sensitive API calls and central nodes are computed to represent the semantic features of the graphs. We implement our approach in a tool called IntDroid and evaluate it on a dataset of 3,988 benign samples and 4,265 malicious samples. Experimental results show that IntDroid is capable of detecting Android malware with an F-measure of 97.1% while maintaining a True-positive Rate of 99.1%. Although the scalability is not as fast as a social-network-analysis--based method (i.e., MalScan ), compared to a traditional graph-based method, IntDroid is more than six times faster than MaMaDroid . Moreover, in a corpus of apps collected from GooglePlay market, IntDroid is able to identify 28 zero-day malware that can evade detection of existing tools, one of which has been downloaded and installed by more than ten million users. This app has also been flagged as malware by six anti-virus scanners in VirusTotal, one of which is Symantec Mobile Insight .

Read full abstract

Malware Instances Research Articles

Related Topics

Articles published on Malware Instances

Towards a semi-automatic classifier of malware through tweets for early warning threat detection

Malware traffic detection based on type II fuzzy recognition

DEF: Deep Ensemble Neural Network Classifier for Android Malware Detection

Malicious Code Detection Using Machine Learning

Securing Cyberspace: Exploring the Efficacy of SVM (Poly, Sigmoid) and ANN in Malware Analysis

EvadeDroid: A practical evasion attack on machine learning for black-box Android malware detection

A novel Android malware detection method with API semantics extraction

Malware Detection

A Survey of Malware Analysis Using Community Detection Algorithms

XMal: A lightweight memory-based explainable obfuscated-malware detector

Malware API Calls Detection Using Hybrid Logistic Regression and RNN Model

Conservative Novelty Synthesizing Network for Malware Recognition in an Open-Set Scenario.

Attention-Based Cross-Modal CNN Using Non-Disassembled Files for Malware Classification

An inception V3 approach for malware classification using machine learning and transfer learning

Lightweight, Effective Detection and Characterization of Mobile Malware Families

IntDroid

CHybriDroid: A Machine Learning-Based Hybrid Technique for Securing the Edge Computing

Two Anatomists Are Better than One—Dual-Level Android Malware Detection

A Multilabel Fuzzy Relevance Clustering System for Malware Attack Attribution in the Edge Layer of Cyber-Physical Networks

A malware variants detection methodology with an opcode-based feature learning method and a fast density-based clustering algorithm

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Malware Instances Research Articles

Related Topics

Articles published on Malware Instances

Towards a semi-automatic classifier of malware through tweets for early warning threat detection

Malware traffic detection based on type II fuzzy recognition

DEF: Deep Ensemble Neural Network Classifier for Android Malware Detection

Malicious Code Detection Using Machine Learning

Securing Cyberspace: Exploring the Efficacy of SVM (Poly, Sigmoid) and ANN in Malware Analysis

EvadeDroid: A practical evasion attack on machine learning for black-box Android malware detection

A novel Android malware detection method with API semantics extraction

Malware Detection

A Survey of Malware Analysis Using Community Detection Algorithms

XMal: A lightweight memory-based explainable obfuscated-malware detector

Malware API Calls Detection Using Hybrid Logistic Regression and RNN Model

Conservative Novelty Synthesizing Network for Malware Recognition in an Open-Set Scenario.

Attention-Based Cross-Modal CNN Using Non-Disassembled Files for Malware Classification

An inception V3 approach for malware classification using machine learning and transfer learning

Lightweight, Effective Detection and Characterization of Mobile Malware Families

IntDroid

CHybriDroid: A Machine Learning-Based Hybrid Technique for Securing the Edge Computing

Two Anatomists Are Better than One—Dual-Level Android Malware Detection

A Multilabel Fuzzy Relevance Clustering System for Malware Attack Attribution in the Edge Layer of Cyber-Physical Networks

A malware variants detection methodology with an opcode-based feature learning method and a fast density-based clustering algorithm