Abstract

Attackers increasingly take advantage of naive users who tend to treat non-executable files casually, as if they are benign. Such users often open non-executable files although they can conceal and perform malicious operations. Existing defensive solutions currently used by organizations prevent executable files from entering organizational networks via web browsers or email messages. Therefore, recent advanced persistent threat attacks tend to leverage non-executable files such as portable document format (PDF) documents which are used daily by organizations. Machine Learning (ML) methods have recently been applied to detect malicious PDF files, however these techniques lack an essential element—they cannot be efficiently updated daily. In this study we present an active learning (AL) based framework, specifically designed to efficiently assist anti-virus vendors focus their analytical efforts aimed at acquiring novel malicious content. This focus is accomplished by identifying and acquiring both new PDF files that are most likely malicious and informative benign PDF documents. These files are used for retraining and enhancing the knowledge stores of both the detection model and anti-virus. We propose two AL based methods: exploitation and combination. Our methods are evaluated and compared to existing AL method (SVM-margin) and to random sampling for 10 days, and results indicate that on the last day of the experiment, combination outperformed all of the other methods, enriching the signature repository of the anti-virus with almost seven times more new malicious PDF files, while each day improving the detection model’s capabilities further. At the same time, it dramatically reduces security experts’ efforts by 75 %. Despite this significant reduction, results also indicate that our framework better detects new malicious PDF files than leading anti-virus tools commonly used by organizations for protection against malicious PDF files.

Highlights

  • Cyber-attacks aimed at organizations have increased since 2009, with 91 % of all organizations hit by cyberattacks in 2013.1 Attacks aimed at organizations usually include harmful activities such as stealing confidential information, spying and monitoring an organization, and1 http://www.humanipo.com/news/37983/91-of-organisations-hit-by-cyber attacks-in-2013/.disrupting an organization’s actions

  • Before we provide a review of existing techniques and known methods of attack, it is worthwhile to mention that Adobe Reader version X, released in 2011, offers a new feature called Protected Mode Adobe Reader (PMAR)

  • The number of new malicious portable document format (PDF) files is 128 since the initial detection model was trained on an initial set of 574 labeled PDF files that contained 128 malwares

Read more

Summary

Introduction

Cyber-attacks aimed at organizations have increased since 2009, with 91 % of all organizations hit by cyberattacks in 2013.1 Attacks aimed at organizations usually include harmful activities such as stealing confidential information, spying and monitoring an organization, and1 http://www.humanipo.com/news/37983/91-of-organisations-hit-by-cyber attacks-in-2013/.disrupting an organization’s actions. Email has become a very attractive platform from which to initiate cyber-attacks against organizations. Attackers often use social engineering in order to encourage recipients to press a link or open a malicious web page or attachment. Before we provide a review of existing techniques and known methods of attack, it is worthwhile to mention that Adobe Reader version X, released in 2011, offers a new feature called Protected Mode Adobe Reader (PMAR). Protected mode uses a sandbox technique in order to create an isolated environment for the Acrobat Reader rendering agent to run while reading a PDF file. Most organizations are not up-to-date with the newest versions of software, including PDF readers, and they are exposed to many well-known attacks that exploit vulnerabilities that exist in previous versions of Adobe Reader. In order to explain how PDF files can be exploited when created or manipulated by an attacker, we first describe the structure of a viable PDF file

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.