Abstract

Malware detection is a paramount priority in today’s world in order to prevent malware attacks. Malware detection comes in three methods: static analysis, dynamic analysis, and hybrids. Static analysis is fast and effective for detecting previously seen malware where as dynamic analysis can be more accurate and robust against zero-day or polymorphic malware, but at the cost of a high computational load, which results in an often-prohibitive dollar cost for the needed server farm to handle all incoming traffic at an organization’s network entry point. Most modern defenses today use a hybrid approach, which uses both static and dynamic analysis to maximize their chances of detecting malware. However, current hybrid approaches are suboptimal. We propose a solution to utilize the strengths of both while minimizing their weaknesses by using a two-phase hybrid detection tool. The first phase is a static tool, which we call a “static-hybrid” tool, that is based on machine learning and static analysis to categorize incoming programs into three buckets: definitely benign, definitely malicious, and needs further analysis. Only the small fraction of programs in the third bucket are run on the dynamic analyzer. Our system approaches the accuracy of the dynamic-only system with only a small fraction of its computational cost, while maintaining a real-time malware detection timeliness similar to a static-only system, thus achieving the best of both approaches.A key feature of our system is that the first (static) phase can run in active mode, i.e. it blocks malware in real time, which is possible because of the low 0.08% rate of mistakenly blocking benign programs as malicious (all results in our salient configuration). The second (dynamic) phase is run in passive mode, i.e. it send alerts for suspected malware without blocking them, and has a higher false positive rate of 0.75%. The first phase blocks 88.98% of malware, whereas the second phase brings up the detection rate to 98.73%. Since only a small fraction of malware missed by the first stage but caught by the second stage generates alerts, our system reduces alerts by 9.5X vs any highly accurate system running by itself in the typical passive mode seen in practice. Since only 3.63% of programs that need further study are sent to the second phase, this reduces the computation load for dynamic analysis by 100/3.63 = 27.5X.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call