Hash-AV: fast virus signature scanning by cache-resident filters
Fast virus scanning is becoming increasingly important in today's Internet. While Moore's law continues to double CPU cycle speed, virus scanning applications fail to ride on the performance wave due to their frequent random memory accesses. This paper proposes Hash-AV, a virus scanning "booster" technique that aims to take advantage of improvements in CPU performance. Using a set of hash functions and a bloom filter array that fits in CPU second-level (L2) caches, Hash-AV determines the majority of "no-match" cases without accesses to main memory. Experiments show that Hash-AV improves the performance of the open-source virus scanner Clam-AV by a factor of 2.5 to 10. The key to Hash-AV's success lies in a set of "bad but cheap" hash functions that are used as initial hashes. The speed of Hash-AV makes it well suited for "on-access" virus scanning, providing greater protections to the user. Through intercepting system calls and wrapping glibc libraries, we have implemented an "on-access" version for Hash-AV+Clam-AV. The on-access scanner can examine input data at a throughput of over 200 Mb/s, making it suitable for network-based virus scanning.
- Research Article
61
- 10.1504/ijsn.2007.012824
- Jan 1, 2007
- International Journal of Security and Networks
Fast virus scanning is becoming increasingly important in today's internet. While Moore's law continues to double CPU cycle speed, virus scanning applications fail to ride on the performance wave due to their frequent random memory accesses. This paper proposes Hash-AV, a virus scanning 'booster' technique that aims to take advantage of improvements in CPU performance. Using a set of hash functions and a Bloom filter array that fits in CPU second-level (L2) caches, Hash-AV determines the majority of 'no-match' cases without accesses to main memory. Experiments show that Hash-AV improves the performance of the open-source virus scanner Clam-AV by a factor of 2–10. The key to Hash-AV's success lies in a set of 'bad but cheap' hash functions that are used as initial hashes. The speed of Hash-AV makes it well suited for 'on-access' virus scanning, providing greater protections to the user. Through intercepting system calls and wrapping glibc libraries, we have implemented an 'on-access' version for Hash-AV+Clam-AV. The on-access scanner can examine input data at a throughput of over 200 Mb/s, making it suitable for network-based virus scanning.
- Conference Article
53
- 10.1109/ctc.2014.11
- Nov 1, 2014
Cybercrime continues to be a growing challenge and malware is one of the most serious security threats on the Internet today which have been in existence from the very early days. Cyber criminals continue to develop and advance their malicious attacks. Unfortunately, existing techniques for detecting malware and analysing code samples are insufficient and have significant limitations. For example, most of malware detection studies focused only on detection and neglected the variants of the code. Investigating malware variants allows antivirus products and governments to more easily detect these new attacks, attribution, predict such or similar attacks in the future, and further analysis. The focus of this paper is performing similarity measures between different malware binaries for the same variant utilizing data mining concepts in conjunction with hashing algorithms. In this paper, we investigate and evaluate using the Trend Locality Sensitive Hashing (TLSH) algorithm to group binaries that belong to the same variant together, utilizing the k-NN algorithm. Two Zeus variants were tested, TSPY_ZBOT and MAL_ZBOT to address the effectiveness of the proposed approach. We compare TLSH to related hashing methods (SSDEEP, SDHASH and NILSIMSA) that are currently used for this purpose. Experimental evaluation demonstrates that our method can effectively detect variants of malware and resilient to common obfuscations used by cyber criminals. Our results show that TLSH and SDHASH provide the highest accuracy results in scoring an F-measure of 0.989 and 0.999 respectively.
- Research Article
34
- 10.1109/tifs.2012.2206028
- Oct 1, 2012
- IEEE Transactions on Information Forensics and Security
In the last several decades, the arms race between malware writers and antivirus programmers has become more and more severe. The simplest way for a computer user to secure his computer is to install antivirus software on his computer. As antivirus software becomes more sophisticated and powerful, evading the detection of antivirus software becomes an important part of malware. As a result, malware writers have developed various approaches to increase the survivability and concealment of their malware. One of these technologies is to terminate antivirus software right after the execution of the malware. In this paper, we propose a mechanism, called ANtivirus Software Shield (ANSS), to prevent antivirus software from being terminated without the consciousness of the antivirus software users. ANSS uses System Service Descriptor Table (SSDT) hooking to intercept specific Windows APIs and analyzes them to filter out hazardous API calls that will terminate antivirus software. When using several pieces of malware that can terminate various brands of antivirus applications to make our experiments, the results show that ANSS can protect antivirus software from being terminated by them with at most 0.42% CPU performance overhead and 1.77% memory write performance overhead.
- Conference Article
9
- 10.1109/icacca.2016.7578894
- Apr 1, 2016
Malware is one of the most serious security threats on the Internet today, it has been seen that malware authors employ variety of techniques to evade security detection but most of their techniques are discovered and blocked by antivirus programs. Still there are some evasion techniques which are not exploited in wild and are effective against antivirus programs. This paper studies the working of Self-Extracting Archive (SFX) and how it can be used for malicious purposes with this we will also present the concept of Silent SFX which is a technique to silently deploy a malware into a target machine bypassing all runtime based antivirus scan's. In addition to this we analyze the antivirus reports produced before and after applying this technique and we will be providing suitable countermeasures to mitigate against this type of malware attack.
- Conference Article
6
- 10.1109/ic3i.2014.7019657
- Nov 1, 2014
An Operating system (OS) is software that manages computer hardware and software resources by providing services to computer programs. One of the important user expectations of the operating system is to provide the practice of defending information from unauthorized access, disclosure, modification, inspection, recording or destruction. Operating system is always vulnerable to the attacks of malwares such as computer virus, worm, Trojan horse, backdoors, ransomware, spyware, adware, scareware and more. And so the anti-virus software were created for ensuring security against the prominent computer viruses by applying a dictionary based approach. The anti-virus programs are not always guaranteed to provide security against the new viruses proliferating every day. To clarify this issue and to secure the computer system, our proposed expert system concentrates on authorizing the processes as wanted and unwanted by the administrator for execution. The Expert system maintains a database which consists of hash code of the processes which are to be allowed. These hash codes are generated using MD5 message-digest algorithm which is a widely used cryptographic hash function. The administrator approves the wanted processes that are to be executed in the client in a Local Area Network by implementing Client-Server architecture and only the processes that match with the processes in the database table will be executed by which many malicious processes are restricted from infecting the operating system. The add-on advantage of this proposed Expert system is that it limits CPU usage and minimizes resource utilization. Thus data and information security is ensured by our system along with increased performance of the operating system.
- Research Article
4
- 10.1016/j.jksuci.2017.09.009
- Sep 28, 2017
- Journal of King Saud University - Computer and Information Sciences
Inner interruption discovery and defense system by using data mining
- Conference Article
7
- 10.1109/iccd.2017.53
- Nov 1, 2017
In recent times, applications like web-based search, antivirus scanners, cloud computing, social media applications, and network applications are extremely common. The hash table is a heavily used data structure in such applications. Modern microprocessors have several special function units (SFUs) such as a floating point unit, a memory management unit, and a cryptography unit. However, hashing is typically performed in software, which reduces the performance of such applications. In this paper, we propose an FPGA-based implementation of a hash unit (a hash function and a hash table) in an FPGA. The FPGA-based hash unit is implemented as a coprocessor for a CPU. The CPU and the FPGA communicate through a PCI Express (PCIe) interface. The hash table in our hash unit is implemented as a content-addressable memory (CAM), to enhance the speed of hash operations. The hash unit (HU) coprocessor is tested in the context of virus checking application, when the hashing operation only requires membership checks. Our HU can be used in other hashing applications as well; we use virus checking as a representative application. Hashing operations are performed in a batch on the FPGA, to provide better utilization of the PCIe bus. We demonstrate a significant performance of up to 7.3× for our FPGAbased hash unit implementation compared to a software-based hashing implementation. This speedup is for the entire virus checking application (not just the hash lookup portion of the virus checking application).
- Conference Article
- 10.1109/cit.2012.201
- Oct 1, 2012
Along with the rapid development of computer technology, people's lives are increasingly dependent on computers. At the same time, the computer system is facing increasingly complex and diverse sabotage and attacks. Destruction of computer viruses is most widespread and severe among them, so studying anti-virus technology is imminent. Virus scanning engine is the kernel of anti-virus software, it uses signature database to identify known viruses. The pattern matching algorithm is the core algorithm of the entire anti-virus software. This paper first introduces some background knowledge of anti-virus software, pattern matching algorithm and hash algorithm. Then it proposes a new type of multi-pattern matching algorithm with automata based on frequently matching hash values. Combining the advantages of fast calculation of hash function and parallel pattern matching of automata, it has significant performance advantages in the circumstance of virus signature matching. It can also be applied to other similar circumstances after making a little improvement, such as gene sequence alignment where patterns are also very long.
- Research Article
18
- 10.1587/transinf.e94.d.2150
- Jan 1, 2011
- IEICE Transactions on Information and Systems
With the rapid development and proliferation of the Internet, cyber attacks are increasingly and continually emerging and evolving nowadays. Malware - a generic term for computer viruses, worms, trojan horses, spywares, adwares, and bots - is a particularly lethal security threat. To cope with this security threat appropriately, we need to identify the malwares' tendency/characteristic and analyze the malwares' behaviors including their classification. In the previous works of classification technologies, the malwares have been classified by using data from dynamic analysis or code analysis. However, the works have not been succeeded to obtain efficient classification with high accuracy. In this paper, we propose a new classification method to cluster malware more effectively and more accurately. We firstly perform dynamic analysis to automatically obtain the execution traces of malwares. Then, we classify malwares into some clusters using their characteristics of the behavior that are derived from Windows API calls in parallel threads. We evaluated our classification method using 2,312 malware samples with different hash values. The samples classified into 1,221 groups by the result of three types of antivirus softwares were classified into 93 clusters. 90% of the samples used in the experiment were classified into 20 clusters at most. Moreover, it ensured that 39 malware samples had characteristics different from other samples, suggesting that these may be new types of malware. The kinds of Windows API calls confirmed the samples classified into the same cluster had the same characteristics. We made clear that antivirus softwares named different name to malwares that have same behavior.
- Conference Article
1
- 10.2316/p.2012.789-045
- Jan 1, 2012
Nowadays, malicious cloaking on web sites becomes a problem. Details on web sites are changeable according to IP addresses where users come from. Malwares on malicious sites can be temporarily eliminated when contents on the sites are scanned by virus detectors such as cyber polices or search engines, so that the malwares are just sent to victims such as ordinary PCs and smart phones. Especially, anti-virus software is not installed into smart phones in general since the software requires more CPU performance and more memory resources, particularly when generic or heuristic analysis is taken for detecting malwares. There are services which scan malwares instead of users' computers monitoring information passed through from a server to the computers. However, user's privacy is revealed in the services, and if on business use, secrets in business can be leaked. Therefore, we propose a system which prevents users' computers from taking malwares on cloaked web sites with keeping their privacy and secrets in safe.
- Conference Article
1
- 10.1145/3372780.3378166
- Mar 30, 2020
The end of Moore's law has been proclaimed on many occasions and it's probably safe to say that we are now working in the post-Moore era. But no one is ready to slow down just yet. We can view Gordon Moore's observation on transistor densification as just one aspect of a longer-term underlying technological trend - the Law of Accelerating Returns articulated by Kurzweil. Arguably, companies became somewhat complacent in the Moore era, happy to settle for the gains brought by each new process node. Although we can expect scaling to continue, albeit at a slower pace, the end of Moore's Law delivers a stronger incentive to push other trends of technology progress harder. Some exciting new technologies are now emerging such as multi-chip 3D integration and the introduction of new technologies such as storage-class memory and silicon photonics. Moreover, we are also entering a golden age of computer architecture innovation. One of the key drivers is the pursuit of domain-specific architectures as proclaimed by Turing award winners John Hennessy and David Patterson. A good example is the Xilinx's AI Engine, one of the important features of the Versal? ACAP (adaptive compute acceleration platform) [1]. Today, the explosion of AI workloads is one of the most powerful drivers shifting our attention to find faster ways of moving data into, across, and out of accelerators. Features such as massive parallel processing elements, the use of domain specific accelerators, the dense interconnect between distributed on-chip memories and processing elements, are examples of the ways chip makers are looking beyond scaling to achieve next-generation performance gains. Next, the growing demands of scaling-out hyperscale datacenter applications drive much of the new architecture developments. Given a high diversification of workloads that invoke massive compute and data movement, datacenter architectures are moving away from rigid CPU-centric structures and instead prioritize adaptability and configurability to optimize resources such as memory and connectivity of accelerators assigned to individual workloads. There is no longer a single figure of merit. It's not all about Tera-OPS. Other metrics such as transfers-per-second and latency come to the fore as demands become more real-time; autonomous vehicles being an obvious and important example. Moreover, the transition to 5G will result in solutions that operate across the traditional boundaries between the cloud and edge and embedded platforms that are obviously power-conscious and cost-sensitive. Future workloads will require agile software flows that accommodate the spread of functions across edge and cloud. Another industry megatrend that will drive technology requirements especially in encryption, data storage and communication, is Blockchain. To some, it may already have a bad reputation, tarnished by association with the anarchy of cryptocurrency, but it will be more widely relevant than many of us realize. Who could have foreseen the development of today's Internet when ARPANET first appeared as a simple platform for distributed computing and sending email? Through projects such as the open-source Hyperledger, Blockchain technology could be game-changing as a platform for building trust in transactions executed over the Internet. We may soon be talking in terms of the Trusted Internet. The predictability of Moore's law may have become rather too comfortable and slow. The future requires maximizing the flexibility, agility, and efficiency of new technologies. With Moore's Law now mostly behind us, new adaptable and scalable architectures will allow us to further provide exponential return from technology in order to create a more adaptable and intelligent world.
- Research Article
2
- 10.18372/2225-5036.25.13824
- Aug 30, 2019
- Ukrainian Scientific Journal of Information Security
Signature-based security tools such as network intrusion detection systems, anti-virus scanners, filters against network worms and other similar systems perform in real time computation-intensive task of multi-pattern string matching against tens of thousands or even millions of predefined malicious patterns. Due to rising traffic rates, increasing number and sophistication of attacks and the collapse of Moore's law for sequential processing, traditional software solutions can no longer meet the high requirements of today’s security challenges. Therefore, designers pay more attention to hardware approaches to accelerate pattern matching. The reconfigurable devices based on Field Programmable Gate Arrays (FPGA) combining the flexibility of software and the near-ASIC performance, have become increasingly popular for this purpose. The state-of-the-art solutions made in this area around the world were analyzed. There are three main approaches to fulfill the pattern matching using FPGA. The techniques (and underlying technologies) of these approaches are: content addressable memory (based on digital comparators), Bloom filter (based on hash-functions) and Aho-Corasick algorithm (based on finite automata). But none of them shows clear advantages over others. In this article, we propose a set of methods to increase the effectiveness of reconfigurable security tools by synthesizing optimal recognition modules that maximize the benefits of each approach. The Parallel Combination Method divides a set of patterns between several matching blocks that use different approaches to better fit each of them. The Sequential Cascading Method processes patterns in parts: if the first fragment does not match, the rest can be ignored. The Vertical Join Method couples together different approaches or techniques in a single unit to provide higher efficiency of the resulting device. The optimization procedure maximizes efficiency gains for each method. The methods and methodologies presented in this study will allow developers to create more efficient reconfigurable tools for information security systems.
- Conference Article
14
- 10.1109/iccke.2013.6682867
- Oct 1, 2013
Malicious software, also called malware, is one of the major threats on the Internet today. Despite various antivirus programs, thousands of Internet hosts are daily infected with malware, such as viruses, worms, and Trojan horses. Due to using a variety of obfuscation techniques, polymorphic malware can easily evade signature-based detection techniques by continually changing their appearance or patterns. However, all polymorphic malware samples in the same malware family often follow the same behavioral pattern that can be used to generate a behavioral signature. In this paper, we propose MalHunter, a novel method based on sequence clustering and sequence alignment to automatic generation of behavioral signatures for polymorphic malware detection. We first generate a set of behavioral sequences for different samples of a polymorphic malware, each of which represents a thread's behavior. We then group similar behavioral sequences into the same cluster and generate an alignment pattern for each cluster. We finally build a multiple behavioral signature for the polymorphic malware. MalHunter stores fewer signatures in the signature database due to the generation of a multiple behavioral signature for different samples of each polymorphic malware. The experimental results on a malware collection suggest that MalHunter is both precise and succinct for effective matching and detection of polymorphic malware.
- Conference Article
254
- 10.1145/968280.968305
- Feb 22, 2004
Moore's Law states that the number of transistors on a device doubles every two years; however, it is often (mis)quoted based on its impact on CPU performance. This important corollary of Moore's Law states that improved clock frequency plus improved architecture yields a doubling of CPU performance every 18 months. This paper examines the impact of Moore's Law on the peak floating-point performance of FPGAs. Performance trends for individual operations are analyzed as well as the performance trend of a common instruction mix (multiply accumulate). The important result is that peak FPGA floating-point performance is growing significantly faster than peak floating-point performance for a CPU.
- Conference Article
1
- 10.1109/icct50939.2020.9295852
- Oct 28, 2020
Named Data Networking (NDN) is a new kind of architecture for future Internet, which is exactly satisfied with the rapidly increasing mobile requirement and information-depended applications that dominate today's Internet. However, the current verification-data accessed system is not safe enough to prevent data leakage because no strongly method to resist any device or user to access it. We bring up a lightweight verification based on hash functions and a fine-grained access control based on Schnorr Signature to address the issue seamlessly. The proposed scheme is scalable and protect data confidentiality in a NDN network.