Abstract

The YARA rules technique is used in cybersecurity to scan for malware, often in its default form, where rules are created either manually or automatically. Creating YARA rules that enable analysts to label files as suspected malware is a highly technical skill, requiring expertise in cybersecurity. Therefore, in cases where rules are either created manually or automatically, it is desirable to improve both the performance and detection outcomes of the process. In this paper, two methods are proposed utilising the techniques of fuzzy hashing and fuzzy rules, to increase the effectiveness of YARA rules without escalating the complexity and overheads associated with YARA rules. The first proposed method utilises fuzzy hashing referred to as enhanced YARA rules in this paper, where if existing YARA rules fails to detect the inspected file as malware, then it is subjected to fuzzy hashing to assess whether this technique would identify it as malware. The second proposed technique called embedded YARA rules utilises fuzzy hashing and fuzzy rules to improve the outcomes further. Fuzzy rules countenance circumstances where data are imprecise or uncertain, generating a probabilistic outcome indicating the likelihood of whether a file is malware or not. The paper discusses the success of the proposed enhanced YARA rules and embedded YARA rules through several experiments on the collected malware and goodware corpus and their comparative evaluation against YARA rules.

Highlights

  • YARA is an established malware analysis technique, discovering malware based on their strings and signature matching [47]

  • This paper proposes a second technique called embedded YARA rules, in which all the information generated by YARA rules during the execution phase is captured and utilised by fuzzy rules to enhance YARA rules instead of focusing on rule optimisation

  • Fuzzy hashing attempts to find structural similarity between the two files in their entirety in circumstances where the selected Indicator of Compromise (IoC) strings cannot be found in the sample [34]

Read more

Summary

Introduction

YARA is an established malware analysis technique, discovering malware based on their strings and signature matching [47]. The proposed embedded approach extends the rule triggering condition of String Matching and adds another additional condition of Fuzzy Hash Matching [33], to demonstrate an initial concept of embedding (see Fig. 7) It can be customised in a more complex way for a number of parameters, multiple conditions and op-codes depending on the specific requirement for malware analysis. This included the results based on the two fuzzy categories Likely Malware and Less Likely Malware, which were not possible using basic or enhanced YARA rules alone [33]. Embedded YARA rules (with fuzzy hash and fuzzy rules) similarity detection rate (%)

Evaluation metric
Conclusion
Compliance with ethical standards
14. Hybrid-Analysis
22. Mandiant
36. Readthedocs
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call