Abstract

Along with the evolution of computer viruses, the number of file samples that need to be analyzed has constantly increased. An automatic and robust tool is needed to classify the file samples quickly and efficiently. Inspired by the human immune system, we developed a local concentration based virus detection method, which connects a certain number of two-element local concentration vectors as a feature vector. In contrast to the existing data mining techniques, the new method does not remember exact file content for virus detection, but uses a non-signature paradigm, such that it can detect some previously unknown viruses and overcome the techniques like obfuscation to bypass signatures. This model first extracts the viral tendency of each fragment and identifies a set of statical structural detectors, and then uses an information-theoretic preprocessing to remove redundancy in the detectors’ set to generate ‘self’ and ‘nonself’ detector libraries. Finally, ‘self’ and ‘nonself’ local concentrations are constructed by using the libraries, to form a vector with an array of two elements of local concentrations for detecting viruses efficiently. Several standard data mining classifiers, including K-nearest neighbor (KNN), radial basis function (RBF) neural networks, and support vector machine (SVM), are leveraged to classify the local concentration vector as the feature of a benign or malicious program and to verify the effectiveness and robustness of this approach. Experimental results show that the proposed approach not only has a much faster speed, but also gives around 98% of accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.