Protocol fuzzers are widely used for finding vulnerabilities and security bugs in the program. The main techniques used by protocol fuzzers can be divided into 2 categories: generation-based and mutation-based fuzzing. The generation-based fuzzing generates data messages using an official specification (i.e., grammar), while the mutation-based fuzzing performs random transformations on a prepared message. But these two types of fuzzing techniques are ineffective or inefficient when testing industrial control system (ICS), because many ICS protocols are unknown, undocumented or proprietary. The generation-based fuzzing cannot work well without specifications, while the mutation-based fuzzing cannot achieve high branch coverage. In this paper, we present Miff (abbreviation of the system using “M”P tree to “i”mprove per“f”ormance of “f”uzzer) that aims at automatically abstracting data models from ICS messages. The data model generated by Miff can be used to direct protocol fuzzers to test ICS. Miff has three processing stages: (1) by instrumenting and monitoring program execution, Miff obtains the execution context information, builds memory propagation (MP) tree for every byte in the message, and identifies protocol field boundaries based on the similarity between MP trees; (2) by using information-theoretic measures, Miff infers the type of every field; (3) according to analysis results of the first two stages, Miff decides the mutation strategy for every field, which combines with the field boundary and type information to form the data model. We have implemented a prototype of Miff and applied it into 4 open-source protocol fuzzers. Our experimental results show that, Miff can enable the generation-based fuzzing to test ICS even if the specification is absent, and improve the performance of the mutation-based fuzzing to achieve higher branch coverage with less test cases.
Read full abstract