Learning model from the execution traces has been considered in several domains such as traffic classification, malware detection, and software engineering since it distinguishes which processes actually executed through their traces. Behavioral model learning and classification (in terms of learned models) are taken into accounts to eliminate the shortages of models derived based on non-behavioral features and improve the resulting classifications. So far, no general method has been proposed to automatically derive behavioral models. To this aim, we assume that the models of applications can be abstractly defined in terms of how they execute their depending components, well-known in the domain. To automatically derive such models, we extend the passive automata learning by considering the behavior of the depending components in addition to the observed behaviors. The state merging algorithm of the learning process is equipped with a new equivalence relation which aggregate states modulo counter abstraction of symmetric reduction. To improve the generated model to cover unobserved behaviors, we leverage the technique of complex event processing to complete the model with the unseen interleaving of actions due to the concurrent execution of components. The derived models, specified by parametrized transition systems, can distinguish different executions of instances of each component by assigning a unique symbolic identifier to each instantiation and parameterizing actions with such identifiers. The learned models are used to distinguish the executions of applications in an interleaved execution trace of different systems. The detection procedure is more complicated for parametric models because of the need for relating the information of the trace to symbolic identifiers as the parameters. We utilize runtime verification techniques in a three-step novel approach so as to enhance the performance of the matching process for a trace. To illustrate the applicability of our approach, we have employed it for traffic classification in the network domain and then applied it on some real applications. To demonstrate the effectiveness of our approach in this domain, we compare it to related approaches in terms of their true positive rate, false positive rate, and test time. Our results indicate that our technique prevents including invalid traces so that unobserved behaviors are covered with an acceptable precision. • We extend the passive automata learning for multi-component applications. • The learning approach aggregates those states preserving the behavior of components. • We provide an automata-based classifier based on the runtime verification technique. • We compare our classifier with statistical ones for network applications.
Read full abstract