Abstract

Static code analysis of software systems has proven beneficial for a broad range of domains, including security assessments, coding practice, error detection, and others. However, as modern systems have grown in complexity and heterogeneity over the past few decades, advances in development frameworks have dominated. Rather than involving low-level language constructs, these frameworks typically focus on software components, including data entities, controllers, and endpoints. As a result, current code analysis approaches have become unsuitable for analyzing these modern systems due to their focus on low-level constructs in a single language. Thus, code analysis has become a far more complicated endeavor thanks to the plethora of languages, frameworks, and design approaches in modern software development. This paper presents a novel approach to solving the problem of being tied to a single language and its low-level constructs. The system’s source code is transformed into an intermediate representation called a language-agnostic abstract-syntax tree. This system representation is then assessed by generalized component parsers that extract relevant high-level information, such as components, from low-level structures. The design of the approach is presented here in detail, along with its evaluation in a case study involving two large, heterogeneous, cloud-native system benchmarks (Java and C++ microservices). The study demonstrates a unified identification approach to determine system data entities and endpoints. Utilizing higher-level constructs, such as components, can advance the current practice of system analysis to better face broader problems introduced by modern system development practices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call