What Does Control Flow Really Look Like? Eyeballing the Cyclomatic Complexity Metric

Jurgen J Vinju,Michael W Godfrey

doi:10.1109/scam.2012.17

Abstract

Assessing the understandability of source code remains an elusive yet highly desirable goal for software developers and their managers. While many metrics have been suggested and investigated empirically, the McCabe cyclomatic complexity metric (CC) - which is based on control flow complexity - seems to hold enduring fascination within both industry and the research community despite its known limitations. In this work, we introduce the ideas of Control Flow Patterns (CFPs) and Compressed Control Flow Patterns (CCFPs), which eliminate some repetitive structure from control flow graphs in order to emphasize high-entropy graphs. We examine eight well-known open source Java systems by grouping the CFPs of the methods into equivalence classes, and exploring the results. We observed several surprising outcomes: first, the number of unique CFPs is relatively low, second, CC often does not accurately reflect the intricacies of Java control flow, and third, methods with high CC often have very low entropy, suggesting that they may be relatively easy to understand. These findings challenge the widely-held belief that there is a clear-cut causal relationship between CC and understandability, and suggest that CC and similar measures need to be reconsidered as metrics for code understandability.

Highlights

Understandability of source code is an important quality attribute of software systems
We introduce the notions of abstract control flow patterns (CFPs) and compressed control flow patterns (CCFPs), which allow us to produce statistical evidence that the complexity metric (CC) metric does not adequately model the likely complexity of control flow in Java methods
We introduce an abstraction that de-emphasizes control structures that occur repeatedly at the same structural level: A compressed control flow pattern (CCFP) is a control flow pattern where each list e1, e2, . . . , en of n > 1 consecutive tree nodes in the pattern that are structurally equal is replaced by a single node

Summary

Introduction

Understandability of source code is an important quality attribute of software systems. While some recent studies have suggested that many metrics correlate strongly with LOC [3] — which is trivial to measure — the cyclomatic complexity metric (CC) continues to be widely used as a measure of the likely understandability of source code. It is an integral part of many metric tool suites, both opensource as well as commercial. It is worthwhile to investigate the reasonableness of using CC to measure understandability of source code

Objectives

Methods

Results

Conclusion