Recovering software architecture from the names of source files

Nicolas Anquetil,Timothy C Lethbridge

doi:10.1002/(sici)1096-908x(199905/06)11:3<201::aid-smr192>3.0.co;2-1

Nicolas Anquetil, Timothy C Lethbridge

https://doi.org/10.1002/(sici)1096-908x(199905/06)11:3<201::aid-smr192>3.0.co;2-1

Copy DOI

Abstract

We discuss how to extract a useful set of subsystems from a set of existing source-code file names. This problem is challenging because many legacy systems use thousands of files names, including some that are very short and cryptic. At the same time the problem is important because software maintainers often find it difficult to understand such systems. We propose a general algorithm to cluster files based on their names, and a set of alternative methods for implementing the algorithm. One of the key tasks is picking candidate words to try to identify in file names. We do this by (a) iteratively decomposing file names, (b) finding common substrings, and (c) choosing words in routine names, in an English dictionary or in source-code comments. In addition, we investigate generating abbreviations from the candidate words in order to find matches in file names, as well as how to split file names into components given no word markers. To compare and evaluate our five approaches, we present two experiments. The first compares the ‘concepts’ found in each file name by each method with the results of manually decomposing file names. The second experiment compares automatically generated subsystems with subsystem examples proposed by experts. We conclude that two methods are most effective: extracting concepts using common substrings and extracting those concepts that relate to the names of routines in the files. Copyright © 1999 John Wiley & Sons, Ltd.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Recovering software architecture from the names of source files

Abstract

Talk to us

Similar Papers

More From: Journal of Software Maintenance: Research and Practice

Lead the way for us

Journal: Journal of Software Maintenance: Research and Practice	Publication Date: May 1, 1999
Citations: 104

Similar Papers

Runtime recovery and manipulation of software architecture of component-based systems
Gang Huang ... Hong Mei
Automated Software Engineering | VOL. 13
Gang Huang, et. al.Gang Huang ... Hong Mei
01 Apr 2006
Automated Software Engineering | VOL. 13

Study of storage devices properties for steganographic data hiding in cluster file systems
K.Yu Shekhanin ... L.O Gorbachova
Radiotekhnika | VOL. -
K.Yu Shekhanin, et. al.K.Yu Shekhanin ... L.O Gorbachova
23 Dec 2020
Radiotekhnika | VOL. -

Identifying self-admitted technical debt in issue tracking systems using machine learning
Yikun Li ... Mohamed Soliman
Empirical Software Engineering | VOL. 27
Yikun Li, et. al.Yikun Li ... Mohamed Soliman
10 Jul 2022
Empirical Software Engineering | VOL. 27

File System
Priyanka Sahni ... Nonika Sharma
Journal of Advance Research in Computer Science & Engineering (ISSN: 2456-3552) | VOL. 1
Priyanka Sahni, et. al.Priyanka Sahni ... Nonika Sharma
30 Apr 2014
Journal of Advance Research in Computer Science & Engineering (ISSN: 2456-3552) | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Recovering software architecture from the names of source files

Abstract

Talk to us

Similar Papers

More From: Journal of Software Maintenance: Research and Practice