Source Code Entities Research Articles

Information Retrieval (IR) approaches, such as Latent Semantic Indexing (LSI) and Vector Space Model (VSM), are commonly applied to recover software traceability links. Recently, an approach based on developers' eye gazes was proposed to retrieve traceability links. This paper presents a comparative study on IR and eye-gaze based approaches. In addition, it reports on the possibility of using eye gaze links as an alternative benchmark in comparison to commits. The study conducted asked developers to perform bug-localization tasks on the open source subject system JabRef. The iTrace environment, which is an eye tracking enabled Eclipse plugin, was used to collect eye gaze data. During the data collection phase, an eye tracker was used to gather the source code entities (SCE's), developers looked at while solving these tasks. We present an algorithm that uses the collected gaze dataset to produce candidate traceability links related to the tasks. In the evaluation phase, we compared the results of our algorithm with the results of an IR technique, in two different contexts. In the first context, precision and recall metric values are reported for both IR and eye gaze approaches based on commits. In the second context, another set of developers were asked to rate the candidate links from each of the two techniques in terms of how useful they were in fixing the bugs. The eye gaze approach outperforms standard LSI and VSM approaches and reports a 55 % precision and 67 % recall on average for all tasks when compared to how the developers actually fixed the bug. In the second context, the usefulness results show that links generated by our algorithm were considered to be significantly more useful (to fix the bug) than those of the IR technique in a majority of tasks. We discuss the implications of this radically different method of deriving traceability links. Techniques for feature location/bug localization are commonly evaluated on benchmarks formed from commits as is done in the evaluation phase of this study. Although, commits are a reasonable source, they only capture entities that were eventually changed to fix a bug or resolve a feature. We investigate another type of benchmark based on eye tracking data, namely links generated from the bug-localization tasks given to the developers in the data collection phase. The source code entities relevant to subjected bugs recommended from IR methods are evaluated on both commits and links generated from eye gaze. The results of the benchmarking phase show that the use of eye tracking could form an effective (complementary) benchmark and add another interesting perspective in the evaluation of bug-localization techniques.

Read full abstract

ContextSoftware networks are directed graphs of static dependencies between source code entities (functions, classes, modules, etc.). These structures can be used to investigate the complexity and evolution of large-scale software systems and to compute metrics associated with software design. The extraction of software networks is also the first step in reverse engineering activities. ObjectiveThe aim of this paper is to present SNEIPL, a novel approach to the extraction of software networks that is based on a language-independent, enriched concrete syntax tree representation of the source code. MethodThe applicability of the approach is demonstrated by the extraction of software networks representing real-world, medium to large software systems written in different languages which belong to different programming paradigms. To investigate the completeness and correctness of the approach, class collaboration networks (CCNs) extracted from real-world Java software systems are compared to CCNs obtained by other tools. Namely, we used Dependency Finder which extracts entity-level dependencies from Java bytecode, and Doxygen which realizes language-independent fuzzy parsing approach to dependency extraction. We also compared SNEIPL to fact extractors present in language-independent reverse engineering tools. ResultsOur approach to dependency extraction is validated on six real-world medium to large-scale software systems written in Java, Modula-2, and Delphi. The results of the comparative analysis involving ten Java software systems show that the networks formed by SNEIPL are highly similar to those formed by Dependency Finder and more precise than the comparable networks formed with the help of Doxygen. Regarding the comparison with language-independent reverse engineering tools, SNEIPL provides both language-independent extraction and representation of fact bases. ConclusionSNEIPL is a language-independent extractor of software networks and consequently enables language-independent network-based analysis of software systems, computation of design software metrics, and extraction of fact bases for reverse engineering activities.

Read full abstract

Source Code Entities Research Articles

Related Topics

Articles published on Source Code Entities

JEMMA: An extensible Java dataset for ML4Code applications

To automatically map source code entities to architectural modules with Naive Bayes

Finding needles in a haystack: Leveraging co-change dependencies to recommend refactorings

The impact of IR-based classifier configuration on the performance and the effort of method-level bug localization

Comparing learning to rank techniques in hybrid bug localization

Case study on which relations to use for clustering-based software architecture recovery

Eye movements in software traceability link recovery

Code Basket: Making Developers' Mental Model Visible and Explorable

Empirical investigation of SEA-based dependence cluster properties

An empirical study on the importance of source code entities for requirements traceability

A language-independent approach to the extraction of dependencies between source code entities

The Impact of Classifier Configuration and Classifier Combination on Bug Localization

Configuring latent Dirichlet allocation based feature location

Ring: A unifying meta-model and infrastructure for Smalltalk source code analysis tools

Assigning change requests to software developers

Analyzing the co-evolution of comments and source code

Towards an Integrated View on Architecture and its Evolution

Using origin analysis to detect merging and splitting of source code entities

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Source Code Entities Research Articles

Related Topics

Articles published on Source Code Entities

JEMMA: An extensible Java dataset for ML4Code applications

To automatically map source code entities to architectural modules with Naive Bayes

Finding needles in a haystack: Leveraging co-change dependencies to recommend refactorings

The impact of IR-based classifier configuration on the performance and the effort of method-level bug localization

Comparing learning to rank techniques in hybrid bug localization

Case study on which relations to use for clustering-based software architecture recovery

Eye movements in software traceability link recovery

Code Basket: Making Developers' Mental Model Visible and Explorable

Empirical investigation of SEA-based dependence cluster properties

An empirical study on the importance of source code entities for requirements traceability

A language-independent approach to the extraction of dependencies between source code entities

The Impact of Classifier Configuration and Classifier Combination on Bug Localization

Configuring latent Dirichlet allocation based feature location

Ring: A unifying meta-model and infrastructure for Smalltalk source code analysis tools

Assigning change requests to software developers

Analyzing the co-evolution of comments and source code

Towards an Integrated View on Architecture and its Evolution

Using origin analysis to detect merging and splitting of source code entities