Large Software Development Research Articles

Software developers in big and medium-size companies are working with millions of lines of code in their codebases. Assuring the quality of this code has shifted from simple defect management to proactive assurance of internal code quality. Although static code analysis and code reviews have been at the forefront of research and practice in this area, code reviews are still an effort-intensive and interpretation-prone activity. The aim of this research is to support code reviews by automatically recognizing company-specific code guidelines violations in large-scale, industrial source code. In our action research project, we constructed a machine-learning-based tool for code analysis where software developers and architects in big and medium-sized companies can use a few examples of source code lines violating code/design guidelines (up to 700 lines of code) to train decision-tree classifiers to find similar violations in their codebases (up to 3 million lines of code). Our action research project consisted of (i) understanding the challenges of two large software development companies, (ii) applying the machine-learning-based tool to detect violations of Sun’s and Google’s coding conventions in the code of three large open source projects implemented in Java, (iii) evaluating the tool on evolving industrial codebase, and (iv) finding the best learning strategies to reduce the cost of training the classifiers. We were able to achieve the average accuracy of over 99% and the average F-score of 0.80 for open source projects when using ca. 40K lines for training the tool. We obtained a similar average F-score of 0.78 for the industrial code but this time using only up to 700 lines of code as a training dataset. Finally, we observed the tool performed visibly better for the rules requiring to understand a single line of code or the context of a few lines (often allowing to reach the F-score of 0.90 or higher). Based on these results, we could observe that this approach can provide modern software development companies with the ability to use examples to teach an algorithm to recognize violations of code/design guidelines and thus increase the number of reviews conducted before the product release. This, in turn, leads to the increased quality of the final software.

Read full abstract

Maintaining a software system includes tasks such as fixing defects, adding new features, or modifying the software (software changes) to accommodate different environments. Then, the modified software system needs to be tested, to ensure the changes will not having any adverse effects on the previously validated code. Regression testing is one of the approaches which software tester used to test the software system. The traditional regression testing strategy was to repeat all the previous tests and retesting all the features of the program even for small modifications. For programming with thousand lines of codes (LOC), the cost of retesting the entire system is expensive if attempted after every change. This practice is becoming increasingly difficult because of the demand for testing the new functionalities and correcting errors with limited resources. Numerous techniques and tools have been proposed and developed to reduce the costs of regression testing and to aid regression testing processes, such as test suite reduction, test case prioritization, and test case done on the thresholds and weightings used in regression testing. However, there is still need to study on the software traceability model of coverage analysis in software changes during regression testing and test effort estimation on regression testing. Hence, this paper describes the proposal for improving software changes with hybrid traceability model and test effort estimation during regression testing. We will explain our proposed work including the problem background, the intended research objectives, literature review and plan for future implementation. This study is expected to contribute in developing hybrid traceability model for large software development project to support software changes during regression testing with test estimation approach and expected to reduce operational cost during the implementation on software maintenance. Also, it is hoped that an efficient and improve solution to regression testing can be realized, thus, gives the benefits to software testers and project manager manage the software maintenance task since it is a critical part in software project development.

Read full abstract

Large Software Development Research Articles

Related Topics

Articles published on Large Software Development

Recognizing lines of code violating company-specific coding guidelines using machine learning

How Technology Support for Contextualization Affects Enterprise Social Media Use: A Media System Dependency Perspective

Does every formal peer review really need to take place? An industrial case study

Coordination Challenges in Large-Scale Software Development: A Case Study of Planning Misalignment in Hybrid Settings

A Review for Improving Software Change using Traceability Model with Test Effort Estimation

Task assignment to distributed teams aided by a hybrid methodology of verbal decision analysis

FACIA: A Fully Automatic Change Impact Analysis Method for Large Scale Requirements

Introduction to the Special Issue on “International Conference on Software Reuse 2015”

Industrial experiences from evolving measurement systems into self‐healing systems for improved availability

Transition of organizational roles in Agile transformation process: A grounded theory approach

Collaboration in OSS Communities: Who Solves Whose Problems?

Assessing the impact of meta-model evolution: a measure and its automotive application

Rendex: A method for automated reviews of textual requirements

Software Project Management Observes: Fiasco V/S Victory

An IT Project as a Plaything of its Organizational Environment: Long-term Challenges in Financial Services

The Best Software Development Teams Might be Temporary

Coordination in multi-team programmes: An investigation of the group mode in large-scale agile software development

A Review of Scaling Agile Methods in Large Software Development

A Linguistic Model in Component Oriented Programming

Large Scale Agile Adoption Model from Management Perspective

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Large Software Development Research Articles

Related Topics

Articles published on Large Software Development

Recognizing lines of code violating company-specific coding guidelines using machine learning

How Technology Support for Contextualization Affects Enterprise Social Media Use: A Media System Dependency Perspective

Does every formal peer review really need to take place? An industrial case study

Coordination Challenges in Large-Scale Software Development: A Case Study of Planning Misalignment in Hybrid Settings

A Review for Improving Software Change using Traceability Model with Test Effort Estimation

Task assignment to distributed teams aided by a hybrid methodology of verbal decision analysis

FACIA: A Fully Automatic Change Impact Analysis Method for Large Scale Requirements

Introduction to the Special Issue on “International Conference on Software Reuse 2015”

Industrial experiences from evolving measurement systems into self‐healing systems for improved availability

Transition of organizational roles in Agile transformation process: A grounded theory approach

Collaboration in OSS Communities: Who Solves Whose Problems?

Assessing the impact of meta-model evolution: a measure and its automotive application

Rendex: A method for automated reviews of textual requirements

Software Project Management Observes: Fiasco V/S Victory

An IT Project as a Plaything of its Organizational Environment: Long-term Challenges in Financial Services

The Best Software Development Teams Might be Temporary

Coordination in multi-team programmes: An investigation of the group mode in large-scale agile software development

A Review of Scaling Agile Methods in Large Software Development

A Linguistic Model in Component Oriented Programming

Large Scale Agile Adoption Model from Management Perspective