Articles published on Java Files
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
31 Search results
Sort by Recency
- Research Article
1
- 10.65521/ijacect.v14i1.166
- Apr 14, 2025
- International Journal on Advanced Computer Engineering and Communication Technology
- Y Rokesh + 4 more
Design patterns serve as reusable solutions to common software development challenges, enabling better organization, maintainability, and scalability of code. Despite their advantages, selecting the right design pattern during development can be a complex and subjective task, particularly for beginner programmers. This paper introduces an intelligent approach that utilizes machine learning to automatically recommend appropriate design patterns based on the structural characteristics of Java source code.The proposed framework begins by analyzing Java files to extract key structural features, which are then encoded into numerical vectors. These vectors capture the essential aspects of the code design and are used as input to machine learning algorithms such as Support Vector Machine (SVM), Decision Tree, and Random Forest. Additionally, an ontology-based similarity ranking method is employed to enhance the precision of predictions by measuring the closeness between the input code and existing pattern examples in the dataset.To make the solution user-friendly and accessible, the model is deployed as a RESTful API. Users can submit their source code through a web interface and receive instant feedback on the most likely design pattern classification, along with model confidence levels. Experimental evaluations indicate that the Random Forest model consistently delivers high accuracy in predicting one of the 13 predefined design patterns, outperforming the other classifiers tested.This system not only supports developers in making more informed design decisions but also contributes to the automation of software architecture practices. The integration of machine learning with software engineering principles creates a valuable resource for both academic research and industry application.
- Research Article
1
- 10.1007/s10664-025-10636-y
- Apr 5, 2025
- Empirical Software Engineering
- Aurora Papotti + 2 more
Slicing is a fault localization technique that has been proposed to support debugging and program comprehension. Yet, its empirical effectiveness during code inspection by humans has received limited attention. The goal of our study is two-fold. First, we aim to define what it means for a code reviewer to identify the vulnerable lines correctly. Second, we investigate whether reducing the number of to-be-inspected lines by method-level slicing supports code reviewers in detecting security vulnerabilities. We propose a novel approach based on the notion of a δ-neighborhood (intuitively based on the idea of the context size of the command git diff) to define correctly identified lines. Then, we conducted a multi-year controlled experiment (2017-2023) in which MSc students attending security courses (n=236) were tasked with identifying vulnerable lines in original or sliced Java files from Apache Tomcat. We provide perfect seed lines for a slicing algorithm to control for confounding factors. Each treatment differs in the pair (Vulnerability, Original/Sliced) with a balanced design with vulnerabilities from the OWASP Top 10 2017: A1 (Injection), A5 (Broken Access Control), A6 (Security Misconfiguration), and A7 (Cross-Site Scripting). To generate smaller slices for human consumption, we used a variant of intra-procedural thin slicing. We report the results for δ=0 which corresponds to exactly matching the vulnerable ground truth lines, and δ=3 which represents the scenario of identifying the vulnerable area. For both cases, we found that slicing helps in ‘finding something’ (the participant has found at least some vulnerable lines) as opposed to ‘finding nothing’. For the case of δ=0 analyzing a slice and analyzing the original file are statistically equivalent from the perspective of lines found by those who found something. With δ=3 slicing helps to find more vulnerabilities compared to analyzing an original file, as we would normally expect. Given the type of population, additional experiments are necessary to be generalized to experienced developers.
- Research Article
34
- 10.1145/3643762
- Jul 12, 2024
- Proceedings of the ACM on Software Engineering
- Nalin Wadhwa + 7 more
As software projects progress, quality of code assumes paramount importance as it affects reliability, maintainability and security of software. For this reason, static analysis tools are used in developer workflows to flag code quality issues. However, developers need to spend extra efforts to revise their code to improve code quality based on the tool findings. In this work, we investigate the use of (instruction-following) large language models (LLMs) to assist developers in revising code to resolve code quality issues. We present a tool, CORE (short for COde REvisions), architected using a pair of LLMs organized as a duo comprised of a proposer and a ranker. Providers of static analysis tools recommend ways to mitigate the tool warnings and developers follow them to revise their code. The proposer LLM of CORE takes the same set of recommendations and applies them to generate candidate code revisions. The candidates which pass the static quality checks are retained. However, the LLM may introduce subtle, unintended functionality changes which may go un-detected by the static analysis. The ranker LLM evaluates the changes made by the proposer using a rubric that closely follows the acceptance criteria that a developer would enforce. CORE uses the scores assigned by the ranker LLM to rank the candidate revisions before presenting them to the developer. We conduct a variety of experiments on two public benchmarks to show the ability of CORE: (1) to generate code revisions acceptable to both static analysis tools and human reviewers (the latter evaluated with user study on a subset of the Python benchmark), (2) to reduce human review efforts by detecting and eliminating revisions with unintended changes, (3) to readily work across multiple languages (Python and Java), static analysis tools (CodeQL and SonarQube) and quality checks (52 and 10 checks, respectively), and (4) to achieve fix rate comparable to a rule-based automated program repair tool but with much smaller engineering efforts (on the Java benchmark). CORE could revise 59.2% Python files (across 52 quality checks) so that they pass scrutiny by both a tool and a human reviewer. The ranker LLM reduced false positives by 25.8% in these cases. CORE produced revisions that passed the static analysis tool in 76.8% Java files (across 10 quality checks) comparable to 78.3% of a specialized program repair tool, with significantly much less engineering efforts. We release code, data, and supplementary material publicly at http://aka.ms/COREMSRI .
- Research Article
3
- 10.1016/j.infsof.2024.107429
- Mar 1, 2024
- Information and Software Technology
- Alan Liu + 3 more
Prevalence and severity of design anti-patterns in open source programs—A large-scale study
- Research Article
1
- 10.2478/acss-2023-0022
- Dec 1, 2023
- Applied Computer Systems
- Farshad Ghassemi Toosi
Abstract Source code constitutes the static and human-readable component of a software system. It comprises an array of artifacts and features that collectively execute a specific set of tasks. Coding behaviours and patterns are formulated through the orchestrated utilization of distinct features in a specified sequence, fostering inter-dependencies among these features. This study seeks to explore into the presence of specific coding behaviours and patterns within Java, which could potentially unveil the extent to which developers endeavour to leverage the facilities and services that exist in the programming language aggregatively. In pursuit of investigating behaviours and patterns, 436 open-source Java projects are selected, each having more than 150 Java files (Classes and Interfaces), in a semi-randomized manner. For every project, 39 features have been chosen, and the frequency of each individual feature has been independently assessed. By employing linear regression, the interrelationships among all features across the complete array of projects are scrutinized. This analysis intends to uncover the manifestation of distinct coding behaviours and patterns. Based on the selected features, preliminary findings suggest a notable collective incorporation of diverse coding behaviours among programmers, encompassing Encapsulation and Polymorphism. The findings also point to a distinct preference for using a specific commenting mechanism, JavaDoc, and the potential existence of Code-Clone and dead code. Overall, the results indicate a clear tendency among programmers to strongly adhere to the fundamental principles of Object -Oriented programming. However, certain less obvious attributes of object-oriented languages appear to receive relatively less attention from programmers.
- Research Article
- 10.56726/irjmets40203
- Jun 19, 2023
- International Research Journal of Modernization in Engineering Technology and Science
- Aprajeeta Singh + 3 more
Banking system can be considered as the one of the great tool supporting many customers as well as banks and financial institutions to make may banking activities through online. Every day banks need to perform many activities related to users which needs huge infrastructure with more staff members etc. But the banking system allows the banks to perform these activities in a simpler way without involving the employees for example consider online banking, mobile banking and ATM banking. But banking system needs to be more secure and reliable because each and every task performed is related to customer's money. Especially authentication and validation of user access is the major task in the banking systems. In this project I tried to show the working of a banking account system and cover the basic functionality of a Bank Account Management System. To develop a project for solving financial applications of a customer in banking environment in order to nurture the needs of an end banking user by providing various ways to perform banking tasks. Also to enable the user's work space to have additional functionalities which are not provided under a conventional banking project. The main aim of this project is to develop system application for Bank. This project has been developed to carry out the processes easily and quickly, which is not possible with the manuals systems, which are overcome by this software. This project is developed using JAVA language and File System use for database connection. Creating and managing requirements is a challenge of IT, systems and product development projects.
- Research Article
8
- 10.3390/info14020081
- Jan 31, 2023
- Information
- Yahya Tashtoush + 5 more
Code readability and software complexity are considered essential components of software quality. They significantly impact software metrics, such as reusability and maintenance. The maintainability process consumes a high percentage of the software lifecycle cost, which is considered a very costly phase and should be given more focus and attention. For this reason, the importance of code readability and software complexity is addressed by considering the most time-consuming component in all software maintenance activities. This paper empirically studies the relationship between code readability and software complexity using various readability and complexity metrics and machine learning algorithms. The results are derived from an analysis dataset containing roughly 12,180 Java files, 25 readability features, and several complexity metric variables. Our study empirically shows how these two attributes affect each other. The code readability affects software complexity with 90.15% effectiveness using a decision tree classifier. In addition, the impact of software complexity on the readability of code using the decision tree classifier has a 90.01% prediction accuracy.
- Research Article
8
- 10.1109/tnsm.2022.3209317
- Dec 1, 2022
- IEEE Transactions on Network and Service Management
- Shaozhi Dai + 6 more
Variable logging plays a vital role in software service management. Developers usually print a set of selected variables in logs to record software system status. Due to the lack of strict logging instructions and domain-specific knowledge, it is challenging for developers to decide which variables to log. Therefore, a technology that enables developers to log high- quality log variables is desirable. There are two reasons that make such a technology feasible. First, there exists semantic relevance between logged variables and other code statements. Second, the structural relationship between variables helps technology learn more information. In this paper, we propose a novel method to recommend variables to log — given a code snippet that needs to be followed by a logging statement, our method will tag every token in this code snippet to indicate whether it should be logged. Our method utilizes a pre-trained model to encode semantic information and a graph neural network to encode graph structure information. Given a code snippet without logging statements, our method first extracts graph structure information by graph neural network, then fuses the graph structure information with semantic information extracted by the pre-trained model to recommend logging variables. We use nine open-source projects' java files to evaluate our method. The experimental results demonstrate that our method outperforms other baseline methods in terms of Hits@1, MRR, and MAP, which indicate that the quality of the first recommended variable and all recommended variables is superior to other baseline models. Moreover this benefits from encoding better semantic information and incorporating graph structure information.
- Research Article
2
- 10.1186/s44147-022-00155-8
- Nov 8, 2022
- Journal of Engineering and Applied Science
- Eman Hosam + 3 more
In programming learning environments, the pressure of delivering many programming assignments makes plagiarism the easiest solution. This highly threatens the learning process; therefore, the need of an automatic, fast, and accurate detection of source code plagiarism becomes essential. To detect whether a pair of Java files is plagiarized, this paper proposes four classification feature sets: (i) structural histogram features, histogram-based features for summarizing similarity matrices; (ii) lexical per-class features, extracted from a lexical similarity matrix between the classes of the two compared files based on character 3-grams; (iii) structural counting features, twelve counting features representing the code structure; and (iv) modified original features: a set of modifications on the features of the used baseline. The results show that the best feature sets in F-measure are the structural histogram features and the lexical per-class features combined, which improve the F-measure by 4% compared to the baseline. The added features slow down the execution time. However, it is still efficient, given that it can classify 70k pairs in 23 min. In addition, we partially re-annotated the SOurce COde Re-use dataset. After the re-annotation, the F-measure of both the baseline and our work is improved, and our work achieves an F-measure of 93.6%, which is 7.5% higher than the new F-measure of the baseline. In addition, some remarks and recommendations are provided for using the SOurce COde Re-use dataset as a benchmark.
- Research Article
12
- 10.1145/3418206
- Oct 22, 2021
- ACM Transactions on Internet Technology
- Farhan Ullah + 3 more
Software piracy is an act of illegal stealing and distributing commercial software either for revenue or identify theft. Pirated applications on Android app stores are harming developers and their users by clone scammers. The scammers usually generate pirated versions of the same applications and publish them in different open-source app stores. There is no centralized system between these app stores to prevent scammers from publishing pirated applications. As most of the app stores are hosted on cloud storage, therefore a cloud-based interaction system can prevent scammers from publishing pirated applications. In this paper, we proposed IoT-based cloud architecture for clone detection using program dependency analysis. First, the newly submitted APK and possible original files are selected from app stores. The APK Extractor and JDEX decompiler extract APK and DEX files for Java source code analysis. The dependency graphs of Java files are generated to extract a set of weighted features. The Stacked-Long Short-Term Memory (S-LSTM) deep learning model is designed to predict possible clones. Experimental results have shown that the proposed approach can achieve an average accuracy of 95.48% among clones from different application stores.
- Research Article
19
- 10.17762/turcomat.v12i10.4931
- Apr 28, 2021
- Turkish Journal of Computer and Mathematics Education (TURCOMAT)
- Aqeel Nawaz
Android is the operating system of this modern world. Today, every tech-savvy people across the world are giving first preference to Android devices for their personal and official use. Because of the growing use of Android devices attackers are turning their attention toward android application. Because of this alarming increase in Android malware attacks there is a need to develop a defence mechanism against such attacks that must be fruitful and cost-effective. State-of-the-art malware detection techniques perform static, dynamic or hybrid analysis. Static analysis involves examining the source code malware samples without executing them. However, dynamic analysis monitors the run time behaviour of application during the actual execution of the app. Static analysis is a straightforward way to analyze the malware samples regarding the Android platform. In this research, we perform hybrid analysis using four different categories of Android application features such as permissions, intents, and network features. We extract permissions and intent from a manifest file while Network-based features extracted from java files. Our results show that the greatest precision of 0.99 can achieve by performing feature selection using Info Gain Method. Through, feature selection and results achieved by those selected features we come to know that permission are the most relevant features among all other three feature categories. We have observed that performing Ensemble method is best among all four machine learning classifiers. We have seen that network features (IP addresses, Email addresses, URL) are the relevant and effective feature for malware detection in the proposed framework.
- Research Article
1
- 10.14329/apjis.2020.30.3.457
- Sep 30, 2020
- Asia Pacific Journal of Information Systems
- Loveleen Kaur + 1 more
This study aims to extensively analyze the performance of various Machine Learning (ML) techniques for predicting version to version change-proneness of source code Java files. 17 object-oriented metrics have been utilized in this work for predicting change-prone files using 31 ML techniques and the framework proposed has been implemented on various consecutive releases of two Java-based software projects available as plug-ins. 10-fold and inter-release validation methods have been employed to validate the models and statistical tests provide supplementary information regarding the reliability and significance of the results. The results of experiments conducted in this article indicate that the ML techniques perform differently under the different validation settings. The results also confirm the proficiency of the selected ML techniques in lieu of developing change-proneness prediction models which could aid the software engineers in the initial stages of software development for classifying change-prone Java files of a software, in turn aiding in the trend estimation of change-proneness over future versions.
- Research Article
54
- 10.1016/j.jss.2020.110750
- Jul 23, 2020
- Journal of Systems and Software
- Valentina Lenarduzzi + 2 more
Some SonarQube issues have a significant but small effect on faults and changes. A large-scale empirical study
- Research Article
24
- 10.1186/s13635-019-0092-4
- Jun 13, 2019
- EURASIP Journal on Information Security
- Fitzroy D Nembhard + 2 more
Secure coding is crucial for the design of secure and efficient software and computing systems. However, many programmers avoid secure coding practices for a variety of reasons. Some of these reasons are lack of knowledge of secure coding standards, negligence, and poor performance of and usability issues with existing code analysis tools. Therefore, it is essential to create tools that address these issues and concerns. This article features the proposal, development, and evaluation of a recommender system that uses text mining techniques, coupled with IntelliSense technology, to recommend fixes for potential vulnerabilities in program code. The resulting system mines a large code base of over 1.6 million Java files using the MapReduce methodology, creating a knowledge base for a recommender system that provides fixes for taint-style vulnerabilities. Formative testing and a usability study determined that surveyed participants strongly believed that a recommender system would help programmers write more secure code.
- Research Article
1
- 10.14421/ijid.2018.07104
- Nov 28, 2018
- IJID (International Journal on Informatics for Development)
- Maria Ulfah Siregar + 1 more
This paper describes our research on implementing a scanner and parsers for Z specifications. Rather to code them from scratch, we use tools that have specialities on creating such tasks. These tools generate several Java files which can be integrated with a main program in Java. Our research could produce a scanner and parser for Z specifications. These tools could benefit Z specifications to be studied further.
- Research Article
- 10.14419/ijet.v7i4.19.21990
- Nov 27, 2018
- International Journal of Engineering & Technology
- A Pandian + 4 more
Artist finding is the errand of distinguishing the creator of a given test from an arrangement of suspects. The free worry of this errand is to characterize a fitting portrayal of test that catches the composition styles of creators. In this task, weka based machine learning instruments are utilized for distinguishing proof of creator for highlight extraction of reports spoke to utilizing variable size character n-grams. We wrote our own java program to extract the features like number of words, sentences etc. From, the poem which in turn fed as input to weka tool for the identification of author then after testing the input with all the algorithm all the accuracy rates are noted down to see which algorithm is given us the best accuracy rate. Now to find the author name for an anonymous poem the poem features are extracted using the java code and the output is taken in the java file given to the weka tool and tested with the algorithms and then the author name is given to the anonymous poems. Â
- Research Article
11
- 10.1016/j.infsof.2018.09.002
- Sep 6, 2018
- Information and Software Technology
- Loveleen Kaur + 1 more
Cognitive complexity as a quantifier of version to version Java-based source code change: An empirical probe
- Research Article
51
- 10.1371/journal.pone.0187204
- Nov 2, 2017
- PLOS ONE
- Xinyu Yang + 4 more
Authorship attribution is to identify the most likely author of a given sample among a set of candidate known authors. It can be not only applied to discover the original author of plain text, such as novels, blogs, emails, posts etc., but also used to identify source code programmers. Authorship attribution of source code is required in diverse applications, ranging from malicious code tracking to solving authorship dispute or software plagiarism detection. This paper aims to propose a new method to identify the programmer of Java source code samples with a higher accuracy. To this end, it first introduces back propagation (BP) neural network based on particle swarm optimization (PSO) into authorship attribution of source code. It begins by computing a set of defined feature metrics, including lexical and layout metrics, structure and syntax metrics, totally 19 dimensions. Then these metrics are input to neural network for supervised learning, the weights of which are output by PSO and BP hybrid algorithm. The effectiveness of the proposed method is evaluated on a collected dataset with 3,022 Java files belong to 40 authors. Experiment results show that the proposed method achieves 91.060% accuracy. And a comparison with previous work on authorship attribution of source code for Java language illustrates that this proposed method outperforms others overall, also with an acceptable overhead.
- Research Article
9
- 10.17932/iau.ijemme.21460604.2017.7/1.1335-1354
- Jun 1, 2017
- International Journal of Electronics, Mechanical and Mechatronics Engineering
- Şenay Kocakoyun
In this work, an article has prepared which can be a guide to researchers who want to start developing Android-based applications. Android is known as an operating system that comes face to face with more than one million applications. While Android sets the ground for different applications, people are aiming for unlimited entertainment for their lives and learning while they are having fun. Mobile application development steps are described for those who want to develop a mobile app for their own business, blog, service or product, but have limited resources for it. The installation steps of the necessary software are mentioned. Adroid SDK and Eclipse ADT installation, Android SDK directory settings, Android SDK and AVD manager settings, It provides detailed information on setting up a virtual device for Android, the hierarchical view and functions of Android project files, creating an Android project from scratch and developing the application. Designs for building NEU-CEIT Android application are written in detail along with the structures in xml and java files. This work will guide the researchers who want to develop applications in the Android Eclipce system environment, without developing the application and running in the emulator.
- Research Article
1
- 10.21013/jte.v6.n3.p1
- Mar 28, 2017
- IRA-International Journal of Technology & Engineering (ISSN 2455-4480)
- M K Patil + 1 more
<div><p><em>Modern programming languages, especially object oriented languages facilitate to create libraries of reusable components (e.g. class definition). The majority of software companies are designing the components and reusing those wherever applicable. Maintaining such components (i.e. class library) and accessing those at right time in right form is challenging because large no. of components in library. Object Oriented Programming supports the reusability of the code. The major challenge in programming is to improve the learning quality and productivity of the software developer, subject teachers and students. </em></p><p><em>To support programming in Java, researcher implemented a design retrieval algorithm which will make it possible to search through potentially reusable Java classes. </em><em>The proposed work, selects the appropriate descriptors of the inputted cases - .java files. It will separate the code components automatically and stores in the repository. The different levels of ambiguity in selection of cases are controlled through data preprocessing technique of data mining. The set of adjustments applied to get the similarity of the code components.</em></p></div>