Related Topics
Articles published on Code Generation
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
4939 Search results
Sort by Recency
- New
- Research Article
- 10.3390/informatics13020032
- Feb 11, 2026
- Informatics
- Mira Raheem + 4 more
Artificial intelligence (AI) has the potential to transform healthcare by supporting more accurate diagnoses and personalized treatments. However, its adoption in practice remains constrained by fragmented data sources, strict privacy rules, and the technical complexity of building reliable clinical systems. To address these challenges, we introduce a model-driven engineering (MDE) framework designed specifically for healthcare AI. The framework relies on formal metamodels, domain-specific languages (DSLs), and automated transformations to move from high-level specifications to running software. At its core is the Medical Interoperability Language (MILA), a graphical DSL that enables clinicians and data scientists to define queries and machine learning pipelines using shared ontologies. When combined with a federated learning architecture, MILA allows institutions to collaborate without exchanging raw patient data, ensuring semantic consistency across sites while preserving privacy. We evaluate this approach in a multi-center cancer immunotherapy study. The generated pipelines delivered strong predictive performance, with best-performing models achieving up to 98.5% accuracy on selected prediction tasks, while substantially reducing manual coding effort. These findings suggest that MDE principles—metamodeling, semantic integration, and automated code generation—can provide a practical path toward interoperable, reproducible, and reliable digital health platforms.
- New
- Research Article
- 10.38124/ijisrt/26jan1243
- Feb 6, 2026
- International Journal of Innovative Science and Research Technology
- Mahesh Kumar Damarched
Higher education institutions tend to manage decades-old legacy systems including mainframes, COBOL-based Student Information Systems (SIS) and PeopleSoft Enterprise Resource Planning (ERP) platforms that account for 60–80% of the IT budget, while simultaneously implementing artificial intelligence for student-facing experiences. This review studies the unexplored potential of Large Language Models (LLMs) as “intelligent copilots” for thorough legacy system modernization across the full lifecycle in higher education IT, including assessment, documentation, code translation, refactoring, testing, and optimization processes. We advocate that the actual leverage is found at the intersection of “LLMs in education” and “LLMs for code modernization”, a convergence that has not been explored in the published researches and has been visualized separately till now, by synthesizing recent literature (2023–2025) on LLM-enabled reverse engineering, code generation and documentation automation. The current modernization efforts tailored to higher education such as AI Virtual Explorer for Research Discovery and Education (AI-VERDE), FernUni LLM Experimental Infrastructure (FLEXI), and other institutional AI gateways are also reviewed in this study. This review study suggests an end-to-end reference architecture that combines multi-agent workflows, Continuous Integration and Continuous Delivery/Deployment (CI/CD) validation, and Retrieval-Augmented Generation (RAG). Numerous studies show that LLMassisted modernization results in 35–40% cost savings and 50% timeline reductions allowing institutions to shift resources from maintenance to innovation. In order to unlock untapped technical value and simultaneously empower contemporary student and administrative experiences this review suggests positioning the LLMs as strategic enablers of dual transformation rather than just productivity tools for educators. By using this integrated approach universities can create sustainable digital ecosystems operational resilience and an unparalleled competitive advantage.
- New
- Research Article
- 10.54097/t9ty1b96
- Feb 4, 2026
- Academic Journal of Science and Technology
- Yongchuan Ren + 4 more
To address common challenges faced by discrete manufacturing enterprises during digital and intelligent transformation—such as inconsistent material coding conventions, difficulties in coordinating multi-source BOM data, and disruptions in end-to-end lifecycle traceability—this paper proposes an MES-based framework for unified code generation, management, and traceability. The proposed framework adopts a bidirectionally linked “one item, two codes” strategy and establishes a structured model consisting of a static material code and a dynamic batch code. Through configurable coding rules, the framework enables unique identification of object attributes and dynamic mapping to production batches. By integrating a rule-parsing engine with a full-chain indexing mechanism, the system achieves closed-loop process management spanning BOM import, automatic code assignment, and shop-floor execution. Engineering practice demonstrates that the proposed system effectively mitigates conflicts caused by multiple codes for the same item and significantly improves material identification efficiency and quality traceability accuracy in complex manufacturing scenarios.
- New
- Research Article
- 10.70382/bejsmsr.v10i9.018
- Feb 4, 2026
- Journal of Systematic and Modern Science Research
- Olakekan Akinbosoye Okewale + 7 more
QR codes have become one of the most commonly used technologies during the COVID-19 pandemic in government, business, and various organizations, with significant usage in financial systems for payments, authentication, and access to digital services. QR codes are primarily used as digital links to web applications or for contactless interactions. Due to the advancing nature of technology, QR technology has developed along with various risks, one innovative instance of this risk being QR code phishing. QR code phishing attacks lead to significant and harmful consequences, including monetary loss and data breaches, affecting both individuals and organizations, particularly in banking, mobile payment systems, and online financial services. QR codes can be compromised and directed to harmful websites, causing unsuspecting individuals to end up on deceptive financial platforms. Recently studied countermeasures reveal significant shortcomings, such as dependence on trusted third parties for verification, excessive time complexity, and the failure to guarantee confidentiality, integrity, authentication, and availability concurrently within one system. Therefore, a stronger countermeasure is necessary for effectively reducing QR code phishing attacks, especially in financial environments where security and trust in transactions are vital. Thus, this research focused on creating a blockchain-based framework designed to reduce QR phishing attacks. The application developed includes a QR code generator along with a blockchain creation feature. The application safely saves created QR codes in Base64 format, along with related attributes like URLs, owner details, comments, and hash values by employing a proof-of-work system. Additional research is necessary regarding the preservation of the established blockchain within a distributed network, as this would improve real-time verification systems that can reduce QR code phishing threats in financial frameworks.
- New
- Research Article
- 10.1021/acs.jpca.5c07588
- Feb 3, 2026
- The journal of physical chemistry. A
- Kalman Szenes + 5 more
In this paper, an efficient implementation of the renormalized internally contracted multireference coupled cluster with singles and doubles (RIC-MRCCSD) into the ORCA quantum chemistry program suite is reported. To this end, Evangelista's Wick&d equation generator was combined with ORCA's native AGE code generator in order to implement the many-body residuals required for the RIC-MRCCSD method. Substantial efficiency gains are realized by deriving a spin-free formulation instead of the previously reported spin-orbital version developed by some of us. Since AGE produces parallelized code, the resulting implementation can directly be run in parallel with substantial speedups when executed on multiple cores. In terms of runtime, the cost of RIC-MRCCSD is shown to be between single-reference RHF-CCSD and UHF-CCSD, even when active space spaces as large as CAS(14,14) are considered. This achievement is largely due to the fact that no reduced density matrices or cumulants higher than three-body enter the formalism. The scalability of the method to large systems is furthermore demonstrated by computing the ground-state of a vitamin B12 model comprised of an active space of CAS(12,12) and 809 orbitals. In terms of accuracy, RIC-MRCCSD is carefully compared to second- and approximate fourth-order n-electron valence state perturbation theories (NEVPT2, NEVPT4(SD)), to the multireference zeroth-order coupled-electron pair approximation (CEPA(0)), as well as to the IC-MRCCSD from Köhn. In contrast to RIC-MRCCSD, the IC-MRCCSD equations are entirely derived by AGE using the conventional projection-based approach, which, however, leads to much higher algorithmic complexity than the former as well as the necessity to calculate up to the five-body RDMs. Remaining challenges such as the variation of the results with the flow, a free parameter that enters the RIC-MRCCSD theory, are discussed.
- New
- Research Article
- 10.1161/str.57.suppl_1.dp146
- Feb 1, 2026
- Stroke
- Edward Kim + 22 more
Introduction: Post-stroke motor rehabilitation typically involves in-person therapy sessions in which clinicians prescribe tailored exercises for patients. However, access to in-person therapy is often limited, particularly for individuals in underserved and rural areas. While digital rehabilitation tools exist to bridge this gap, they frequently lack sufficient personalization considering the patient’s home environment and individual needs. To address this issue, we evaluated the feasibility of a novel AI-enabled augmented reality (AR) system that translates natural language into software using large language model (LLM) code generation at the point of care, allowing therapists to 1) design personalized, home-based exercises and 2) monitor exercise completion by patients in detail. Methods: In a prospective, single-arm proof-of-concept study, 20 therapists conducted simulated remote therapy sessions with a standardized patient with right upper extremity weakness using the AI-enabled AR system. Therapists prescribed personalized exercises with voice recordings and manual typings, which the LLM translated into software in the Scenic programming language. By using a commercial AR headset, the software provided instructions to the patient and independently monitored the completion of each exercise step, offering to therapists information to guide future exercise prescriptions. Results: The system successfully delivered 99.8% (95% CI: 98.6-100%) of the 398 instructions prescribed without errors or hallucinations. The accuracy of monitoring exercise completion was 88.4% (95% CI: 84.9-91.9%) when compared to the gold-standard evaluation by therapists. Therapists reported excellent usability (mean Likert 5-point score: 4.5 ± 0.5) and 75% indicated they would like to use the technology in clinical practice. For 90% of the therapists, the system did not have an added risk of injury compared to the current usual care with paper worksheets. Conclusions: In conclusion, our AR system can enable therapists to remotely create and deliver personalized rehabilitation exercises for stroke and other neurological conditions while monitoring completion. To our knowledge, this is the first study evaluating LLMs for real-time code generation to support clinicians in prescribing interventions in rehabilitation. This approach has the potential to expand access to individualized stroke rehabilitation beyond traditional in-person care.
- New
- Research Article
- 10.1016/j.engappai.2025.113373
- Feb 1, 2026
- Engineering Applications of Artificial Intelligence
- Xiao-Guang Zhou + 1 more
DeepSeek-R1-assisted design and maintenance of concrete-filled steel tubular structures through automated modeling code generation
- New
- Research Article
- 10.2516/stet/2026003
- Jan 29, 2026
- Science and Technology for Energy Transition
- Mounir Bensaid + 3 more
This work presents a nonlinear control approach specifically tailored for a Multi-Drive Web Winding System (MDWWS), utilizing the Integral Backstepping Control (IBSC) technique. The proposed strategy is designed to improve the precision of both speed control and mechanical tension regulation across multiple coordinated drives. A detailed formulation of the control law is provided, grounded in the Backstepping framework and extended with integral action to enhance steady-state performance. The theoretical foundations of the IBSC method are thoroughly discussed, and its performance is benchmarked against the conventional Proportional-Integral (PI) controller. The comparative study focuses on evaluating the robustness and adaptability of each control method in the presence of system parameter variations and external disturbances. To validate the effectiveness of the proposed control strategy, a Processor-in-the-Loop (PIL) setup is implemented, integrating automatic code generation with hybrid simulation. This platform enables real-time execution of the control algorithm on the TMDSCNCD28379D DSP board while emulating the dynamic behavior of the web winding system in Simulink, thus providing a realistic and efficient environment for performance evaluation.
- New
- Research Article
- 10.1007/s12273-025-1385-7
- Jan 28, 2026
- Building Simulation
- Shuhao Li + 4 more
Abstract Large language models (LLMs) exhibit significant potential in automating data-driven building load forecasting (BLF) model development, substantially reducing reliance on human effort and domain expertise. However, direct application of LLMs faces challenges, including the large and indivisible nature of optimization problems, slow optimization, error-prone code generation, and underutilization of LLM reasoning capabilities. This study introduces AutoLFM, a novel multi-agent framework leveraging LLMs to automate the end-to-end BLF model development workflow. AutoLFM decomposes the complex modeling process using a two-stage optimization strategy, with specialized LLM agents (Retriever, Reasoner, Coder, and Validator) performing distinct tasks. Key mechanisms include dynamic knowledge retrieval for prompt enhancement, data-adaptive search space generation by the Reasoner, and verification-enhanced code generation between the Coder and Validator. Experimental evaluations on three real-world building datasets demonstrate that AutoLFM efficiently generates BLF models, achieving predictive accuracy comparable to or exceeding manually designed baselines and other LLM-based methods, with an average R 2 improvement of 12.3%. It shortens the traditional development cycle from weeks to hours while achieving a 100% code generation success rate. Ablation study confirms the contributions of two-stage optimization, data-adaptive search space, and validation-enhanced code generation to the framework’s performance and reliability. AutoLFM highlights the potential of multi-agent LLM systems in automating complex time-series forecasting tasks, significantly reducing development time and dependence on specialized knowledge.
- New
- Research Article
- 10.3389/fmars.2025.1757394
- Jan 27, 2026
- Frontiers in Marine Science
- Lang Xu + 1 more
Global maritime transport carries nearly four-fifths of world merchandise trade and is a significant source of greenhouse gas (GHG) emissions. With the GHG reduction strategies from the International Maritime Organization (IMO), the EU’s inclusion of shipping in the Emissions Trading System and the introduction of fuel GHG-intensity standards, there is an urgent need for prediction frameworks that are more robust, transparent and adaptable to evolving policy landscapes. Drawing on a structured search of the Web of Science Core Collection for the period 2020–2024, this review synthesises 1,012 peer-reviewed studies on global shipping emissions, decarbonisation measures and AI-enabled modelling. It first compares conventional approaches—fuel-based top-down inventories, AIS-driven bottom-up models and statistical or machine learning techniques—highlighting their respective strengths and limitations in terms of spatial and temporal resolution, data requirements and policy relevance. It then examines the emerging capabilities of large language models (LLMs) in knowledge integration, code generation and tool orchestration, and proposes five LLM-enabled paradigms for shipping emissions prediction, including multi-source information extraction, model orchestration, scenario construction and intelligent compliance auditing. Key technical and governance challenges are discussed, such as data quality and confidentiality, physical consistency, explainability and the environmental footprint of AI. The study argues that coupling LLMs with physics-based and data-driven models can enhance the flexibility and policy relevance of shipping emissions prediction, while a clearly defined research agenda is needed to ensure their responsible and effective use in supporting the decarbonisation of maritime transport.
- New
- Research Article
- 10.1145/3731752
- Jan 20, 2026
- ACM Transactions on Software Engineering and Methodology
- Qingyuan Liang + 8 more
The e X tensible M arkup L anguage (XML) is a file format widely used for data transmission in modern software development. In recent years, embedding SQL statements in XML files (i.e., XML-SQL) has become a popular way for developing applications with database access capability. Typically, XML-SQL code snippets demonstrate similar functionalities and structures, leading to repetitive programming work. Therefore, leveraging pre-trained code models for automated code generation presents a promising way to alleviate duplicated efforts and enhance the efficiency of developing XML-SQL code. However, XML-SQL code has strong domain-specific characteristics that general pre-trained code models typically struggle to fully harness, thereby leading to limited overall performance of general pre-trained code models. In this article, we aim to address the challenge of handling this domain-specific knowledge. First, we propose a code updating task and construct the corresponding TwinXSQL dataset to better evaluate the model’s code generation performance in the XML-SQL domain. Then, we leverage the common characteristics of XML-SQL and other programming languages (i.e., all programming languages impose grammar constraints on behavior) to design a bipartite-grammar–aware training framework (named BGA) for unsupervised pre-training, thereby improving the transfer of general-purpose code models to the XML-SQL domain. Specifically, we divide the XML-SQL code into two types of grammatical components: structure components and value components. During pre-training, we undertake three tasks, each designed to learn the internal information of these grammatical components and the relationships between them, enabling the pre-training process to better incorporate previously unlearned domain-specific knowledge of XML-SQL code. Our experimental results show that our trained model XSQLT5-base (220M) improves accuracy by 13.8% compared to the similarly sized CodeT5-base (220M). Additionally, our experiments reveal that ChatGPT, due to its inability to fully learn the XML-SQL domain knowledge, achieves a much lower generation accuracy even with few-shot samples compared to our XSQLT5-base (220M) model.
- New
- Research Article
- 10.1140/epjqt/s40507-026-00470-6
- Jan 20, 2026
- EPJ Quantum Technology
- Chon-Fai Kam + 1 more
Abstract We propose a device-independent quantum GPS protocol that employs multipartite Bell nonlocality to self-test a $[[5,1,3]]$ [ [ 5 , 1 , 3 ] ] five-qubit entangled state distributed to four satellites and one ground station, providing robust security against cyberattacks. Leveraging quantum rigidity, the protocol removes reliance on device assumptions, thereby bolstering GPS resilience. In the NISQ era, we compare superconducting (98.4% fidelity, 216 ns gate time) and trapped-ion (99.8% fidelity, 483 μs gate time) platforms for five-qubit code generation, and analyze photonic distribution over 20,200 km with $0.1\text{ dB}/\text{km}$ 0.1 dB / km attenuation, underscoring the necessity of LEO constellations or quantum repeaters for a scalable quantum GPS protocol.
- New
- Research Article
- 10.1080/20964471.2026.2615511
- Jan 19, 2026
- Big Earth Data
- Qianqian Luo + 8 more
ABSTRACT Large Language Models (LLMs) have demonstrated substantial progress in task automation and natural language understanding. However, without domain expertise in geographic information science (GIS), they continue to encounter limitations including reduced accuracy and unstable performance when processing complex spatial tasks. To address these challenges, we propose GeoJSON agents—a novel multi-agent LLM architecture specifically designed for geospatial analysis. This framework transforms natural language instructions into structured GeoJSON operations through two widely adopted LLM enhancement techniques: function calling and code generation. The architecture integrates three core components: task parsing, agent collaboration, and result integration. The planner agent systematically decomposes user-defined tasks into executable subtasks, while specialized worker agents perform spatial data processing and analysis either by invoking predefined function APIs or by dynamically generating and executing Python-based analytical code. The system produces reusable, standards-compliant GeoJSON outputs through iterative refinement. To systematically evaluate both approaches, we constructed a hierarchical benchmark comprising 70 tasks spanning basic, intermediate, and advanced complexity levels, conducting experiments with OpenAI’s GPT-4o as the core model. Results indicate that the code generation–based agent achieved 97.14% accuracy, while the function calling–based agent attained 85.71%—both significantly outperforming the best-performing general-purpose model (48.57%). Comparative analysis reveals that code generation offers superior flexibility for complex, open-ended tasks, whereas function calling provides enhanced execution stability for structured operations. This study represents the first systematic integration of GeoJSON data with a multi-agent LLM framework and provides empirical evidence comparing two mainstream enhancement methodologies in geospatial contexts, offering new perspectives for improving GeoAI system performance and reducing barriers to GIS application.
- New
- Research Article
- 10.1177/16094069261425173
- Jan 19, 2026
- International Journal of Qualitative Methods
- Wen Xu
While the extant research has provided a recipe for researchers to undertake thematic analysis (TA) in a theoretically and methodologically sound way, there has not yet been sufficient research to map out TA in the age of generative artificial intelligence (Gen AI). Building on and refining my 2020 article Applying thematic analysis to education: A hybrid approach to interpreting data in practitioner research published in International Journal of Qualitative Methods , which provides an example of thorough, end-to-end manual TA in a practitioner inquiry, this paper presents a follow-up example of how human-ChatGPT can work side by side in undertaking a non-positivist, Big Q reflexive TA. Particular attention is given to how ChatGPT can expedite data transcription, code generation, theme development, interpretation and proofreading throughout the research process, while enabling human researchers to articulate their reflexivity, mitigate algorithmic bias, produce nuanced interpretations and uphold research ethics. Following the six phases of reflexive TA outlined by Braun and Clarke, this paper opens up possibilities for AI-assisted TA research and invites reflection on what constitutes ‘good TA’ practices in working together with nonhuman entities in qualitative inquiry.
- Research Article
- 10.1038/s41598-025-34350-3
- Jan 14, 2026
- Scientific reports
- Hussein A Al-Hashimi
The increasing reliance on automatic code generation integrated with Generative AI technology has raised new challenges for cybersecurity defense against code injection, insecure code templates, and adversarial manipulation of an AI model. These risks make developing advanced frameworks imperative to ensure secure, reliable, and privacy-preserving code generation processes. The paper presents a novel Hybrid Artificial Neural Network (ANN)-Interpretive Structural Modeling (ISM) Framework to alleviate the cybersecurity risks associated with the automatic code generation using Generative AI. The proposed framework integrates the predictive capability of ANN and structured analysis of ISM for the identification, evaluation, and treatment of common vulnerabilities and risks in automatic code generation. We first conduct a multivocal literature review (MLR) to identify cybersecurity risks and generative AI practices for addressing these risks in automatic code generation. Then we conduct a questionnaire survey to identify and validate the identified risks and practices. An expert panel review was then assigned for the process of ANN-ISM. The ANN model can predict potential security risks by learning from historical data and code generation patterns. ISM is used to (1) structure and visualize (2) relations between identified risks and mitigation approaches and (3) offer a combined, multi-layered risk management methodology. We then perform an in-depth examination of the framework with a case study of an AI-based code generation company. We further determine its practicality and usefulness in real-world settings. The case study results show that the framework efficiently handles the primary cybersecurity challenges, such as injection attacks, code quality, backdoors, and lack of input validation. The analysis characterizes the maturity of several mitigation practices and areas for improvement for security integration with automatic code generation functionality. Advanced risk mitigation is enabled in the framework across multiple process areas, where techniques such as static code analysis, automated penetration testing, and adversarial training hold much promise. The Hybrid ANN-ISM Mechanism is a stable and flexible solution for cybersecurity risk reduction in automatic code generation environments. The coupling of ANN and ISM, in terms of predictive analysis and structured risk management, respectively, contributes effectively towards the security of AI-based code generation tools. More research is required to improve the scalability, privacy preserving, and dynamic integration of the framework with cybersecurity threat intelligence.
- Research Article
- 10.1038/s41467-025-67922-y
- Jan 12, 2026
- Nature communications
- Mike A Merrill + 19 more
Deriving personalized insights from popular wearable trackers requires complex numerical reasoning that challenges standard LLMs, necessitating tool-based approaches like code generation. Large language model (LLM) agents present a promising yet largely untapped solution for this analysis at scale. We introduce the Personal Health Insights Agent (PHIA), a system leveraging multistep reasoning with code generation and information retrieval to analyze and interpret behavioral health data. To test its capabilities, we create and share two benchmark datasets with over 4000 health insights questions. A 650-hour human expert evaluation shows that PHIA significantly outperforms a strong code generation baseline, achieving 84% accuracy on objective, numerical questions and, for open-ended ones, earning 83% favorable ratings while being twice as likely to achieve the highest quality rating. This work can advance behavioral health by empowering individuals to understand their data, enabling a new era of accessible, personalized, and data-driven wellness for the wider population.
- Research Article
- 10.3390/mi17010084
- Jan 7, 2026
- Micromachines
- Yiwen Kang + 1 more
With the rapid advancement of Transformer-based large language models (LLMs), these models have found widespread applications in industrial domains such as code generation and non-functional requirement (NFR) classification in software engineering. However, recent research has primarily focused on optimizing linear matrix operations, while nonlinear operators remain relatively underexplored. This paper proposes hardware-efficient approximation and acceleration methods for the Softmax and RMSNorm operators to reduce resource cost and accelerate Transformer inference while maintaining model accuracy. For the Softmax operator, an additional range reduction based on the SafeSoftmax technique enables the adoption of a bipartite lookup table (LUT) approximation and acceleration. The bit-width configuration is optimized through Pareto frontier analysis to balance precision and hardware cost, and an error compensation mechanism is further applied to preserve numerical accuracy. The division is reformulated as a logarithmic subtraction implemented with a small LOD-driven lookup table, eliminating expensive dividers. For RMSNorm, LOD is further leveraged to decompose the reciprocal square root into mantissa and exponent parts, enabling parallel table lookup and a single multiplication. Based on these optimizations, an FPGA-based pipelined accelerator is implemented, achieving low operator-level latency and power consumption with significantly reduced hardware resource usage while preserving model accuracy.
- Research Article
- 10.1093/bioinformatics/btag015
- Jan 2, 2026
- Bioinformatics (Oxford, England)
- Cameron S Movassaghi + 2 more
Scientific software packages impose persistent maintenance costs due to dependency churn, version incompatibilities, and bug triage, even when the underlying algorithms are stable and well described. At the same time, peer-reviewed publications already function as the canonical record of many computational methods, yet translating narrative method descriptions into usable code remains labor-intensive and error-prone. Recent advances in large language models (LLMs) raise the question of whether published articles alone can serve as sufficient specifications for on-demand code generation, potentially reducing reliance on continuously maintained libraries. We systematically evaluated state-of-the-art LLMs by tasking them with implementing core algorithms using only the original scientific publications as input. Across a diverse benchmark including random forests, batch correction methods, gene regulatory network inference, and gene set enrichment analysis, we show that modern LLMs can frequently reproduce package-level functionality with performance indistinguishable from established libraries. Failures and discrepancies primarily arose when manuscripts underspecified implementation details or data structures, rather than from limitations in model reasoning. These results demonstrate that literature-driven code generation is already feasible for many well-specified algorithms, while also exposing where current publication standards hinder reproducibility. All prompts, generated code, evaluation scripts, and benchmark datasets are publicly available at https://github.com/xomicsdatascience/articles-to-code.
- Research Article
- 10.18178/ijeetc.15.1.19-28
- Jan 1, 2026
- International Journal of Electrical and Electronic Engineering & Telecommunications
- Yousef Alraba’Nah + 3 more
As generative Artificial Intelligence (AI) models become increasingly integrated into software development workflows, understanding their efficiency and code quality is critical. This study offers a comprehensive comparison of three leading AI models—ChatGPT GPT-4-turbo, Claude Sonnet, and DeepSeek-V3—for automated code generation, focusing specifically on sorting algorithms. The models are evaluated across multiple metrics including execution time, memory usage, peak memory consumption, logical and physical file sizes, and code readability. Python implementations of Insertion Sort, Merge Sort, Quick Sort, and Heap Sort are generated by each model and benchmarked in a consistent Linux Docker environment. Results reveal that ChatGPT leads in overall efficiency, with the fastest average execution time, the lowest peak memory usage, and the highest readability scores. DeepSeek demonstrated competitive performance, especially in producing readable code, while Claude showed higher memory consumption and lower readability. This analysis provides practical insight into the trade-offs between code quality and system performance in AI-generated programming, offering valuable guidance for researchers and developers alike.
- Research Article
- 10.1016/j.neunet.2026.108606
- Jan 1, 2026
- Neural networks : the official journal of the International Neural Network Society
- Shanzhi Gu + 8 more
Mitigating sensitive information leakage in LLMs4Code through machine unlearning.