Published in last 50 years
Articles published on Instruction Set
- New
- Research Article
- 10.1145/3774419
- Nov 6, 2025
- ACM Transactions on Architecture and Code Optimization
- Naorin Hossain + 1 more
Ensuring correct execution of programs running on today’s parallel systems becomes difficult when memory is shared across several processing units. Memory consistency models (MCMs) were defined to provide a contract between different levels of the hardware-software stack to specify shared memory access orderings for correct implementations. However, instruction set architecture (ISA) MCMs traditionally only reason about the program-visible impacts of shared memory accesses for user-facing program instructions. In virtual memory systems though, there are additional hardware and operating system (OS) level shared memory accesses that can occur to facilitate address translation and may impact the program execution. Memory transistency models (MTMs) were thus coined to define a superset of MCMs that additionally account for underlying virtual memory operations. However, MTM implementations are complex as they are managed across the hardware and OS, making them difficult to specify. In such cases, empirical testing is the usual approach for effective validation of a specification. However, empirical MTM testing is challenging due to the complex ordering relationships between hardware, OS, and user-level operations, as well as the inability to explicitly run and control virtual memory operations with user-level programs. While many MCM testing tools exist and have been used to uncover system bugs, there is no existing work on developing systematic MTM testing techniques, nor has there been any analysis of how effective such techniques are. In this work, we introduce TEMpesT, the first full-system framework that uses ISA-agnostic techniques for performing fuzz testing for MTMs across both hardware and OS levels. TEMpesT uses enhanced litmus tests (ELTs) with novel targeted user-level MTM fuzzing techniques to induce virtual memory operations and coax out corner case behaviors. We used TEMpesT to validate the formal MTM specification \({\tt x86t\_elt} \) on an Intel x86 system running Linux. Our results show TEMpesT’s techniques are able to induce high outcome variety, resulting in 94% of all possible outcomes being observed with just 10,000 iterations of each ELT synthesized for \({\tt x86t\_elt} \) , outperforming prior methodology by 5.8 ×. This paper also introduces MTM mutation tests that we used to evaluate TEMpesT’s fuzzing techniques and demonstrate effective MTM bug detection.
- New
- Research Article
- 10.1088/1361-6404/ae126d
- Nov 3, 2025
- European Journal of Physics
- L G Vieira
Abstract This article presents a practical method for measuring the spatial irradiance distribution of the infrared beam emitted by a globar in a Fourier Transform Infrared (FTIR) spectrometer. The approach, intended for students with access to standard FTIR instrumentation, provides insights into the inhomogeneity of the infrared beam profile, which can be well approximated by a Gaussian distribution. By recording just a few spectra using different apertures and applying a numerical fitting, the method enables the determination of the irradiance distribution width. Beyond offering a deeper understanding of beam characteristics in FTIR spectroscopy, this technique serves as a useful diagnostic tool for monitoring the performance and long-term stability of the spectrometer source. Its simplicity and low experimental requirements make it particularly well suited for instructional settings.
- New
- Research Article
- 10.63385/ipt.v1i3.114
- Nov 2, 2025
- Innovations in Pedagogy and Technology
- Abraham Abby Sen + 4 more
As Artificial Intelligence (AI) reshapes educational assessment practices, there is a growing need to examine existing frameworks through the lens of Knowledge Management (KM). While models such as TALiP, Assessment for Learning (AfL), Popham’s Model, and Stiggins’ Five Pillars offer important foundations for assessment literacy, they lack structured mechanisms for systematic knowledge generation, transfer, and alignment with AI-generated insights. This study introduces a novel contribution: a seven-phase KM-AI integration framework designed to support responsible and pedagogically aligned AI adoption in educational assessment. Unlike existing approaches, this framework embeds KM principles into the full lifecycle of AI-supported assessment capturing tacit expertise, contextualizing algorithmic outputs, and enabling iterative learning across instructional settings. The framework is grounded in theoretical analysis and refined through a hypothetical use case and a documented institutional deployment of an AI-powered teaching assistant. Together, these cases illustrate how the framework can enhance teacher agency, support ethical AI use, and improve assessment coherence even in low-resource environments. The outcome is a practical roadmap for educators, policymakers, and developers that ensures AI tools strengthen rather than displace human-centered assessment practices. This study advances both theory and practice by providing an actionable, scalable model for KM-AI alignment in the era of digital transformation.
- New
- Research Article
- 10.11591/ijres.v14.i3.pp843-854
- Nov 1, 2025
- International Journal of Reconfigurable and Embedded Systems (IJRES)
- Ashuthosh Moolemajalu Ravikumar + 3 more
Graph algorithms are essential in domains like social network analysis, web search, and bioinformatics. Their execution on modern hardware is vital due to the growing size and complexity of graphs. Traditional multi-core systems struggle with irregular memory access patterns in graph workloads. Reduced instruction set computer–five (RISC-V)-based many-core processors offer a promising alternative with their customizable open-source architecture suitable for optimization. This work focuses on parallelizing graph algorithms like breadth-first search (BFS) and PageRank (PR) on RISC-V many-core systems. We evaluated performance based on graph structure and processor architecture, and developed an analytical model to predict execution time. The model incorporates the unique characteristics of the RISC-V architecture and the types and numbers of instructions executed by multiple cores, with a maximum prediction error of 11%. Our experiments show a speedup of up to 11.55× for BFS and 7.56× for PR using 16 and 8 cores, respectively, over single-core performance. Comparisons with existing graph processing frameworks demonstrate that RISC-V systems can deliver up to 20× better energy efficiency on real-world graphs from the network repository.
- New
- Research Article
- 10.11591/ijres.v14.i3.pp705-716
- Nov 1, 2025
- International Journal of Reconfigurable and Embedded Systems (IJRES)
- B Muthu Nisha + 1 more
The vision of sustainable development goal 9 (SDG 9) is realized through the integration of innovative technologies in the cyber-physical system (CPS). This work focuses on a smart network meter (SNM) application, designed to manage the extensive big data analytics required for processing and analyzing vast amounts of aggregated data in a short period. To address these demands, an advanced explicitly parallel instruction computing (AEPIC) approach is employed, leveraging a multi-core hardware security module (HSM) built on the elliptic curve cryptography (ECC) algorithm. Implementing the algorithm on various field programmable gate arrays (FPGAs) ensures adaptability to different hardware configurations, delivering scalable and optimized performance for big data aggregation in SNM applications. The proposed module showcases exceptional performance in design analysis. The Virtex-7 FPGA demonstrates excellent suitability for big data analytics in smart network applications, with dynamic power consumption accounting for 55% of total power and an on-chip power of 0.542 watts.
- New
- Research Article
- 10.1109/tip.2025.3625764
- Oct 31, 2025
- IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
- Yunjian Feng + 2 more
Weather image translation technologies aim to convert sunny images into various weather scenes, addressing the challenge of the costly acquisition of highly-demanded diverse weather samples. However, existing weather translation methods based on generative adversarial networks (GANs) have limited generalization capability, resulting in translated images that lack authenticity and diversity. In contrast, the emerging image generation technologies based on diffusion models have greatly surpassed GAN-based ones in performance, thus becoming the dominant paradigm in various visual tasks. This work pioneers the application of diffusion models to weather translation and presents a novel Instruction-driven Multi-Weather Translation (InstructWT) method. InstructWT is built on the large image editing model, InstructPix2Pix, and leverages the latter's zero-shot generalization capacities. We develop a user-friendly translation instruction set through prompt engineering and introduce a weather intensity factor for precise control of weather effects, thereby well enhancing the authenticity and diversity of weather images translated. A weather correlation-based blended editing technique is employed to maintain the layout and structure of the original image content. Additionally, a physical rendering approach of rain and snow is incorporated to further improve the translations' realism. The results of comparative experiments on a public dataset, Cityscapes, demonstrate that InstructWT outperforms existing methods in terms of authenticity and fidelity. Specifically, InstructWT achieves Contrastive Language-Image Pre-Training (CLIP) image embedding cosine similarity and directional CLIP similarity scores of 0.8302 and 0.1598, respectively. Furthermore, several semantic segmentation algorithms fine-tuned using the multi-weather scene dataset augmented by InsturctWT show significant improvement in the segmentation effect on all complex weather scenarios.
- New
- Research Article
- 10.3390/app152111634
- Oct 31, 2025
- Applied Sciences
- Zhiwei Jin + 3 more
The widespread adoption of fifth-generation Reduced Instruction Set Computing (RISC-V) processors in embedded systems has driven advancements in domestic processor design. However, research on processor performance optimization methods predominantly focuses on two- to three-stage pipeline architectures, with relatively few studies addressing complex five-stage pipeline processors. This study addresses this gap by analyzing optimization strategies for a five-stage pipeline processor architecture. Key areas examined include RISC-V jump instruction branch prediction (speed optimization), memory structure (memory access and resource optimization), and data-correlation-based division operations (fetch optimization). The processor core underwent CoreMark benchmark testing via a Field Programmable Gate Array (FPGA), analyzing the impact of optimizations such as branch prediction and cache on processor performance. The final processor achieved a CoreMark score of 2.92 CoreMark/MHz, outperforming most open-source processors and validating the effectiveness of the optimization strategies.
- New
- Research Article
- 10.3390/electronics14214171
- Oct 25, 2025
- Electronics
- Marouene Boubakri + 1 more
RISC-V has emerged as a compelling alternative to proprietary instruction set architectures, distinguished by its openness, extensibility, and modularity. As the ecosystem matures, attention has turned to building confidential computing foundations, notably Trusted Execution Environments (TEEs) and secure enclaves, to support sensitive workloads. These efforts explore a variety of design directions, yet reveal important trade-offs. Some approaches achieve strong isolation guarantees, but fall short in scalability or broad adoption. Others introduce defenses, such as memory protection or side-channel resistance, although often with significant performance costs that limit deployment in constrained systems. Lightweight enclaves address embedded contexts, but lack the advanced security features demanded by complex applications. In addition, early-stage development, complex programming models, and limited real-world validation hinder their usability. This survey reviews the current landscape of RISC-V TEEs and secure enclaves, analyzing their architectural principles, strengths, and weaknesses. To the best of our knowledge, this is the first work to present such a consolidated view. Finally, we highlight open challenges and research opportunities, aiming toward establishing a cohesive and trustworthy RISC-V trusted computing ecosystem.
- New
- Research Article
- 10.59319/arete.v3i2.990
- Oct 23, 2025
- Αρετή (Arete): Journal of Excellence in Global Leadership
- Detra Lynn Mills + 3 more
Background: This case study explores power dynamics through the story of Lucia, a first-year law student and victim of technology-facilitated sexual harassment by a professor at her university. Readers will analyze the issues and conditions facing Lucia, consider potential courses of action, and evaluate actionable solutions for this real-world scenario. Objectives: Readers will explore the complexities of Role Theory, inequities inherent in professor and student relationships, conflicts of interest in institutional settings, and potential pitfalls of digital interconnectedness while developing a broader understanding of appropriate ethical boundaries. Learning Outcomes: As a result of heightened awareness, readers will develop an awareness and ability to self-advocate for appropriate relationships within the academy, to recognize signs of sexual harassment, and to exercise greater autonomy over their digital existence. Use: This case study is intended for instructional settings in university gender studies, communications, and ethics courses as well as in academic and corporate human resources, board and governance, and business management settings. Some content may not be suitable for minors. Teaching Notes: Teaching notes and materials will be made available upon request of a verified instructor or educator. Limitations: Names and locations were anonymized to protect the privacy and integrity of those involved, and some narrative was editorialized for purposes of clarity. Any dialogue presented in this case study is representative of conversations and data that is either protected or no longer available through public channels.
- New
- Research Article
- 10.59319/arete.v3i2.1001
- Oct 23, 2025
- Αρετή (Arete): Journal of Excellence in Global Leadership
- Meredith Williams + 3 more
Background: The principles of Universal Design (UD) have been adopted and adapted in educational settings using various frameworks over the years, including Universal Design for Learning, Universal Design for Instruction, Quality Matters, Universal Instructional Design, and Integrated Multicultural Design. Each model has nuanced differences while simultaneously complementing each other in principle and purpose for continuous improvement in collegial environments. This article examines and compares the existing literature on universal design in higher education settings. Objectives: This article analyzes and synthesizes multiple universal design models used in education, identifies common themes, and assesses their relevance to the field of higher education. It examines their application in diverse instructional settings such as online classrooms, graduate programs, and globally diverse cultures. Approach: The review is guided by Universal Design as its theoretical framework. A thematic analysis of peer-reviewed articles, scholarly works, and professional resources using targeted keywords, including “Universal Design,” UD Models, and “Universal Design in Education,” were identified and reviewed. Results: Results of the research contrast elements of multiple models in the context of higher education and provide insight for future research globally. Conclusions: Universal Design principles continue to evolve as viable frameworks for improving student outcomes in higher education. The most prominent models share similar characteristics and continue to show promise in helping all learners in various ways.
- New
- Research Article
- 10.54097/wkc5bp95
- Oct 19, 2025
- Journal of Education and Educational Research
- Bin Zhou
This article focuses on the application of AIGC technology in digital interior design instruction, exploring its innovative value and practical approaches. AIGC, with its powerful multimodal content generation capabilities, has demonstrated potential across various creative industries. Furthermore, the interior design industry is experiencing an accelerated digital transformation, necessitating urgent reforms in instructional settings. Through case analysis and practical validation, this study explores specific methods for AIGC-enabled instruction, aiming to provide new insights for instructional reform, promote adaptation to industry needs, and cultivate more innovative and practical individuals.
- Research Article
- 10.1021/acs.jpca.5c04136
- Oct 13, 2025
- The Journal of Physical Chemistry. a
- Andrey Asadchev + 1 more
We report an implementation of the McMurchie–Davidsonevaluationscheme for 1- and 2-particle Gaussian AO integrals designed for processorswith Single Instruction Multiple Data (SIMD) instruction sets. Likein our recent MD implementation for graphical processing units (GPUs)[AsadchevA.,; ValeevE. F.,. J. Chem.Phys.2024, 160, 244109.]38934632, variable-sized batches of shellsetsof integrals are evaluated at a time. By optimizing for the floatingpoint instruction throughput rather than minimizing the number ofoperations, this approach achieves up to 50% of the theoretical hardwarepeak FP64 performance for many common SIMD-equipped platforms (AVX2,AVX512, NEON), which translates to speedups of up to 30 over the state-of-the-artone-shellset-at-a-time implementation of Obara–Saika-type schemesin Libint for a variety of primitive and contractedintegrals. As with our previous work, we rely on the standard C++programming languagesuch as the std::simd standard library feature to be included in the 2026 ISO C++ standardwithoutany explicit code generation to keep the code base small and portable.The implementation is part of the open source LibintX library freely available at https://github.com/ValeevGroup/libintx.
- Research Article
- 10.38088/jise.1712080
- Oct 10, 2025
- Journal of Innovative Science and Engineering (JISE)
- Latif Akçay + 1 more
Digital signal processing applications are becoming increasingly important because modern systems work with much larger amounts of data than before. The Discrete Cosine Transform (DCT), used in almost all multimedia compression methods, creates a significant computational load especially in resource-constrained embedded systems. This study proposes four custom operations compatible with Transport-Triggered Architecture (TTA). To enhance computational efficiency and avoid floating-point overhead, fixed-point arithmetic is used. To analyse the effect of the proposed operations, different Application-Specific Instruction Set Processor (ASIP) configurations were created on a general-purpose processor architecture. Performance analyses show that speedups between 2x and 3.5x are achieved. In addition, the developed processor models have been implemented in hardware. FPGA synthesis results indicate a reasonable increase in chip area, showing that the proposed solutions could be an efficient alternative, particularly for limited-resource embedded systems.
- Research Article
- 10.1145/3763162
- Oct 9, 2025
- Proceedings of the ACM on Programming Languages
- Sirui Lu + 1 more
Modern optimizing compilers generate efficient code but rarely achieve theoretical optimality, often necessitating manual fine-tuning. This is especially the case for processors with vector instructions, which can grow the instruction set by an order of magnitude. Super-optimizers can synthesize optimal code, but they face a fundamental scalability constraint: as the size of the instruction set increases, the length of the longest synthesizable program decreases rapidly. To help super-optimizers deal with large instruction sets, we introduce HieraSynth, a parallel framework for super-optimization that decomposes the problem by hierarchically partitioning the space of candidate programs, effectively decreasing the instruction set size. It also prunes search branches when the solver proves unrealizability, and explores independent subspaces in parallel, achieving near-linear speedup. HieraSynth is sufficiently efficient to run to completeness even on many hard problems, which means that it exhaustively explores the program space. This ensures that the synthesized program is optimal according to a cost model. We implement HieraSynth as a library and demonstrate its effectiveness with a RISC-V Vector super-optimizer capable of handling instruction sets with up to 700 instructions while synthesizing 7–8-instruction programs. This is a significant advancement over previous approaches that were limited to 1–3 instructions with similar instruction set sizes. Specifically, HieraSynth can handle instruction sets up to 10.66× larger for a given program size, or synthesize up to 4.75× larger programs for a fixed instruction set. Evaluations show that HieraSynth can synthesize code surpassing human-expert optimizations and significantly reduce synthesis time, making super-optimization more practical for modern vector architectures.
- Research Article
- 10.1177/00420859251382191
- Oct 9, 2025
- Urban Education
- Tasha Austin + 4 more
As an exploration of potential support structures for Black women teachers, this study elicits the wisdom of Black women as othermothers who serve in an array of urban instructional settings. Using a Sista Circle Methodology as framed through Black Girl Cartographies and Radical Mothering, we found that through the dual healing processes of (re)membering and (re)fusal, these women became one another's homeplaces. Implications include a need to activate potentials of carework within communities of Black women teachers via unstructured opportunities to gather or to convene in culturally conducive spaces free from institutionally driven norms and expectations.
- Research Article
- 10.1145/3770756
- Oct 7, 2025
- ACM Transactions on Reconfigurable Technology and Systems
- Md Arafat Kabir + 6 more
The matrix operations that underpin today’s deep learning models are routinely implemented in SIMD domain specific accelerators. [1–19]. SIMD accelerators including GPUs and array processors can effectively leverage parallelism in models that are compute-bound, but their effectiveness can be diminished for models that are memory-bound. Processing-in-Memory (PIM) architectures are being explored to provide better energy efficiency and scalable performance for these memory-bound models [20–33]. Modern Field Programmable Gate Arrays (FPGAs) feature hundreds of megabits of SRAM distributed across the device as disaggregated memory resources. This makes FPGAs ideal programmable platforms for developing custom Processor In/Near Memory accelerators. Several PIM array-based accelerator designs [24–31] have been proposed to leverage this substantial internal bandwidth. However, results reported to date show the FPGA based PIM architectures operating at system clock frequencies well below a chips BRAM Fmax clock frequency. Results also show that the compute densities of the designs do not scale linearly with BRAM densities. These results indicate that FPGA PIM architectures will never be competitive with their custom Application-Specific Integrated Circuit (ASIC) counterparts. In this paper, we introduce DA-VinCi, a D eeplearning A ccelerator O v erlay using in -Memory C omput i ng. DA-VinCi is the first scalable FPGA based PIM deep-learning accelerator overlay capable of clocking at the maximum frequency of a device’s BRAM. Further, the architecture of DA-VinCi allows the number of compute units to scale linearly up to the maximum capacity of a devices BRAM, and at the maximum clock frequency of the BRAM. The DA-VinCi overlay has a programmable Instruction Set Architecture (ISA) that allows the same synthesized design to provide low-latency inferencing of a range of memory-bound deep-learning models, including MLP, RNN, LSTM, and GRU networks. The scalability and high clocking frequency of DA-VinCi is achieved through a new Processor In Memory (PIM) Tile architecture and a highly scalable system-level framework. We present results showing DA-VinCi linearly scaling the number of PEs to 100% of the BRAM capacity (over 60K PEs) on an Alveo U55 clocking at 737 MHz, the chips BRAM Fmax. We provide comparative studies on inference latency across multiple deep-learning applications that show DA-VinCi achieves up to a 201× improvement over a state-of-the-art PIM overlay accelerator, up to 87× improvement over existing PIM-based FPGA accelerators, and up to 57× improvement over custom deep-learning accelerators on FPGAs.
- Research Article
- 10.1371/journal.pone.0333037.r004
- Oct 6, 2025
- PLOS One
- Zhijie Yang + 8 more
The core challenge of Knowledge Base Question Answering (KBQA), as a bridge between natural language and structured knowledge, is to accurately map complex semantic queries into Graph Query Language (GQL). Compared with the traditional Text-to-SQL task, KBQA faces a dual challenge: the structural differences between GQL and SQL and the lack of high-order subgraph information in multi-hop inference of knowledge graphs. While existing approaches such as ChatKBQA have made progress, the limitation of subgraph scalability severely constrains multi-hop query performance. To this end, this study proposes Knowledge Graph Multi-hop Perceptron (KGMP) - a retrieval-generation framework fine-tuned based on open-source large language models, whose innovativeness is reflected in three aspects: 1. Dynamic Graph Traversal Mechanism: Through an iterative subgraph expansion strategy, KGMP effectively achieves dynamic traversal of problem oriented graphs with progressive reasoning. 2. Structured Interaction Protocol: Based on SparQL syntax, KGMP designs a lightweight interaction instruction set to build an efficient communication interface between LLM and knowledge graph. 3. Graph Structure Optimization Technique: Develop subgraph reordering algorithms and pruning strategies based on the reranker model to ensure that the subgraphs input to the LLM are both compact and semantically complete. By integrating KGMP as a retrieval module into the ChatKBQA framework and providing it with optimised multi-hop subgraph input, the experimental results show a performance improvement of 6.2% and 5.3% on the WebQSP and CWQ datasets, respectively. This study provides a new technical paradigm for deep collaboration between LLM and knowledge graph.
- Research Article
- 10.1371/journal.pone.0333037
- Oct 6, 2025
- PloS one
- Zhijie Yang + 6 more
The core challenge of Knowledge Base Question Answering (KBQA), as a bridge between natural language and structured knowledge, is to accurately map complex semantic queries into Graph Query Language (GQL). Compared with the traditional Text-to-SQL task, KBQA faces a dual challenge: the structural differences between GQL and SQL and the lack of high-order subgraph information in multi-hop inference of knowledge graphs. While existing approaches such as ChatKBQA have made progress, the limitation of subgraph scalability severely constrains multi-hop query performance. To this end, this study proposes Knowledge Graph Multi-hop Perceptron (KGMP) - a retrieval-generation framework fine-tuned based on open-source large language models, whose innovativeness is reflected in three aspects: 1. Dynamic Graph Traversal Mechanism: Through an iterative subgraph expansion strategy, KGMP effectively achieves dynamic traversal of problem oriented graphs with progressive reasoning. 2. Structured Interaction Protocol: Based on SparQL syntax, KGMP designs a lightweight interaction instruction set to build an efficient communication interface between LLM and knowledge graph. 3. Graph Structure Optimization Technique: Develop subgraph reordering algorithms and pruning strategies based on the reranker model to ensure that the subgraphs input to the LLM are both compact and semantically complete. By integrating KGMP as a retrieval module into the ChatKBQA framework and providing it with optimised multi-hop subgraph input, the experimental results show a performance improvement of 6.2% and 5.3% on the WebQSP and CWQ datasets, respectively. This study provides a new technical paradigm for deep collaboration between LLM and knowledge graph.
- Research Article
- 10.52783/cana.v32.6104
- Oct 4, 2025
- Communications on Applied Nonlinear Analysis
- Rajender Kumar, Sneha Bhattacharya
In an era where computational demands continually escalate, the quest for more efficient and powerful processors persists. Computer engineering and VLSI design industries are facing challenges with the trade-offs between the cost and performance of components in the implementation domain. Reduced Instruction Set Computer (RISC) architecture relies its focus mainly on scaling down the complexity and the number of instructions in the microprocessor. RISC-V is an Open-source Instruction set architecture (ISA) designed to be simple, modular, and customizable. The most important feature of RISC is that it supports load-store architecture. With this feature, an optimized 32-bit microprocessor has been designed with Verilog, simulated and synthesized in Xilinx Vivado. Verilog enables us to describe the behavior and structure of our processor at a register-transfer level. Overall, RISC-V's combination of simplicity, openness, and flexibility positions it as a promising ISA for a wide range of applications, from low-power IoT devices to high-performance computing systems.
- Research Article
- 10.3389/fpsyg.2025.1663428
- Oct 2, 2025
- Frontiers in Psychology
- Ki Hong Kwon + 1 more
This study investigates the structural relationships among golf instructors’ human service quality, customers’ emotional responses, customer satisfaction, and learning transfer in the context of golf lesson participants in South Korea. The research focuses on adult golfers who received instruction within the past 2 years at outdoor golf practice ranges. Data were collected from 376 valid responses using a structured questionnaire and analyzed using structural equation modeling (SEM). The results reveal that the human service quality of golf instructors has a significant positive effect on customers’ positive emotional responses and a significant negative effect on their negative emotional responses. Furthermore, both the instructors’ service quality and customers’ positive emotional responses significantly contribute to higher customer satisfaction. Conversely, negative emotional responses were found to decrease satisfaction. Regarding learning transfer, customer satisfaction positively influences the extent to which golf lesson content is effectively applied, while negative emotional responses negatively affect this process. Mediation analysis further indicates that the impact of human service quality on learning transfer is significantly mediated by both customer satisfaction and emotional responses. In particular, positive emotional responses enhance learning transfer through increased satisfaction, suggesting a dual pathway of influence. These findings underscore the importance of emotional experience and perceived service quality in sports instruction settings. Golf instructors should prioritize strategies that foster positive emotional experiences and satisfaction to optimize learning outcomes and promote the transfer of skills from training to actual performance.