Articles published on Reinforcement Learning
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
48768 Search results
Sort by Recency
- New
- Research Article
- 10.1017/s0033291725102729
- Dec 3, 2025
- Psychological medicine
- Christine Roberts + 4 more
Alterations in reward responsiveness represent a key mechanism implicated in youth depression risk. However, not all youth with these alterations develop depression, suggesting the presence of factors that may moderate risk patterns. As socioeconomic disadvantage is also related to youth depression risk, particularly for youth exhibiting altered reward function, this study examined whether indices of family- and neighborhood-level disadvantage interacted with electrocortical reward responsivity to predict depression symptom trajectories across childhood and adolescence. Participants included 76 youth (ages 9-16years) at low and high risk for depression based on maternal history of depression. At baseline, youth completed a monetary reward-guessing task while electroencephalography was recorded to measure the reward positivity (RewP), an event-related potential indexing reward responsiveness. Family and neighborhood disadvantage were assessed using the income-to-needs (ITN) ratio and Area Deprivation Index (ADI), respectively. Self-reported and clinician-rated depression symptoms were assessed across a multiwave, 18-month follow-up. RewP interacted with family- and neighborhood-level disadvantage to predict self-reported depression symptom trajectories. Specifically, blunted RewP predicted self-reported depression symptom increases for youth with a lower ITN ratio and higher ADI score. A blunted RewP also predicted clinician-rated depression symptom increases for youth living in neighborhoods with higher ADI scores. Findings suggest that reduced reward responsiveness is a mechanism implicated in future depression risk among youth, specifically in the context of family- and neighborhood-level socioeconomic disadvantage. Interventions that enhance reward response among youth exposed to higher levels of socioeconomic disadvantage may be particularly effective in preventing depression emergence.
- New
- Research Article
- 10.55041/ijsrem54782
- Dec 2, 2025
- International Journal of Scientific Research in Engineering and Management
- Rangasamy S + 3 more
Abstract— Adaptive, data-driven pricing is a necessity for online merchants experiencing volatile demand and intense competition. This research suggests an integrated predictive model that uses historical transaction logs, click-stream traces, and exogenous market indicators to infer best price points at SKU, segment, and session levels. The pipeline in this model begins with Recency-Frequency-Monetary (RFM) features and K- Means/Hierarchical clustering to obtain behaviorally meaningful customer segments; then elastic-net regression predicts short-term price-elasticity for each segment; lastly, a reinforcement-learning layer adjusts prices in near-real-time to maximize expected contribution margin subject to inventory and competitor-price constraints. We test the framework on 2.8 million orders for a mid- size fashion e-retailer during January 2023 – December 2024. Our system compares favorably with the company's rule-based approach. It increases gross profit by 7.9 %, conversion among high-lifetime-value customers by 5.4 %, and reduces markdown expenditure by 11.2 %. Robustness checks under severe demand shocks—e.g., flash sales and influencer-driven traffic spikes— verify steady performance. The contribution is two-fold: (i) methodological—integrating segmentation, econometric elasticity, and machine-learning control within one loop; (ii) managerial— showing how granular behavioral data can translate to defensible margin gains with customer goodwill intact. Ethical and regulatory aspects of personalized prices are also examined Keywords— Dynamic pricing, E-commerce analytics, Customer segmentation, Recency-Frequency-Monetary (RFM), Price elasticity, Elastic-net regression, Reinforcement learning, Predictive modelling, Revenue management, Behavioral data, Machine-learning control, Personalized pricing.
- New
- Research Article
- 10.1080/00207543.2025.2582768
- Dec 2, 2025
- International Journal of Production Research
- Wenbin Xiang + 3 more
Wafer batch processing machines scheduling is critical to the efficiency of semiconductor manufacturing, where highly dynamic task arrivals, complex constraints, and reentrant processing pose significant challenges. To tackle these challenges, this study introduces a novel multi-agent collaborative reinforcement learning (RL) framework enhanced by a lightweight large language model (LLM). The proposed framework incorporates two dedicated agents–a batch formation agent and a batch assignment agent–specifically designed to optimise scheduling decisions in dynamic and constraint-rich production environments through collaborative interaction. A lightweight LLM is integrated as an auxiliary module to provide semantic action guidance through a two-stage fine-tuning process that combines expert knowledge and RL experience, enabling the agents to generate more effective and context-aware policies. Furthermore, a Transformer-based architecture is employed to fuse dynamic information across agents, facilitating coordination and joint decision-making. Experimental results demonstrate that the proposed framework significantly improves scheduling performance, reducing average task flow time by over 20% on benchmark cases and by more than 25% compared to rule-based and heuristic methods in real-world scenarios, while also enhancing equipment utilisation.
- New
- Research Article
- 10.1145/3777367
- Dec 2, 2025
- ACM Computing Surveys
- Shahmir Khan Mohammed + 4 more
The Digital Twins (DT) paradigm has emerged as a powerful tool for simulating and analyzing complex systems in various domains. A DT is a virtual representation of a real-world object(s) whose goal is to accurately emulate real systems, optimize processes, minimize synchronization delays, cut down on overhead, and automate decision-making. DT technology is moving at a faster than expected pace with advances in Artificial Intelligence (AI), Internet of Things (IoT), Distributed Computing, and 5/6G. Being a highly beneficial technology, DT still faces issues of - (1) limited adaptability, (2) incomplete model representation, (3) suboptimal decision making, (4) limited generalization, and (5) scalability and computational efficiency. Reinforcement Learning (RL) offers unsupervised decision-making and intelligence, which can be immensely beneficial in addressing the current challenges faced by DT. This study offers a thorough analysis of the DT paradigm from the standpoint of RL. The survey compares and contrasts existing reinforcement learning-based Digital Twin frameworks, assessing their advantages and disadvantages. Moreover, discussions of approaches highlighting the trade-offs between simulation fidelity and computing complexity is also studied. Additionally, a thorough understanding of the Digital Twins paradigm from a reinforcement learning perspective, is presented as a helpful resource for academics and industry professionals in the field. Finally, future research directions in this developing field at the nexus of digital modeling, simulation, and artificial intelligence is discussed.
- New
- Research Article
- 10.3390/jtaer20040337
- Dec 2, 2025
- Journal of Theoretical and Applied Electronic Commerce Research
- Xinmin Wang + 4 more
Big data-driven discriminatory pricing not only creates opportunities to boost hotel profits but also amplifies consumers’ negative perceptions of price fairness. Developing a dynamic discriminatory pricing model with fairness constraints helps hotel room managers formulate optimal pricing strategies. This paper proposes a dynamic discriminatory pricing model with fairness constraints that unifies four pricing models: fixed pricing, dynamic pricing, discriminatory pricing, and dynamic discriminatory pricing. It further proposes a two-stage deep reinforcement learning algorithm to efficiently solve the model and generate optimal pricing strategies. Finally, a case study is conducted to validate the proposed model and algorithm. The results show that the two-stage deep reinforcement learning algorithm can instantaneously derive optimal pricing schemes that satisfy both group and temporal fairness constraints, following a reasonably time-efficient training process. By adjusting the fairness parameters, our model can be transformed into the four types of pricing models, and the performance of the algorithm is validated for the commonly used dynamic pricing and dynamic discriminatory pricing models. Compared to traditional nonlinear programming solution algorithms, this algorithm generates optimal daily prices based on real-time market changes, making it more practically applicable.
- New
- Research Article
- 10.1016/j.neunet.2025.107905
- Dec 1, 2025
- Neural networks : the official journal of the International Neural Network Society
- Yifan Li + 5 more
Reinforcement learning with temporal and variable dependency-aware transformer for stock trading optimization.
- New
- Research Article
- 10.1016/j.aap.2025.108270
- Dec 1, 2025
- Accident; analysis and prevention
- Cheng Wang + 3 more
HAD-Gen: Human-like and diverse driving behavior modeling for controllable scenario generation.
- New
- Research Article
- 10.1016/j.mex.2025.103472
- Dec 1, 2025
- MethodsX
- C N Vanitha + 4 more
Proximal Policy Optimization-based Task Offloading Framework for Smart Disaster Monitoring using UAV-assisted WSNs.
- New
- Research Article
- 10.1016/j.neunet.2025.107870
- Dec 1, 2025
- Neural networks : the official journal of the International Neural Network Society
- Jiazhou Jiang + 1 more
Experimental data-efficient reinforcement learning with an ensemble of surrogate models.
- New
- Research Article
- 10.1016/j.renene.2025.123678
- Dec 1, 2025
- Renewable Energy
- Syed Muhammad Ahsan + 2 more
Multi-agent systems in networked microgrids: Reinforcement learning and strategic pricing mechanisms
- New
- Research Article
- 10.1016/j.engappai.2025.112754
- Dec 1, 2025
- Engineering Applications of Artificial Intelligence
- Yunteng Niu + 4 more
Multi-scale target detection of metal surface defects in additive manufacturing based on reinforcement learning
- New
- Research Article
- 10.1016/j.engappai.2025.112468
- Dec 1, 2025
- Engineering Applications of Artificial Intelligence
- Meng Zhang + 5 more
Efficient active flow control strategy for confined square cylinder wake using deep learning-based surrogate model and reinforcement learning
- New
- Research Article
- 10.1016/j.ins.2025.122523
- Dec 1, 2025
- Information Sciences
- Yao Dong + 4 more
Power load forecasting using deep learning and reinforcement learning
- New
- Research Article
- 10.1016/j.conengprac.2025.106536
- Dec 1, 2025
- Control Engineering Practice
- Wenhao Feng + 5 more
Multiple gait locomotion generation for quadruped robots based on trajectory planning and reinforcement learning
- New
- Research Article
- 10.1016/j.conengprac.2025.106538
- Dec 1, 2025
- Control Engineering Practice
- Shuo Xue + 7 more
Learning robust quadrupedal locomotion under disturbances via reinforcement learning with an autonomous evolutionary mechanism
- New
- Research Article
- 10.1016/j.physa.2025.131009
- Dec 1, 2025
- Physica A: Statistical Mechanics and its Applications
- Wang-Han Gong + 2 more
Human-driving like lane-changing behavior of autonomous vehicles based on asymmetric risk field and reinforcement learning
- New
- Research Article
- 10.1016/j.suscom.2025.101224
- Dec 1, 2025
- Sustainable Computing: Informatics and Systems
- Jiaying Wang + 4 more
Multi-objective energy-efficient power system scheduling using Stochastic State Space Model and reinforcement learning
- New
- Research Article
- 10.1016/j.array.2025.100551
- Dec 1, 2025
- Array
- Ali Hashemian + 1 more
Target tracking in Internet of Things using reinforcement learning
- New
- Research Article
- 10.1016/j.engappai.2025.112552
- Dec 1, 2025
- Engineering Applications of Artificial Intelligence
- Chen-Dong Zeng + 3 more
Trajectory planning of redundant parallel mechanism considering motion accuracy based on reinforcement learning
- New
- Research Article
- 10.1016/j.chaos.2025.117418
- Dec 1, 2025
- Chaos, Solitons & Fractals
- Yijie Huang
The evolution of cooperation in multi-games with reinforcement learning