Toward a Failure Analysis Chatbot with Retrieval-Augmented Generation
Abstract This article looks at naive, advanced, and modular retrieval-augmented generation (RAG) architectures for an FA chatbot implementation. This study identified RAG as a suitable adaptation technique that aligns with the specified resource constraints in a cost-effective manner. Even advanced modular RAGs still hallucinate, suggesting that integrating additional adaptation techniques, such as reinforcement learning or targeted fine-tuning, may be necessary for optimal performance.
- Research Article
- 10.36765/jp3m.v7i2.734
- Apr 25, 2025
- Jurnal Pemikiran dan Penelitian Pendidikan Matematika (JP3M)
Abstrak: Studi ini bertujuan untuk melakukan kajian mendalam terhadap implementasi dan konsep Reinforcement Learning (RL) dalam sistem otomatis melalui pendekatan Systematic Literature Review. Penelitian ini menggunakan sumber literatur dari database seperti Scopus, DOAJ, dan Google Scholar dengan rentang tahun 2014-2024. Tinjauan ini menyoroti aplikasi RL dalam berbagai domain sistem otomatis seperti robotika, kendaraan otonom, manajemen lalu lintas, kedirgantaraan, manajemen energi, dan perawatan kesehatan. Hasil tinjauan menunjukkan bahwa RL memberikan kontribusi signifikan dalam meningkatkan efisiensi, adaptabilitas, dan kecerdasan sistem otomatis. Namun, implementasi RL juga dihadapkan pada tantangan seperti efisiensi data yang buruk, biaya komputasi yang tinggi, dan ketergantungan pada infrastruktur teknologi yang memadai. Berbagai solusi telah diusulkan, seperti pengoptimalan perangkat keras, metode hemat data, dan integrasi informasi struktural tambahan, untuk mengatasi tantangan ini. Meskipun demikian, masih diperlukan penelitian lanjutan untuk mengembangkan teknik-teknik yang lebih efisien dan adaptif dalam penggunaan data serta integrasi RL dengan infrastruktur otomatisasi yang lebih luas. Penelitian ini mengidentifikasi kesenjangan dalam literatur dan merumuskan topik riset mendesak untuk mengeksplorasi solusi-solusi inovatif guna memperluas aplikasi RL di masa mendatang. Abstract: This study aims to conduct an in-depth review of the implementation and concepts of Reinforcement Learning (RL) in automated systems through a Systematic Literature Review approach. The research utilizes literature sources from databases such as Scopus, DOAJ, and Google Scholar spanning the years 2014 to 2024. The review highlights RL applications in various domains of automated systems including robotics, autonomous vehicles, traffic management, aerospace, energy management, and healthcare. The findings reveal that RL significantly contributes to enhancing efficiency, adaptability, and intelligence in automated systems. However, RL implementation faces challenges such as poor data efficiency, high computational costs, and dependence on adequate technological infrastructure. Various solutions have been proposed, such as hardware optimization, data-efficient methods, and the integration of additional structural information, to address these challenges. Nevertheless, further research is needed to develop more efficient and adaptive techniques in data utilization and the integration of RL with broader automation infrastructure. This study identifies gaps in the literature and formulates urgent research topics to explore innovative solutions for expanding RL applications in the future.
- Book Chapter
- 10.55432/978-1-6692-0007-9_1
- Aug 16, 2024
Job scheduling for high performance computing systems involves building a policy to optimize for a particular metric, such as minimizing job wait time or maximizing system utilization. Different administrators may value one metric over another, and the desired policy may change over time. Tuning a scheduling application to optimize for a particular metric is challenging, time consuming, and error prone. However, reinforcement learning can quickly learn different scheduling policies dynamically from log data and effectively apply those policies to other workloads. This research demonstrates that a reinforcement learning agent trained using the proximal policy optimization algorithm performs 18.44% better than algorithmic scheduling baselines for one metric and has comparable performance for another. Reinforcement learning can learn scheduling policies which optimize for multiple different metrics and can select not only which job in the queue to schedule next, but also the machine on which to run it. The agent considers jobs with three resource constraints (CPU, GPU, and memory) while respecting individual machine resource constraints.
- Research Article
37
- 10.1109/jiot.2020.2992509
- May 8, 2020
- IEEE Internet of Things Journal
We propose a general sensor selection (SS) methodology for ocean-of-things (OoT) where a sensing network performs multiobject tracking (MOT) under resource constraints. SS methods address the combinatorial problem of determining the best subset of sensors that maximizes a suitable reward function for a fixed cardinality. The novelty of this article is twofold. First, we propose a tractable information-theoretic reward function for MOT-OoT with an unknown and time-varying number of objects such as ocean vessels. A tractable reward function is essential in order to rapidly evaluate a sensor subset, which is crucial in the high-dimensional problems encountered in OoT. Second, we propose a general cross-entropy SS (CE-SS) methodology that efficiently estimates the probabilities of sensor activations and determines the optimal sensor subset according to the proposed reward function and under the imposed cardinality constraint. The CE-SS algorithm avoids exhaustive searching over the space of all sensor subsets, which is intractable for most OoT applications. The CE-SS methodology, coupled with the proposed reward function, is capable of selecting sensors that lead to more accurate estimates than random selection for both the number of vessels and their trajectories. We demonstrate the effectiveness of our method via numerical simulation in serveral scenarios, including multivessel tracking for OoT with an emulated network of acoustic sensors deployed off the coast of Italy.
- Research Article
- 10.55041/ijsrem45372
- Apr 22, 2025
- INTERNATIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT
Customer support process has seen a transformational shift in recent years. With the rapid proliferation of Artificial Integrating (AI), the human driven customer support has leapfrogged into automated customer services & support. The field of Artificial Intelligence (AI) has seen phenomenal rise in the development of agents' capability right from autonomous decision-making and problem-solving abilities to adapt to changing environments. Their proficiency in learning from experience and adaptation greatly influences their effectiveness. They both drive improvement over time and insure response to dynamic customer needs and conditions. Adaptation and learning also permit strategies to be revised in light of new data provided to them. This paper explores the bases and techniques of adaptation and learning in AI agents, focusing on reinforcement learning, supervised learning, unsupervised learning, and the combination. In addition, it also addresses challenges like overfitting, exploration-exploitation tradeoffs and computational economics. The role of these processes in various AI applications, including robotics, natural language processing and autonomous systems help industry reduce dependency on manual effort and improve efficiency to drive growth. Keywords-Adaptation, Learning, AI agents, Reinforcement learning, Supervised learning, Autonomous systems.
- Research Article
7
- 10.1016/j.engappai.2023.106593
- Jun 21, 2023
- Engineering Applications of Artificial Intelligence
Active control of flexible rotors using deep reinforcement learning with application of multi-actor-critic deep deterministic policy gradient
- Conference Article
- 10.1109/ecce.2017.8096565
- Oct 1, 2017
This paper introduces the unified equivalent circuit model for modular high voltage(HV) power generation architectures. The HV generation architectures are introduced considering the modularity of key HV components such as transformers or rectifier circuits firstly. An equivalent resister and capacitor circuit network is adopted to model the HV transformer and multi-stage voltage multiplier circuit for HV generation architectures to simplifies the analysis, design and optimization for HV generation architectures. The expressions of equivalent resister and capacitor network in modular HV generation architectures are deduced. Based on the proposed equivalent circuit model, a 400kHz switching frequency 500W 20kV output HV generator prototype based on modular HV architecture is built to validate the equivalent circuit model. The experimental results of HV generator prototype are given finally.
- Discussion
4
- 10.1016/j.tics.2010.05.008
- Jun 17, 2010
- Trends in Cognitive Sciences
Cognitive Science should be unified: comment on Griffiths et al. and McClelland et al.
- Research Article
5
- 10.1109/access.2023.3245055
- Jan 1, 2023
- IEEE Access
Situation-awareness-based decision-making (SABDM) models constructed using cognitive maps and goal-direct task analysis techniques have been successfully used in decision support systems in safety-critical and mission-critical environments such as air traffic control and electrical energy distribution. Reinforcement learning (RL) and other machine learning techniques are used to automate situational awareness mental model parameter adjustments, reducing the expert work on the initial configuration and long-term maintenance without affecting the mental model’s structure and maintaining the situation-awareness-based decision-making model’s cognitive and explainability characteristics. Real-world models should evolve to cope with changes in the environmental conditions. This study evaluates the application of reinforcement learning as an online adaptive technique to adjust the situational-awareness mental model’s parameters under evolving conditions. We conducted evaluation experiments using real-world public datasets to compare the performance of the SABDM model with that of the reinforcement learning adaptation technique (SABDM/RL) and other adaptive machine learning methods under distinct concept drift evolving conditions. We measured the techniques’ overall and dynamic performance to understand how well they adapt to evolving environmental conditions. The experiments show that SABDM/RL performs similarly to modern online adaptive machine learning classification methods with the support of concept drift detection techniques while maintaining the mental model strength of the situation awareness-based systems.
- Research Article
1
- 10.1371/journal.pcbi.1012554
- Oct 28, 2024
- PLoS computational biology
Synaptic plasticity enables animals to adapt to their environment, but memory formation can require a substantial amount of metabolic energy, potentially impairing survival. Hence, a neuro-economic dilemma arises whether learning is a profitable investment or not, and the brain must therefore judiciously regulate learning. Indeed, in experiments it was observed that during starvation, Drosophila suppress formation of energy-intensive aversive memories. Here we include energy considerations in a reinforcement learning framework. Simulated flies learned to avoid noxious stimuli through synaptic plasticity in either the energy expensive long-term memory (LTM) pathway, or the decaying anesthesia-resistant memory (ARM) pathway. The objective of the flies is to maximize their lifespan, which is calculated with a hazard function. We find that strategies that switch between the LTM and ARM pathways, based on energy reserve and reward prediction error, prolong lifespan. Our study highlights the significance of energy-regulation of memory pathways and dopaminergic control for adaptive learning and survival. It might also benefit engineering applications of reinforcement learning under resources constraints.
- Conference Article
- 10.1109/candarw51189.2020.00063
- Nov 1, 2020
A connected space comprises embedded systems that are attached to the physical space and cloud systems through the Internet. Using the connected space, various services can be continuously provided. These services can be dynamic and flexible based on user requirements and usage environments. The systems need to adapt to various changes in the need of users, service providers, and environments. However, embedded systems that implement elements of the connected space have resource constraints and difficulty in updating software; this is a significant challenge for embedded systems in providing dynamic and flexible services continuously. To tackle this challenge, this study considers reinforcement learning technologies for autonomous adaptive embedded systems for sustainable usage. We discuss the requirements of embedded systems and the rationale of the selected reinforcement learning method.
- Conference Article
3
- 10.1109/icin56760.2023.10073489
- Mar 6, 2023
Network Slicing (NS) is a key enabler of the 5G network ecosystem due to its potential to provide distinct services over the same physical infrastructure. However, the necessity to optimally orchestrate resources for heterogeneous demands is crucial when dealing with resource constraints and Quality-of-Service (QoS) requirements. We consider a radio access network scenario providing NS over multiple base stations (BS) with limited resources, and we design an efficient resource orchestration technique, based on reinforcement learning, which optimizes resource utilization among different services while satisfying the constraints and complying with Service Level Agreement (SLA) and QoS requirements. The proposed technique makes use of the Trust Region Method to formulate the orchestration objective function and satisfy the constraints and is then optimized via Kronecker Factored Approximate Curvature (K-FAC). Extensive simulations demonstrate that the proposed technique outperforms other Reinforcement Learning (RL) algorithms, reaching 99% of QoS and SLA satisfaction while assuring bandwidth constraints.
- Book Chapter
- 10.1017/9781108943321.013
- Jan 31, 2023
In this chapter, we study the Age of Information (AoI) when the status updates of the underlying process of interest can be sampled at any time by the source node and are transmitted over an error-prone wireless channel. We assume the availability of perfect feedback that informs the transmitter about the success or failure of transmitted status updates and consider various retransmission strategies. More specifically, we study the scheduling of sampling and transmission of status updates in order to minimize the long-term average AoI at the destination under resource constraints. We assume that the underlying statistics of the system are not known, and hence, propose average-cost reinforcement learning algorithms for practical applications. Extensions of the results to a multiuser setting with multiple receivers and to an energy-harvesting source node are also presented, different reinforcement learning methods including deep Q Network (DQN) are exploited and their performances are demonstrated.
- Research Article
119
- 10.1007/s10586-015-0484-2
- Sep 11, 2015
- Cluster Computing
Task scheduling is a necessary prerequisite for performance optimization and resource management in the cloud computing system. Focusing on accurate scaled cloud computing environment and efficient task scheduling under resource constraints problems, we introduce fine-grained cloud computing system model and optimization task scheduling scheme in this paper. The system model is comprised of clearly defined separate submodels including task schedule submodel, task execute submodel and task transmission submodel, so that they can be accurately analyzed in the order of processing of user requests. Moreover the submodels are scalable enough to capture the flexibility of the cloud computing paradigm. By analyzing the submodels, where results are repeated to obtain sufficient accuracy, we design a novel task scheduling scheme based on reinforcement learning and queuing theory to optimize task scheduling under the resource constraints, and the state aggregation technologies is employed to accelerate the learning progress. Our results, on the one hand, demonstrate the efficiency of the task scheduling scheme and, on the other hand, reveal the relationship between the arrival rate, server rate, number of VMs and the number of buffer size.
- Research Article
1
- 10.1287/mnsc.2020.03850
- Mar 3, 2025
- Management Science
We consider a general online stochastic optimization problem with multiple resource constraints over a horizon of finite time periods. In each time period, a reward function and multiple cost functions are revealed, and the decision maker needs to specify an action from a convex and compact action set to collect the reward and consume the resources. Each cost function corresponds to the consumption of one resource. The reward function and the cost functions of each time period are drawn from an unknown distribution, which is nonstationary across time. The objective of the decision maker is to maximize the cumulative reward subject to the resource constraints. This formulation captures a wide range of applications including online linear programming and network revenue management, among others. In this paper, we consider two settings: (i) a data-driven setting where the true distribution is unknown but a prior estimate (possibly inaccurate) is available and (ii) an uninformative setting where the true distribution is completely unknown. We propose a unified Wasserstein distance–based measure to quantify the inaccuracy of the prior estimate in setting (i) and the nonstationarity of the environment in setting (ii). We show that the proposed measure leads to a necessary and sufficient condition for the attainability of a sublinear regret in both settings. For setting (i), we propose an informative gradient descent algorithm. The algorithm takes a primal-dual perspective, and it integrates the prior information of the underlying distributions into an online gradient descent procedure in the dual space. The algorithm also naturally extends to the uninformative setting (ii). Under both settings, we show the corresponding algorithm achieves a regret of optimal order. We illustrate the algorithm’s performance through numerical experiments. This paper was accepted by Chung Piaw Teo, optimization. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2020.03850 .
- Research Article
6
- 10.1002/jcc.26984
- Aug 24, 2022
- Journal of Computational Chemistry
Conformer‐RL is an open‐source Python package for applying deep reinforcement learning (RL) to the task of generating a diverse set of low‐energy conformations for a single molecule. The library features a simple interface to train a deep RL conformer generation model on any covalently bonded molecule or polymer, including most drug‐like molecules. Under the hood, it implements state‐of‐the‐art RL algorithms and graph neural network architectures tuned specifically for molecular structures. Conformer‐RL is also a platform for researching new algorithms and neural network architectures for conformer generation, as the library contains modular class interfaces for RL environments and agents, allowing users to easily swap components with their own implementations. Additionally, it comes with tools to visualize and save generated conformers for further analysis. Conformer‐RL is well‐tested and thoroughly documented with tutorials for each of the functionalities mentioned above, and is available on PyPi and Github: https://github.com/ZimmermanGroup/conformer-rl.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.