Utilising fuzzy reinforcement learning for prediction in volatile stock markets
Utilising fuzzy reinforcement learning for prediction in volatile stock markets
- Conference Article
23
- 10.1109/icmlc.2005.1527071
- Jan 1, 2005
This paper focused on the problem of the intelligent mobile robot navigation under the unknown and changing environment. The fuzzy logic controller (FLC) is applied to the reactive robot control system. Without sufficient expert knowledge can be available, the fuzzy inference system (FIS) and reinforcement learning (RL) are integrated. The consequence of fuzzy rules is refined through Q (/spl lambda/)-learning. Then, the fuzzy reinforcement learning is employed to design controller of the robot system. The scheme of switching behavior-based FLC was presented, which includes avoidance obstacles behavior and wall-following behavior. This scheme can effectively solve the problem of navigation under complicated environment, which contains the concave obstacles. Experiment results indicate that efficiency and effectiveness of the proposed approach. Furthermore, the FLC learned by RL has robust and adaptability, and can be applied to the different environments.
- Research Article
27
- 10.1007/s13369-019-04126-9
- Sep 19, 2019
- Arabian Journal for Science and Engineering
This paper aims to introduce nonlinear optimization in the fuzzy reinforcement learning (RL) approach through genetic algorithm (GA)-based minimization. In conventional fuzzy RL, an agent attempts to find most optimal action at each stage by choosing an action having the lowest Q value or the greedy action. However, Q function is an unknown function and an attempt to find minima of such a function based on a limited set of values, in our view, is inaccurate and insufficient. A more rigorous approach would be to employ a nonlinear optimization procedure for finding minima of the Q function. We propose to employ genetic algorithm for finding optimal action value in each iteration of the algorithm rather than plain algebraic minimum. For guaranteed stability of the designed controller, we use Lyapunov theory-based fuzzy RL control with GA optimizer. We validate the performance of our controller on three benchmark nonlinear NL control problems: (1) inverted pendulum swing up, (2) cart pole balance, and (3) rotational/translational proof-mass actuator system. We carry out comparative evaluation of our controller against: (1) hybrid Lyapunov fuzzy RL control and (2) fuzzy Q learning control. Results show that our proposed GA-optimized fuzzy Lyapunov RL controller is able to achieve a high success rate with stable and superior tracking performance.
- Book Chapter
- 10.1007/978-3-319-07617-1_6
- Jan 1, 2014
In this paper, we introduce a new fuzzy reinforcement learning method to quality of service (QoS) provisioning cognitive transmission in cognitive radio networks. The cognitive transmissions under QoS constraints are treated here as the data sending at two different average power levels depending on the activity of the primary (licensed) users, which is determined by the secondary (unlicensed) users. For this transmission, the model is defined a state-transition model. The maximum throughput under these statistical QoS constraints is determined by using fuzzy QoS reinforcement learning techniques. The performance effectiveness of the proposed method is obtained in situations and comparison with the numerical method based on the effective capacity of the cognitive radio channel under various QoS constraints. It is shown that the hybrid AI method used outperforms comparable results obtained by the classical numerical method, including various situations with different QoS limitations.
- Research Article
10
- 10.3390/info10110341
- Nov 2, 2019
- Information
Multi-Robot Confrontation on physics-based simulators is a complex and time-consuming task, but simulators are required to evaluate the performance of the advanced algorithms. Recently, a few advanced algorithms have been able to produce considerably complex levels in the context of the robot confrontation system when the agents are facing multiple opponents. Meanwhile, the current confrontation decision-making system suffers from difficulties in optimization and generalization. In this paper, a fuzzy reinforcement learning (RL) and the curriculum transfer learning are applied to the micromanagement for robot confrontation system. Firstly, an improved Q-learning in the semi-Markov decision-making process is designed to train the agent and an efficient RL model is defined to avoid the curse of dimensionality. Secondly, a multi-agent RL algorithm with parameter sharing is proposed to train the agents. We use a neural network with adaptive momentum acceleration as a function approximator to estimate the state-action function. Then, a method of fuzzy logic is used to regulate the learning rate of RL. Thirdly, a curriculum transfer learning method is used to extend the RL model to more difficult scenarios, which ensures the generalization of the decision-making system. The experimental results show that the proposed method is effective.
- Research Article
40
- 10.1007/s00521-017-3106-5
- Aug 8, 2017
- Neural Computing and Applications
Unit commitment problem (UCP) aims at optimizing generation cost for meeting a given load demand under several operational constraints. We propose to use fuzzy reinforcement learning (RL) approach for efficient and reliable solution to the unit commitment problem. In particular, we cast UCP as a multiagent fuzzy reinforcement learning task wherein individual generators act as players for optimizing the cost to meet a given load over a twenty-four-hour period. Unit commitment task has been fuzzified, and the most optimal unit commitment solution is generated by employing RL on this fuzzy multigenerator setup. Our proposed multiagent RL framework does not assume any a priori task or system knowledge, and the generators gradually learn to produce most optimal output solely based on their collective generation. We look at the UCP as a sequential decision-making task with reward/penalty to reduce the collective generation cost of generators. To the best of our knowledge, ours is a first attempt at solving UCP by employing fuzzy reinforcement learning. We test our approach on a ten-generating-unit system with several equality and inequality constraints. Simulation results and comparisons against several recent UCP solution methods prove superiority and viability of our proposed multiagent fuzzy reinforcement learning technique.
- Research Article
4
- 10.1109/tiv.2024.3429500
- Mar 1, 2025
- IEEE Transactions on Intelligent Vehicles
Collision avoidance is a key technology for autonomous underwater vehicle (AUV) to achieve tasks such as path planning, target searching, and map construction. The performance of the algorithm directly affects the safety of the AUV and the success of collision avoidance. To improve the algorithm's generalization and adaptability, a fuzzy reinforcement learning collision avoidance strategy is presented in this study. Firstly, reinforcement learning learns without explicit modeling and converge quickly, making it suitable for continuous online learning tasks. Therefore, this paper adopts a multi-step temporal difference reinforcement learning approach to control the collision avoidance of AUV. Then, fuzzy theory is integrated into reinforcement learning to address generalization issues in the state space. This integration allows the acquisition of continuous state inputs and action outputs, enhancing motion state recognition and improving operational smoothness. Finally, enhancing online training of the fuzzy reinforcement learning method through adaptive strategy adjustments has improved the algorithm's online optimization efficiency. The proposed AUV collision avoidance strategy is validated through simulations and experiments in a three-dimensional underwater environment. The research results demonstrate that this strategy can safety and efficiently guide the AUV, showcasing strong generalization capabilities.
- Research Article
1
- 10.1007/s11802-002-0038-0
- Apr 1, 2002
- Journal of Ocean University of Qingdao
Robot learning in unstructured environments has been proved to be an extremely challenging problem, mainly because of many uncertainties always present in the real world. Human beings, on the other hand, seem to cope very well with uncertain and unpredictable environments, often relying on perception-based information. Furthermore, humans beings can also utilize perceptions to guide their learning on those parts of the perception-action space that are actually relevant to the task. Therefore, we conduct a research aimed at improving robot learning through the incorporation of both perception-based and measurement-based information. For this reason, a fuzzy reinforcement learning (FRL) agent is proposed in this paper. Based on a neural-fuzzy architecture, different kinds of information can be incorporated into the FRL agent to initialise its action network, critic network and evaluation feedback module so as to accelerate its learning. By making use of the global optimisation capability of GAs (genetic algorithms), a GA-based FRL (GAFRL) agent is presented to solve the local minima problem in traditional actor-critic reinforcement learning. On the other hand, with the prediction capability of the critic network, GAs can perform a more effective global search. Different GAFRL agents are constructed and verified by using the simulation model of a physical biped robot. The simulation analysis shows that the biped learning rate for dynamic balance can be improved by incorporating perception-based information on biped balancing and walking evaluation. The biped robot can find its application in ocean exploration, detection or sea rescue activity, as well as military maritime activity.
- Book Chapter
- 10.1007/978-981-19-6613-2_706
- Jan 1, 2023
The current research on unguided bomb targeting and launching by unmanned aerial vehicles (UAVs) is very lacking. In this research, referring the basic steps of traditional bomb launching, a basic method of unguided bomb targeting and delivery based on the bombing circle is proposed for UAVs. The model of the aiming error is given, and the algorithm flow of aiming and launching bombs is suggested. Aiming at the difficulty of decision-making of turning angular velocity, an autonomous learning decision-making algorithm based on fuzzy reinforcement learning (RL) is designed, and the fuzzy method and algorithm flow are studied. Simulations are carried out to prove the effectiveness of the method in this paper.KeywordsUnmanned aerial vehicles (UAVs)Autonomous aimingBomb launchingBombing circleReinforcement Learning (RL)
- Conference Article
89
- 10.1109/fuzzy.1994.343737
- Jan 1, 1994
Fuzzy reinforcement learning (FRL) involves "jump starting" reinforcement learning with fuzzy logic rules. By using FRL, prior domain knowledge, which may be very approximate and imprecise, can be expressed in terms of fuzzy rules and refined later through the learning process. In this paper, we develop a new algorithm called fuzzy Q-learning (or FQ-Learning) which extends Watkin's Q-learning method. It can be used for decision processes in which the goals and/or the constraints, but not necessarily the system under control, are fuzzy in nature. An example of a fuzzy constraint is: "the weight of object A must not be substantially heavier than w" where w is a specified weight. Similarly, an example of a fuzzy goal is: "the robot must be in the vicinity of door k". We show that FQ-learning provides an alternative solution to this problem which is simpler than the Bellman-Zadeh's fuzzy dynamic programming approach. We apply the algorithm to a multistage decision making problem and a navigation task.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
- Conference Article
5
- 10.1109/iccae.2010.5451248
- Feb 1, 2010
The signalized intersection system often exhibits severe nonlinear and time-varying characteristic due to the random fluctuation of traffic demand or some special event, therefore, it cannot be adequately controlled with some traditional ways. The traditional reinforcement learning was extended to the fuzzy pattern with defining the fuzzy reinforcement function by using the fuzzy state. A stochastic control scheme, based on fuzzy reinforcement learning, is introduced in the traffic signal control systems due to its powerful adaptability. The FRL-based adaptive controller can produced appropriate control policy to prevent the traffic network from becoming over-congested. The traditional intersection traffic model is extended to a new mode which taking some real aspects of traffic conditions into account, such as the turning fraction and the lanes scheme. The model is tested on a typical four-legged signalized intersection, and compared to both pre-timed control and full-actuated controller. Analyses of simulation results using this approach show significant improvement over traditional control, especially for the case of over-saturated traffic demand and special events such as incidents and blockages. Using the FRL model, the total mean delay of each vehicle has been reduced by 25.7% under the heavy demands compared to the FAC scheme.
- Conference Article
2
- 10.1109/fuzzy.2011.6007675
- Jun 1, 2011
Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) offer a powerful platform for optimizing sequential decision making in partially observable stochastic environments. However, finding optimal solutions for Dec-POMDPs is known to be intractable, necessitating approximate/suboptimal approaches. To address this problem, this work proposes a novel fuzzy reinforcement learning (RL) based game theoretic controller for Dec-POMDPs. The proposed controller implements fuzzy RL on Dec-POMDPs, which are modeled as a sequence of Bayesian games (BG). The main contributions of the work are the introduction of a game based RL paradigm in a Dec-POMDP settings, and the use of fuzzy inference systems to effectively generalize the underlying belief space. We apply the proposed technique on two benchmark problems and compare results against state-of-the-art Dec-POMDP control approach. The results validate the feasibility and effectiveness of using game theoretic RL based fuzzy control for addressing intractability of Dec-POMDPs, thus opening up a new research direction.
- Conference Article
5
- 10.1109/cimca.2006.192
- Jan 1, 2006
In this paper, self-learning approaches are applied to an obstacle avoidance task of a mobile robot. Compared with conventional reinforcement learning (RL) and fuzzy RL (FRL), a novel approach termed dynamic self-generated fuzzy Q-Learning (DSGFQL) and its extended version, Enhanced Dynamic Self-Generated Fuzzy Q-Learning (EDS-GFQL), are proposed. Both methods are capable of generating a fuzzy inference system (FIS) without any priori knowledge. In the DSGFQL approach, the structure and preconditioning parts of an FIS are generated according to the input space partition and the reinforcement of the system. An extended self organizing map (SOM) algorithm is combined with the DSGFQL approach and the EDSGFQL algorithm can update the centers of membership functions (MFs). In both the DSGFQL and EDSGFQL approaches, the consequent parts of the FIS are updated by Fuzzy Q- Learning, which is a widely used RL method. As a consequence, the proposed DSGFQL and EDSGFQL methodologies can automatically create, delete and adjust fuzzy rules without any priori knowledge or supervision. Simulation studies on an obstacle avoidance task by a mobile robot show that the proposed DSGFQL and EDSGFQL approaches are superior to those current RL methods.
- Book Chapter
16
- 10.1007/978-3-540-87732-5_44
- Sep 24, 2008
This paper focused on the problem of the autonomous mobile robot navigation under the unknown and changing environment. The reinforcement learning (RL) is applied to learn behaviors of reactive robot. T-S fuzzy neural network and RL are integrated. T-S network is used to implement the mapping from the state space to Q values corresponding with action space of RL. The problem of continuous, infinite states and actions in RL is able to be solved through the function approximation of proposed method. Finally, the method of this paper is applied to learn behaviors for the reactive robot. The experiment shows that the algorithm can effectively solve the problem of navigation in a complicated unknown environment.KeywordsReinforcement learningRobot navigationT-S fuzzy neural networkQ-learning
- Research Article
- 10.1088/1742-6596/2258/1/012047
- Apr 1, 2022
- Journal of Physics: Conference Series
To lessen fuel consumption of hybrid electric vehicle, energy online distribution strategy based on fuzzy reinforcement learning is proposed. Vehicle dynamic model and key component model are built, so that hybrid power system simulation model is given. Energy distribution problem of hybrid electric vehicle is transferred to constrained optimization problem of stochastic dynamic system through modelling, and solving problem based on fuzzy reinforcement learning is put forwarded. Fuzzy reinforcement learning is used to optimize fuzzy inference system real-time, which makes fuzzy inference system adjust with driving cycle adaptively, so that optimal control under any driving cycle come true. Clarified by simulation, fuel consumption under the control of rule-based is 3.89L/km, and it is 3.45L/km under the control of fuzzy reinforcement learsning, which saves 11.31% fuel consumption compared with rule-based. The data above proves validity of fuzzy reinforcement learning on hybrid electric vehicle energy management.
- Book Chapter
4
- 10.1007/978-3-319-25017-5_28
- Oct 18, 2015
This paper proposes a new Reinforcement Learning (RL) algorithm for formation of agents in regular geometric forms. Due to curse of dimensionality problem, applying RL algorithms in formation problems cannot present suitable performance. Moreover, since the state space in formation problem is large, this leads to long learning time. Here, a multi-agent fuzzy reinforcement learning algorithm is presented that is an extension of fuzzy actor-critic reinforcement learning in a multi-agent environment. The final action for each agent is generated by a zero order T-S fuzzy system. In conventional fuzzy actor-critic RL, there are several candidate actions for consequence of each fuzzy rule and aim of learning is finding the best action among these discrete candidate actions. Here, using the proposed linear interpolation, a continuous action selection for determining the best action for each fuzzy rule is presented. The simulation results show the proposed method can improve the learning speed and action quality.