Observable Markov Decision Process Research Articles

Over the past few years, the Bitcoin-based financial trading system (BFTS) has created new challenges for the power system due to the high-risk consumption of mining devices. Briefly, such a problem would be a compelling incentive for cyber-attackers who intend to inflict significant infections on a power system. Simply put, an effort to phony up the consumption data of mining devices results in the furtherance of messing up the optimal energy management within the power system. Hence, this paper introduces a new cyber-attack named miner-misuse for power systems equipped by transaction tech. To overwhelm this dispute, this article also addresses an online coefficient anomaly detection approach with reliance on the reinforcement learning (RL) concept for the power system. On account of not being sufficiently aware of the system, we fulfilled the Observable Markov Decision Process (OMDP) idea in the RL mechanism in order to barricade the miner attack. The proposed method would be enhanced in an optimal and punctual way if the setting parameters were properly established in the learning procedure. So to speak, a hybrid mechanism of the optimization approach and learning structure will not only guarantee catching in the best and most far-sighted solution but also become the high converging time. To this end, this paper proposes an Intelligent Priority Selection (IPS) algorithm merging with the suggested RL method to become more punctual and optimum in the way of detecting miner attacks. Additionally, to conjure up the proposed detection approach’s effectiveness, mathematical modeling of the energy consumption of the mining devices based on the hashing rate within BFTS is provided. The uncertain fluctuation related to the needed energy of miners makes energy management unpredictable and needs to be dealt with. Hence, the unscented transformation (UT) method can obtain a high chance of precisely modeling the uncertain parameters within the system. All in all, the F-score value and successful probability of attack inferred from results revealed that the proposed anomaly detection method has the ability to identify the miner attacks as real-time-short as possible compared to other approaches.

Read full abstract

The actual state of ecological systems is rarely known with certainty, but management actions must often be taken regardless of imperfect measurement (partial observability). Because of the difficulties in accounting for partial observability, it is usually treated in an ad hoc fashion, or simply ignored altogether. Yet incorporating partial observability into decision processes lends a realism that has the potential to improve ecological outcomes significantly. We review frameworks for dealing with partial observability, focusing specifically on dynamic ecological systems with Markovian transitions, i.e., transitions among system states that are influenced by the current system state and management action over time. Fully observable states are represented in an observable Markov decision process (MDP), whereas obscure or hidden states are represented in a partially observable process (POMDP). POMDPs can be seen as a natural extension of observable MDPs. Management under partial observability generalizes the situation for complete observability, by recognizing uncertainty about the system's state and incorporating sequential observations associated with, but not the same as, the states themselves. Decisions that otherwise would depend on the actual state must be based instead on state probability distributions (“belief states”). Partial observability requires adaptation of the entire decision process, including the use of belief states and Bayesian updates, valuation that includes expectations over observations, and optimal strategy that identifies actions for belief states over a continuous belief space. We compare MDPs and POMDPs and highlight POMDP applications to some common ecological problems. We clarify the structure and operations, approaches for finding solutions, and analytic challenges of POMDPs for practicing ecologists. Both observable and partially observable MDPs can use an inductive approach to identify optimal strategies and values, with a considerable increase in mathematical complexity with POMDPs. Better understanding of POMDPs can help decision makers manage imperfectly measured ecological systems more effectively.

Read full abstract

Observable Markov Decision Process Research Articles

Related Topics

Articles published on Observable Markov Decision Process

Density estimation based soft actor-critic: deep reinforcement learning for static output feedback control with measurement noise

ICORPP: Interleaved commonsense reasoning and probabilistic planning on robots

Risk-Aware Markov Decision Process Contingency Management Autonomy for Uncrewed Aircraft Systems

Risk-Averse Decision Making Under Uncertainty

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

Dynamic Optimization of Random Access in Deadline-Constrained Broadcasting

Act-Then-Measure: Reinforcement Learning for Partially Observable Environments with Active Measuring

An Adoptive Miner-Misuse Based Online Anomaly Detection Approach in the Power System: An Optimum Reinforcement Learning Method

Learning to Break Rocks With Deep Reinforcement Learning

Partial observability and management of ecological systems.

Predicting the Evolution of Controlled Systems Modeled by Finite Markov Processes

Hydrocarbon Field (Re-)Development as Markov Decision Process

Privacy-Preserving POMDP Planning via Belief Manipulation

Network defense decision-making based on a stochastic game system and a deep recurrent Q-network

Pseudo Random Number Generation through Reinforcement Learning and Recurrent Neural Networks

An Optimal Policy for Hybrid Channel Access in Cognitive Radio Networks With Energy Harvesting

Integer Programming on the Junction Tree Polytope for Influence Diagrams

Bridging Commonsense Reasoning and Probabilistic Planning via a Probabilistic Action Language

Managing engineering systems with large state and action spaces through deep reinforcement learning

A New Smart Router-Throttling Method to Mitigate DDoS Attacks

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Observable Markov Decision Process Research Articles

Related Topics

Articles published on Observable Markov Decision Process

Density estimation based soft actor-critic: deep reinforcement learning for static output feedback control with measurement noise

ICORPP: Interleaved commonsense reasoning and probabilistic planning on robots

Risk-Aware Markov Decision Process Contingency Management Autonomy for Uncrewed Aircraft Systems

Risk-Averse Decision Making Under Uncertainty

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes

Dynamic Optimization of Random Access in Deadline-Constrained Broadcasting

Act-Then-Measure: Reinforcement Learning for Partially Observable Environments with Active Measuring

An Adoptive Miner-Misuse Based Online Anomaly Detection Approach in the Power System: An Optimum Reinforcement Learning Method

Learning to Break Rocks With Deep Reinforcement Learning

Partial observability and management of ecological systems.

Predicting the Evolution of Controlled Systems Modeled by Finite Markov Processes

Hydrocarbon Field (Re-)Development as Markov Decision Process

Privacy-Preserving POMDP Planning via Belief Manipulation

Network defense decision-making based on a stochastic game system and a deep recurrent Q-network

Pseudo Random Number Generation through Reinforcement Learning and Recurrent Neural Networks

An Optimal Policy for Hybrid Channel Access in Cognitive Radio Networks With Energy Harvesting

Integer Programming on the Junction Tree Polytope for Influence Diagrams

Bridging Commonsense Reasoning and Probabilistic Planning via a Probabilistic Action Language

Managing engineering systems with large state and action spaces through deep reinforcement learning

A New Smart Router-Throttling Method to Mitigate DDoS Attacks