Perfect Foresight Research Articles

The derivation of an efficient operational strategy for storing intermittent renewable energies using a hybrid battery-hydrogen energy storage system is a difficult task. One approach for deriving an efficient operational strategy is using mathematical optimization in the context of model predictive control. However, mathematical optimization derives an operational strategy based on a non-exact mathematical system representation for a specified prediction horizon to optimize a specified target. Thus, the resulting operational strategies can vary depending on the optimization settings.This work focuses on evaluating potential improvements in the operational strategy for a hybrid battery-hydrogen energy storage system using mathematical optimization. To investigate the operation, a simulation model of a hybrid energy storage system and a tailor-made mixed integer linear programming optimization model of this specific system are utilized in the context of a model predictive control framework. The resulting operational strategies for different settings of the model predictive control framework are compared to a rule-based controller to show the potential benefits of model predictive control compared to a conventional approach. Furthermore, an in-depth analysis of different factors that impact the effectiveness of the model predictive controller is done. Therefore, a sensitivity analysis of the effect of different electricity demands and resource sizes on the performance relative to a rule-based controller is conducted. The model predictive controller reduced the energy consumption by at least 3.9 % and up to 17.9% compared to a rule-based controller. Finally, Pareto fronts for multi-objective optimizations with different prediction and control horizons are derived and compared to the results of a rule-based controller. A cost reduction of up to 47 % is achieved by a model predictive controller with a prediction horizon of 7 days and perfect foresight.

Read full abstract

Introduction The inventory management of platelets is complicated by their short shelf-life. In many hospitals, ordering policies are set by staff based on their experience rather than through forecasting or mathematical modelling. A data-driven approach may help to reduce wastage while ensuring the right unit is available for the right patient at the right time. Finding optimal policies for managing perishable inventory is known to be computationally challenging due to the large number of “observation states” required to represent the age profile of the stock. Reinforcement learning (RL) is a subfield of machine learning in which agents learn how to solve a sequential decision-making task through interaction with an environment. Deep reinforcement learning (DRL) uses deep neural networks to efficiently learn a policy (a mapping between an observed state and an action) for problems with many observation states. We demonstrate, with both simulated and real-life demand data, that DRL can be used to learn effective platelet replenishment policies for a hospital blood bank. Methods We implemented the platelet replenishment scenario from a recent study (Rajendran & Srinivas, 2020) as an RL environment using the OpenAI gym Python package. Daily demand is stochastic, sampled from day-of-the-week specific Poisson distributions. The reward is the negative cost incurred, comprised of fixed and variable order costs, holding costs, shortage costs and wastage costs. We reimplemented the four heuristic replenishment policies described in that study, with policy parameters fit using stochastic mixed integer linear programming (SMILP), where the ordering decision is based on the total number of units in stock and the order quantity is either fixed or the difference between current stock and a target stock level. We trained DRL policies using two popular methods: Deep Q-Networks (DQN) and Proximal Policy Optimization (PPO) using the RL environment. We compared the performance of these six policies, in addition to the optimal policy found using value iteration (VI) and a policy with perfect foresight, on 1,000 randomly generated evaluation episodes each 365 days long. We repeated the analysis using real demand data obtained from the blood transfusion laboratory at University College London Hospital, a large tertiary level care hospital in the United Kingdom, fitting the policies using daily demand data from 2015 and 2016, and evaluating their performance on data from 2017. Results & Discussion The DRL policies incurred consistently lower mean daily costs on the simulated evaluation episodes than the four policies fit using SMILP. PPO incurred a lower mean daily cost than DQN in 96% of the evaluation episodes and performed near optimally - its mean daily cost was only 0.3% higher than that of the VI policy and 8% higher than the policy with perfect foresight. The best performing heuristic policy fit using SMILP, (s, S), incurred a mean daily cost 1.2% higher than the VI policy. The VI policy could be represented as an (s, S) heuristic policy, with different parameters to those found using SMILP. Therefore, in this case, the advantage of DRL over SMILP appears to be the fact it can efficiently learn from many more example sequences of demand, rather than its ability to represent more complex functions. PPO incurred the lowest mean daily cost on the real demand data from 2017 of the six SMILP and DRL policies, 10% higher than a policy with perfect foresight. Holding costs were the main difference between PPO and the policy with perfect foresight when using both simulated and real demand data. In both experiments PPO achieved mean wastage of 0% and suffered a shortage, which would have required placing an additional rush order, on 3.2% of days with simulated demand data and 2.7% of days with real demand data. Conclusion DRL can be used to learn near-optimal policies for a simplified platelet replenishment task, consistently outperforming a previously reported approach. This suggests it may be a viable method for finding policies that can be applied in practice to improve the management of platelet inventory and reduce wastage. In future work, the ability of DRL to learn how to act in large observation space states will enable consideration of additional aspects of the real problem, such as the fact that not all units arrive fresh and not all requested units are transfused, where existing methods become computationally infeasible or impractical.

Read full abstract

Perfect Foresight Research Articles

Related Topics

Articles published on Perfect Foresight

Model Predictive Control of a Stand-Alone Hybrid Battery-Hydrogen Energy System: A Case Study of the PHOEBUS Energy System

Iterative coupling of a fundamental electricity market model and an agent-based simulation model to reduce the efficiency gap

Performance-Based Pay and Limited Information Access. An Agent-Based Model of the Hidden Action Problem

Renewable-battery hybrid power plants in congested electricity markets: Implications for plant configuration

General equilibrium and dynamic inconsistency

Strategic complementarities, coordination failures, and macroeconomic fluctuations: from multiplicity of equilibria to disequilibrium dynamics

Model coupling and comparison on optimal load shifting of battery electric vehicles and heat pumps focusing on generation adequacy

Sequential trading with coarse contingencies

Balancing Data Acquisition Benefits and Ordering Costs for Predictive Supplier Selection and Order Allocation

Stability of price and quantity to a long-run equilibrium: a dynamic Leontief model with bounded rationality

Closed loop model predictive control of a hybrid battery-hydrogen energy storage system using mixed-integer linear programming

Transferable Energy Storage Bidder

Quantifying the effects of expectation variability on economic dynamics

A Real-Time Redispatch Method to Evaluate the Contribution of Storage to Capacity Adequacy

Strategic Implicit Balancing With Energy Storage Systems via Stochastic Model Predictive Control

Existence and Uniqueness of Solutions to Dynamic Models with Occasionally Binding Constraints

Deep Reinforcement Learning for Managing Platelets in a Hospital Blood Bank

Impact of consumer foresight on efficient overselling

Flexible time horizon planning, learning and aggregate fluctuations

The stochastic balance equation for the American option value function and its gradient

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Perfect Foresight Research Articles

Related Topics

Articles published on Perfect Foresight

Model Predictive Control of a Stand-Alone Hybrid Battery-Hydrogen Energy System: A Case Study of the PHOEBUS Energy System

Iterative coupling of a fundamental electricity market model and an agent-based simulation model to reduce the efficiency gap

Performance-Based Pay and Limited Information Access. An Agent-Based Model of the Hidden Action Problem

Renewable-battery hybrid power plants in congested electricity markets: Implications for plant configuration

General equilibrium and dynamic inconsistency

Strategic complementarities, coordination failures, and macroeconomic fluctuations: from multiplicity of equilibria to disequilibrium dynamics

Model coupling and comparison on optimal load shifting of battery electric vehicles and heat pumps focusing on generation adequacy

Sequential trading with coarse contingencies

Balancing Data Acquisition Benefits and Ordering Costs for Predictive Supplier Selection and Order Allocation

Stability of price and quantity to a long-run equilibrium: a dynamic Leontief model with bounded rationality

Closed loop model predictive control of a hybrid battery-hydrogen energy storage system using mixed-integer linear programming

Transferable Energy Storage Bidder

Quantifying the effects of expectation variability on economic dynamics

A Real-Time Redispatch Method to Evaluate the Contribution of Storage to Capacity Adequacy

Strategic Implicit Balancing With Energy Storage Systems via Stochastic Model Predictive Control

Existence and Uniqueness of Solutions to Dynamic Models with Occasionally Binding Constraints

Deep Reinforcement Learning for Managing Platelets in a Hospital Blood Bank

Impact of consumer foresight on efficient overselling

Flexible time horizon planning, learning and aggregate fluctuations

The stochastic balance equation for the American option value function and its gradient