Abstract

We consider the dynamic and stochastic resource-constrained multi-project scheduling problem which allows for the random arrival of projects and stochastic task durations. Completing projects generates rewards, which are reduced by a tardiness cost in the case of late completion. Multiple types of resource are available, and projects consume different amounts of these resources when under processing. The problem is modelled as an infinite-horizon discrete-time Markov decision process and seeks to maximise the expected discounted long-run profit. We use an approximate dynamic programming algorithm (ADP) with a linear approximation model which can be used for online decision making. Our approximation model uses project elements that are easily accessible by a decision-maker, with the model coefficients obtained offline via a combination of Monte Carlo simulation and least squares estimation. Our numerical study shows that ADP often statistically significantly outperforms the optimal reactive baseline algorithm (ORBA). In experiments on smaller problems however, both typically perform suboptimally compared to the optimal scheduler obtained by stochastic dynamic programming. ADP has an advantage over ORBA and dynamic programming in that ADP can be applied to larger problems. We also show that ADP generally produces statistically significantly higher profits than common algorithms used in practice, such as a rule-based algorithm and a reactive genetic algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call