With the advent of the socio-technical manufacturing paradigm, the way in which rescheduling decisions are taken at the shop floor has radically changed in order to guarantee highly efficient production under increasingly dynamic conditions. To cope with uncertain production environments, a drastic increase in the type and degree of automation used at the shop floor for handling unforeseen events and unplanned disturbances is required. In this work, the on-line rescheduling task is modelled as a closed-loop control problem in which an artificial autonomous agent implements a control policy generated off-line using a schedule simulator to learn schedule repair policies directly from high-dimensional sensory inputs. The rescheduling control policy is stored in a deep neural network, which is used to select repair actions in order to achieve a small set of repaired goal states. The rescheduling agent is trained using Proximal Policy Optimisation based on a wide variety of simulated transitions between schedule states using colour-rich Gantt chart images and negligible prior knowledge as inputs. An industrial example is discussed to highlight that the proposed approach enables end-to-end deep learning of successful rescheduling policies to encode task-specific control knowledge that can be understood by human experts.
Read full abstract