With the increasing need of flexible manufacturing systems, machine flexibility has become one of the major impact factors. An important aspect of machine flexibility is the ability to change an individual machine’s capacity (or cycle time) to improve the overall system efficiency. In this paper, a novel control method is proposed for multi-stage production systems to dynamically change the individual machines’ cycle time to improve overall system efficiency. The proposed control method integrates distributed feedback control scheme and a Reinforcement Learning (RL) control scheme based on an extended actor-critic algorithm. The feedback control will determine whether a machine is turned on or off using real-time system status, while the RL control scheme will decide how to increase or decrease a machine’s cycle time when a machine is on. An improved actor-critic RL algorithm is developed to add an auxiliary model-based path to the standard model-free RL to enhance the learning performance. To demonstrate the effectiveness of the proposed method, numerical case studies have been performed that clearly show improvements in the overall profits and energy savings compared to other methods.