This paper studies a multi-objective scheduling problem in a continuous annealing operation of the steel industry, which is to simultaneously determine the production batch sizes for coils and their schedules so as to minimize the cost of setup, earliness, and tardiness. A large batch can reduce the setup costs but lead to a cost increase in earliness and tardiness. Thus, the conflict objectives of minimizing setup cost, earliness, and tardiness can be formulated separately. In this paper, we formulate a multi-objective optimization model for the problem and develop an adaptive multi-objective differential evolutionary based on deep reinforcement learning (AMODE-DRL) for effectively obtaining the Pareto solutions. In the proposed AMODE-RDL, DRL is integrated into the MODE algorithm as a controller, which can adaptively select mutation operators and parameters according to different search domains. Computational results on randomly generated instances and the practical problem instances show that DRL can effectively guide MODE to select mutation operators and parameters. The proposed algorithm can obtain better solutions compared to other powerful multi-objective evolutionary and adaptive MODE algorithms. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —Setup usually leads to a decrease in production capacity and an increase in production costs. Similarly, earliness and tardiness costs are also crucial. These costs correspond to production, customer demand, and inventory, which are common objectives in production systems. However, the three objectives are generally conflicting, and it is very hard for practitioners to make appropriate decisions with manual experience. The multi-objective optimization methods can provide different scheduling for practitioners who may choose the suitable one according to current working conditions. Accordingly, the proposed model and algorithm can extend to other fields with similar characteristic problems.