Discrete event systems (DESs) are powerful abstract representations for large human-made physical systems in a wide variety of industries. Safety control issues on DESs have been extensively studied based on the logical specifications of the systems in various literature. However, when facing the DESs under uncertain environment which brings into the implicit specifications, the classical supervisory control approach may not be capable of achieving the performance. So in this research, we propose a new approach for optimal control of DESs under uncertain environment based on supervisory control theory (SCT) and reinforcement learning (RL). Firstly, we use SCT to gather deliberative planning algorithms with the aim to safe control. Then we convert the supervised system to Markov Decision Process simulation environments that is suitable for optimal algorithm training. Furthermore, a SCT-based RL algorithm is designed to maximize performance of the system based on the probabilistic attributes of the state transitions. Finally, a case study on the autonomous navigation task of a delivery robot is provided to corroborate the proposed method by multiple simulation experiments. The result shows the proposed approach owning 8.27%\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\%$$\\end{document} performance improvement compared with the non-intelligent methods. This research will contribute to further studying the optimal control of human-made physical systems in a wide variety of industries.