In view of unknown Bilateral Teleoperators, the simultaneous satisfaction of not only synchronous control problem between two sides, but also optimal control performance under time-varying delays and external disturbance is a difficult task. Initially, in order to address this challenge, we proposed two control approaches including On-Policy and Off-Policy strategies after employing the sliding variable to reduce the order of dynamic model. Additionally, since the development of the unification between synchronous control problem and optimal control performance, it requires the consideration of discount factor addition to guarantee the bound of the infinite horizon cost function. In order to handle the dynamic uncertainties and the exponential cost function, by fully considering data collection technique in model-free On-Policy and Off-Policy strategies, Reinforcement Learning control schemes are proposed to guarantee the optimality effectiveness. Consequently, the Robust Integral of the Sign of the Error term is integrated into two proposed optimal control frames to improve the synchronous control performance and estimation capability of dynamic uncertainties. Moreover, the convergence of control policies and synchronous control problem of two proposed control frames are rigorously analyzed by Lyapunov stability theory. Finally, simulation results and the comparisons with the existing controller demonstrate the effectiveness of two proposed control frameworks.