This paper presents a conceptual framework for addressing the consensus problem in multi-agent systems with unknown nonlinear dynamics using reinforcement learning (RL) based nearly optimal sliding mode controller (SMC). The agents’ dynamics are assumed to have uncertainty and mismatched disturbance. An adaptive fixed-time estimator is introduced to estimate uncertain dynamics and disturbances for each agent at a certain time. The paper proposes two control strategies. In the first strategy, a controller is designed, incorporating adaptive SMC and an optimal controller. Adaption law in SMC estimates the bound of fixed time estimator error before convergence, ultimately achieving asymptotic convergence to the sliding surface and converting the agent’s dynamics to linear. This enables solving a linear consensus problem using an RL-based adaptive optimal controller through an on-policy critic–actor method. The second control strategy enhances the adaptive SMC into a fixed-time controller, reducing the time to reach the sliding surface regardless of initial conditions. Consequently, the convergence time of the consensus error to zero is diminished. This reduction in reaching time results in faster convergence of the consensus error to zero. The effectiveness of both strategies is validated through numerical experiments on two real system models, aligning with the theoretical proofs.