The maximum k-plex problem is a computationally challenging problem, which emerged from graph-theoretic social network studies in the context of community detection. As a relaxation of the traditional maximum clique problem, intended to identify cohesive structures within a graph that are not necessarily fully connected, the aim of this problem is to find the largest set of vertices in a graph in which all vertices are not adjacent to at most k vertices within this set. This paper introduces an effective hybrid local search algorithm for solving the maximum k-plex problem that combines a stochastic local search method and a reinforcement learning strategy. The proposed approach includes a number of distinguishing features, including: a new coarse-to-fine strategy to balance the search process, a distance-and-quality reward for actions, and a parameter control mechanism based on reinforcement learning. We conduct a computational analysis of the key components of the proposed algorithm, assessing their impact on the overall performance. Extensive experiments for the maximum k-plex problem (k=2,3,4,5) on 80 benchmark instances from the second DIMACS Challenge demonstrate that the proposed approach competes favourably with the state-of-the-art algorithms from the literature.