For many applications, local time stepping offers an interesting and worthwhile alternative to the by now well established global time step control. In fact, local time stepping can allow for a highly detailed resolution of localized features of the solution with strongly reduced computational cost, when compared to global time step control. However local time stepping is not applicable in a straight-forward manner in the context of fully implicit time-discretizations.Here, we present a method for the efficient parallel adaptive solution of (non-linear) partial differential equations, in particular reaction–diffusion equations, using spatially adapted time step sizes in the context of a fully implicit solution strategy. Our proposed method uses a discontinuous Galerkin method in-time approach within a full space-time approach. Moreover, it is designed from scratch for efficient parallel computation. We employ shallow tree-based mesh data structures in order to ensure a low memory footprint of the adaptive meshes. By solving the time-dependent partial differential equation on a (d+1)-dimensional non-conforming mesh, space-time adaptivity is naturally achieved. In combination with a discontinuous Galerkin method in-time the size of the arising systems can be precisely controlled.We additionally introduce and discuss a stabilization scheme for space-time mortar element methods that also has a highly positive impact on the efficiency of preconditioning techniques for the arising systems of equations. We present results from extensive numerical experiments that address the question of convergence and efficiency, linear and non-linear solver performance, parallel scalability up to 2048 cores as well as accuracy for the linear heat equation and a real world, non-linear reaction–diffusion equation from the field of computational electrocardiology.