Context. The calculation of the emerging radiation from a model atmosphere requires knowledge of the emissivity and absorption coefficients, which are proportional to the atomic level population densities of the levels involved in each transition. Due to the intricate interdependence of the radiation field and the physical state of the atoms, iterative methods are required in order to calculate the atomic level population densities. A variety of different methods have been proposed to solve this problem, which is known as the nonlocal thermodynamical equilibrium (NLTE) problem. Aims. Our goal is to develop an efficient and rapidly converging method to solve the NLTE problem under the assumption of statistical equilibrium. In particular, we explore whether the Jacobian-Free Newton-Krylov (JFNK) method can be used. This method does not require an explicit construction of the Jacobian matrix because it estimates the new correction with the Krylov-subspace method. Methods. We implemented an NLTE radiative transfer code with overlapping bound-bound and bound-free transitions. This solved the statistical equilibrium equations using a JFNK method, assuming a depth-stratified plane-parallel atmosphere. As a reference, we also implemented the Rybicki & Hummer (1992) method based on linearization and operator splitting. Results. Our tests with the Fontenla, Avrett and Loeser C model atmosphere (FAL-C) and two different six-level Ca II and H I atoms show that the JFNK method can converge faster than our reference case by up to a factor 2. This number is evaluated in terms of the total number of evaluations of the formal solution of the radiative transfer equation for all frequencies and directions. This method can also reach a lower residual error compared to the reference case. Conclusions. The JFNK method we developed poses a new alternative to solving the NLTE problem. Because it is not based on operator splitting with a local approximate operator, it can improve the convergence of the NLTE problem in highly scattering cases. One major advantage of this method is that it is expected to allow for a direct implementation of more complex problems, such as overlapping transitions from different active atoms, charge conservation, or a more efficient treatment of partial redistribution, without having to explicitly linearize the equations.