This paper presents a Q-learning-based target selection algorithm for spacecraft autonomous navigation using bearing observations of known visible targets. For the considered navigation system, the position and velocity of the spacecraft are estimated using an extended Kalman filter (EKF) with the measurements of inter-satellite line-of-sight (LOS) vectors obtained via an onboard star camera. This paper focuses on the selection of the appropriate target at each observation period for the star camera adaptively, such that the performance of the EKF is enhanced. To derive an effective algorithm, a Q-function is designed to select a proper observation region, while a U-function is introduced to rank the targets in the selected region. Both the Q-function and the U-function are constructed based on the sequence of innovations obtained from the EKF. The efficiency of the Q-learning-based target selection algorithm is illustrated via numerical simulations, which show that the presented algorithm outperforms the traditional target selection strategy based on a Cramer-Rao bound (CRB) in the case that the prior knowledge about the target location is inaccurate.