Computerized adaptive testing (CAT) aims to present items that statistically optimize the assessment process by considering the examinee's responses and estimated trait levels. Recent developments in reinforcement learning and deep neural networks provide CAT with the potential to select items that utilize more information across all the items on the remaining tests, rather than just focusing on the next several items to be selected. In this study, we reformulate CAT under the reinforcement learning framework and propose a new item selection strategy based on the deep Q-network (DQN) method. Through simulated and empirical studies, we demonstrate how to monitor the training process to obtain the optimal Q-networks, and we compare the accuracy of the DQN-based item selection strategy with that of five traditional strategies-maximum Fisher information, Fisher information weighted by likelihood, Kullback‒Leibler information weighted by likelihood, maximum posterior weighted information, and maximum expected information-on both simulated and real item banks and responses. We further investigate how sample size and the distribution of the trait levels of the examinees used in training affect DQN performance. The results show that DQN achieves lower RMSE and MAE values than traditional strategies under simulated and real banks and responses in most conditions. Suggestions for the use of DQN-based strategies are provided, as well as their code.