Cooperative localization is a promising solution to the vehicular high-accuracy localization problem. Despite its high potential, exhaustive measurement and information exchange between all adjacent vehicles are expensive and impractical for applications with limited resources. Greedy policies or hand-engineering heuristics may not be able to meet the requirement of complicated use cases. In this paper, we formulate a scheduling problem to improve the localization accuracy (measured through the Cramer-Rao lower bound) of every vehicle up to a given threshold using the minimum number of measurements. The problem is cast as a partially observable Markov decision process and solved using decentralized scheduling algorithms with deep reinforcement learning, which allow vehicles to optimize the scheduling (i.e., the instants to execute measurement and information exchange with each adjacent vehicle) in a distributed manner without a central controlling unit. Simulation results show that the proposed algorithms have a significant advantage over random and greedy policies in terms of both required numbers of measurements to localize all nodes and achievable localization precision with limited numbers of measurements.