With the rapid development of the intelligent transportation system (ITS), routing in vehicular ad hoc networks (VANETs) has become a popular research topic. The high mobility of vehicles in urban streets poses serious challenges to routing protocols and has a significant impact on network performance. Existing topology-based routing is not suitable for highly dynamic VANETs, thereby making location-based routing protocols the preferred choice due to their scalability. However, the working environment of VANETs is complex and interference-prone. In wireless-network communication, the channel contention introduced by the high density of vehicles, coupled with urban structures, significantly increases the difficulty of designing high-quality communication protocols. In this context, compared to topology-based routing protocols, location-based geographic routing is widely employed in VANETs due to its avoidance of the route construction and maintenance phases. Considering the characteristics of VANETs, this paper proposes a novel environment-aware adaptive reinforcement routing (EARR) protocol aimed at establishing reliable connections between source and destination nodes. The protocol adopts periodic beacons to perceive and explore the surrounding environment, thereby constructing a local topology. By applying reinforcement learning to the vehicle network's route selection, it adaptively adjusts the Q table through the perception of multiple metrics from beacons, including vehicle speed, available bandwidth, signal-reception strength, etc., thereby assisting the selection of relay vehicles and alleviating the challenges posed by the high dynamics, shadow fading, and limited bandwidth in VANETs. The combination of reinforcement learning and beacons accelerates the establishment of end-to-end routes, thereby guiding each vehicle to choose the optimal next hop and forming suboptimal routes throughout the entire communication process. The adaptive adjustment feature of the protocol enables it to address sudden link interruptions, thereby enhancing communication reliability. In experiments, the EARR protocol demonstrates significant improvements across various performance metrics compared to existing routing protocols. Throughout the simulation process, the EARR protocol maintains a consistently high packet-delivery rate and throughput compared to other protocols, as well as demonstrates stable performance across various scenarios. Finally, the proposed protocol demonstrates relatively consistent standardized latency and low overhead in all experiments.