Due to the physical characteristics of acoustic channels, the performance of underwater acoustic communication networks (UACNs) is more susceptible to the impacts of multipath and Doppler effects. Channel quality can serve as a measure of the reliability of underwater communication links. A cross-layer routing protocol based on channel quality (CLCQ) is proposed to improve the overall network performance and resource utilization. First, the BELLHOP ray model is used to calculate the channel impulse response combined with the winter sound speed profile data of a specific sea area. Then, the channel impulse response is integrated into the communication system to evaluate the channel quality between nodes based on the bit error rate (BER). Finally, during the selection of the next hop node, a reinforcement learning algorithm is employed to facilitate cross-layer interaction within the protocol stack. The optimal relay node is determined by the channel quality index (BER) from the physical layer, the buffer state from the data link layer, and the node residual energy. To enhance the algorithm’s convergence speed, a forwarding candidate set selection method is proposed which takes into account node depth, residual energy, and buffer state. Simulation results show that the packet delivery rate (PDR) of the CLCQ is significantly higher than that of Q-Learning-Based Energy-Efficient and Lifetime-Extended Adaptive Routing (QELAR) and Geographic and Opportunistic Routing (GEDAR).