With the development of the underwater acoustic (UWA) adaptive communication system, energy-efficient transmission has become a critical topic in underwater acoustic (UWA) communications. Due to the unique characteristics of the underwater environment, the transmitter node will almost always have outdated channel state information (CSI), which results in low energy efficiency. In this paper, we take full advantage of bidirectional links and propose an adaptive modulation and coding (AMC) scheme that aims to maximize the long-term energy efficiency of a single link by jointly scheduling the coding rate, modulation order, and transmission power. Considering the complexity characteristics of UWA channels, we proposed a bit error ratio (BER) estimation method based on deep neural networks (DNN). The proposed network could realize channel estimation, feature extraction, and BER estimation by using a fixed pilot of the feedback link. Then, we design a channel classification method based on the estimated BERs of the modulation and coding scheme (MCS) and further model the UWA channels as a finite-state Markov chain (FSMC) with an unknown transition probability. Thus, we formulate the AMC problem as a Markov Decision Process (MDP) and solve it through a reinforcement learning framework. Considering the large state-action pairs, a double deep Q-network (DDQN) based scheme is proposed. Simulation results demonstrate that the proposed AMC scheme outperforms the fixed MCS with a perfect channel information state, and achieves near-optimal energy efficiency.