Scene classification in remote sensing (RS) images is a challenging task due to the lack of well-labeled data. Recently, deep transfer learning (DTL) has been proposed to handle this task. However, the intraclass variations and interclass similarities remain challenges. To handle these challenges, this letter presents a novel dropout-based adversarial training network (DATN) for RS scene classification. Specifically, a dropout-based label classifier (DLC) module is designed to reduce the selection of ambiguous features on class boundaries. Then, a dropout-based domain discriminator (DDD) module is constructed to capture multimodal structures of RS images so as to achieve fine-grained alignment between cross-domain distributions. Third, a joint distribution of features and labels is built to further enhance the performance. Experiments on seven public RS datasets show that our model outperforms several states of the art (SOTAs) under different conditions. The code of our method is publicly available at <uri>https://github.com/WangXin81/DATN-Submitted-to-IEEE-GRSL</uri>.