We propose a deep learning (DL) model suitable for classifying target and non-target using a small amount of active sonar data. The proposed model uses two hand-crafted features (STFT and CQT) extracted from the same raw active sonar data, to complement each other and enhance the generalization of DL under insufficient data. The attention-based complementary learning module in proposed model reinforces one feature by referring to the other feature. The comprehensive feature from the complementary learning module pass through a shallow layer CNN and classify targets and non-targets. To verify the performance of the proposed model, we compared it with prevalent deep learning models including ResNet and ViT in terms of generalization performance and learning stability using two real-ocean datasets. The generalization performance of the proposed model having much smaller number of parameters was similar to or superior to the existing deep learning models depending on the training dataset, while the proposed model had the best learning stability for the two datasets.