To learn a well-performed image annotation model, a large number of labeled samples are usually required. Although the unlabeled samples are readily available and abundant, it is a difficult task for humans to annotate large numbers of images manually. In this article, we propose a novel semi-supervised approach based on adaptive weighted fusion for automatic image annotation that can simultaneously utilize the labeled data and unlabeled data to improve the annotation performance. At first, two different classifiers, constructed based on support vector machine and covolutional neural network, respectively, are trained by different features extracted from the labeled data. Therefore, these two classifiers are independently represented as different feature views. Then, the corresponding features of unlabeled images are extracted and input into these two classifiers, and the semantic annotation of images can be obtained respectively. At the same time, the confidence of corresponding image annotation can be measured by an adaptive weighted fusion strategy. After that, the images and its semantic annotations with high confidence are submitted to the classifiers for retraining until a certain stop condition is reached. As a result, we can obtain a strong classifier that can make full use of unlabeled data. Finally, we conduct experiments on four datasets, namely, Corel 5K, IAPR TC12, ESP Game, and NUS-WIDE. In addition, we measure the performance of our approach with standard criteria, including precision, recall, F-measure, N+, and mAP. The experimental results show that our approach has superior performance and outperforms many state-of-the-art approaches.