In recent years, the number of people who endanger their lives has been increasing rapidly due to the mental burden of depression. The online social network (OSN) provides researchers with another perspective for detecting individuals suffering from depression. However, existing machine learning-based depression detection studies still leave relatively low classification performance, suggesting that there is significant improvement potential in their feature engineering. In this paper, we manually build and publish a large dataset on Sina Weibo (a leading OSN with the largest number of active users in the Chinese community), namely the Weibo User Depression Detection Dataset (WU3D). It includes more than 20,000 normal users and more than 10,000 depressed users, both of which are labeled and rechecked following the DSM-5 official medical and psychological depression document by professionals. Then, we conclude and propose ten statistical features by analyzing the user’s text, social behavior, and posted pictures. In the meantime, text-based word features are extracted using the popular pretrained model XLNet. Moreover, we fuse these features from heterogeneous modalities and implement a multitask learning scheme to train our proposed deep neural network classification model, i.e. FusionNet (FN). The experimental results show that FN has excellent to recognize depressed users on the OSN, achieving the highest F1-Score of 0.9772 on the test set. Compared to existing studies, the proposed method has better classification performance and robustness for unbalanced training samples, as well as reasonable training and inference time. Our work provides a method to fuse multimodal information to detect individual-level depression and has reference significance for similar studies on other OSNs.