In this paper, we tackle a new learning paradigm called learning from complementary labels, where the training data specifies classes that instances do not belong to, instead of the accuracy labels. In general, it is more efficient to collect the complementary labels compared with collecting the supervised ones, with no need for selecting the correct one from a number of candidates. While current state-of-the-art methods design various loss functions to train competitive models by the limited supervised information, they overlook learning from the data and model themselves, which always contain fruitful information that can improve the performance of complementary label learning. In this paper, we propose a novel learning framework, which seamlessly integrates self-supervised and self-distillation to complementary learning. Based on the general complementary learning framework, we employ an entropy regularization term to guarantee the network outputs exhibit a sharper state. Then, to intensively learn information from the data, we leverage the self-supervised learning based on rotation and transformation operations as a plug-in auxiliary task to learn better transferable representations. Finally, knowledge distillation is introduced to further extract the “dark knowledge” from a network to guide the training of a student network. In the extensive experiments, our method surprisingly demonstrates compelling performance in accuracy over several state-of-the-art approaches.