Optical and SAR image registration is important for multi-modal remote sensing image information fusion. Recently, deep matching networks have shown better performances than traditional methods on image matching. However, due to significant differences between optical and SAR images, the performances of existing deep learning methods still need to be further improved. This paper proposes a self-distillation feature learning network (SDNet) for optical and SAR image registration, improving performance from network structure and network optimization. Firstly, we explore the impact of different weight-sharing strategies on optical and SAR image matching. Then, we design a partially unshared feature learning network for multi-modal image feature learning. It has fewer parameters than the fully unshared network and has more flexibility than the fully shared network. Additionally, the limited binary supervised information (matching or non-matching) is insufficient to train the deep matching networks for optical-SAR image registration. Thus, we propose a self-distillation feature learning method to exploit more similarity information for deep network optimization enhancing, such as the similarity ordering between a series of non-matching patch-pairs. The exploited rich similarity information will significantly enhance network training and improve matching accuracy. Finally, considering that existing deep learning methods brute-force constrain the features of the matching optical and SAR image patches are similar, which will be lost many discriminative information, degenerating matching performances. Thus, we build an auxiliary task reconstruction learning to optimize the feature learning network to keep more discriminative information. Extensive experiments demonstrate the effectiveness of our proposed method on multi-modal image registration.