Although railway transportation has rapidly evolved in recent times, the existing manual maintenance of rail infrastructure presents several challenges, such as labor intensity and tediousness. Computer vision, as one of the automatic inspection methods, shows promising prospects for railway detection tasks. In this study, we proposed a novel rail surface defect detection method based on a self-reference template and similarity evaluation. Firstly, a self-reference template was generated by extracting the domain information together with the longitudinal direction of the rail image. We further defined an image structural similarity index, and subsequently compared all the transversal rows with the background template. A coarse-to-fine segmentation method was further proposed to locate the defect. In the first step of the segmentation procedure, rows with large differences were selected using the Otsu algorithm adaptively. The exact position of the defect was then determined by utilizing both gradient magnitude and grayscale information. Our method was evaluated on a public rail surface defects dataset, which included two types of data. The experiment results showed that our method detected type-I and type-II defects with 89.04% and 89.61% recall, and 98.46% and 97.87% precision, respectively. This shows that our method achieved higher accuracy than the established detection algorithms.