Automated diagnosis of infant hip dysplasia is heavily affected by the individual differences among infants and ultrasound machines. Hip sonographic images of 493 infants from various ultrasound machines were collected in the Department of Orthopedics in Yangzhou Maternal and Child Health Care Service Centre. Herein, we propose a semi-supervised learning method based on a feature pyramid network (FPN) and a contrastive learning scheme based on a Siamese architecture. A large amount of unlabeled data of ultrasound images was used via the Siamese network in the pre-training step, and then a small amount of annotated data for anatomical structures was adopted to train the model for landmark identification and standard plane recognition. The method was evaluated on our collected dataset. The method achieved a mean Dice similarity coefficient (DSC) of 0.7873 and a mean Hausdorff distance (HD) of 5.0102 in landmark identification, compared to the model without contrastive learning, which had a mean DSC of 0.7734 and a mean HD of 6.1586. The accuracy, precision, and recall of standard plane recognition were 95.4%, 91.64%, and 94.86%, respectively. The corresponding area under the curve (AUC) was 0.982. This study proposes a semi-supervised deep learning method following Graf's principle, which can better utilize a large volume of ultrasound images from various devices and infants. This method can identify the landmarks of infant hips more accurately than manual operators, thereby improving the efficiency of diagnosis of infant hip dysplasia.