Automatic vertebrae localization and identification in medical computed tomography (CT) scans is of great value for computer-aided spine diseases diagnosis. In order to overcome the disadvantages of the approaches employing hand-crafted, low-level features and based on field-of-view priori assumption of spine structure, an automatic method is proposed to localize and identify vertebrae by combining deep stacked sparse autoencoder (SSAE) contextual features and structured regression forest (SRF). The method employs SSAE to learn image deep contextual features instead of hand-crafted ones by building larger-range input samples to improve their contextual discrimination ability. In the localization and identification stage, it incorporates the SRF model to achieve whole spine localization, then screens those vertebrae within the image, thus relieves the assumption that the part of spine in the field of image is visible. In the end, the output distribution of SRF and spine CT scans properties are assembled to develop a two-stage progressive refining strategy, where the mean-shift kernel density estimation and Otsu method instead of Markov random field (MRF) are adopted to reduce model complexity and refine vertebrae localization results. Extensive evaluation was performed on a challenging data set of 98 spine CT scans. Compared with the hidden Markov model and the method based on convolutional neural network (CNN), the proposed approach could effectively and automatically locate and identify spinal targets in CT scans, and achieve higher localization accuracy, low model complexity, and no need for any assumptions about visual field in CT scans.
Read full abstract