A deep learning (DL) network for 2D-based breast mass segmentation in unenhanced dedicated breast CT images was developed and validated, and its robustness in radiomic feature stability and diagnostic performance compared to manual annotations of multiple radiologists was investigated. 93 mass-like lesions were extensively augmented and used to train the network (n = 58 masses), which was then tested (n = 35 masses) against manual ground truth of a qualified breast radiologist with experience in breast CT imaging using the Conformity coefficient (with a value equal to 1 indicating a perfect performance). Stability and diagnostic power of 672 radiomic descriptors were investigated between the computerized segmentation, and 4 radiologists' annotations for the 35 test set cases. Feature stability and diagnostic performance in the discrimination between benign and malignant cases were quantified using intraclass correlation (ICC) and multivariate analysis of variance (MANOVA), performed for each segmentation case (4 radiologists and DL algorithm). DL-based segmentation resulted in a Conformity of 0.85 ± 0.06 against the annotated ground truth. For the stability analysis, although modest agreement was found among the four annotations performed by radiologists (Conformity 0.78 ± 0.03), over 90% of all radiomic features were found to be stable (ICC>0.75) across multiple segmentations. All MANOVA analyses were statistically significant (p ≤ 0.05), with all dimensions equal to 1, and Wilks’ lambda ≤0.35. In conclusion, DL-based mass segmentation in dedicated breast CT images can achieve high segmentation performance, and demonstrated to provide stable radiomic descriptors with comparable discriminative power in the classification of benign and malignant tumors to expert radiologist annotation.