The speed of sound (SoS) has great potential as a quantitative imaging biomarker since it is sensitive to pathological changes in tissues. In this paper, a target-aware deep neural (TAD) network reconstructing an SoS image quantitatively from pulse-echo phase-shift maps gathered from a single conventional ultrasound probe is presented. In the proposed TAD network, the reconstruction process is guided by feature maps created from segmented target images for accuracy and contrast. In addition, the feature extraction process utilizes phase difference information instead of direct pulse-echo radio frequency (RF) data for robust image reconstruction against noise in the pulse-echo data. The TAD network outperforms the fully convolutional network in root mean square error (RMSE), contrast-to-noise ratio (CNR), and structural similarity index (SSIM) in the presence of nearby reflectors. The measured RMSE and CNR are 5.4 m/s and 22 dB, respectively with the tissue attenuation coefficient of 2 dB/cm/MHz, which are 72% and 13 dB improvement over the state of the art design in RMSE and CNR, respectively. In the in-vivo test, the proposed method classifies the tissues in the neck area using SoS with a p-value below 0.025. The proposed TAD network is the most accurate and robust single-probe SoS image reconstruction method reported to date. The accuracy and robustness demonstrated by the proposed SoS imaging method open up the possibilities of wide-spread clinical application of the single-probe SoS imaging system.