Synthetic Aperture Radar (SAR) is a unique remote sensing instrument imaging ocean surface waves in two dimensions with high spatial resolution regardless of sunlight and weather conditions. However, due to the nonlinear imaging process, the ocean wave spectra cannot be retrieved directly from SAR data. The emergence of deep learning (DL) techniques provides a new paradigm for addressing this challenge. In this paper, a deep-learning-based model, called Wave-Spec-CNN, is proposed for retrieving omni-directional ocean wave spectra from SAR data. This model is constructed using approximately 21,000 collocations of Sentinel-1 Interferometric Wide swath mode images matched with global in-situ buoy data. The model adapts the convolution neural network (CNN) to accommodate the multi-valued nature of omni-directional ocean wave spectra, enhances performance by integrating a calibration branch and further incorporates physical characteristics into the training process. The results demonstrate consistency with buoy measurements for significant wave height (SWH) in the range of 0.5 m to 6 m, yielding a root-mean-square error (RMSE) of 0.51 m on the validation dataset, comparable to traditional physical-based methods. In terms of mean wave period (MWP) and peak frequency (PF), the achieved RMSEs are of 1.24 s and 0.03 Hz, respectively. The retrieved omni-directional ocean wave spectra also allow to separate swell and windsea components for respective comparisons with those derived by in-situ buoy data. The RMSEs of respective SWH comparisons are of 0.46 m and 0.42 m. This research represents an initial endeavor into utilizing DL for the long-standing challenge of SAR inversion for ocean wave spectra, as well as providing valuable insights for employing DL in multi-parameter inversion tasks in remote sensing.