Single-channel blind separation (SCBS) of co-frequency signals has been a challenging task in the blind signal processing field. Most effective approaches were still formulated in signal space, triggering complicated signal processing and high computational cost. To reduce the complexity and improve the separation accuracy of SCBS, in this paper, we propose a new waveform separation-demodulation scheme. As the first quintessential step, we design a convolutional time-domain network with squeeze-and-excitation blocks (CNSE) to carry out waveform separation in latent feature space, then the separated results are demodulated by the low-complexity per-survivor processing (LC-PSP) method. Our network adopts the backbone structure of encoder, separator, and decoder. And, it is worth noting that we deepen the encoder/decoder of the network to emphasize the low-level features and further weigh these features by means of the channel attention module. The encoder-decoder structure helps to eliminate the influence of noise during the stage of signal feature extraction and signal reconstruction. In our experiment, CNSE surpasses long short-term memory network and convolutional time-domain audio separation network as evaluated by signal-to-distortion ratio, signal-to-interferences ratio, and symbol error rate. Additionally, to reduce the influence of the truncation effect in separation, we also propose a data pre-processing method called redundancy protection (RP), and our trials reveal that RP can compensate for the performance of CNSE in the whole signal-to-noise ratio range. It is also verified that, compared with the traditional PSP algorithm, our method has better noise robustness and higher separation efficiency, especially when processing more complex mixtures.