Recent years have witnessed many practical applications of supervised deep learning in seismic processing. However, a weak generalization behavior prevents widespread implementation on large-scale prestack data sets for coherent noise attenuation. This is particularly true when addressing strong near-surface scattered noise in land seismic data. To alleviate this problem, we have combined deep learning with an offset-vector tile (OVT) partitioning method to suppress strong scattered noise. With the OVT partitioning method, seismic data are spatially uniformly sampled, offering a favorable foundation for network learning. Specifically, the reflection probability distribution is more stationary than the noise distribution, making it easier for the network to learn the reflections. Accordingly, we use the direct signal learning strategy rather than the commonly used residual learning strategy to train the network. To construct high-quality training labels, we adopt the 3D continuous wavelet transform (3D CWT), which can exploit the 3D spatial correlation in OVT gathers. General use of these labels can produce results similar to 3D CWT but is highly efficient. To further improve denoising performance, we propose a training sample construction approach that leverages middle-offset OVT volumes with varying azimuths in light of midoffset relatively high signal-to-noise ratio characteristics. The field data experiment demonstrates that our proposed method also has an excellent generalization ability. Despite only using six middle-offset gathers for training, the trained network is able to effectively process 1260 OVTs in a timely manner.