Due to the 3D spatiotemporal regularities of natural videos and small-scale video quality databases, effective objective video quality assessment (VQA) metrics are difficult to obtain but highly desirable. In this paper, we propose a general-purpose no-reference VQA framework that is based on weakly supervised learning with a convolutional neural network (CNN) and a resampling strategy. First, an eight-layer CNN is trained by weakly supervised learning to construct the relationship between the deformations of the 3D discrete cosine transform of video blocks and the corresponding weak labels judged by a full-reference (FR) VQA metric. Thus, the CNN obtains the quality assessment capacity converted from the FR-VQA metric, and the effective features of the distorted videos can be extracted through the trained network. Then, we map the frequency histogram calculated from the quality score vectors predicted by the trained network onto the perceptual quality. Especially, to improve the performance of the mapping function, we transfer the frequency histogram of the distorted images and videos to resample the training set. The experiments are carried out on several widely used VQA databases. The experimental results demonstrate that the proposed method is on a par with some state-of-the-art VQA metrics and has promising robustness.