Super-resolution (SR) techniques aim to restore a high-resolution image from low-resolution images, which are often used to assist the enhancement of image/video quality under the rapid development of high-resolution and high-framerate media. Recently, neural network (NN)-based methods perform much better image reconstruction quality than classical approaches. However, the unacceptable computation complexity as well as the huge memory footprints of NNs limit the throughputs and scalability of these SR systems. In this work, we analyze several key issues in the design of NN-based SR systems first. Then, we propose a three-level systematic optimization methodology for SR systems to reduce computation overhead and keep image quality. At the algorithm level, we introduce image blocking to SR tasks and develop a block-wise SR algorithm based on the hybrid of NN and interpolation with a consistent image block evaluation metric. The configurable hybrid parameters help the SR algorithm to achieve a flexible trade-off between the computation overhead and image quality. At the operator level, we focus on the transpose convolution operators commonly used for upsampling in SR neural networks. We propose an efficient Winograd-based transposed convolution acceleration method. Through the efficient sub-convolutions conversion and the Winograd specialization, this methods enables unified Winograd transformations and simplified data access patterns. At the data level, we propose a novel quantization method for Winograd-aware SR neural networks to get better quantized accuracy. Comprehensive evaluations demonstrate the effectiveness of these optimizations. Our SR system reduces a large number of multiplications with great scalability and supports 4K@120fps and 8K@30fps outputs with acceptable image quality degradation.