Visual saliency prediction has achieved significant improvements with the advent of convolutional neural networks, but the breakthrough in saliency prediction accuracy comes at the high computational cost. In this paper, we present a lightweight saliency prediction model based on scaled up convolutional neural networks (CNN), utilizing image activity guided collaboration learning of global and local information at multiple scales. we use a pseudo-siamese network with a scaled up network (EfficientNet) as the backbone, and the two branches of the network respectively capture the global saliency feature and high-level local feature. Concretely, we first utilize the image complexity-related activity features (Image Activity Measure) as our low-level local salience prior, and then feed the input images and the activity maps to scaled up CNN modules to further learn high-level features in a multi-scale collaboration manner. Through extensive evaluation, we show that the proposed method exhibits competitive and consistent results on the challenging benchmark datasets, and our method has better prediction performance, fewer trainable parameters and faster inference speed. Moreover, the proposed model has low requirements for platform computing capabilities, which improves the universality of saliency application scenarios.