Since the infrared (IR) image and the visible (VI) image reflect different contents of the same scene, it is not appropriate to fuse them with the same representation and similar features. Gradient transfer fusion (GTF) using l 1 norm can well address the issue. This paper demonstrates that using l 2 norm can also address it well, based on our novel proposed fusion model. We formulate the fusion task as an l 2 norm optimization problem, where the first term measured by l 2 norm tends to constrain the fused image to have similar pixel intensities as the IR image, and the second term computed by l 2 norm tends to force the fused image to have similar gradient distribution as the VI image. As the fused image obtained by directly optimizing l 2 norm is smooth, we introduce two weights into our objective function to address this issue, which is inspired by a weighted least squares filtering (WLSF) framework. Different from l 1 norm-based methods such as GTF, our method can obtain the mathematical function formula between the source images and the fusion result as l 2 norm is differentiable, which is effective and efficient. The mathematical function formula makes our method not only significantly different from the current fusion methods, but also lower in computation cost than most fusion methods. Experimental results demonstrate that our method outperforms GTF and most state-of-the-art fusion methods in terms of visual quality and evaluation metrics, where our fusion results look like IR images with abundant VI appearance information.