The fusion of high-resolution multispectral (HrMSI) and low-resolution hyperspectral images (LrHSI) has been acknowledged as a promising method for generating a high-resolution hyperspectral image (HrHSI), which is also termed to be an essential part for precise recognition and cataloguing of the underlying materials. In order to improve the fusion of LrHSI and HrMSI performance, in this article, we propose a novel Nonnegative Matrix Factorization Inspired Deep Unrolling Networks, dubbed NMF-DuNet, for fusing LrHSI and HrMSI. For this aim, initially, a variational fusion model regularized by non-negative sparse prior is proposed and then is solved through the gradient descent optimization method and unrolled towards the deep network. The nonnegative coefficient matrices and orthogonal of the proposed transform coefficients constraints are both incorporated into the proposed method. Moreover, the fusion of HrMSI and LrHSI heavily depends on an imaging model that explains the degeneracy of HSI in the spectral and spatial regions. Practically, the imaging model is often unknown. The degradation model is represented implicitly via a proposed network, and both the degradation model and sparse priors are jointly optimized through the training process of the proposed network. Instead of being hand-crafted, all the parameters of NMF-DuNet are learned end-to-end. Compared to the previous state-of-the-art model-based and learning-based fusion approaches, the hardware-friendly proposed NMF-DuNet outperforms both the model-based and learning-based fusion approaches and requires a far smaller number of trainable parameters and storage space while preserving real-time performance.