Quantitative estimation of multi-material density images is an important goal for Spectral CT imaging. However, material decomposition is a poorly-conditioned nonlinear inverse problem. Maximum-likelihood model-based material decomposition results in very noisy material density image estimates. One increasingly popular strategy for noise reduction is to apply deep neural networks for multi-material image formation. The most common loss function is mean squared error with respect to supervised target images such as ground truth or higher-dose cases. However, we believe that the mean-squared error loss function has several issues for multi-material image formation. In this work, we present a new loss function which includes multiple noise realizations with separate weights on covariance and bias for joint denoising of all material bases. By modulating these weights, it is possible to tune the image quality of neural network output images. To demonstrate our proposed approach, we conducted a simulation of a water/calcium/gadolinium spectral CT imaging scenario using a deep neural network for multi-material image denoising. Our results show that by changing the weights of our proposed loss function, it is possible to control the tradeoff between variance and bias for individual materials as well as the control over the bias coupling between materials.