End-to-end learning is an advanced framework in deep learning. It combines feature extraction, followed by pattern recognition (classification, clustering, etc.) in a unified learning structure. However, these deep networks face several challenges such as overfitting, vanishing gradient, computational complexity, information loss in layers, and weak robustness to noisy data/features. To address these challenges, this paper presents Deep and Wide Nonnegative Matrix Factorization (DWNMF) with embedded regularization for the feature extraction stage of the end-to-end models. DWNMF aims to identify more robust features while preventing overfitting via embedding regularization. For this purpose, DWNMF integrates input data with its noisy versions as diverse augmented channels. Then, the features in all channels are extracted in parallel using distinct network branches. The parameters of this model learn the intrinsic hierarchical features in the channels of complex data objects. Finally, the extracted features in different channels are aggregated in a single feature space to perform the classification task. To embed regularization in the DWNMF model, some NMF neurons in the layers are substituted by random neurons to increase the stability and robustness of the extracted features. Experimental results confirm that the DWNMF model extracts more robust features, prevents overfitting, and achieves better classification accuracy compared to state-of-the-art methods.