As a powerful paradigm, deep learning (DL) models have been used in many applications for classification tasks in images, text, and audio. Through DL models, we can learn task-driven features from big data. However, DL models are fully deterministic and cannot handle uncertain and imprecise data. DL models are often sensitive to noise in data and do not operate well in areas where data are vague. Moreover, when there is a large feature set or high-dimensional data with irrelevant and redundant features, the DL models’ performance decreases in classification due to training with irrelevant features. To gain reliable results in such high-dimensional problems, DL models require a large amount of data which usually grows exponentially concerning the number of features. Data uncertainty problems and irrelevant features in DL models cause low performance in classification tasks. This paper proposes an optimized fuzzy deep learning (OFDL) model for data classification based on Non-Dominated Sorting Genetic Algorithm II (NSGA-II). OFDL utilizes optimization in the composition of DL and fuzzy learning via the NSGA-II in multi-modal learning. To achieve effective classification, OFDL first considers intelligent feature selection by finding the best trade-offs between two conflicting objective functions, minimizing the number of features, and maximizing the accuracy (maximizing weights of selected features). Next, to reach optimized backpropagation and fuzzy membership functions, OFDL utilizes Pareto optimal solutions for multi-objective optimization using NSGA-II based on their objective functions. Furthermore, the fusion layer in OFDL fused optimized views of DL and fuzzy learning that provides a high-level representation of inputs and optimum features for classification tasks where the data contain high uncertainties and noises. This functionality gives valuable attributes during classification since identifying and selecting appropriate features ensures prompt and correct class. Also, it provides deep insight into tackling the effect of ambiguous data and each feature’s uncertainty in the classification tasks. The examination of OFDL reveals good performance in terms of F-measure, accuracy, recall, precision, and True Positive Rate (TPR) compared to fuzzy classifiers. Furthermore, OFDL has higher accuracy in classification tasks than earlier fuzzy DNN models.