Anaerobic co-digestion of waste activated sludge with wheat straw has been applied in this study. Four novel two-dimensional mathematical models (TDMMs) along with an artificial neural network (ANN) have been used to simulate and predict the biogas production via anaerobic co-digestion process. In addition, a proposed moth flame optimization (MFO) technique is used to identify the optimal structure of the proposed multilayer feedforward neural network (MFFNN) to predict the produced biogas, then, a comparison is conducted based on the results obtained from both TDMMs and ANN. The experimental results demonstrated that the co-digestion at 7% mixing ratio (straw to sludge based on weight) improved the C/N ratio to 35, and the highest yield of biogas (15-fold higher than sludge mono) was recorded, along with the largest reductions in the total solids (TS), volatile solids (TVS) and chemical oxygen demand (COD) with percentages of 58.06%, 66.55% and 74.67%, respectively. The four introduced TDMMs showed high correlation with the experimental data. Among them, the logistic kinetic model is considered the best one for the experimental data representation. However, the ANN results showed that the training, validation and testing of the MFFNN-MFO model yielded very high correlation coefficients in comparison with the other used models, demonstrating that it is the most useful tool for modeling the biogas production process. These findings can support decision-makers in the establishment of sustainable development strategies that utilize ecofriendly technologies for efficient power generation from biomass residues and in predicting the model behavior.