AbstractAccurate prediction of software effort is important for planning, scheduling, and allocating resources. However, software effort estimation has been a challenging task. Although numerous estimation models have been proposed, few achieve anything close to accurate prediction of software development effort. To achieve optimal results, machine learning techniques have recently been employed for predicting software development effort using relatively large software repositories. However, some issues remain unresolved, and this paper aims to address the following issues. First, feature selection methods often neglected the information rich variables present in the dataset. Second, selection of important features was done through statistical methods, which lack domain knowledge. Third, missing values in the data that significantly influence the prediction outcome was not efficiently handled. Fourth, majority of the literature neglected advanced evaluation measures, which thoroughly evaluate the ability of learning models to produce accurate results. To address the above issues, a machine learning‐based model has been proposed in this paper, which not only allows effective preprocessing of data but also provides highly accurate prediction results with minimum error rate. The purpose is to best identify attributes (predictors) from large software repositories that are most influential in the estimation of effort. In addition, we apply MMRE for better performance analysis.