Abstract

Although most of the machine learning and deep learning model expects the nature of data to have Gaussian representation for better predictive capability, in reality, this assumption does not hold true of late. In this paper, we explore the efficacy of an Artificial Neural Network (ANN) model using the K fold cross-validation strategy to deal with an imbalanced dataset. The objective of our work is to propose a robust scalable shallow ANN model which can handle data imbalance problems and train a large amount of data in less time. We demonstrate our model on the Redshift dataset from Astronomy where we consider the entire span of (0<z<7) which is highly skewed. The logarithmic transformation has been used here on redshift to tackle the issue of data imbalance. We have applied error metrics such as Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Normalized Median Absolute Deviation (NMAD) to measure the effectiveness of the ANN model and it shows the proposed model is feasible and reliable with a satisfactory RMSE (.89) and NMAD (.05) for a highly imbalanced dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call