Data heterogeneity presents enormous challenges for Federated Learning (FL) where data is characterized by non-independent and identically distributed (non-IID) patterns across participating clients. Although a prominent issue, there is a lack of comprehensive studies regarding the efficacy of FL algorithms under different configurations of hyperparameters and models. Three state-of-the-art algorithms were chosen to be the focus of this paper FedAvg, FedProx, and SCAFFOLD. Experiments of the paper study varying degrees of data imbalance and number of local training steps to see their effects on model training results. The algorithms and hyperparameters were evaluated using ResNet50, AlexNet, and DenseNet121 on the CIFAR-10 dataset. Experimental results reveal that FedAvg and SCAFFOLD generally outperform FedProx, except for the latter excelling in the scenario of extreme non-IID data distributions in combination with minimal local training. It was also observed that models with complex architectures like ResNet50 are more susceptible to data imbalances, while simpler models such as AlexNet prove to be more robust. This study provides valuable insights to how different configurations of hyperparameters affect FL in non-IID scenarios, contributing to future development of more efficient and robust FL algorithms.