Low latency, resource efficiency, and data privacy are some of the crucial requirements in modern communication networks. Federated learning can efficiently address these issues by utilizing the data at the network edge and processing massive amounts of data in parallel at the edge devices, thus ensuring data privacy and low latency. For a large-scale federated learning task spanning many devices, challenges arise due to device heterogeneity, data variability, and limited network resources. Optimal selection of edge devices participating in federated learning is essential to attaining resilient, reliable, and resource-efficient edge networks. In this context, this paper proposes an optimal device selection method to minimize redundant data training and improve network resource utilization without affecting the performance of federated learning over resource-constrained edge networks. The proposed optimal device selection method aims to minimize network resource demands while maximizing data diversity within the aggregated model. The performance of the proposed federated learning framework is evaluated using a publicly available image dataset of handwritten digits, EMNIST (an extended version of the MNIST dataset). Experimental results indicate that the proposed framework can obtain accuracy convergence performance on par with conventional federated learning methods while significantly reducing device usage (up to 50%) and resource utilization (up to 30%) while reaching 99% of achievable accuracy. The proposed method can therefore be effectively applied to resource-constrained edge networks.