The Internet of Things (IoT) technology has revolutionized various industries by allowing data collection, analysis, and decision-making in real time through interconnected devices. However, challenges arise in implementing Federated Learning (FL) in heterogeneous industrial IoT environments, such as maintaining model accuracy with non-Independent and Identically Distributed (non-IID) datasets and straggler IoT devices, ensuring computation and communication efficiency, and addressing weight aggregation issues. In this study, we propose an Uncertainty-Aware Federated Reinforcement Learning (UA-FedRL) method that dynamically selects epochs of individual clients to effectively manage heterogeneous industrial IoT devices and improve accuracy, computation, and communication efficiency. Additionally, we introduce the Predictive Weighted Average Aggregation (PWA) method to tackle weight aggregation issues in heterogeneous industrial IoT scenarios by adjusting the weights of individual models based on their quality. The UA-FedRL addresses the inherent complexities and challenges of implementing FL in heterogeneous industrial IoT environments. Extensive simulations in complex IoT environments demonstrate the superior performance of UA-FedRL on both MNIST and CIFAR-10 datasets compared to other existing approaches in terms of accuracy, communication efficiency, and computation efficiency. The UA-FedRL algorithm attain an accuracy of 96.83% on the MNIST dataset and 62.75% on the CIFAR-10 dataset, despite the presence of 90% straggler IoT devices, attesting to its robust performance and adaptability in different datasets.