Rapid development realms of parallel architectures and its heterogeneity have inspired researchers to invent new scheduling strategies to efficiently distribute workloads among these architectures in a way that may lead to better performance. This paper presents a comprehensive study on optimizing resource utilization for large-scale problems by employing architecture-aware scheduling techniques. We conducted a series of experiments to measure the execution times of various architectures with different problem sizes. These experiments have been conducted multiple times to minimize measurement variance. The findings from these experiments are utilized to develop a scheduling strategy that enables faster completion of larger data-parallel problems while maximizing resource utilization. The proposed approach makes performance enhancement with 16.7% for large data size. It has a significant impact on enhancing computational efficiency and reducing costs in high-performance computing environments.