Multi-threaded design of the applications has given a new dimension to the execution concurrency. The Network-on-Chip (NoC) infrastructure has evolved as the communication backbone in multiprocessor System-on-Chips (MPSoCs) to exploit Thread-Level Parallelism (TLP). Mapping the threads of various applications on the processing cores is crucial in utilizing the resources of a many-core system. An application's performance depends primarily on the mapping strategy. However, the performance of an application gets influenced significantly by the underlying network architecture of the NoC. The present work proposes a run-time application mapping technique based on the Cube-Tree-Hybrid (CTH) topology focusing on the traffic load balancing across the network. The mapping overhead decreased significantly by exploiting the low network diameter of CTH with its high scalability and path diversity, projecting a 31% reduction in normalized execution time over the prevalent algorithms on the existing mapping platforms. Rigorous experimentation with varying application and network sizes is carried out across mesh, ZMesh, torus, and Mest-of-Tree (MoT) networks. The experimental observations project that the proposed algorithm results in sustainable application performance with 15%-21% gain in the overall energy consumption over the existing state-of-the-art techniques across various benchmark workloads.