An implicit unstructured grid density-based solver–PRAVAHA based on a parallel variant of the lower upper symmetric Gauss-Seidel (LUSGS) method is developed to compute large-scale engineering problems. A four-layered parallel algorithm is designed to efficiently compute three-dimensional turbulent flows on massively parallel modern multiple instruction multiple data-stream (MIMD) computational hardware. This data parallel approach achieves multiple layers of parallelism including continuity of flow solution, transfer of solution gradients, and calculation of drag/lift/solution residuals, right up to the innermost implicit LUSGS solver sub-routine, which is relatively less explored in the literature. Domain decomposition is performed using the METIS software based on multi-level graph partitioning algorithms. Non-blocking message-passing interface library functions are used to manage inter-processor communication through explicit message passing, efficiently. Super-linear scalability of the parallel solver is established on the current state-of-the-art supercomputing facility, the 838 teraflops PARAM seva on up to 6144 cores. Linear or even super-linear speedup on problems of significant size is observed even on ad-hoc parallel computing platforms like workstations and multi-node clusters, for turbulent flow simulations.