Abstract Combustion simulation is complex and computationally expensive as it involves integration of fundamental chemical kinetics and multidimensional Computational Fluid Dynamics (CFD) models. This paper presents our efforts porting a real-world supersonic combustion simulation application to the heterogeneous architecture consists of multi-core CPUs and Intel Many Integrated Core (MIC) coprocessors. Scalable OpenMP parallelization is added to make use of the large number of cores on CPUs and MIC coprocessors. Single thread performance optimizations are addressed to improve the computational efficiency. CPU and MIC collaborative algorithm, along with a series of techniques to improve the data transfer efficiency and load balance, are applied. Performance evaluation is performed on the Tianhe-2 supercomputer. The results show that on a single node, the optimized CPU-only version is 8.33 times faster than the baseline version, and the CPU + MIC heterogeneous version is again 3.07 times faster than the optimized CPU-only version. The resulting codes effectively scale to 5120 nodes (998,400 cores) on a mesh with 27.46 Giga cells. Given that the total number of floating-point operations is reduced by about 10 times after our optimizations, the heterogeneous version still achieves a sustained double precision floating-point performance of 0.46 Pflops on 5120 nodes. This demonstrates Petascale heterogeneous computing capabilities for real-world supersonic combustion problems.