Organic contaminants such as polycyclic aromatic compounds (PACs) occurring in industrial effluents can not only persist in wastewater but also undergo transformation into more toxic and mobile substituted heterocyclic products during their treatment. Thus, predicting the occurrence of PACs and their heterocyclic derivatives (HPACs) in coking wastewater is of utmost importance to reduce the environmental risks of receiving water bodies. While HPACs can be monitored through sampling and analysis, their characterisation techniques are costly and time-consuming. In this study, we propose 3 distinct kernel-based machine learning (ML) models for predicting PACs including substituted HPACs and alkylated PACs occurring in coking wastewater. By using routinely measured wastewater quality data, as input for our models, we predicted the occurrence of 14 HPACs in the final effluent with an R2 of 0.83. Further performance assessment of the regression model based on support vector machine (SVR) showed a logarithmic error (MALE) of 0.46 and square error (RMSE) of 0.073 ng/L. Comparatively, K-nearest neighbor and random forest models showed an R2 of 0.75 and 0.76 respectively for HPAC prediction. Further model exploration through feature analysis revealed that the superior predictability of SVR model was based on its higher weightage (81%) towards input variables of dissolved organic carbon and total ammonia which could capture the underlying secondary transformations likely occurring in the treatment plant. Based on partial dependence plots, ammonia levels higher than 120 mg/L and DOC levels of 50-60 mg/L were indicative of higher HPACs dissolved in coking effluent. This work highlights the capability of kernel-based ML models in capturing nonlinear wastewater chemistry and offer a tool for monitoring trace organic contaminants released in coking wastewater effluents.