AbstractDegradation of coatings and structural materials due to high temperature corrosion in the presence of molten salt environment is a major concern for critical infrastructure applications to meet its commercial viability. The choice of high value coatings and structural (construction parts) materials comes with challenges, and therefore data centric approach may accelerate change in discovery and data practices. This research aims to use machine learning (ML) approach to estimate corrosion rates of materials when operated at high temperatures conditions (e.g., nuclear, geothermal, oxidation (dry/wet), solar applications) but geared towards nuclear thermochemical cycles. Published data related to materials (structural and coatings materials), their composition and manufacturing, including corrosion environment were gathered and analysed. Analysis demonstrated that random forest regression model is highly precise compared to other models. Assessment indicates that very limited sets of materials are likely to survive high temperature corrosive environment for extended period of exposure. While a higher quality and larger dataset are required to accurately predict the corrosion rate, the findings demonstrated the value of ML’s regression and data mining capabilities for corrosion data analysis. With the research gap in material selection strategies, proposed research will be critical to advancing data analytics approach exploiting their properties for high temperature corrosion applications. Graphical Abstract