We propose strategies that couple natural language processing with deep learning to enhance machine capability for corrosion-resistant alloy design. First, accuracy of machine learning models for materials datasets is often limited by their inability to incorporate textual data. Manual extraction of numerical parameters from descriptions of alloy processing or experimental methodology inevitably leads to a reduction in information density. To overcome this, we have developed a fully automated natural language processing approach to transform textual data into a form compatible for feeding into a deep neural network. This approach has resulted in a pitting potential prediction accuracy substantially beyond state of the art. Second, we have implemented a deep learning model with a transformed-input feature space, consisting of a set of elemental physical/chemical property-based numerical descriptors of alloys replacing alloy compositions. This helped identification of those descriptors that are most critical toward enhancing their pitting potential. In particular, configurational entropy, atomic packing efficiency, local electronegativity differences, and atomic radii differences proved to be the most critical.
Read full abstract