Abstract

In the big data era, data-driven machine learning technology acts as a vital role in the development of energy materials. Innovative research on high-quality energy material databases, appropriate material feature descriptors, efficient prediction and generation models are urgent and will bring significant breakthroughs to the discovery of new energy materials. To realize the sustainable development of society, advanced materials for energy storage and conversion are urgently needed. For a long time, the development of new materials relies heavily on tedious trial and error experiments, which have long cycles and high costs, far from modern requirements for advanced materials. With the rapid development of supercomputers and the wide application of density functional theory, high-precision first principle theoretical calculation has been widely used in the process of material design. With the help of quantum mechanics, researchers can select the best-performed materials from thousands of candidates quickly, which greatly shortens the process of new material development. However, the cost of this ab initio calculation method is still high, and sometimes would be stuck in large-scale complex systems. The above disadvantages limit its application in further material discovery. In the big data era, with the explosive growth of new material samples, the effective management and utilization of existing data is the key to accelerating material design. Therefore, the greatest challenge is how to evaluate and analyze quickly and efficiently existing data sets to find out hidden rules. Artificial intelligence (AI) technology, which could extract information from massive data, has been applied in all aspects of our daily life and has gradually penetrated the field of natural science to help experimental scheme design or functional material screen. Machine learning is the core of AI as well as the key to realizing computer intelligence. Based on machine learning, computers can automatically learn the relationship between the characteristics and properties of data samples usually beyond human cognition and discover the hidden laws behind high-dimensional data. Then it could be used to predict the properties of unknown samples and generate potentially new ones. In recent years, with the rapid growth of material databases, machine learning has also been applied in materials science. The introduction of machine learning has greatly reduced the cost of high-throughput computational material screening and accelerated the research and development process of energy materials. Meanwhile, the feature analysis of the built models can further deepen our understanding of material structure–property relationships and help to explore new theoretical mechanisms. At present, machine learning-assisted material development methods have been used in the field of new energy materials. For example, several breakthroughs have been made in photovoltaic materials, thermoelectric materials, lithium-ion batteries, catalytic hydrogen production, and so on.1-3 In photovoltaic material discovery, machine learning models have already been used to predict the optical band gaps, photoelectric conversion efficiencies, and other related properties, with appropriate data-mining and feature-extracting algorithms, which avoid the tedious density functional theory calculation. At the same time, several effective generation algorithms can automatically generate potential high-performance materials in the prescreening process. For example, the machine learning model can realize the full-space search of perovskite photovoltaic materials by screening key parameters such as band gap and formation energy, and afterward, unexplored high-performance photovoltaic materials could be created. In addition, after the model training, the weight analysis can help us to understand the sample characteristics in depth, which is of great significance in inspiring and designing new materials. In addition to photovoltaic materials, thermoelectric material is another kind of clean energy material that can convert thermal energy into electrical energy directly. The thermoelectric properties are related to the thermoelectric conversion Seebeck coefficient, thermal conductivity, electrical conductivity, and many other properties of the material. Through natural language recognition and first-principles calculations, some experimental and computational databases for thermoelectric materials have been developed. These databases collected hundreds of thermoelectric materials with their thermoelectric merit (ZT), Seebeck coefficient, thermal conductivity, and other related information at different temperatures. With these samples, researchers trained predictive models to explore the most important factors of thermoelectric conversion properties, such as ZT or Seebeck coefficients. Besides, machine learning models are also widely used to predict the lattice thermal conductivity, one of the most important factors determining the energy conversion between thermal and electrical energy. In addition to periodic solid materials, machine learning algorithms have significant advantages in large-scale and irregular systems. In the field of new energy materials, apart from energy conversion, energy storage is also very crucial. Lithium-ion battery is the most important energy storage device, while it is still unsatisfactory in energy density, power density, cycle life, cost and safety. Compared with traditional ways, AI methods could significantly accelerate the development of new battery systems. By learning the data in literature or existing databases, efficient machine learning models can be used to develop novel electrode materials, greatly improving the screening efficiency and finding reliable new materials. Currently, machine learning has shown excellent performances in the prediction of electrolyte and electrode materials properties, battery state, and lifetimes. In addition, catalysts play a vital role in the new energy industry, such as the photo-dissociation of water to hydrogen, carbon dioxide reduction, and fuel cells. Therefore, efficient catalyst materials are highly desired to achieve higher reaction efficiency. In design of new catalytic materials, machine learning models can screen catalysts via rapidly predicting adsorption energy of crucial intermediate on catalysts. At the same time, through the weight analysis of model parameters, we can fully understand the relationship between catalyst structure and performance. In short, machine learning acts as a vital role in the development of energy materials. The explosive growth of high-performance algorithms and material databases provides fertile importantground for the booming of machine learning in energy materials development. However, as a new attempt, machine learning still faces not only opportunities but challenges. First, as a data-driven method, sufficient data is a necessary condition to ensure the accuracy of the training model. With the development of material genetic engineering, more and more commercial or open-source material databases have been developed. There still are many published data not being collected and collated. Compared with databases in stock exchange, public transportation, or biomedicine, most material databases were limited with their versatility, normalization, and scale. At the same time, few databases collect negative data and materials with poor performance, which also play important roles in model training. To solve the above-mentioned problems, researchers, in one hand, should establish a generally accepted data storage standard and develop an open collaboration framework for machine-readable format data. It will achieve data standardization and promote data sharing. In the other hand, the development of professional natural language recognition technology is also helpful to enlarge the scale of material databases. For example, text recognition and mining technology has been applied to chemistry and materials science. Secondly, the performance of the training model to predict material properties largely depends on the appropriate description of materials. In most situations, the processes of feature selection and material description rely on the intuition of researchers. Generally, the descriptions are constructed by encoding their structural or electronic attributes, such as mass, atomic number, atomic type, electronic bandgap, dielectric constant, work function, electron density, electron affinity, and so on. Therefore, researchers' understanding and cognition of the problem play a decisive role in feature engineering, which will further affect the trained model. However, even excellent researchers cannot exhaust and figure out all the best features and the most efficient encoding. Therefore, automated feature engineering is proposed. Compared with the manual way, automated feature engineering is more efficient and repeatable. For example, deep learning provides an opportunity for automatic feature extraction and continuous training, which can reduce the incompleteness of manual operation, and is an important research trend in the future. Third, it is also to select the appropriate machine learning algorithm or algorithm collection before training the model. The selection of the algorithm depends on not only the training datasets, including their size, distribution, and internal correlation but also the problems to be solved. There is no universal algorithm suitable for all problems. And sometimes, we have to integrate multiple algorithms to obtain effective models. In addition, with the expansion of databases, time consumption also should be paid attention in the process of algorithm selection and model optimization. Besides, machine learning models and material generation models are also important directions of machine learning in the field of energy materials. The machine learning model has long been considered a “black box” connecting input and output. It is difficult for us to extract knowledge from the model and summarize it into general scientific laws. Therefore, the interpretability of machine learning models is also a key challenge. Developing interpretable algorithms, converting models into formulas, and summarizing scientific laws are important directions in the future. In conclusion, as a relatively new direction in computational materials science, data-driven machine learning material screening methods will provide great opportunities for the design of new energy materials. Meanwhile, chemical synthesis robots equipped with artificial intelligence technology have also been paid attention to in recent years.4-6 Combining machine learning material screening with intelligent manufacturing will further accelerate the development of new energy materials (Figures 1 and 2). This study was supported by the National Natural Science Foundation of China (22003046 and 22071172) and the research program “A Multi-Scale and High-Efficiency Computing Platform for Advanced Functional Materials,” funded by Haihe Laboratory in Tianjin (Grants No. 22HHXCJC00007). The authors declare no conflicts of interest.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call