The frequency-dependent optical spectrum is pivotal for a broad range of applications from material characterization to optoelectronics and energy harvesting. Data-driven surrogate models, trained on density functional theory (DFT) data, have effectively alleviated the scalability limitations of DFT while preserving its chemical accuracy, expediting material discovery. However, prevailing machine learning (ML) efforts often focus on scalar properties such as the band gap, overlooking the complexities of optical spectra. In this work, we employ deep graph neural networks (GNNs) to predict the frequency-dependent complex-valued dielectric function across the infrared, visible, and ultraviolet spectra directly from the crystal structures. We explore multiple architectures for the spectral multioutput representation of the dielectric function and utilize various multifidelity learning strategies, such as transfer learning and fidelity embedding, to address the challenges associated with the scarcity of high-fidelity DFT data. Additionally, we model key solar cell absorption efficiency metrics, demonstrating that learning these parameters is enhanced when integrated through a learning bias within the learning of the frequency-dependent absorption coefficient. This study demonstrates that leveraging multioutput and multifidelity ML techniques enables accurate predictions of optical spectra from crystal structures, providing a versatile tool for rapidly screening materials for optoelectronics, optical sensing, and solar energy applications across an extensive frequency spectrum.
Read full abstract