Abstract

The ability to auto-generate databases of optical properties holds great prospects in data-driven materials discovery for optoelectronic applications. We present a cognate set of experimental and computational data that describes key features of optical absorption spectra. This includes an auto-generated database of 18,309 records of experimentally determined UV/vis absorption maxima, λmax, and associated extinction coefficients, ϵ, where present. This database was produced using the text-mining toolkit, ChemDataExtractor, on 402,034 scientific documents. High-throughput electronic-structure calculations using fast (simplified Tamm-Dancoff approach) and traditional (time-dependent) density functional theory were executed to predict λmax and oscillation strengths, f (related to ϵ) for a subset of validated compounds. Paired quantities of these computational and experimental data show strong correlations in λmax, f and ϵ, laying the path for reliable in silico calculations of additional optical properties. The total dataset of 8,488 unique compounds and a subset of 5,380 compounds with experimental and computational data, are available in MongoDB, CSV and JSON formats. These can be queried using Python, R, Java, and MATLAB, for data-driven optoelectronic materials discovery.

Highlights

  • The availability of materials databases that comprise cognate experimental and computational data would place computational calculations in an advantageous position, whereby their associated wavefunctions could be used to proliferate many more data, with the confidence that these data would be reliable; as such, the database would be further enriched with appropriate information. Given this potential vantage point for computational data, the forging of a pipeline that auto-generates materials databases of cognate experimental and computational data was deemed to be strategically useful. Realizing this goal is the subject of this paper, whereby we present a new materials database of UV/vis absorption spectral attributes[6] whose experimental data component has been auto-generated by mining text from documents in the scientific literature and pertains to: a chemical material, its peak absorption wavelength(s), λmax, and the molar extinction coefficient of each peak, ε

  • These data are coupled to the results of a computational pipeline that uses fast and slow quantum-chemical methods, within a high-throughput computational framework, to produce the comparable UV/vis absorption spectral metrics, λmax, and the oscillation strength, f, a metric related to ε

  • HTML and XML article formats were included in the data extraction by restricting the download to articles released after the year 2000

Read more

Summary

Background & Summary

Progress in materials science is driven by the publication of articles in scientific journals where results are presented in tables, figures and continuous prose. Realizing this goal is the subject of this paper, whereby we present a new materials database of UV/vis absorption spectral attributes[6] whose experimental data component has been auto-generated by mining text from documents in the scientific literature and pertains to: a chemical material, its peak absorption wavelength(s), λmax, and the molar extinction coefficient of each peak, ε These data are coupled to the results of a computational pipeline that uses fast (approximating) and slow (traditional) quantum-chemical methods, within a high-throughput computational framework, to produce the comparable UV/vis absorption spectral metrics, λmax, and the oscillation strength, f, a metric related to ε (see Fig. 1). This trend moves towards the ultimate goal of data-driven materials discovery for optical and optoelectronic applications

Methods
Findings
Code availability

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.