The National Institute for Materials Science, Japan, has been developing a materials data platform linked with a materials data repository system for rapid new material searching using materials informatics. Data conversion from raw data to a human-legible/machine-readable data file is one of the key preparation techniques prior to data analysis, where the converted data file should include metainformation. The tools developed by the authors can convert raw data to a structured data package that consists of mandatory and measurement-characterization metadata, primary and raw parameters, and formatted numerical data (FND). The FND are expressed as a matrix type with robust flexibility. This flexibility is achieved by applying the data analysis architecture of schema-on-read, not schema-on-write based on de jure standards, such as ISO documents. The primary parameters are carefully selected from raw parameters, and their vocabularies are transformed from instrument-dependent terms to general terms that everyone can readily understand. The converted data are linked with, for example, specimen information, process information, specimen handling records, and the electronic laboratory notebook. Using this raw-to-repository (R2R) conversion flow, the authors demonstrated that they can generate and store interoperable data files of x-ray photoelectron spectroscopy (XPS) spectra and depth profiles, powder x-ray diffraction patterns, (scanning) transmission electron microscope images, transmission electron diffraction patterns, electron energy-loss spectroscopy spectra, and calculated electron inelastic mean free path data. Linking measurement data to other required information ensures experimentally repeatable, replicable, and reproducible results. The mandatory and characterization metadata are used for quick electronic searching, and primary and raw parameters are convenient for setting up measurement conditions and useful for reproducibility/replicability and replicability/repeatability, respectively. The FND are human legible and machine readable using parser software, leading to a long lifetime of data utilization. The authors also developed a system to allow the semiautomatic data transfer from an instrument-controlling personal computer (PC) isolated from the communication network by adopting a Wi-Fi-capable secure digital card’s scripting capability while keeping the PC offline. They are developing further software for on-demand data manipulation after R2R data conversion. To date, it has been possible to perform XPS peak separation using an automated information compression technique without any a priori assumption. By combining R2R conversion with a high-throughput data collection system and automated data analysis routine, highly reproducible data acquisition and data analysis could be achieved, where human interaction is minimized. At this early stage, the authors demonstrate automated peak separation processing for XPS C 1s and O 1s narrow spectra of polyethylene terephthalate with very high reproducibility.
Read full abstract