Abstract
BackgroundMultidisciplinary integrated research requires the ability to couple thediverse sets of data obtained from a range of complex experiments andcomputer simulations. Integrating data requires semantically richinformation. In this paper an end-to-end use of semantically rich data incomputational chemistry is demonstrated utilizing the Chemical MarkupLanguage (CML) framework. Semantically rich data is generated by the NWChemcomputational chemistry software with the FoX library and utilized by theAvogadro molecular editor for analysis and visualization.ResultsThe NWChem computational chemistry software has been modified and coupled tothe FoX library to write CML compliant XML data files. The FoX library wasexpanded to represent the lexical input files and molecular orbitals used bythe computational chemistry software. Draft dictionary entries and a formatfor molecular orbitals within CML CompChem were developed. The Avogadroapplication was extended to read in CML data, and display molecular geometryand electronic structure in the GUI allowing for an end-to-end solutionwhere Avogadro can create input structures, generate input files, NWChem canrun the calculation and Avogadro can then read in and analyse the CML outputproduced. The developments outlined in this paper will be made available infuture releases of NWChem, FoX, and Avogadro.ConclusionsThe production of CML compliant XML files for computational chemistrysoftware such as NWChem can be accomplished relatively easily using the FoXlibrary. The CML data can be read in by a newly developed reader in Avogadroand analysed or visualized in various ways. A community-based effort isneeded to further develop the CML CompChem convention and dictionary. Thiswill enable the long-term goal of allowing a researcher to run simple“Google-style” searches of chemistry and physics and have theresults of computational calculations returned in a comprehensible formalongside articles from the published literature.Electronic supplementary materialThe online version of this article (doi:10.1186/1758-2946-5-25) contains supplementary material, which is available to authorized users.
Highlights
Multidisciplinary integrated research requires the ability to couple the diverse sets of data obtained from a range of complex experiments and computer simulations
Our experience of the use of FoX for introducing Chemical Markup Language (CML) output into NWChem has reinforced our understanding of various best-practice guidelines for the integration of FoX to large-scale computational chemistry simulation codes
This cue is absent in the case of external converters that parse an output file only designed for human consumption. If this hint to developers is insufficient the use of FoX provides a second line of defence against breakage of the CML output: such damage will result in the NWChem code failing to compile or run simple test cases
Summary
Multidisciplinary integrated research requires the ability to couple the diverse sets of data obtained from a range of complex experiments and computer simulations. The key to successful multi-disciplinary integrated research is often the ability to couple the diverse sets of data obtained from a range of complex experiments and computer simulations to solve scientific problems that are intractable when only one technique is employed. Raw data (for example the recorded NMR signal or calculated molecular orbitals) is only a subset of all the scientific data generated from an experiment or computer simulation. This acquired data gets processed and analyzed with a barrage of tools to extract the important observables, i.e. derived data, and help create the scientific interpretation. While the experimental community has been working to develop and deploy data standards, this is less true in the computational chemistry community
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.