Abstract

The inconsistency of polymer indexing caused by the lack of uniformity in expression of polymer names is a major challenge for widespread use of polymer related data resources and limits broad application of materials informatics for innovation in broad classes of polymer science and polymeric based materials. The current solution of using a variety of different chemical identifiers has proven insufficient to address the challenge and is not intuitive for researchers. This work proposes a multi-algorithm-based mapping methodology entitled ChemProps that is optimized to solve the polymer indexing issue with easy-to-update design both in depth and in width. RESTful API is enabled for lightweight data exchange and easy integration across data systems. A weight factor is assigned to each algorithm to generate scores for candidate chemical names and optimized to maximize the minimum value of the score difference between the ground truth chemical name and the other candidate chemical names. Ten-fold validation is utilized on the 160 training data points to prevent overfitting issues. The obtained set of weight factors achieves a 100% test accuracy on the 54 test data points. The weight factors will evolve as ChemProps grows. With ChemProps, other polymer databases can remove duplicate entries and enable a more accurate “search by SMILES” function by using ChemProps as a common name-to-SMILES translator through API calls. ChemProps is also an excellent tool for auto-populating polymer properties thanks to its easy-to-update design.

Highlights

  • The inconsistency of polymer indexing caused by the lack of uniformity in expression of polymer names is a major challenge for widespread use of polymer related data resources and limits broad application of materials informatics for innovation in broad classes of polymer science and polymeric based materials

  • We propose ChemProps, a RESTful Application Programming Interface (API) enabled database that takes in common polymer names and returns chemical identifiers with tolerance of expression differences as mentioned earlier

  • In this work, we propose a twelve algorithm based mapping methodology named ChemProps that is optimized to solve a polymer indexing issue which routinely impedes the progress of Materials Informatics for polymeric based systems

Read more

Summary

Introduction

The inconsistency of polymer indexing caused by the lack of uniformity in expression of polymer names is a major challenge for widespread use of polymer related data resources and limits broad application of materials informatics for innovation in broad classes of polymer science and polymeric based materials. ChemProps is an excellent tool for autopopulating polymer properties thanks to its easy-to-update design Introduction they confound attempts to curate data in the robust and Significant advances in computing power in the past dec- consistent manner essential for indexing into databases ade have given birth to many data-driven approaches for Materials Informatics, which is considered a major including Materials Informatics, which facilitates under- impediment for the adoption of machine learning techstanding of processing-structure–property relationships niques [6, 7]. Because of the lack of unials Informatics requires data to be reliable, uniform and formity in expression of polymer names in publications stored in a controlled manner [5] This seemingly simple and data sets, exploration of the data via search and visurequirement has posed many challenges for polymeric alization tools becomes problematic, leading to difficulmaterials data due to prevalent use of different nam- ties in using a polymer data resource as a viable tool in ing conventions and abbreviations for polymers. Search type: Use “pol” for polymer or “fil” for filler Chemical name to locate Optional abbreviation to locate Optional trade name to locate Optional specific SMILES value to locate

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.