Abstract

The three databases of PubChem, ChemSpider, and UniChem capture the majority of open chemical structure records with February 2018 totals of 95, 63, and 154 million, respectively. Collectively, they constitute a massively enabling resource for cheminformatics, chemical biology, and drug discovery. As meta‐portals, they subsume and link out to the major proportion of public bioactivity data extracted from the literature and screening center assay results. Therefore, they not only present three different entry points, but the many subsumed independent resources present a fourth entry point in the form of standalone databases. Because this creates a complex picture it is important for users to have at least some appreciation of differential content to enable utility judgments for the tasks at hand. This turns out to be challenging. By comparing the three resources in detail, this review assesses their differences, some of which are not obvious. This includes the fact that coverage is significantly different between the 587, 282, and 38 contributing sources, respectively. This not only presents the “who‐has‐what” question, but also the reason “why” any particular inclusion is considered valuable is rarely made explicit. Also confusing is that sources nominally in common (i.e., having the same submitter name) can have significantly different structure counts, not only in each of the three but also from their standalone instantiations. Assessing a series of examples indicates that differences in loading dates and structural standardization are the main causes of this inter‐portal discordance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.