BackgroundPubChem is an open archive consisting of a set of three primary public databases (BioAssay, Compound, and Substance). It contains information on a broad range of chemical entities, including small molecules, lipids, carbohydrates, and (chemically modified) amino acid and nucleic acid sequences (including siRNA and miRNA). Currently (as of Nov. 2015), PubChem contains more than 150 million depositor-provided chemical substance descriptions, 60 million unique chemical structures, and 225 million biological activity test results provided from over 1 million biological assay records.DescriptionMany PubChem records (substances, compounds, and assays) include depositor-provided cross-references to scientific articles in PubMed. Some PubChem contributors provide bioactivity data extracted from scientific articles. Literature-derived bioactivity data complement high-throughput screening (HTS) data from the concluded NIH Molecular Libraries Program and other HTS projects. Some journals provide PubChem with information on chemicals that appear in their newly published articles, enabling concurrent publication of scientific articles in journals and associated data in public databases. In addition, PubChem links records to PubMed articles indexed with the Medical Subject Heading (MeSH) controlled vocabulary thesaurus.ConclusionLiterature information, both provided by depositors and derived from MeSH annotations, can be accessed using PubChem’s web interfaces, enabling users to explore information available in literature related to PubChem records beyond typical web search results.Graphical Graphical abstractLiterature information for PubChem records is derived from various sources
Read full abstract