Abstract
This commentary describes two simple procedures using commercially available software packages that greatly facilitate the creation of and replication of data sets intended for quantitative structure activity relationship (QSAR) and quantitative structure property relationship (QSPR) studies. Used properly, the procedures allow the capture of individual chemical structures from the Chemical Abstracts Service (CAS) SciFinder software in a computer readable format that is recognized by most chemical database and computational calculation software packages. The researcher need not draw in a chemical structure to create a Molecular Design Limited (MDL) mol file, the 2D connection table format most commonly used to create the chemical depiction of compound or drug. The MDL mol format is needed so that properties can be calculated from the chemical structure alone. All that is required is that the compound or drug be located in SciFinder. The procedures are described in considerable detail because the key procedures for capturing structures from Chemical Abstracts Service (CAS) SciFinder through the use of Accelrys’Accord for Excel software are undocumented in either software. Also described is a batch procedure that allows search of CAS SciFinder for the exact chemical structure of up to 25 compounds. Without use of this procedure, Scifinder can only be searched for an exact chemical structure a single compoundat a time using a query consisting of a drawn in structure. Both the single-mode structure retrieval and batch-mode compound search procedures result in very significant time savings to the researcher creating or replicating QSAR/QSPR data sets and likelymay enable structure searches that previously might not have been attempted because of researcher time constraints. These procedures do not affect positively or negatively the cost to the user of the searches against the SciFinder software. These costs are determined by CAS policy, and depend on the numbers of structures/compounds searched. Locating a compound or drug in SciFinder is most accurately done using the CAS Registry Number. The CAS Registry Number uniquely identifies a specific compound and salt form. Older deleted CAS Registry Numbers for a specific compound may be encountered, but a search on the current (or older) CAS Registry Numbers will always bring up the correct compound. If different salt forms of the same compound exist in the CAS databases, they will have different CAS Registry Numbers. By contrast, a search by compound or drug namemay fail. IUPAC names for compounds are of no value in searches against SciFinder because IUPAC Names are not listed in the records. Commonly used drug names work fairly well. However, it is frequent to find variant or misspelled drug names in the scientific literature. A deviation between a search input name and the names stored in CAS databases in only a single letter or number will result in a search failure. Searchesusingdrug tradenames (as invery recent drugs) or company code numbers (as in early discovery stage compounds) fail more frequently than searches using common names.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.