Abstract

In quantitative structure-activity relationship (QSAR) studies, a wide range of biological activity may be observed by modifying the lead molecule. Some molecules may be so inactive that measurement of ED 50 (the application rate at which the compound exhibits 50% control of the target organism) is impractical. Therefore, for some compounds, we may know only that the ED 50 is greater than the highest application rate tested, i.e., the biological activity data for some molecules will be censored. Methods for analyzing QSAR data sets containing censored observations have been discussed previously, where it was shown that censored data points provide useful information, although less information than if the data were not censored. The present paper goes beyond data analysis to discuss the effect of possible censoring on optimal experimental designs. Biological test data for a set of molecules is presented (including some censored observations) along with a statistically significant regression equation relating structure and activity. However, the molecules on which the regression analysis was based were not an optimal subset of the molecules to which predictions were required. Therefore, sharper estimates of the regression parameters were needed to make good predictions for the diversity of compounds of interest. Consequently, additional molecules were chosen (for synthesis, testing and incorporation into the data analysis) using the theory of optimal experimental design (D-optimal and related information criteria). The possibility of information loss due to censoring was taken into account for those new molecules for which the uncertainty limits on predictions indicated that a molecule might be too inactive to obtain an actual ED 50. However, simply deleting all potentially inactive molecules from the selection was undesirable because good parameter estimates are favored by having a wide (relative to experimental error) range of activity. Therefore, an optimal balance between possible information loss due to censoring and information gain by having a wide range of activities was desired. The mathematics of this compromise are developed in this paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call