We present a method for computing confidence in the Carcinogenic Potency Categorization Approach (CPCA) based predictions for N-nitrosamines. Our method relies on capturing local structural variations surrounding the nitrosamine core, which can significantly influence potency and may introduce uncertainty into predictions relying on these features.We use continuous-valued fingerprints to conduct a specialized neighborhood analysis, grouping nitrosamines with similar local features. Using a reference dataset of 7679 potential Nitrosamine Drug Substance Related Impurities (NDSRIs) with pre-computed CPCA-derived Acceptable Intake (AI) limits, we gauge the prediction confidence for a given query N-nitrosamine by evaluating the distances and CPCA derived potency category distribution among neighboring NDSRIs. Our methodology allows for a nuanced assessment of CPCA's discrete four-level outcomes (i.e. 18/26.5, 100, 400, and 1500 ng AI limits). It enables the differentiation of robust predictions from potentially uncertain ones, for instance, cases where low confidence arises from rare structural features in the query nitrosamine, helpful in regulatory decision-making.In our analysis of 30 nitrosamines with animal carcinogenicity data, we often observed lower confidence scores when experimental TD50 values significantly disagreed with CPCA-calculated potency. Moreover, lower confidence scores were associated with greater variability in the predicted α-carbon hydroxylation potential of neighboring compounds. In a list of 265 NDSRIs with established regulatory AI limits, approximately 68% received strong confidence scores for accurate CPCA potency class predictions. However, 8% received poor confidence in potency class predictions, as well as lacked sufficient neighbor support due to uncommon structural features.
Read full abstract