While increased attention is being paid to the impact of data quality in cell-line sensitivity and toxicology modeling, to date, no systematic study has evaluated the comparability of independent cytotoxicity measurements on a large-scale. Here, we estimate the experimental uncertainty of public cytotoxicity data from ChEMBL version 19. We applied stringent filtering criteria to assemble a curated data set comprised of pIC50 data for compound-cell line systems measured in independent laboratories. The estimated experimental uncertainty calculated was a mean unsigned error (MUE) value of 0.61-0.76, a median unsigned error (MedUE) value of 0.51-0.58, and a standard deviation of 0.76-1.00 pIC50 units. The experimental uncertainty (σE) estimated from all pairs of cytotoxicity measurements with a ΔpIC50 value lower than 2.5 was found to be 0.59-0.77 pIC50 units, and thus 21-60% and 21-26% higher than that of pKi and pIC50 data for ligand-protein data (σE =0.47-0.48 pKi units and σE =0.57-0.61 pIC50 units, respectively). The estimated σE value from the pairs of pIC50 values measured with metabolic assays was 0.98, whereas the σE value was found to be 0.69 when using the 1388 pIC50 pairs measured using exactly the same experimental setup. The maximum achievable Pearson correlation coefficient (RPearsonmax.2) of in silico models trained on cytotoxicity data from different laboratories was estimated to be 0.51-0.85, which is considerably different from the value of 1 corresponding to perfect predictions, hinting at the maximum performance one can expect also from computational cytotoxicity predictions. The lowest concordance between pairs of measurements was found for the drugs paclitaxel, methotrexate, zidovudine, and docetaxel, and for the cell lines HepG2, NCI-H460, L1210, and CCRF-CEM, hinting at particular sensitivity of those systems to experimental setups. The highest concordance was estimated for the compound-cell line system HL-60-etoposide (σE =0.70), whereas the lowest for L1210-methotrexate (σE =1.68). We found that annotation errors are responsible for the high discordance observed for some pairs of measurements, pointing out the importance of data curation when automatically extracting cytotoxicity data from public databases. Likewise, these results highlight the importance of estimating compound cytotoxicity with assays providing complementary biological information (i.e., metabolic, clonogenic and assays based on cell membrane integrity), especially when the mechanism of action of test compounds is unknown. From this analysis, guidelines can be created on the reliability of cytotoxicity data from public databases, which could ultimately prove valuable for modeling purposes, and to guide reporting of data in the literature.
Read full abstract