Within a recent comparative exercise, different approaches to the prediction of rodent carcinogenicity were challenged on a common set of chemicals bioassayed by the U.S. National Toxicology Program. The approaches were of very different natures. Some prediction systems looked for relationships between carcinogenicity and other, more quickly detectable biological events (activity-activity relationships, AAR). Some approaches tended to find structure-activity relationships (SAR). To give an objective evaluation of the results of the exercise, we have analyzed the rodent results and the predictions with the multivariate data analysis methods. The calculated performances varied according to the adopted carcinogenicity classification of the chemicals. When the four rodent results were summarized into a final + or − call, the Tennant approach (AAR method) showed the best performance (about 75% accuracy), whereas the best SAR systems had 60–65% accuracy. A common limitation of almost all the systems was the lack of specificity (too many false positives). Based on these results, better concordance was obtained when the input information was the very costly (and closer to the final endpoint) biological data, rather than the inexpensive (and farther from the endpoint) knowledge of the chemical structure. However, when the rodent results were summarized into a carcinogenicity classification that maintained, to some extent, the gradation intrinsic to the original experimental data, the performance of the AAR systems declined, and the SAR approaches showed a better performance. The difficulty in evaluating the various approaches was further complicated because of a fundamental difference in the approaches themselves: some approaches were ‘pure’ prediction methods (i.e. their predictions were rigorously based on information not inclusive of carcinogenicity); other approaches (e.g. Tennant, Weisburger) used ‘mixed’ information, inclusive of known carcinogenicity results from experiments performed before the NTP bioassays. As far as the SAR systems are concerned, their sets of predictions showed a fundamental similarity. This happened in spite of the extremely different procedures adopted to treat the chemical formula (initial information): very simple calculations (Benigni), intuition of the experts (Weisburger and Lijinsky), sophisticated computer programs (TOPKAT and CASE). The results of the Bakale Ke method, based on the experimental measurement of the chemical electrophilicity, and of the Salmonella typhimurium mutagenicity assay were similar to the patterns of predictions of the SAR methods.