Collective intelligence, the "wisdom of the crowd," seeks to improve the quality of judgments by aggregating multiple individual inputs. Here, we evaluate the success of collective intelligence strategies applied to probabilistic diagnostic judgments. We compared the performance of individual and collective intelligence judgments on two series of clinical cases requiring probabilistic diagnostic assessments, or "forecasts". We assessed the quality of forecasts using Brier scores, which compare forecasts to observed outcomes. On both sets of cases, the collective intelligence answers outperformed nearly every individual forecaster or team. The improved performance by collective intelligence was mediated by both improved resolution and calibration of probabilistic assessments. In a secondary analysis looking at the effect of varying number of individual inputs in collective intelligence answers from two different data sources, nearly identical curves were found in the two data sets showing 11-12% improvement when averaging two independent inputs, 15% improvement averaging four independent inputs, and small incremental improvements with further increases in number of individual inputs. Our results suggest that the application of collective intelligence strategies to probabilistic diagnostic forecasts is a promising approach to improve diagnostic accuracy and reduce diagnostic error.
Read full abstract