Standard data analysis pipelines for digital PCR estimate the concentration of a target nucleic acid by digitizing the end-point fluorescence of the parallel micro-PCR reactions, using an automated hard threshold. While it is known that misclassification has a major impact on the concentration estimate and substantially reduces accuracy, the uncertainty of this classification is typically ignored. We introduce a model-based clustering method to estimate the probability that the target is present (absent) in a partition conditional on its observed fluorescence and the distributional shape in no-template control samples. This methodology acknowledges the inherent uncertainty of the classification and provides a natural measure of precision, both at individual partition level and at the level of the global concentration. We illustrate our method on genetically modified organism, inhibition, dynamic range, and mutation detection experiments. We show that our method provides concentration estimates of similar accuracy or better than the current standard, along with a more realistic measure of precision. The individual partition probabilities and diagnostic density plots further allow for some quality control. An R implementation of our method, called Umbrella, is available, providing a more objective and automated data analysis procedure for absolute dPCR quantification.
Read full abstract