Although most modern techniques and analysis methods in multiparameter flow cytometry (MFC) allow for increased dimensionality for the characterization and quantification of cell populations, most MFC applications depend on flow cytometers measuring relatively small (<16) numbers of parameters. When more markers than the available parameters need to be acquired, these are commonly distributed over multiple independent measurements that include a backbone of common markers. Several methods have been proposed to impute values for combinations of markers that were not measured simultaneously. These imputation methods are frequently used without proper validation and knowledge of their effects on data analysis. We evaluated the performance of existing imputation software (Infinicyt, CyTOFmerge, CytoBackBone, and cyCombine) in approximating known measured expression data in terms of similarity in visual appearance, cell expression, and gating in different datasets by splitting MFC samples into separate measurements with partially overlapping markers and re-calculating missing marker expression. Out of the assessed packages, CyTOFmerge showed the most accurate approximation of the known expression in terms of similar expression values and concordance with manual gating, with a mean F-score between 0.53-0.87 when retrieving cell populations in different datasets. Performance remained inadequate for all methods, with only limited similarity at the cell level. In conclusion, the use of imputed MFC data should take such limitations into account and include independent validation of results to justify conclusions. This article is protected by copyright. All rights reserved.
Read full abstract