Many real data analyses involve two-sample comparisons in location or in distribution. Most existing methods focus on problems where observations are independently and identically distributed in each group. However, in some applications the observed data are not identically distributed but associated with some unobserved parameters which are identically distributed. To address this challenge, we propose a novel two-sample testing procedure as a combination of the -modeling density estimation introduced by Efron and the two-sample Kolmogorov-Smirnov test. We also propose efficient bootstrap algorithms to estimate the statistical significance for such tests. We demonstrate the utility of the proposed approach with two biostatistical applications: the analysis of surgical nodes data with binomial model and differential expression analysis of single-cell RNA sequencing data with zero-inflated Poisson model.
Read full abstract