Abstract

The recent success of generative adversarial networks and variational learning suggests that training a classification network may work well in addressing the classical two-sample problem, which asks to differentiate two densities given finite samples from each one. Network-based methods have the computational advantage that the algorithm scales to large datasets. This paper considers using the classification logit function, which is provided by a trained classification neural network and evaluated on the testing set split of the two datasets, to compute a two-sample statistic. To analyze the approximation and estimation error of the logit function to differentiate near-manifold densities, we introduce a new result of near-manifold integral approximation by neural networks. We then show that the logit function provably differentiates two sub-exponential densities given that the network is sufficiently parametrized, and for on or near manifold densities, the needed network complexity is reduced to only scale with the intrinsic dimensionality. In experiments, the network logit test demonstrates better performance than previous network-based tests using classification accuracy, and also compares favorably to certain kernel maximum mean discrepancy tests on synthetic datasets and hand-written digit datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call