Testing for normality with neural networks

Miloš Simić

doi:10.1007/s00521-021-06229-7

Abstract

In this paper, we treat the problem of testing for normality as a binary classification problem and construct a feedforward neural network that can act as a powerful normality test. We show that by changing its decision threshold, we can control the frequency of false non-normal predictions and thus make the network more similar to standard statistical tests. We also find the optimal decision thresholds that minimize the total error probability for each sample size. The experiments conducted on the samples with no more than 100 elements suggest that our method is more accurate and more powerful than the selected standard tests of normality for almost all the types of alternative distributions and sample sizes. In particular, the neural network was the most powerful method for testing normality of the samples with fewer than 30 elements regardless of the alternative distribution type. Its total accuracy increased with the sample size. Additionally, when the optimal decision-thresholds were used, the network was very accurate for larger samples with 250–1000 elements. With AUROC equal to almost 1, the network was the most accurate method overall. Since the normality of data is an assumption of numerous statistical techniques, the network constructed in this study has a very high potential for use in everyday practice of statistics, data analysis and machine learning.

Full Text