Signal detection is one of the main challenges in data science. As often happens in data analysis, the signal in the data may be corrupted by noise. There is a wide range of techniques that aim to extract the relevant degrees of freedom from data. However, some problems remain difficult. This is notably the case for signal detection in almost continuous spectra when the signal-to-noise ratio is small enough. This paper follows a recent bibliographic line, which tackles this issue with field-theoretical methods. Previous analysis focused on equilibrium Boltzmann distributions for an effective field representing the degrees of freedom of data. It was possible to establish a relation between signal detection and Z2 -symmetry breaking. In this paper, we consider a stochastic field framework inspired by the so-called ‘model A’, and show that the ability to reach, or not reach, an equilibrium state is correlated with the shape of the dataset. In particular, by studying the renormalization group of the model, we show that the weak ergodicity prescription is always broken for signals that are small enough, when the data distribution is close to the Marchenko–Pastur law. This, in particular, enables the definition of a detection threshold in the regime where the signal-to-noise ratio is small enough.
Read full abstract