There are many methods proposed for inferring parameters of the Ising model from given data, that is a set of configurations generated according to the model itself. However little attention has been paid until now to the data, e.g. how the data is generated, whether the inference error using one set of data could be smaller than using another set of data, etc. In this paper we discuss the data quality problem in the inverse Ising problem, using as a benchmark the kinetic Ising model. We quantify the quality of data using effective rank of the correlation matrix, and show that data gathered in a out-of-equilibrium regime has a better quality than data gathered in equilibrium for coupling reconstruction. We also propose a matrix-perturbation based method for tuning the quality of given data and for removing bad-quality (i.e. redundant) configurations from data.
Read full abstract