Abstract

AbstractWe propose a new method to learn the structure of a Gaussian graphical model with finite sample false discovery rate control. Our method builds on the knockoff framework of Barber and Candès for linear models. We extend their approach to the graphical model setting by using a local (node-based) and a global (graph-based) step: we construct knockoffs and feature statistics for each node locally, and then solve a global optimization problem to determine a threshold for each node. We then estimate the neighbourhood of each node, by comparing its feature statistics to its threshold, resulting in our graph estimate. Our proposed method is very flexible, in the sense that there is freedom in the choice of knockoffs, feature statistics and the way in which the final graph estimate is obtained. For any given data set, it is not clear a priori what choices of these hyperparameters are optimal. We therefore use a sample-splitting-recycling procedure that first uses half of the samples to select the hyperparameters, and then learns the graph using all samples, in such a way that the finite sample FDR control still holds. We compare our method to several competitors in simulations and on a real data set.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call