Abstract

Multivariate count data, such as sequencing reads in genomics, are often connected to a clinical phenotype of interest. We develop a flexible framework for dimension reduction in regression, with predictors that are correlated counts, by modeling the conditional distribution of the predictors, given the response, using a pairwise Poisson graphical model. This new framework, called network-based inverse regression for counts, allows us to derive a sufficient reduction of the predictors, while adjusting for the dependence structure among them. We propose a regularized criterion for estimating both the reduction structure and the network structure. The estimation algorithm can be implemented efficiently on a parallel computer. We also introduce an adaptive version and a sparse variant of the proposed procedure. The methods are evaluated on simulated data and are applied to a gut microbiome sequencing dataset. Supplementary materials for this article are available online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call