Logistic Regression Models for Aggregated Data

T Whitaker,B Beranger,S A Sisson

doi:10.1080/10618600.2021.1895816

Abstract

Logistic regression models are a popular and effective method to predict the probability of categorical response data. However, inference for these models can become computationally prohibitive for large datasets. Here we adapt ideas from symbolic data analysis to summarize the collection of predictor variables into histogram form, and perform inference on this summary dataset. We develop ideas based on composite likelihoods to derive an efficient one-versus-rest approximate composite likelihood model for histogram-based random variables, constructed from low-dimensional marginal histograms obtained from the full histogram. We demonstrate that this procedure can achieve comparable classification rates to the standard full data multinomial analysis and against state-of-the-art subsampling algorithms for logistic regression, but at a substantially lower computational cost. Performance is explored through simulated examples, and analyses of large supersymmetry and satellite crop classification datasets. Supplementary materials for this article are available online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Computational and Graphical Statistics	Publication Date: Apr 20, 2021
Citations: 8	License type: cc-by

R Discovery Prime

R Discovery Prime

Logistic Regression Models for Aggregated Data

Abstract

Talk to us

Similar Papers

More From: Journal of Computational and Graphical Statistics

Lead the way for us

Similar Papers

Phylogenetic Analyses of Large Data Sets: Approaches Using the Angiosperms
Douglas E. Soltis ... Pamela S. Soltis
-
Douglas E. Soltis, et. al.Douglas E. Soltis ... Pamela S. Soltis
01 Jan 1999
01 Jan 1999

Secondary Data Analysis of Large Data Sets in Urology: Successes and Errors to Avoid
Bruce J Schlomer ... Hillary L Copp
Journal of Urology | VOL. 191
Bruce J Schlomer, et. al.Bruce J Schlomer ... Hillary L Copp
17 Oct 2013
Journal of Urology | VOL. 191

Hierarchical models facilitate spatial analysis of large data sets: a case study on invasive plant species in the northeastern United States
A M Latimer ... H Sang Jr
Ecology Letters | VOL. 12
A M Latimer, et. al.A M Latimer ... H Sang Jr
12 Jan 2009
Ecology Letters | VOL. 12

Meeting Big Data challenges with visual analytics
Victoria Louise Lemieux ... Lyse Rowledge
Records Management Journal | VOL. 24
Victoria Louise Lemieux, et. al.Victoria Louise Lemieux ... Lyse Rowledge
15 Jul 2014
Records Management Journal | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Logistic Regression Models for Aggregated Data

Abstract

Talk to us

Similar Papers

More From: Journal of Computational and Graphical Statistics