Abstract

We consider the problem of modelling count data with excess zeros using Zero-Inflated Poisson (ZIP) regression. Recently, various regularization methods have been developed for variable selection in ZIP models. Among these, EM LASSO is a popular method for simultaneous variable selection and parameter estimation. However, EM LASSO suffers from estimation inefficiency and selection inconsistency. To remedy these problems, we propose a set of EM adaptive LASSO methods using a variety of data-adaptive weights. We show theoretically that the new methods are able to identify the true model consistently, and the resulting estimators can be as efficient as oracle. The methods are further evaluated through extensive synthetic experiments and applied to a German health care demand dataset.

Highlights

  • Modern research studies routinely collect information on a broad array of outcomes including count measurements with excess amount of zeros

  • We propose a set of Expectation Maximization (EM) adaptive LASSO methods using a variety of data-adaptive weights

  • Tang et al [6] showed that the EM adaptive LASSO (i.e., AMAZonn - EM AL) enjoys the so-called oracle properties, i.e., the estimator is able to identify the true model consistently, and the resulting estimator is as efficient as oracle

Read more

Summary

A Note on the Adaptive LASSO for Zero-Inflated Poisson Regression

JP Morgan Chase & Co., USA NBCUniversal, USA Department of Biostatistics, Harvard T.H. Chan School of Public Health, USA Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, USA Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, USA Eli Lilly and Company, USA. EM LASSO suffers from estimation inefficiency and selection inconsistency. To remedy these problems, we propose a set of EM adaptive LASSO methods using a variety of data-adaptive weights. The methods are further evaluated through extensive synthetic experiments and applied to a German health care demand dataset

Introduction
Methods
Oracle Properties
Simulation Studies
Application to German Health Care Demand Data
Discussion
Findings
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.