Abstract

Variable selection in count data using Penalized Poisson regression is one of the challenges in applying Poisson regression model when the explanatory variables are correlated. To tackle both estimate the coefficients and perform variable selection simultaneously, Lasso penalty was successfully applied in Poisson regression. However, Lasso has two major limitations. In the p > n case, the lasso selects at most n variables before it saturates, because of the nature of the convex optimization problem. This seems to be a limiting feature for a variable selection method. Moreover, the lasso is not well-defined unless the bound on the L1-norm of the coefficients is smaller than a certain value. If there were a group of variables among which the pairwise correlations are very high, then the lasso tends to select only one variable from the group and does not care which one is selected. To address these issues, we propose the elastic net method between explanatory variables and to provide the consistency of the variable selection simultaneously. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in the model together.

Highlights

  • In modern data analysis problem, we had number of parameters greater than number of observation leading to high dimensional problems.Health, finance, economics and sports to mention a few were some of the areas that had benefited drastically from the ever increasing level of technology

  • A study of elastic net was proposed by applying on Penalized Poisson regression model

  • Elastic net and least absolute shrinkage and selection operator (Lasso) were compared by using simulation studies and real data application

Read more

Summary

Introduction

In modern data analysis problem, we had number of parameters greater than number of observation leading to high dimensional problems. Finance, economics and sports to mention a few were some of the areas that had benefited drastically from the ever increasing level of technology This has seen an enormous amount of data derived with two dimensions the number of both variable and observation. The Lasso helps to increase the model interpretability by eliminating irrelevant variables that are not associated with the response variable, this way overfitting is reduced. This is the point where we are more interested in because in this paper the focus is on the feature selection task [6].

Literature Review
Penalized Poisson Regression Model
Elastic-net
Model Testing
Data Description
Empirical Result
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call