Abstract

Partially linear models (PLMs) are important generalizations of linear models and are very useful for analyzing high-dimensional data. Compared to linear models, the PLMs possess desirable flexibility of non-parametric regression models because they have both linear and non-linear components. Variable selection for PLMs plays an important role in practical applications and has been extensively studied with respect to the linear component. However, for the non-linear component, variable selection has been well developed only for PLMs with extra structural assumptions such as additive PLMs and generalized additive PLMs. There is currently an unmet need for variable selection methods applicable to general PLMs without structural assumptions on the non-linear component. In this paper, we propose a new variable selection method based on learning gradients for general PLMs without any assumption on the structure of the non-linear component. The proposed method utilizes the reproducing-kernel-Hilbert-space tool to learn the gradients and the group-lasso penalty to select variables. In addition, a block-coordinate descent algorithm is suggested and some theoretical properties are established including selection consistency and estimation consistency. The performance of the proposed method is further evaluated via simulation studies and illustrated using real data.

Highlights

  • Velopments include Mukherjee and Wu [29] and De Brabanter et al [7], which mainly focus on estimating the gradient functions to do regression or classification

  • We examine the performance of the proposed variable selection method for partially linear models, comparing against some other popular variable selection methods in literature, including the variable selection method for additive models proposed by [47], Cosso by [26], model free gradient learning method by [49] and regular gradient learning method [30], referred to as

  • The idea of gradient learning has become popular for variable selection, because it is model free [49]

Read more

Summary

Method

When the number of predictors d is large, variable selection plays a crucial rule in strengthening these three pillars For this aim, we assume that the true partially linear model (2.1) is sparse in the sense that some elements of z and some elements of w have no effect on the response variable. In order to conduct variable selection in both the linear and nonlinear components, we propose to consider a penalized procedure, minimizing the following objective function over the vector of parameters β = (β1, · · · , βp)T and functions g = (g1, · · · , gq)T , p q. The parameter τn in the kernel weights is not considered as a tuning parameter, and can be set as the median over the pairwise distances among all the sample points [30]

Implementation
Tuning
Asymptotic theory
Simulation studies
Methods
Real data applications
Digit recognition data
Japanese industrial chemical firms data
Summary
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call