Abstract

The lasso model has been widely used for model selection in data mining, machine learning, and high-dimensional statistical analysis. However, with the ultrahigh-dimensional, large-scale data sets now collected in many real-world applications, it is important to develop algorithms to solve the lasso that efficiently scale up to problems of this size. Discarding features from certain steps of the algorithm is a powerful technique for increasing efficiency and addressing the Big Data challenge. This paper proposes a family of hybrid safe–strong rules (HSSR) which incorporate safe screening rules into the sequential strong rule (SSR) to remove unnecessary computational burden. Two instances of HSSR are presented, SSR-Dome and SSR-BEDPP, for the standard lasso problem. SSR-BEDPP is further extended to the elastic net and group lasso problems to demonstrate the generalizability of the hybrid screening idea. Extensive numerical experiments with synthetic and real data sets are conducted for both the standard lasso and the group lasso problems. Results show that the proposed hybrid rules can substantially outperform existing state-of-the-art rules.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call