Variable selection and estimation with the seamless-L0 penalty

Lee Dicker, Baosheng Huang, Xihong Lin

科研成果: 期刊稿件文章同行评审

78 引用 (Scopus)

摘要

Penalized least squares procedures that directly penalize the number of variables in a regression model (L0 penalized least squares procedures) enjoy nice theoretical properties and are intuitively appealing. On the other hand, L0 penalized least squares methods also have significant drawbacks in that implementation is NP-hard and not computationally feasible when the number of variables is even moderately large. One of the challenges is the discontinuity of the L0 penalty. We propose the seamless-L0 (SELO) penalty, a smooth function on [0;∞) that very closely resembles the L0 penalty. The SELO penalized least squares procedure is shown to consistently select the correct model and is asymptotically normal, provided the number of variables grows more slowly than the number of observations. SELO is efficiently implemented using a coordinate descent algorithm. Since tuning parameter selection is crucial to the performance of the SELO procedure, we propose a BIC-like tuning parameter selection method for SELO, and show that it consistently identifies the correct model while allowing the number of variables to diverge. Simulation results show that the SELO procedure with BIC tuning parameter selection performs well in a variety of settings - outperforming other popular penalized least squares procedures by a substantial margin. Using SELO, we analyze a publicly available HIV drug resistance and mutation dataset and obtain interpretable results.

源语言英语
页(从-至)929-962
页数34
期刊Statistica Sinica
23
2
DOI
出版状态已出版 - 4月 2013

指纹

探究 'Variable selection and estimation with the seamless-L0 penalty' 的科研主题。它们共同构成独一无二的指纹。

引用此