High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study


Ahmed S. E., Kim H., Yildirim G., YÜZBAŞI B.

25th International Workshop on Matrices and Statistics (IWMS), Funchal, Portekiz, 6 - 09 Haziran 2016, ss.145-175 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1007/978-3-030-17519-1_11
  • Basıldığı Şehir: Funchal
  • Basıldığı Ülke: Portekiz
  • Sayfa Sayıları: ss.145-175
  • Anahtar Kelimeler: Correlated design, Penalized and non-penalized methods, High-dimensional data, Monte Carlo, VARIABLE SELECTION, ADAPTIVE LASSO, VIEW
  • İnönü Üniversitesi Adresli: Evet

Özet

Regression problems where the number of predictors, p, exceeds the number of responses, n, have become increasingly important in many diverse fields in the last couple of decades. In the classical case of "small p and large n," the least squares estimator is a practical and effective tool for estimating the model parameters. However, in this so-called Big Data era, models have the characteristic that p is much larger than n. Statisticians have developed a number of regression techniques for dealing with such problems, such as the Lasso by Tibshirani (J R Stat Soc Ser B Stat Methodol 58:267-288, 1996), the SCAD by Fan and Li (J Am Stat Assoc 96(456):1348- 1360, 2001), the LARS algorithm by Efron et al. (Ann Stat 32(2):407-499, 2004), the MCP estimator by Zhang (Ann Stat. 38:894-942, 2010), and a tuning-free regression algorithm by Chatterjee (High dimensional regression and matrix estimation without tuning parameters, 2015, https://arxiv.org/abs/1510.07294). In this paper, we investigate the relative performances of some of these methods for parameter estimation and variable selection through analyzing real and synthetic data sets. By an extensive Monte Carlo simulation study, we also compare the relative performance of proposed methods under correlated design matrix.