High Dimensional Data Analysis: Integrating Submodels


Ahmed S. E. , YÜZBAŞI B.

BIG AND COMPLEX DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS, ss.285-304, 2017 (SCI İndekslerine Giren Dergi) identifier

  • Cilt numarası:
  • Basım Tarihi: 2017
  • Doi Numarası: 10.1007/978-3-319-41573-4_14
  • Dergi Adı: BIG AND COMPLEX DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS
  • Sayfa Sayıları: ss.285-304

Özet

We consider an efficient prediction in sparse high dimensional data. In high dimensional data settings where d >> n, many penalized regularization strategies are suggested for simultaneous variable selection and estimation. However, different strategies yield a different submodel with d(i) < n, where di represents the number of predictors included in ith submodel. Some procedures may select a submodel with a larger number of predictors than others. Due to the trade-off between model complexity and model prediction accuracy, the statistical inference of model selection becomes extremely important and challenging in high dimensional data analysis. For this reason we suggest shrinkage and pretest strategies to improve the prediction performance of two selected submodels. Such a pretest and shrinkage strategy is constructed by shrinking an overfitted model estimator in the direction of an underfitted model estimator. The numerical studies indicate that our post-selection pretest and shrinkage strategy improved the prediction performance of selected submodels.