High Dimensional Data Analysis: Integrating Submodels

Ahmed, Syed; YÜZBAŞI, BAHADIR

doi:10.1007/978-3-319-41573-4_14

High Dimensional Data Analysis: Integrating Submodels

Atıf İçin Kopyala

Ahmed S. E., YÜZBAŞI B.

BIG AND COMPLEX DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS, ss.285-304, 2017 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası:
Basım Tarihi: 2017
Doi Numarası: 10.1007/978-3-319-41573-4_14
Dergi Adı: BIG AND COMPLEX DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED)
Sayfa Sayıları: ss.285-304
Anahtar Kelimeler: Monte Carlo simulation, Pretest, penalty and shrinkage strategies, Sparse regression models, MODEL SELECTION, VARIABLE SELECTION, REGRESSION-MODELS, ORACLE PROPERTIES, ADAPTIVE LASSO, SHRINKAGE, PENALTY
İnönü Üniversitesi Adresli: Evet

Özet

We consider an efficient prediction in sparse high dimensional data. In high dimensional data settings where d >> n, many penalized regularization strategies are suggested for simultaneous variable selection and estimation. However, different strategies yield a different submodel with d(i) < n, where di represents the number of predictors included in ith submodel. Some procedures may select a submodel with a larger number of predictors than others. Due to the trade-off between model complexity and model prediction accuracy, the statistical inference of model selection becomes extremely important and challenging in high dimensional data analysis. For this reason we suggest shrinkage and pretest strategies to improve the prediction performance of two selected submodels. Such a pretest and shrinkage strategy is constructed by shrinking an overfitted model estimator in the direction of an underfitted model estimator. The numerical studies indicate that our post-selection pretest and shrinkage strategy improved the prediction performance of selected submodels.