时间:2018329(周四)下午400-500

地点:数学楼二楼学术报告厅

报告人:香港理工大学厐震博士


This talk concerns with variable screening when highly correlated variables exist in high dimensional models. The elastic net procedure (Zou and Hastie, 2005) which was designed for this situation may select the highly correlated variables, but include too many truly irrelevant variables. We propose a novel cluster feature selection (CFS) procedure based on the elastic net and linear correlation variable screening to enjoy the benefits of the two methods. When calculating the correlation between the predictor and the response, we consider the highly correlated group of the predictors instead of the individual ones. This is in contrast to the usual linear correlation variable screening. Within each correlated group, we apply the elastic net to select and estimate the variables. This avoids the drawback of mistakenly eliminating true non-zero coefficients for highly correlated variables like LASSO (Tibshirani, 1996) does. After applying the cluster feature selection procedure, maximum absolute sample correlation coefficient between clusters becomes smaller and any common model selection methods like SIS (Fan and Lv, 2008) or LASSO can be applied to improve the results. Extensive numerical examples including pure simulation examples and semi-real examples are conducted to show the good performances of our procedure.