题目: Information-adaptive Lasso for text data
报告人:Cathy Y. Chen 陈怡璇教授
时间:2023年10月23日 下午 14:30-15:30
地点:本部维格堂319
摘要:
Economic analysis using text data has expanded rapidly over the last few years. In this paper, we present a novel method, the information-adaptive Lasso (IA-Lasso), proposed specifically for text data. Our approach extends the traditional lasso to handle the heterogeneous informativeness of individual words or phrases in such data. In order to regularize the information capacity of words, we introduce an information-adaptive weighting function that facilitates consistent text selection and estimation. We establish the oracle properties of IA-Lasso under mild regularity conditions on the sequence of informativeness conveyed by each word. Additionally, we provide a Bayesian interpretation for informativeness variable, linking it to the distribution of random waiting time. To evaluate its performance, we conduct a comprehensive simulation experiment comparing the IA-Lasso with the adaptive lasso proposed by Zou (2006). We utilize the FOMC statements to demonstrate the predictive performance of text data and the resulting sparsity in the text representation.
个人简介:Dr. Cathy Y. Chen 陈怡璇博士是英国格拉斯哥大学商学院教授。研究领域集中在“公共数据抓取”、“金融和会计分析的文本挖掘”、“风险建模与管理”等多个方面。她具有银行业风险建模与管理方面的专业经验。近年来致力于文字挖掘技术的开发和应用。
邀请人:严继高