作者吴桐
姓名汉语拼音Wu Tong
学号2021000003036
培养单位兰州财经大学
电话18916119980
电子邮件920698959@qq.com
入学年份2021-9
学位类别专业硕士
培养级别硕士研究生
一级学科名称应用统计
学科代码0252
授予学位经济学硕士
第一导师姓名牛成英
第一导师姓名汉语拼音Niu Chengying
第一导师单位兰州财经大学
第一导师职称教授
题名基于LASSO的PSM变量选择与应用
英文题名PSM Variable Selection And Application Based on LASSO
关键词LASSO 倾向得分匹配(PSM) LASSO-PSM 变量选择 工作类型
外文关键词LASSO ; Propensity Score Matching(PSM) ; LASSO-PSM ; Variable Selection ; Job Type
摘要

随机化实验是反事实框架下因果效应分析的黄金标准,但基于观察数据的实证研究中,由于各种原因,使得研究的样本单元无法满足随机化分配要求。倾向得分匹配(Propensity Score MatchingPSM)是一种将研究数据处理成“随机对照实验数据”的常用方法,目的在于减少观察数据偏差和混杂因素的干扰,目前在诸多领域有着广泛应用。

但基于高维数据的倾向得分匹配模型设定直接影响处理组与控制组样本匹配结果的平衡性,特别是为了达到控制潜在混淆变量、提高匹配质量和增强模型稳健性等目的而加入过多变量,造成变量之间存在相关性,给匹配带来维度灾难、支持度差异、多重检验等问题,最终导致匹配结果平衡性较差以及因果效应估计不可靠,因此使用PSM时需要合理选择模型变量来提高匹配结果的可靠性。

在变量选择方法中,最小绝对收缩和选择算子(Least Absolute Shrinkage and Selection OperatorLASSO)的独特优势是具有自动选择特征的能力,将 LASSO变量选择的优势应用到PSM中,提出得到基于LASSOPSM模型,即LASSO-PSM模型,解决了传统倾向得分匹配中模型设定主观性和维度灾难问题。结果表明,LASSO-PSM模型的可行性及其匹配结果的平衡性优于PSM模型。

LASSO-PSM模型应用到工作类型偏好与工作选择因素的实证研究中,利用LASSO-PSM模型对劳动者择业偏好的相应变量进行筛选,再利用筛选后的变量计算不同工作类型劳动者生活状况(经济收入、心理压力、健康状况和幸福感)的因果效应。研究结果发现:LASSO-PSM模型选择变量更符合实际意义,不同类型工作劳动者的经济收入存在显著差异。

英文摘要

Randomized experiments are considered the gold standard for causal effect analysis within the counterfactual framework. However, in empirical studies based on observational data, various reasons often render the sample units unable to meet the requirements of random allocation. Propensity Score Matching (PSM) is a commonly used method to process study data into "randomized controlled trial data," aiming to reduce bias and interference from confounding factors in observational data. Currently, PSM is widely applied in numerous fields to mitigate the impact of observational data biases and confounding factors.

In high-dimensional data, the specification of the propensity score matching model directly affects the balance of matching results between the treatment and control groups. Particularly, adding too many variables to control potential confounders, improve matching quality, and enhance model robustness can lead to inter-variable correlations. This, in turn, results in issues such as the curse of dimensionality, support differences, and multiple testing, ultimately leading to poor balance in matching results and unreliable estimation of causal effects. Therefore, when using PSM, it is essential to judiciously select model variables to enhance the reliability of matching results.

In the variable selection process, the unique advantage of the Least Absolute Shrinkage and Selection Operator (LASSO) is its ability to automatically select features. By incorporating the advantages of LASSO variable selection into Propensity Score Matching (PSM), a LASSO-based PSM model, namely LASSO-PSM model, is proposed to address the subjectivity and dimensionality issues in traditional propensity score matching. The results indicate that the feasibility of the LASSO-PSM model and the balance of matching results are superior to those of the PSM model.

In an empirical study on the preference for job types and factors influencing job selection, the LASSO-PSM model was applied to select relevant variables related to workers' job preferences. Subsequently, the selected variables were utilized to calculate the causal effects of different job types on workers' living conditions (economic income, psychological pressure, health status, and sense of happiness). The research findings indicate that the variables selected by the LASSO-PSM model are more aligned with practical significance, and there are significant differences in economic income among workers in different types of jobs.

学位类型硕士
答辩日期2024-05-25
学位授予地点甘肃省兰州市
研究方向大数据分析
语种中文
论文总页数67
参考文献总数64
馆藏号0005637
保密级别公开
中图分类号C8/413
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/36689
专题统计与数据科学学院
推荐引用方式
GB/T 7714
吴桐. 基于LASSO的PSM变量选择与应用[D]. 甘肃省兰州市. 兰州财经大学,2024.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
2021000003036.pdf(1072KB)学位论文 开放获取CC BY-NC-SA浏览 下载
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[吴桐]的文章
百度学术
百度学术中相似的文章
[吴桐]的文章
必应学术
必应学术中相似的文章
[吴桐]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 2021000003036.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。