GLM与神经网络的集成模型及其应用

作者	李仁祥
姓名汉语拼音	Li Renxiang
学号	2018000003091
培养单位	兰州财经大学
电话	18811750669
电子邮件	972094360@qq.com
入学年份	2018-9
学位类别	学术硕士
培养级别	硕士研究生
学科门类	理学
一级学科名称	统计学
学科方向	数理统计学
学科代码	0714Z4
第一导师姓名	孟生旺
第一导师姓名汉语拼音	Meng Shengwang
第一导师单位	中国人民大学
第一导师职称	教授
题名	GLM与神经网络的集成模型及其应用
英文题名	Integrated Model of GLM and Neural Network and Its Application
关键词	深度神经网络广义线性模型残差修正集成模型
外文关键词	Deep Neural Network; Generalized Linear Model; Residual Correction; Integration Model
摘要	广义线性模型是非寿险精算的标准模型，广泛应用于非寿险费率厘定和准备金评估。在索赔频率预测中，常用的广义线性模型有泊松回归模型和负二项回归。虽然广义线性模型有着很好的模型解释力，但是其并不能很好的反映数据之间的复杂关系。而神经网络模型有比较好的模型效果，可以很好的揭示数据之间的复杂关系。本文的主要目标是将GLM(Generalize Linear Model)与神经网络模型结合，获得更好预测效果的同时也使模型具有解释力。本文先介绍多个索赔频率模型并针对模型变量选择问题建立Lasso-GAMLSS(Generalized Additive Models For Location,Scale And Shape)模型，然后将多个深度学习神经网络引入到非寿险精算中，基于实际车险数据的分析结果表明，神经网络的模型效果要优于传统索赔频率的模型效果。基于上述结果，将广义线性模型与神经网络模型结合，引入CANN（Combined actuarial neural network）模型。因为GAM(Generalized Additive Models)比广义线性模型有着更好的拟合效果，所以将其扩展为广义可加模型的CANN。又因为CNN(Convolution Neural Network)模型能够自动选择变量，在变量多的情形下有好的模型效果，所以将其扩展为CNN的CANN模型。为了提高CANN模型的解释力，本文提出了基于深度学习神经网络修正GLM残差的集成模型。先从理论的角度进行研究，说明相关集成模型的优点，接着建立集成模型。首先利用广义线性模型对损失数据进行初步预测，计算预测值与观测值的残差。将多个神经网络模型对损失数据进行预测，选择结果相对较好的几个模型以及模型的最优参数。以损失影响因素为自变量以及残差为因变量，建立改进BP(Back Propagation)、DNN(Deep Neural Network)和CNN模型进行残差拟合，接着使用深度神经网络的残差预测值对广义线性模型的预测值进行修正就得到了预测结果。基于集成的思想，从残差修正模型中选择合适的模型建立两种集成模型。第一种集成方法是对残差修正模型使用DNN模型进行集成，第二种集成方法是对预测模型使用线性回归模型进行集成。通过实证研究发现，残差修正模型比广义线性模型的效果好，同时与神经网络模型的结果相差不大，集成模型的效果比单个残差修正模型的效果好，第二种集成方法比第一种集成方法效果更好。由此得到的集成模型在提高车险索赔频率预测效果的同时还保留了传统GLM模型的解释力。
英文摘要	Generalized linear model is the standard model of non life insurance actuarial, which is widely used in non life insurance rate determination and reserve evaluation. In the prediction of claim frequency, Poisson regression model and negative binomial regression model are commonly used. Although the generalized linear model has a good explanatory power, it can not well reflect the complex relationship between the data. The neural network model has a good model effect, which can well reveal the complex relationship between the data. The main goal of this paper is to combine GLM (generalized linear model) with neural network model to obtain better prediction effect and make the model have explanatory power at the same time. This paper first introduces multiple claim frequency models and establishes Lasso-Gamlss (generalized additional models for location, scale and shape) model for the problem of model variable selection. Then it introduces multiple deep learning neural networks into non life insurance actuarial. The analysis results based on actual vehicle insurance data show that the effect of neural network model is better than that of traditional claim frequency model. Based on the above results, combined with the generalized linear model and neural network model, CANN(combined actual neural network) model is introduced. Because GAM (generalized additive models) has better fitting effect than generalized linear model, it is extended to CANN of generalized additive model. Because CNN (revolution neural network) model can automatically select variables, it has good model effect in the case of many variables, so it is extended to CANN of CNN model. In order to improve the explanatory power of CANN model, this paper proposes an integrated model based on deep learning neural network to modify GLM residuals. Firstly, it studies from the perspective of theory, explains the advantages of the relevant integration model, and then establishes the integration model. Firstly, the generalized linear model is used to predict the loss data, and the residual between the predicted value and the observed value is calculated. Several neural network models are used to predict the loss data, and several models with relatively good results and the optimal parameters of the model are selected. Taking loss influencing factors as independent variables and residuals as dependent variables, the improved BP (back propagation), DNN (deep neural network) and CNN models are established to fit the residuals, and then the prediction value of the generalized linear model is modified by the residual prediction value of the deep neural network to obtain the prediction result. Based on the idea of integration, two kinds of integration models are established by selecting the appropriate model from the residual correction model. The first method is to integrate the residual correction model with DNN model, and the second method is to integrate the prediction model with linear regression model. Through the empirical study, it is found that the residual correction model is better than the generalized linear model, and the result is similar to that of the neural network model. The effect of the integrated model is better than that of the single residual correction model, and the effect of the second integrated method is better than that of the first integrated method. The results show that the integrated model can improve the prediction effect of vehicle insurance claim frequency while retaining the explanatory power of traditional GLM model.
学位类型	硕士
答辩日期	2021-05-15
学位授予地点	甘肃省兰州市
语种	中文
论文总页数	70
参考文献总数	50
馆藏号	0003537
保密级别	公开
中图分类号	O212/9
文献类型	学位论文
条目标识符	http://ir.lzufe.edu.cn/handle/39EH0E1M/29663
专题	统计与数据科学学院
推荐引用方式 GB/T 7714	李仁祥. GLM与神经网络的集成模型及其应用[D]. 甘肃省兰州市. 兰州财经大学,2021.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
兰州财经大学李仁祥毕业论文终稿上传（1724KB）	学位论文		开放获取	CC BY-NC-SA	浏览下载