作者韩旭昊
姓名汉语拼音Han XuHao
学号2021000003008
培养单位兰州财经大学
电话18893484067
电子邮件2739344932@qq.com
入学年份2021-9
学位类别专业硕士
培养级别硕士研究生
一级学科名称应用统计
学科代码0252
第一导师姓名赵煜
第一导师姓名汉语拼音Zhao Yu
第一导师单位兰州财经大学
第一导师职称教授
第二导师姓名陈文凯
第二导师姓名汉语拼音Chen WenKai
第二导师单位甘肃省地震局
第二导师职称正高级工程师
题名基于Stacking集成学习的地震人员死亡评估
英文题名Research on Earthquake Casualties Assessment Based on Stacking Ensemble Learning
关键词地震 人员死亡 Stacking集成学习
外文关键词Earthquake ; Casualties ; Stacking Ensemble Learning
摘要

    地震灾害人员死亡快速评估对地震应急响应至关重要。准确了解地震后人员死亡情况,并通过科学评估指导应急救援工作,对于降低经济损失和减少因救援不及时导致的人员死亡具有重要意义。中国大陆位于环太平洋地震带和欧亚地震带之间,地震灾害发生频繁,地震灾害人员死亡影响因素众多,如地震震级、地理环境、人口密度等。这些因素直接影响地震人员死亡结果,如:地震震级直接关系到地震灾害程度;地区人口密度越高,可能导致更多的人员死亡;地震次生灾害也会加剧人员死亡。鉴于以上考虑,本文利用1950年至2022年的中国大陆地震灾害损失评估资料,使用随机森林算法选择地震人员死亡影响因素,基于Stacking集成学习算法建立地震灾害人员死亡快速评估模型,为各级政府和应急管理部门应急指挥决策提供技术支持。该模型有助于在地震后及时部署救援资源,最大程度减少灾害损失。主要工作内容如下:
    1.震害数据的选取。收集中国大陆历史破坏性地震烈度图、人员死亡、灾区人口等基础数据,并根据需要进行整理,获得地震不同烈度区的震区面积和人口密度,同时甄别数据的准确性。
    2.基于随机森林、分类与回归树、梯度提升决策树、自适应提升算法选择地震人员死亡影响因素,选择算法时基于交叉验证评估每个算法的性能,根据特征重要性分析地震人员死亡影响因素,选择最相关的特征。
    3.选取1950-2022年间发生的破坏性地震,对地震死亡人数对数处理,使用生成对抗网络(GAN)将数据扩充到2000条,根据重要性分析选出的因素分别代入Lasso、SVR、XGBoost、RF、LightGBM模型进行训练,采用网格搜索算法确定参数值。用上述模型对地震人员死亡人数做出预测,分析各模型的RMSE值、MAE值、MAPE值,发现LightGBM预测效果最好,Lasso多元线性回归模型最差。基于上述模型构造Stacking集成模型,将多个基础学习器的输出作为输入,通过另一个模型(元学习器)进行最终的预测,根据评价指标得出LightGBM-Stacking预测最准确,效果最好。
    4.为分析LightGBM-Stacking模型的评估效果,随机选取验证震例。充分考虑地震对不同地区造成的差异性影响,将中国大陆划分为西北、西南和东部三个区域,按地震烈度对样本进行分类,为增加样本的多样性,利用GAN扩充样本,对模型结果进行分析,并与其他评估方法进行比较,验证模型的性能和准确性。这一方法的创新点在于考虑到地震在不同地区的特殊情况,通过细致的区域划分和样本分类,以及GAN网络的应用,提高模型的鲁棒性和泛化能力。
 

英文摘要

  Rapid assessment of human casualties in earthquake disasters is crucial to earthquake emergency response. Accurately understanding the casualties after an earthquake and guiding the emergency rescue work through scientific assessment is of great significance in reducing economic losses and minimizing casualties caused by untimely rescue. Mainland China is located between the Pacific Rim Seismic Belt and the Eurasian Seismic Belt, where seismic disasters occur frequently, and there are many factors affecting the casualties of seismic disasters, such as earthquake magnitude, geographic environment, and population density. These factors directly affect the results of earthquake casualties, such as: the earthquake magnitude is directly related to the degree of seismic hazard; the higher population density of the region may lead to serious casualties; and secondary disasters of earthquakes may also aggravate casualties. In view of the above considerations, this paper makes use of the earthquake disaster loss assessment data from 1950 to 2022 in mainland China, uses the random forest algorithm to select the factors affecting earthquake casualties for modeling, and establishes a rapid assessment model of earthquake disaster casualties based on the Stacking integrated learning algorithm, which provides technical support for the earthquake emergency response command and decision-making of the governments at all levels and the emergency management departments. The model helps to deploy rescue resources in time after an earthquake and minimize disaster losses. The main work is as follows:
  1.Selection of seismic data. Collect basic data such as historical destructive earthquake intensity maps, deaths, and population of the affected areas in mainland China, and organize them as needed to obtain the area of the seismic zone and population density in different intensity zones of the earthquakes, as well as to screen the accuracy of the data.
  2.select the factors affecting earthquake casualties based on Random Forest, Classification and Regression Tree, Gradient Boosting Decision Tree, and Adaptive Boosting algorithms, evaluate the performance of each algorithm based on cross-validation when selecting the algorithms, and identify the factors affecting earthquake casualties based on the analysis of the importance of the features, and select the most relevant features.
  3.Destructive earthquakes occurring between 1950 and 2022 are selected, the number of earthquake fatalities is logarithmically processed, and the data are expanded to 2,000 using GAN, and the factors selected according to the importance analysis are substituted into the Lasso Multiple Linear Regression, SVR, XGBoost, RF, and LightGBM models for training, respectively, and the lattice search algorithm is used to Determine the parameter values. The above models were used to predict the number of earthquake deaths, and the RMSE, MAE, and MAPE values of each model were analyzed, and it was found that LightGBM had the best prediction effect, and Lasso multiple linear regression model was the worst. The Stacking integrated model is constructed based on the above model, and the outputs of multiple base learners are used as inputs to make the final prediction through another model (meta-learner), and according to the evaluation index, it is concluded that LightGBM-Stacking prediction is the most accurate and the best.
  4.In order to analyze the evaluation effect of the LightGBM-Stacking model, randomly selected validation earthquakes were used. Taking into full consideration the differential impacts of earthquakes on different regions, the Chinese mainland is divided into three regions: northwest, southwest and east, and the samples are classified according to the seismic intensity. To increase the diversity of the samples, the samples are expanded by using the GAN, and the results of the model are analyzed and compared with other methods of assessing the casualties, so as to validate the performance and accuracy of the model. The innovation of this method is to consider the special situation of earthquakes in different regions, and to improve the robustness and generalization ability of the model through careful regional division and sample classification, as well as the application of GAN networks.

学位类型硕士
答辩日期2024-05-25
学位授予地点甘肃省兰州市
语种中文
论文总页数62
参考文献总数57
馆藏号0005609
保密级别公开
中图分类号C8/385
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/37070
专题统计与数据科学学院
推荐引用方式
GB/T 7714
韩旭昊. 基于Stacking集成学习的地震人员死亡评估[D]. 甘肃省兰州市. 兰州财经大学,2024.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
2021级韩旭昊含授权页学位论文.pdf(2311KB)学位论文 开放获取CC BY-NC-SA浏览 下载
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[韩旭昊]的文章
百度学术
百度学术中相似的文章
[韩旭昊]的文章
必应学术
必应学术中相似的文章
[韩旭昊]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 2021级韩旭昊含授权页学位论文.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。