作者周尧民
姓名汉语拼音ZhouYaomin
学号2019000003027
培养单位兰州财经大学
电话13893612323
电子邮件598127220@qq.com
入学年份2019-9
学位类别学术硕士
培养级别硕士研究生
学科门类经济学
一级学科名称应用经济学
学科方向统计学
学科代码020208
授予学位经济学硕士学位
第一导师姓名黄恒君
第一导师姓名汉语拼音HuangHengjun
第一导师单位兰州财经大学
第一导师职称教授
题名基于“分解-聚类-集成”的PM2.5时空预测研究及其应用
英文题名Spatio-temporal prediction of PM2.5 based on decomposition-clustering-integration and its application
关键词时空预测 模态分解 时间序列聚类 拉普拉斯算子 LSTM神经网络
外文关键词Spatiotemporal prediction; Modal decomposition; Time series clustering; Laplacian; LSTM neural network
摘要

随着城市的发展、人口的集聚,城市汽车保有量持续增加,周边工厂排放的空气污染物等,造成城市环境恶化,居民出行以及健康状况受到严重影响。由此,利用城市空气质量数据、气象数据、空间POI等等城市大数据,构建精准的空气质量模型,从而更好的帮助居民制定出行计划,辅助政府制定环保决策。

在构建空气质量预测模型中从时间维度和空间维度共同出发,不仅丰富了研究角度,并且将数据融合的构想运用在研究中,将时间序列与空间信息进行了融合。本文以PM2.5污染物浓度为例,探究PM2.5序列在时间维度与空间维度的特征提取方式,将其纳入预测模型,并将时间与空间维度的预测结果动态地结合,提升预测效果。主要工作如下:

第一,探究时空模型理论算法,包括数据缺失值、离群点的处理,以及对运用相关性理论进行特征选取。在此基础上,本文深入研究了PM2.5预测的各类前沿算法:模态分解、时间序列聚类、深度神经网络等,构建PM2.5时空预测模型理论架构。

第二,构建时空预测模型,在分析PM2.5预测在时间维度和空间维度特性的基础上,分别从两个维度出发构建时间与空间预测器。在时间维度上,利用模态分解提取PM2.5数据波动特征,运用时间序列聚类算法对分量进行重构,并基于ELSTM模型构建时间预测器;在空间维度运用拉普拉斯算子从图模型角度提取站点的空间关系,以此构建空间预测器;最后,运用XGBoost将两部分结果进行动态聚合,完成LX-M-CEEMDAN-VMD-LSTM模型的构建。

第三,利用兰州市空气污染物浓度数据、气象数据以及地理信息对PM2.5浓度序列进行预测。在时间预测模块中,运用CEEMDANVMD分解构建二层分解方法提取时序信息,并进行聚类重构不仅提高了时序序列预测的精度,而且运用聚类进行数据重构进一步简化了模型。在空间预测模块中,拉普拉斯矩阵有效的提取数据的空间特征,提升了空间预测精度。基于XGBoost提取各类特征重要性则将时间与空间特征动态结合,弥补了各自维度的不足以均方根误差(RMSE)、绝对值误差(MAE)和和平均绝对误差百分比(MAPE)三个评价指标以及DM检验对比此模型的优越性、有效性。结果表明,本文所构建模型的预测精度显著提高,在各项指标中均优于对照模型。

英文摘要

With the development of the city and the agglomeration of the population, the number of cars in the city continues to increase, and the air pollutants emitted by the surrounding factories have caused the deterioration of the urban environment, and the travel and health of the residents have been seriously affected. Therefore, using urban big data such as urban air quality data, meteorological data, spatial POI, etc., to build an accurate air quality model, so as to better help residents make travel plans and assist the government to make environmental protection decisions.

Starting from the time dimension and the space dimension in the construction of the air quality prediction model not only enriches the research angle, but also applies the concept of data fusion in the research, and integrates the time series with the spatial information. Taking the PM2.5 pollutant concentration as an example, this paper explores the feature extraction method of PM2.5 sequence in the time dimension and space dimension, incorporates it into the prediction model, and dynamically combines the prediction results of the time and space dimensions to improve the prediction effect. The main work is as follows:

Firstly, explore the theoretical algorithms of spatiotemporal models, including the processing of missing data and outliers, and feature selection using correlation theory. On this basis, this paper deeply studies various cutting-edge algorithms for PM2.5 prediction: modal decomposition, time series clustering, deep neural network, etc., and constructs the theoretical framework of PM2.5 spatiotemporal prediction model.

Secondly, build a spatiotemporal prediction model, and build a temporal and spatial predictor on the basis of analyzing the characteristics of PM2.5 prediction in the time dimension and space dimension. In the time dimension, the modal decomposition is used to extract the fluctuation characteristics of PM2.5 data, the time series clustering algorithm is used to reconstruct the components, and the time predictor is constructed based on the ELSTM model; In the spatial dimension, the Laplacian operator is used to extract the spatial relationship of the site from the perspective of the graphical model, so as to construct the spatial predictor; finally, XGBoost is used to dynamically aggregate the two parts of the results to complete the LX-M-CEEMDAN-VMD-LSTM model 's build.

Thirdly, the PM2.5 concentration sequence is predicted by using lanzhou air pollutant concentration data, meteorological data and geographic information. In the time prediction module, CEEMDAN and VMD were used to construct a two-level decomposition method to extract time series information, and then cluster reconstruction was carried out, which not only improved the accuracy of time series prediction, but also further simplified the model by clustering data reconstruction. In the space prediction module, the Laplace matrix can effectively extract the spatial features of data and improve the accuracy of space prediction. Based on XGBoost, the importance of various features is extracted, and the temporal and spatial features are dynamically combined to make up for the deficiency of their respective dimensions. Root mean square error (RMSE), absolute error (MAE) and mean absolute error percentage (MAPE) were used to compare the advantages and effectiveness of the model. The empirical results show that the prediction accuracy of the proposed model is significantly improved, and it is superior to the control model in all indicators.

学位类型硕士
答辩日期2022-05-15
学位授予地点甘肃省兰州市
语种中文
论文总页数66
参考文献总数48
馆藏号0004157
保密级别公开
中图分类号C8/290
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/32156
专题统计与数据科学学院
推荐引用方式
GB/T 7714
周尧民. 基于“分解-聚类-集成”的PM2.5时空预测研究及其应用[D]. 甘肃省兰州市. 兰州财经大学,2022.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
10741_2019000003027_(6273KB)学位论文 暂不开放CC BY-NC-SA请求全文
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[周尧民]的文章
百度学术
百度学术中相似的文章
[周尧民]的文章
必应学术
必应学术中相似的文章
[周尧民]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。