基于多视角数据融合的PM2.5浓度预测研究——以兰州市为例

作者	廖若雯
姓名汉语拼音	LiaoRuowen
学号	2021000003017
培养单位	兰州财经大学
电话	18717781328
电子邮件	rwliao0405@163.com
入学年份	2021-9
学位类别	专业硕士
培养级别	硕士研究生
一级学科名称	统计学
学科代码	0252
第一导师姓名	黄恒君
第一导师姓名汉语拼音	HuangHengjun
第一导师单位	兰州财经大学统计与数据科学学院
第一导师职称	教授
题名	基于多视角数据融合的PM2.5浓度预测研究——以兰州市为例
英文题名	Prediction of PM2.5 concentration based on multi-view data fusion——A case study of Lanzhou City
关键词	PM2.5 数据融合图卷积网络图注意力网络预测
外文关键词	PM2.5 ; Data fusion ; Graph convolutional network ; Graph attention network ; Prediction
摘要	在我国提出的三大攻坚战中，污染防治攻坚战是关乎人民群众身体健康和社会经济可持续发展的重大战役。空气质量问题是污染防治攻坚战的重点之一，对城市发展和居民健康造成了严重影响。近年来，由于工业排放、汽车尾气和人口密集等因素，我国许多地区的空气质量问题越来越严重。持续恶化的空气质量严重影响了城市的发展和居民的健康，空气质量问题成为当今时代的社会关注的热点问题，直径小于或等于 2.5µm 的细微颗粒物（PM2.5）在空气质量监测中作为一个主要的污染源，是制约空气质量改善的主要因素。因此，本文将以 PM2.5浓度值为预测对象，围绕着空气质量预测问题展开研究。根据目前对PM2.5浓度预测的研究现状和发展方向，对各种 PM2.5浓度预测的方法进行了总结，从利用传统的统计方法到机器学习方法，以及随着深度学习的发展与应用，目前也有很多学者利用深度学习方法来预测 PM2.5的浓度值等方面，分析了现有文献的优点与不足之处，发现现有的预测模型的准确性方面还有很大的提升空间，因此，本文的主要研究内容是基于空气质量数据、气象数据、兴趣点（POI）以及路网数据构建多视角图卷积门控递归单元（MGCN-GRU）模型框架和多视角图注意力长短期记忆（MGATs-LSTM）模型框架对 PM2.5浓度值进行时空预测。本文在已有的文献的基础上，挖掘了 PM2.5浓度和时间特征、其他污染物特征、气象特征、POI 特征、路网结构特征之间的相关性。从时间和空间两个不同的维度，猜测具有时间序列的特征会对 PM2.5浓度值的预测有时间依赖性，空间特征也会对 PM2.5浓度产生空间相关性的影响，本文旨在挖掘这些特征对PM2.5浓度值预测的影响，这会有助于提高预测性能，并通过实验进行论证。以甘肃省兰州市的 PM2.5浓度数据为例，基于空气质量数据、气象数据、POI 和路网数据构建 MGCN-GRU 模型框架和 MGATs-LSTM 模型框架对 PM2.5浓度值进行时空预测。其中对于 POI、路网这样的非时序数据，目前的特征提取方法存在忽视不同类别 POI 和路网之间的层次关系的问题。为解决这一问题，我们提出利用图结构学习非时序数据的特征表示，并将其作为辅助信息应用于PM2.5 浓度预测中。最后利用平均绝对误差（MAE）、均方误差（MSE）、均方根误差（RMSE）、平均绝对百分比误差（MAPE）、决定系数（R2）这 5 个指标对预测模型进行评价。结果证明在基于多视角数据融合的基础上加入了空间特征的时空预测模型比其他模型能够更加准确的进行 PM2.5浓度预测。再为了证明多视角数据融合对预测性能的重要影响，进而对城市 PM2.5浓度预测进行消融实验，证明数据融合在城市 PM2.5浓度预测中具有重要的优势，可以帮助我们更好地理解和预测空气质量，以支持决策和改善环境质量。
英文摘要	Among the three major battles proposed by our country, the battle of pollution prevention and control is a major battle related to the health of the people and the sustainable development of society and economy. Air quality is one of the key issues in the battle against pollution, which has a serious impact on urban development and residents' health. In recent years, air quality problems in many parts of China have become more and more serious due to factors such as industrial emissions, vehicle exhaust and dense population. The deteriorating air quality has seriously affected the development of cities and the health of residents, and the issue of air quality has become a hot issue of social concern in today's era. Fine particulate matter (PM2.5) with a diameter less than or equal to 2.5 µm is a major source of pollution in air quality monitoring, and a major factor restricting the improvement of air quality. Therefore, this paper will take PM2.5 concentration value as the forecast object and carry out research on air quality prediction. According to the current research status and development direction of PM2.5 concentration prediction, various PM2.5 concentration prediction methods are summarized, from the use of traditional statistical methods to machine learning methods, as well as with the development and application of deep learning, there are many scholars using deep learning methods to predict PM2.5 concentration. After analyzing the advantages and disadvantages of existing literature, it is found that there is still much room for improvement in the accuracy of existing prediction models. Therefore, Based on air quality data, meteorological data, point of interest (POI) and road network data, the main research content of this paper is to construct multi-view graph Convolutional gated recursive unit (MGCN-GRU) model framework and multi-view graph Long Short-Term Attention memory (MGATs-LSTM) model framework to predict PM2.5 concentration in time and space. On the basis of existing literature, this paper explores the correlation between PM2.5 concentration and time characteristics, other pollutant characteristics, meteorological characteristics, POI characteristics and road network structure characteristics. From the two different dimensions of time and space, it is speculated that features with time series will have a time dependence on the prediction of PM2.5 concentration, and spatial features will also have a spatial correlation impact on PM2.5 concentration. This paper aims to explore the impact of these features on the prediction of PM2.5 concentration, which will help improve the prediction performance, and demonstrate through experiments. Taking PM2.5 concentration data in Lanzhou, Gansu Province as an example, MGCN-GRU model framework and MGATs-LSTM model framework were constructed based on air quality data, meteorological data, POI and road network data to predict PM2.5 concentration in time and space. For non-time series data such as POI and road network, the current feature extraction methods ignore the hierarchical relationship between different types of POI and road network. To solve this problem, we propose to use the graph structure to learn the feature representation of non-time series data and apply it as auxiliary information to the prediction of PM2.5 concentration. At last, the prediction model was evaluated using five indexes: mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), mean absolute percentage error (MAPE) and determination coefficient (R2). The results show that the spatio-temporal prediction model with the addition of spatial features on the basis of multi-view data fusion can predict PM2.5 concentration more accurately than other models. In order to prove the important impact of multi-perspective data fusion on the prediction performance, ablation experiments were conducted for urban PM2.5 concentration prediction to prove that data fusion has important advantages in urban PM2.5 concentration prediction, which can help us better understand and predict air quality to support decision-making and improve environmental quality.
学位类型	硕士
答辩日期	2024-05
学位授予地点	甘肃省兰州市
语种	中文
论文总页数	84
参考文献总数	58
馆藏号	0005618
保密级别	公开
中图分类号	C8/394
文献类型	学位论文
条目标识符	http://ir.lzufe.edu.cn/handle/39EH0E1M/36849
专题	统计与数据科学学院
推荐引用方式 GB/T 7714	廖若雯. 基于多视角数据融合的PM2.5浓度预测研究——以兰州市为例[D]. 甘肃省兰州市. 兰州财经大学,2024.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
2021000003017.pdf（5563KB）	学位论文		开放获取	CC BY-NC-SA	浏览下载