作者于婷
姓名汉语拼音Yu Ting
学号2019000003017
培养单位兰州财经大学
电话15706059028
电子邮件Fearless_yt@163.com
入学年份2019-9
学位类别学术硕士
培养级别硕士研究生
学科门类理学
一级学科名称统计学
学科方向数理统计学
学科代码0714Z3
第一导师姓名孟生旺
第一导师姓名汉语拼音Meng Shengwang
第一导师单位中国人民大学
第一导师职称教授
题名多模态数据驱动的组合预测方法研究及应用
英文题名Research on multimodal data driven combined forecasting method and its application
关键词集装箱吞吐量 机场客流预测 分解 集成 二次分解 网络搜索信息 布谷鸟搜索算法 麻雀搜索算法
外文关键词Container Throughput ; Network Search Information ; Cuckoo Search Algorithm ; Aparrow Aearch Algorithm ; Airport Passenger Flow Forecast ; Decomposition-Integration ; QuadraticDdecomposition
摘要

为了对时间序列数据进行更高精度的预测,本文基于分解-集成框架,提出了组合预测新方法,分别建立一次分解、二次分解、基于网络搜索信息的预测模型。

本文首先提出了一种基于EEMD-PR/PSO-LSSVR-PM的组合预测模型。首先,采用集合经验模态分解(EEMD)将数据分解为多个不同频率的本征模态函数(IMFs),以降低数据的复杂性。然后利用粒子群最小二乘支持向量回归(POS-LSSVR)来分别预测IMFs,利用一元多项式(PR)来预测带有趋势的残差项,最终对各个序列利用感知机模型(PM)进行非线性集成,得到最终预测结果。在实证分析中,以2019年我国港口集装箱吞吐量排名前十的大规模港口为研究对象,先运用K-Means聚类按数据特征将港口分为3类,从每一类中分别选取广州、营口、上海为三大代表性港口。利用本文所提出的组合预测方法进行实证预测。

其次,在一次分解的基础上,提出了一种二次分解-集成预测模型。在一次分解降低原始数据的复杂度的基础上,进一步挖掘数据潜在特征,并且通过重构子序列避免预测时的误差累积。第一步,用集合经验模态分解(EEMD)将原始机场客流量分解,将得到的子序列重构,得到高、中、低频序列;第二步,高、中频序列由于其变化波动较大、频率较快,采取变分模态分解(VMD)方法对其进一步分解,使其均被分解为复杂度较低,且更易于预测的子序列;第三步,采用布谷鸟搜索算法优化BP神经网络(CS-BP)模型预测所有子序列,并采用试错法自适应的确定神经网络模型最佳滞后期;第四步,分别将高频、中频子序列的预测值采用CS-BP模型进行集成,得到高频、中频的预测值;最后,将所有高、中、低频的预测值采用CS-BP模型汇总集成为最终预测值。

最后,在分解-集成框架下提出了一种基于网络搜索信息的组合预测新法。首先,采用平均影响值和时差相关分析法对机场旅客吞吐量相关的网络搜索关键词进行筛选,利用每个关键词搜索量与原始航空客流数据的相关程度确定最佳滞后期,进而合成综合搜索指数。其次,利用ICEEMDAN方法分别将机场旅客吞吐量和综合搜索指数分解为若干子模态序列,并依据子序列的样本熵值重构为高、中、低频序列。以搜索指数中的不同频率成分作为辅助输入信息,分别对机场旅客吞吐量的高频和中频序列采用麻雀搜索算法优化的BP神经网络(SSA-BP)模型进行预测,而低频序列采用自回归分布滞后模型进行预测,最后将不同频率序列预测值用SSA-BP进行综合集成得到最终的预测值。

利用文章所提出的组合预测方法进行实证研究预测,结果表明,本文提出的方法在在港口集装箱吞吐量预测和机场客流预测中均有较高的预测精度和鲁棒性。

英文摘要

In order to predict time series data with higher accuracy, based onthe “decomposition-integration”framework, this thesis proposes a new method of combined forecasting, and establishes a forecasting model based on primary decomposition, secondary decomposition, and network search information. Empirically, it is found that the combined forecasting method can significantly improve the forecasting accuracy and show better robustness.

In this thesis, a combined prediction model based on EEMD-PR/PSO-LSSVR-PM is proposed. First, ensemble empirical mode decomposition (EEMD) is used to decompose the data into multiple eigenmode functions (IMFs) with different frequencies to reduce the complexity of the data. Then use Particle Swarm Least Squares Support Vector Regression (POS-LSSVR) to predict IMFs separately, use Univariate Polynomial (PR) to predict residuals with trends, and finally use Perceptron Model (PM) for each sequence to perform nonlinear Integrate to get the final prediction result. In the empirical analysis, taking the top ten large-scale ports in my country's port container throughput in 2019 as the research object, the K-Means clustering was used to classify the ports into three categories according to data characteristics, and from each category, Guangzhou, Yingkou and Shanghai are the three representative ports. Empirical forecasting is carried out using the combined forecasting method proposed in this thesis.

A quadratic decomposition-integrated prediction model is established. On the basis of reducing the complexity of the original data by the primary decomposition, the potential features of the data are further mined, and the accumulation of errors in prediction is avoided by reconstructing the subsequences. In the first step, the original airport passenger flow is decomposed by Ensemble Empirical Mode Decomposition (EEMD), and the obtained subsequences are reconstructed to obtain high, medium and low frequency sequences; in the second step, the high and medium frequency sequences fluctuate greatly due to their changes. , the frequency is faster, and the variationalmodal decomposition (VMD) method is used to further decompose it, so that it can be decomposed into subsequences with lower complexity and easier to predict; the third step is to use the cuckoo search algorithm to optimize the BP The neural network (CS-BP) model predicts all the subsequences, and uses the trial-and-error method to adaptively determine the optimal lag period of the neural network model; the fourth step is to use the CS-BP model for the predicted values of the high-frequency and intermediate-frequency subsequences. Integrate to get the predicted values of high frequency and medium frequency; finally, use the CS-BP model to aggregate and integrate all the predicted values of high, medium and low frequency into the final predicted value.

This thesisproposes another new method of “decomposition-integration”combination prediction based on network search information. Firstly, the average impact value and time difference correlation analysis method are used to filter the network search keywords related to the airport passenger throughput, and the optimal lag period is determined by the correlation between the search volume of eachkeyword and the original aviation passenger flow data, and then acomprehensive search is synthesized. index. Secondly, the airport passenger throughput and comprehensive search index are decomposed into several sub-modal sequences by the ICEEMDAN method, and reconstructed into high, medium and low frequency sequences according to the sample entropy values of the subsequences. Using the different frequency components in the search index as auxiliary input information, the high-frequency and intermediate-frequency sequences of the airport passenger throughput are predicted by the BP neural network (SSA-BP) model optimized by the sparrow search algorithm, while the low-frequency sequences are predicted by the autoregressive distribution. The lag model is used for prediction, and finally the prediction values of different frequency series are integratedwith SSA-BP to obtain the final prediction value.

The combined forecasting method proposed in this thesisis used to conduct empirical research forecasting. The results show that the method proposed in this thesishas high forecasting accuracy and robustness in both port container throughput forecasting and airport passenger flow forecasting.

学位类型硕士
答辩日期2022-05-15
学位授予地点甘肃省兰州市
语种中文
论文总页数68
参考文献总数48
馆藏号0004147
保密级别公开
中图分类号O212/25
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/32419
专题统计与数据科学学院
推荐引用方式
GB/T 7714
于婷. 多模态数据驱动的组合预测方法研究及应用[D]. 甘肃省兰州市. 兰州财经大学,2022.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
2019000003017.pdf(4011KB)学位论文 开放获取CC BY-NC-SA浏览 下载
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[于婷]的文章
百度学术
百度学术中相似的文章
[于婷]的文章
必应学术
必应学术中相似的文章
[于婷]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 2019000003017.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。