作者邴贵英
姓名汉语拼音Bing Guiying
学号2021000003002
培养单位兰州财经大学
电话18419154109
电子邮件2321415594@qq.com
入学年份2021-9
学位类别专业硕士
培养级别硕士研究生
一级学科名称应用统计
学科代码0252
第一导师姓名孙景云
第一导师姓名汉语拼音Sun Jingyun
第一导师单位兰州财经大学
第一导师职称教授
题名“聚类-降维”策略下基于多源数据的铜期货价格预测
英文题名Copper futures price forecasting based on multi-source heterogeneous data under the strategy of "Clustering-Dimensionality Reduction"
关键词铜价预测 谷歌趋势 CNN 文本分析 KELM
外文关键词Copper price prediction ; Google Trends ; CNN ; Text analysis ; KELM
摘要

       随着在 21 世纪后参与大宗商品市场的金融资本不断增加,期货市场价格呈 现频繁而剧烈的波动。作为中国铜现货价格“风向标”的沪铜期货,它的价格波 动给相关利益主体带来各方面的不确定性。我国作为全球第一大铜消费国,铜在 我国工业系统中占据重要地位,国际铜价的剧烈波动也会进一步传导到国内,对 我国工业经济系统产生重大影响。同时,互联网技术和有色金属期货市场高速发 展,投资者可以通过互联网实时收集更为及时和丰富的信息,并在期货市场做出 相应的投资决策。因此,对铜期货价格运行规律进行深入研究显得尤为重要。在 此背景下,分析导致铜价格波动的外部因素以及如何提高铜价格预测的准确性已 成为一个新课题。

       本文基于目前的文献研究,分别以上海期货市场的沪铜期货和国际铜期货为 研究对象进行具体分析,主要研究工作如下:(1)采用多尺度数据,分别将宏观 经济数据、百度指数以及历史价格数据作为影响沪铜期货价格变化的宏观经济因 素和投资者微观关注度特征,为减少预测偏差,从源头上提高预测精度。此外, 本文还提出了一种新的沪铜期货价格预测的混合模型:SC-KPCA-KELM。首先 对多源数据信息集进行系统聚类(SC),然后对聚类结果利用核主成分分析法 (KPCA)进行特征提取,最后将提取出的主要特征作为预测因子,通过对比验 证预测因子在沪铜期货的月度价格预测中的有效性。(2)在采用“聚类-降维” 方法的基础上再采用 KELM 模型预测的方法获得了较好的预测性能。因此继续 采用该策略,基于“K-means-KPCA”进行特征提取并结合多源数据信息(宏观 经济数据、谷歌趋势以及历史价格数据),使用 K-means-KPCA-KELM 混合模型 对国际铜期货的周度价格进行预测,然后通过对比验证预测模型的有效性。(3) 互联网信息技术的出现意味着有足够的在线数据来反映驱动铜期货市场的因素, 并且智能优化算法的提出能有效改善模型预测精度。因此提出了一种利用在线媒 体文本、谷歌趋势、宏观经济数据及历史价格数据的新型数据驱动的国际铜价格 预测混合模型:K-means-KPCA-GWO-KELM,以深入挖掘上述多尺度数据的信 息,从而提高周度国际铜期货价格预测精度。通过卷积神经网络(CNN)来说明 在线新闻标题对国际铜价格预测的解释能力,变分模态分解(VMD)被用来构建 基于 CNN 输出的有效的时间序列指标。将 CNN 序列、谷歌趋势、宏观经济数据以及历史铜价数据作为输入变量,构建 K-means-KPCA-KELM 模型进行实证 研究。

       实证结果表明,(1)本研究提出的沪铜期货价格预测模型与其他预测模型相 比,在 SC-KPCA 方法下综合利用宏观经济数据和百度搜索信息的混合预测模型 在水平和方向预测精度上均获得了更好的预测性能。基于混合数据集的 SCKPCA-KELM 方法具有最低的 MAPE: 4.219%,最低的 RMSE:0.059 以及最高的DS:62.963%。在 SC-KPCA 方法下,基于混合数据集和 KELM 方法的预测模型 具有更优的预测能力。(2)针对国际铜期货价格预测,分别从数据和方法层面分 析:在数据层面,混合数据集在水平和方向预测精度上均显著优于经济数据集或GSVI 数据集。这表明混合数据结合了它们的优势,在水平和方向精度上都获得 了最佳的预测性能。在方法层面,基于“K-means-KPCA”方法的混合数据集和KELM 获得了更好的预测性能。该方法具有最低的 MAPE:5.42%,最低RMSE:546.99 和最高 DA:74.35%。(3)研究发现,在融合谷歌趋势与宏观指标特 征的混合数据集中加入文本特征同时作为预测因子预测国际铜价时,可以有效提 高预测精度,且经过灰狼优化后的 KELM 模型与原模型相比具有更高的预测精 度。结合两者的优势,可以在优化模型结构的同时提高信息利用效率。

英文摘要

       With the increasing amount of financial capital participating in the commodity market after the 21st century, the futures market price shows frequent and violent fluctuations. As the "weather vane" of China's copper spot price, the price fluctuations of Shanghai copper futures bring uncertainty to relevant stakeholders. As the world's largest copper consumer, copper occupies an important position in China's industrial system, and the violent fluctuations in international copper prices will be further transmitted to China, which will have a significant impact on China's industrial economic system. At the same time, with the rapid development of Internet technology and non-ferrous metal futures market, investors can collect more timely and rich information in real time through the Internet, and make corresponding investment decisions in the futures market. Therefore, it is particularly important to conduct in-depth research on the operation law of copper futures prices. In this context, the analysis of external factors that contribute to the fluctuation of copper prices and how to improve the accuracy of copper price forecasts has become a new topic. 

      Based on the current literature research, this paper takes Shanghai Copper Futures and International Copper Futures in the Shanghai Futures Market as the research objects for specific analysis, and the main research work is as follows: (1) Using multi-scale data, macroeconomic data, Baidu index and historical price data are used as macroeconomic factors affecting the price changes of Shanghai Copper Futures and the characteristics of investors' micro attention, so as to reduce the forecast bias and improve the prediction accuracy from the source. In addition, this paper proposes a new hybrid model of Shanghai copper futures price prediction: SC-KPCAKELM. Firstly, systematic clustering (SC) was carried out on the multisource data information set, then the kernel principal component analysis (KPCA) was used to extract the features of the clustering results, and finally the extracted main features were used as predictors, and the effectiveness of the predictors in the monthly price prediction of Shanghai copper futures was verified by comparison. (2) On the basis of the "clustering-dimensionality reduction" framework, the KELM model prediction method is used to obtain better prediction performance. Therefore, this strategy is continued to be adopted, combined with multisource data information (macroeconomic data, Google Trends and historical price data), and the K-means-KPCA-KEL hybrid model is used to predict the weekly price of international copper futures, and then the effectiveness of the prediction model is verified by comparison. (3) The emergence of Internet information technology means that there is enough online data to reflect the factors driving the copper futures market, and the proposal of intelligent optimization algorithm can effectively improve the prediction accuracy of the model. Therefore, a new data-driven hybrid model of international copper price forecasting: K-means-KPCA-GWOKELM, which uses online media text, Google Trends, macroeconomic data and historical price data, is proposed to dig deeper into the information of the above multi-scale data, so as to improve the accuracy of weekly international copper futures price forecasting. Convolutional Neural Network (CNN) was used to illustrate the explanatory power of online news headlines for international copper price forecasts, and Variational Mode Decomposition (VMD) was used to construct effective time series indicators based on CNN output. Using CNN series, Google Trends, macroeconomic data and historical copper price data as input variables, the K-means-KPCA-KELM model was constructed for empirical research. 

      The empirical results show that: (1) Compared with other forecasting models, the Shanghai copper futures price prediction model proposed in this study achieves better prediction performance in both horizontal and directional forecasting accuracy by using macroeconomic data and Baidu search information under the "SC-KPCA" forecasting framework. The SCKPCA-KELM method based on the mixed dataset had the lowest MAPE: 4.219%, the lowest RMSE: 0.059, and the highest DS: 62.963%. Under the SC-KPCA framework, the prediction model based on mixed datasets and KELM method has better prediction ability. (2) For the international copper futures price forecast, the analysis is from the data and method level: at the data level, the mixed data set is significantly better than the economic data set or GSVI data set in terms of horizontal and directional prediction accuracy. This suggests that the pooled data combines their strengths to achieve the best prediction performance in both horizontal and directional accuracy. At the method level, the hybrid dataset based on the "K-meansKPCA" framework and KELM obtained better prediction performance. This method has the lowest MAPE: 5.42%, the lowest RMSE: 546.99 and the highest DA: 74.35%. (3) It is found that when text features are added to the mixed dataset that integrates the features of Google Trends and macro indicators and are used as predictors to predict the international copper price, the prediction accuracy can be effectively improved, and the optimized model has higher prediction accuracy than the original model. Combining the advantages of the two can improve the efficiency of information utilization while optimizing the model structure. 

学位类型硕士
答辩日期2024-05-25
学位授予地点甘肃省兰州市
语种中文
论文总页数82
参考文献总数43
馆藏号0005603
保密级别公开
中图分类号C8/379
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/37120
专题统计与数据科学学院
推荐引用方式
GB/T 7714
邴贵英. “聚类-降维”策略下基于多源数据的铜期货价格预测[D]. 甘肃省兰州市. 兰州财经大学,2024.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
10741_2021000003002_(1512KB)学位论文 开放获取CC BY-NC-SA浏览 下载
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[邴贵英]的文章
百度学术
百度学术中相似的文章
[邴贵英]的文章
必应学术
必应学术中相似的文章
[邴贵英]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10741_2021000003002_LW.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。