作者韩运龙
姓名汉语拼音hanyunlong
学号2021000010004
培养单位兰州财经大学
电话15040258507
电子邮件hanhanyl@163.com
入学年份2021-9
学位类别学术硕士
培养级别硕士研究生
学科门类管理学
一级学科名称管理科学与工程
学科方向
学科代码1201
授予学位管理学硕士学位
第一导师姓名尚庆生
第一导师姓名汉语拼音shangqingsheng
第一导师单位兰州财经大学
第一导师职称教授
题名祁连山可持续发展知识图谱构建研究
英文题名Research on the Knowledge Graph of Sustainable Development in Qilian Mountains
关键词祁连山 可持续发展 知识图谱 实体识别
外文关键词Qilian Mountains ; sustainable development ; knowledge graph ; entity recognition
摘要

随着人口激增和经济的快速发展,气候变化和资源枯竭等全球性危机日益加 深,人们对社会、经济、生态三个领域的可持续发展越来越重视,特别是在自然 生态环境领域的可持续发展研究一直是学术研究的热门议题。同时祁连山作为我 国西部重要的生态安全屏障和重要水源产地,祁连山地区的可持续发展对于我国 西部乃至全国的生态文明建设至关重要。当前由于气候环境的变化和人类活动的 影响,祁连山面临着日益严重的环境问题和可持续发展挑战,对祁连山可持续发 展现状进行成体系的归纳整理和研究成为重要而紧迫的任务。当下自然语言处理、 深度学习技术的飞速发展,为研究处理祁连山可持续发展信息提供了新的方向, 知识图谱的出现也为相关生态环境学术研究和可持续发展领域提供了直观有效 的工具。因此,本文利用知识图谱思想,构建祁连山可持续发展知识图谱,为祁 连山可持续发展的智能研究、知识问答、知识推理等方面提供重要支撑。本文主 要内容如下: (1)建立祁连山可持续发展信息数据集。将中国知网数据库中公开发表的 中文期刊论文作为主要数据源,检索祁连山可持续发展相关信息文献并进行处理, 建立原始数据集用于祁连山可持续发展文献研究热点分析,确定好实体类型和关 系类型后对文本数据进行标注,最后把文本转换成特定格式完成标注数据集的创 建。标注数据集共包含6种实体类别和6种关系类别,为后续构建知识图谱提供 数据支撑。 (2)提出了融合注意力机制的 ALBERT-BiLSTM-Attention-CRF 轻量化模 型。模型在特征提取层BiLSTM的后面引入注意力层,解决了BiLSTM模型存 在的问题,同时轻量化的ALBERT层使得模型在本文的命名实体语料规模相对 较小的情况下,以其参数少的优势,能够取得更好的性能。经过与其他命名实体 识别模型实验对比,本文模型的准确率、F1 值都有较高的提升,证明了模型的 有效性与可行性。 (3)构建祁连山可持续发展知识图谱。利用确定好的实体和关系类型,通 过 ALBERT-BiLSTM-Attention-CRF 模型进行实体识别,根据抽取的实体特征, 基于规则模版完成实体关系抽取,构建<实体,关系,实体>三元组,最后采用 Neo4j 图数据库对知识图谱进行存储与可视化展示。

英文摘要

With the rapid population growth and rapid economic development, global crises such as climate change and resource depletion are deepening, and people are paying more and more attention to sustainable development in the social, economic, and ecological fields. In particular, the research on sustainable development in the natural ecological environment has always been a hot topic in academic research. At the same time, as an important ecological security barrier and important water source in western China, the sustainable development of the Qilian Mountains is crucial for the construction of ecological civilization in western China and even the whole country. Currently, due to changes in climate environment and the impact of human activities, the Qilian Mountains are facing increasingly serious environmental problems and sustainable development challenges. It has become an important and urgent task to systematically summarize and study the current situation of sustainable development in the Qilian Mountains. The rapid development of natural language processing and deep learning technology provides a new direction for researching and processing sustainable development information in the Qilian Mountains, and the emergence of knowledge graphs also provides intuitive and effective tools for related ecological environment academic research and sustainable development fields. Therefore, this article uses the idea of knowledge graphs to construct a knowledge graph of sustainable development in the Qilian Mountains, providing important support for intelligent research, knowledge question answering, knowledge reasoning, etc. in sustainable development of the Qilian Mountains. The main contents of this article are as follows: (1) Establish a sustainable development information dataset for the Qilian Mountains. Using the Chinese journal articles published in the CNKI database as the main data source, retrieve and process the relevant information literature on sustainable development in the Qilian Mountains, establish an original data set for the analysis of research hotspots in sustainable development literature in the Qilian Mountains, and label the text data after determining the entity types and relationship types. Finally, convert the text into a specific format to complete the creation of labeled data sets. The labeled data set contains six types of entities and six types of relationships, providing data support for the subsequent construction of a knowledge graph. (2) A lightweight model of ALBERT-BiLSTM-Attention-CRF with attention mechanism is proposed. The model introduces the attention layer behind the feature extraction layer BiLSTM, which solves the problems of BiLSTM model. At the same time, the lightweight ALBERT layer enables the model to achieve better performance with less parameters in the case of relatively small scale of the named entity corpus in this paper. Compared with other named entity recognition models, the accuracy of the proposed model The F1 value has a higher increase, which proves the effectiveness and feasibility of the model. (3) Construct a knowledge graph for sustainable development in the Qilian Mountains. Using the determined entity types and relationship types, perform entity recognition through the ALBERT-BiLSTM-Attention-CRF model. Based on the extracted entity features, complete entity relationship extraction based on rule templates, construct a triplet, and finally use the Neo4j graph database to store and visualize the knowledge graph.

学位类型硕士
答辩日期2024-05-18
学位授予地点甘肃省兰州市
语种中文
论文总页数70
参考文献总数70
馆藏号0006284
保密级别公开
中图分类号C93/88
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/36370
专题信息工程与人工智能学院
推荐引用方式
GB/T 7714
韩运龙. 祁连山可持续发展知识图谱构建研究[D]. 甘肃省兰州市. 兰州财经大学,2024.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
2021000010004.pdf(1814KB)学位论文 开放获取CC BY-NC-SA浏览 下载
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[韩运龙]的文章
百度学术
百度学术中相似的文章
[韩运龙]的文章
必应学术
必应学术中相似的文章
[韩运龙]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 2021000010004.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。