融合外部知识的方面级情感分析研究

作者	赵金雨
姓名汉语拼音	zhaojinyu
学号	2021000010016
培养单位	兰州财经大学
电话	15554888681
电子邮件	lyzjyshr@163.com
入学年份	2021-9
学位类别	学术硕士
培养级别	硕士研究生
学科门类	管理学
一级学科名称	管理科学与工程
学科方向	无
学科代码	1201
授予学位	管理学硕士学位
第一导师姓名	李强
第一导师姓名汉语拼音	liqiang
第一导师单位	兰州财经大学
第一导师职称	教授
题名	融合外部知识的方面级情感分析研究
英文题名	A study of aspect-level sentiment analysis incorporating external knowledge
关键词	方面级情感分析外部知识图卷积网络信息增强多通道特征融合
外文关键词	Aspect-level sentiment analysis ; External knowledge ; Graph convolutional networks ; Information enhancement ; Multichannel ; Feature fusion
摘要	近年来，随着互联网的迅速发展和5G时代的到来，各大电商平台和社交APP 拥有庞大用户群体，这导致大量的非结构化文本数据的出现，这些数据中包含了丰富的情感元素，具有极大的商业潜力。为了能够利用这些数据挖掘用户的情感倾向，快速地掌握对商业产品、社会制度、政治生活等方面的看法，文本情感分析变成一个备受关注的研究主题。当前已有很多机器学习技术实现了文本情感分析，但仍有许多问题亟待解决。文本情感分析主要包括方面级、句子级和篇章级情感分析。考虑到文本数据的独特性，方面级情感分析对于处理目前互联网上产生的文本情感数据更为合适。这种方法针对特定方面词进行情感分析，为个人、公司和相关部门实施某项措施时提供更加精准的数据支撑，在实际应用中非常有价值。目前大多数情感分析研究都集中在基于句子本身挖掘句法依存树的依赖关系，并且仅仅考虑是否存在依赖关系，将所有的依赖关系平等对待，同时没有使用太多与文本相关的外部知识，对外部知识的利用不充分。针对以上问题，本文开展的主要工作如下：（1）针对当前方面级情感分析研究中大部分模型未充分提取句法信息，且没有充分将文本位置信息融合的问题，本文提出了一种融合外部知识和位置信息增强的图卷积神经网络模型（A graph neural network model enhanced by fusion of external knowledge and location information, KL-GCN）。模型分为语义信息提取模块和句法信息提取模块，在句法信息提取部分中，我们对数据集词性进行统计，构建词性矩阵，充分考虑否定词、程度词等影响句子情感表达的词对情感分类的影响。然后，利用外部情感词典为句子中的每个词分配情感分数，构建情感分数矩阵来突出情感词的权重，通过该方法模型充分学习句子的依赖关系，得到包含丰富句法信息的特征向量，在语义信息提取模块，首先通过BERT模型对文本词和方面词进行编码，加入位置信息，获得位置信息词向量表示，然后通过GRU CNN 网络提取文本语义特征。最后构建基于注意力机制的特征融合模块，将两部分进行特征融合，增强特征向量表示的能力。（2）针对目前模型对外部知识利用不充分的问题，本文提出了一种融合外部知识的多通道图卷积网络模型（Fusion of External Knowledge Multi-Channel Graph Convolutional Networks, FEKM-GCN），将语法、语义和外部知识三种特征向量融合。首先模型分别对句子中的语法信息和语义信息进行提取，在将外部知识信息嵌入模型中，利用自注意力机制得到外部知识的得分矩阵，输入图卷积中提取外部知识特征信息，将三种特征信息输入特征融合模块，使三个通道的特征能够互补学习。实验结果表明，模型可以更好地提取到文本的语义信息和语法信息，同时外部信息更加丰富了文本信息，提高了模型的准确率。
英文摘要	In recent years, with the rapid development of the Internet and the arrival of the 5G era, major e-commerce platforms and social APPs have a huge user base, which leads to the emergence of a large amount of unstructured text data, which contains rich emotional elements and has a very high commercial potential. In order to be able to use these data to mine the emotional tendencies of users and quickly grasp the views on commercial products, social systems, political life and other aspects, text sentiment analysis has become a popular research direction in the field of natural language processing. It has become a much talked about research topic in the research field of natural language processing. There have been many Chinese sentiment analysis methods implemented based on machine learning techniques, but there are still many problems to be solved. Text sentiment analysis mainly includes aspect-level, sentence-level and chapter-level sentiment analysis. Considering the uniqueness of textual data, aspect-level sentiment analysis is more appropriate for textual sentiment data currently generated on the Internet. This approach provides sentiment analysis for specific aspectual words, which is valuable in practical applications as it provides more accurate data support for individuals, companies and related departments when implementing a measure. Most of the current sentiment analysis research focuses on mining dependency trees for context-word and aspect-word dependencies based on the sentence itself, and only considers whether there is a dependency or not, treats all dependencies equally, and does not use too much external sentiment knowledge related to the text, and at the same time, does not make sufficient use of external knowledge. To address the above problems, the main work carried out in this paper is as follows: 1. Aiming at the problem that most of the models in the current research on aspect-level sentiment analysis do not sufficiently extract syntactic information and do not sufficiently fuse textual location information, this paper proposes a graphical convolutional neural network model (KL-GCN) fused with external knowledge and augmented with location information. The model is divided into a semantic information extraction module and a syntactic information extraction module. In the syntactic information extraction part, we count the lexical properties of the dataset, construct a lexical matrix, and take into full consideration the influence of words affecting the expression of the sentiment of the sentence such as negative words, degree words, and so on, on the sentiment classification. Then, the external sentiment dictionary is used to assign sentiment scores to each word in the sentence, and the sentiment score matrix is constructed to highlight the weight of the sentiment words, and the dependency relationship of the sentence is fully learnt through the method model to obtain the feature vector containing rich syntactic information. In the semantic information extraction module, firstly, the text words and aspect words are encoded through the BERT model, and the positional information is added to obtain the positional information word vector representation In the semantic information extraction module, firstly, the text words and aspect words are encoded by BERT model, and the location information is added to obtain the location information word vector representation, and then the text semantic features are extracted by GRU-CNN network. Finally, the feature fusion module based on the attention mechanism is constructed to fuse the two parts to enhance the feature vector representation. 2. Aiming at the problem that the current models do not make sufficient use of external knowledge, this paper proposes a Fusion of External Knowledge Multi-Channel Graph Convolutional Networks (FEKM-GCN) model that fuses syntactic, semantic and external knowledge three feature vectors are fused. Firstly, the model extracts the syntactic and semantic information in the sentence respectively, then embeds the external knowledge information into the model, uses the self attention mechanism to get the score matrix of external knowledge, inputs the external knowledge feature information into the graph convolutional to extract the external knowledge feature information, and inputs the three kinds of feature information into the feature fusion module, so that the features of the three channels can be learnt in a complementary way. The experimental results show that the model can better extract the semantic and syntactic information of the text, while the external information enriches the text information more and improves the accuracy of the model.
学位类型	硕士
答辩日期	2024-05-18
学位授予地点	甘肃省兰州市
语种	中文
论文总页数	78
参考文献总数	71
馆藏号	0006296
保密级别	公开
中图分类号	C93/100
文献类型	学位论文
条目标识符	http://ir.lzufe.edu.cn/handle/39EH0E1M/36379
专题	信息工程与人工智能学院
推荐引用方式 GB/T 7714	赵金雨. 融合外部知识的方面级情感分析研究[D]. 甘肃省兰州市. 兰州财经大学,2024.

条目包含的文件		下载所有文件
文件名称/大小	文献类型	版本类型	开放类型	使用许可
10741_2021000010016_（1951KB）	学位论文		开放获取	CC BY-NC-SA	浏览下载