作者丁申宇
姓名汉语拼音ding shenyu
学号2019000010006
培养单位兰州财经大学
电话13893450169
电子邮件shenyu_ding@163.com
入学年份2019-9
学位类别学术硕士
培养级别硕士研究生
学科门类管理学
一级学科名称管理科学与工程
学科方向
学科代码1201
第一导师姓名王玉珍
第一导师姓名汉语拼音wang yuzhen
第一导师单位兰州财经大学
第一导师职称教授
题名基于BERT-LDA的在线评论细粒度情感分析--以手机产品为例
英文题名Fine-grained emotion analysis of online reviews based on BERT-LDA ——Take smartphone products for example
关键词情感分析 细粒度情感分析 BERT-LDA模型 关键短语
外文关键词Sentiment analysis;Fine-grained emotion analysis; BERT-LDA model; Key phrases
摘要

情感分析可以在文本数据中辨识出用户所表达的情感意向,是自然语言处理中的热门领域。但传统的情感分析大多是基于粗粒度层面的研究,以探讨产品整体情感极性为主,因此无法掌握产品特征及对应的细粒度情感。为弥补传统情感分析的不足,本文从产品特征和情感分类两个角度展开研究。其中,产品特征角度的研究是对产品具体特征进行分析,明确消费者重视的产品属性;情感分类角度的研究是将情感词典进一步细分,不再沿袭传统的情感极性二分类的方法,而是将情感分类划分的更加细致,并计算情感词的情感强度,明确消费者情感的细分倾向。将产品特征与情感细粒度结合,能够更加清楚的了解消费者对产品特征的具体要求,从而帮助商家更好地提供个性化服务。

本文的主要研究内容包括以下四部分:

(1)构建了手机产品关键短语集合。本文基于关键短语结构,以手机特征词典为基础,构建了手机关键短语集合,为手机特征和情感词的高精度提取创造了前提条件。

2)构建了BERT-LDA模型。首先对BERTBidirectional Encoder Representations from Transformers)模型中相似度值域进行训练;然后应用BERT模型进行相似短语的提取;最后将提取结果输入到LDA模型中,获得评价对象和评价词。高质量的相似短语在降低LDA模型困惑度上发挥了极大作用,也有利于提高LDA模型主题提取的准确率。

3)建立了基于手机特征词的细粒度情感词典。为实现对情感词的分类与情感强度的计算,采用由大连理工大学信息检索研究室整理的中文情感词汇本体库(DUTIR),以本文所建立的手机关键短语集合和手机特征词典为基础,构建了基于手机产品评价特征的细粒度情感词典,并增加了代表“疑惑”的“疑”的情感,构成8类情感词,同时对修饰词与否定词进行扩充。

4)应用了细粒度情感分析模型。以某品牌的手机评论文本为例,应用BERT-LDA模型,找出消费者关注度较高的特征及情感词,并应用细粒度情感计算方法,计算特征的情感分类与情感强度,从而详细了解消费者对不同特征的情感倾向,有利于店铺提升口碑、增加销量,更好地提供给个性化服务。

英文摘要

Emotion analysis can identify the emotional intention expressed by users in text data, and is a popular area in natural language processing.However, the traditional emotion analysis is mostly based on the coarse-grained research, mainly to explore the overall emotional polarity of the product, so it is impossible to grasp the product characteristics and the corresponding fine-grained emotion.In order to make up for the deficiency of traditional emotion analysis, this paper studies from the perspectives of product characteristics and emotion classification.Among them, the study of product characteristics is to analyze the product attributes that consumers value; the study of emotion classification is to further subdivide the traditional method of emotion polar classification, but to divide the emotion classification more carefully, calculate the emotion intensity of emotion words, and clarify the segmentation tendency of consumer emotion.The combination of product characteristics and emotional grancan have a clearer understanding of consumers' specific requirements for product characteristics, so as to help businesses better provide personalized services.

The main research content of this article includes the following four parts:

(1)Build a collection of key phrases of mobile phone products.Based on the structure of key phrases and the dictionary of mobile phone features, this paper creates the preconditions for the high-precision extraction of mobile phone features and emotional words.

(2)The BERT-LDA model was constructed.First, the similarity value domain in BERT (Bidirectional Encoder Representations from Transformers) model is trained; then BERT model is applied to extract similar phrases; finally, the extraction results are input into LDA model to obtain the evaluation object and evaluation words.High-quality similarity phrases play a great role in reducing the confusion of the LDA model, and also help to improve the accuracy of the LDA model topic extraction.

(3)A fine-grained emotion dictionary based on mobile phone feature words is established.For the calculation of the classification of emotional words and emotional strength, organized by Dalian university of technology information retrieval Chinese emotional vocabulary ontology library (DUTIR), on the basis of the phone key phrases and mobile features dictionary, built based on the evaluation of mobile product emotion dictionary, and increase the "doubt" of "doubt" emotion, constitute eight types of emotional words, and expand the modifier and negative words.

(4)A fine-grained emotion analysis model was applied.With a brand of mobile phone review text, for example, the application of BERT-LDA model, find out the characteristics of consumer attention and emotional words, and apply fine-grained emotion calculation method, calculate the characteristics of emotional classification and emotional intensity, to understand the emotional tendency of different characteristics, to store word of mouth, increase sales, better provide personalized service.

学位类型硕士
答辩日期2022-05-29
学位授予地点甘肃省兰州市
语种中文
论文总页数71
参考文献总数59
馆藏号0004259
保密级别公开
中图分类号C93/65
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/32095
专题信息工程与人工智能学院
推荐引用方式
GB/T 7714
丁申宇. 基于BERT-LDA的在线评论细粒度情感分析--以手机产品为例[D]. 甘肃省兰州市. 兰州财经大学,2022.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
2019000010006.pdf(3712KB)学位论文 暂不开放CC BY-NC-SA请求全文
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[丁申宇]的文章
百度学术
百度学术中相似的文章
[丁申宇]的文章
必应学术
必应学术中相似的文章
[丁申宇]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。