作者王宇辰
姓名汉语拼音Wang Yuchen
学号2018000003131
培养单位兰州财经大学
电话18894001307
电子邮件grubbywyc123@163.com
入学年份2018-9
学位类别专业硕士
培养级别硕士研究生
一级学科名称应用统计
学科代码0252
授予学位应用统计硕士
第一导师姓名黄恒君
第一导师姓名汉语拼音Huang Hengjun
第一导师单位兰州财经大学统计学院
第一导师职称教授
题名基于文本挖掘的数据类岗位人才需求分析
英文题名Demand Analysis of Data Jobs Based on Text Mining
关键词数据类岗位 文本挖掘 LDA主题模型 岗位需求对比
外文关键词Data Jobs ; Text Mining ; LDA Topic Model ; Job Requirements Comparison
摘要

    随着大数据时代的到来,高等教育的发展与普及,高校数据类人才数量不断增加;在大数据与人工智能技术的发展过程中,数据类岗位的市场需求也逐渐增大。但是,仍然会出现毕业生难以找到较为满意的工作、企业难以招聘到理想人才的情况。本文旨在探究就业市场上数据类人才的招聘情况,对数据类岗位的侧重点进行对比,挖掘出企业对数据类人才的要求,为希望从事数据类岗位的求职者提供一些参考。
    本文在前程无忧招聘平台上以“数据分析” 、“数据挖掘”、“数据开发”、“数据运营”等四种关键词进行检索,并使用网络爬虫技术获取岗位信息进而对数据类岗位展开研究。 首先,采用描述性统计与可视化方法,对这四类岗位的城市需求分布、工作经验要求、学历要求进行比较,探究不同地区数据类岗位的需求差异,并在此基础上分析岗位薪资与工作经验要求、薪资与学历要求的走势情况;其次,对四类岗位的招聘文本信息进行文本数据预处理与文本向量化,构建 LDA 主题模型,根据模型评价指标确定最优的主题数,并输出各类主题词;最后,利用 Word2Vec 模型对 LDA 模型提取的主题词进行拓展,提取语义近似的词项,并结合二者筛选出关键词绘制词云,从岗位职责、 职位场景、技能需求等方面对四类岗位进行比较。
    研究结果表明: 数据类岗位需求较大的城市包括华东地区的上海、杭州、南京等,华南地区的广州、深圳、东莞等,华南地区对数据挖掘与数据开发岗位的需求更大;工作经验要求方面, 绝大部分岗位都要求就业人员具备 1 年以上相关工作经验;学历要求方面, 本科成为大部分数据类岗位的起步学历, 寻求数据挖掘与数据开发工作时,本科学历不具备竞争优势。 LDA 主题模型与 Word2Vec 模型的主题词提取结果显示, 数据分析岗位强调数据敏感性与搭建指标体系的能力;数据挖掘岗位注重数据分析与挖掘算法的设计与优化,需要具备大数据平台的使用能力;数据开发岗位侧重大数据平台、数据计算架构的研发与运维;数据运营旨在通过数据与业务相结合,指导业务增长。 数据分析和数据运营偏向业务,数据挖掘和数据开发更偏重技术。
    根据上述结果,本文旨在帮助致力于向数据类岗位发展的求职者, 有助于求职者、应届生和在校生根据自身情况选择合适自己的岗位类型、就业地区, 根据市场需求有针对性的补强自身实力,提高就业率。
 

英文摘要

    With the advent of the era of big data and the development and popularization of higher education, the number of data talents in colleges and universities is increasing; in the process of development of big data and artificial intelligence technology, the market demand for data positions is also gradually increasing.However, there will still be situations in which it is difficult for graduates to find more satisfactory jobs and for companies to recruit ideal talents. This paper aims to explore the recruitment situation of data talents in the job market, compare the
focus of data positions, and discover the requirements of enterprises for data talents, so as to provide some references for job seekers who want to engage in data positions.
    In this paper, four keywords, such as "data analysis" "data mining" "data
development" and "data operation", are used to search on the 51job.com recruitment platform, and the web crawler technology is used to obtain job information and then expand the data-based jobs. Research. First of all, using descriptive statistics and visualization methods to compare the urban demand distribution, work experience requirements, and education requirements of these four types of jobs, explore the differences in demand for data jobs in different regions, and analyze job salaries and work experience on this basis. The trend of requirements, salary and educational requirements; secondly, text data preprocessing and text vectorization are carried out on the recruitment text information of the four types of positions, the LDA topic model is constructed,the optimal number of topics is determined according to the model evaluation
index, and various types of topics are output. Subject words; finally, the Word2Vec model is used to expand the subject words extracted by the LDA model, extract terms with similar semantics, and combine the two to filter out the keywords to draw a word cloud. class positions for comparison.

    The research results show that cities with large demand for data jobs include Shanghai, Hangzhou, Nanjing, etc. in eastern China, Guangzhou, Shenzhen, Dongguan, etc. in southern China, and there is a greater demand for data mining and data development jobs in southern China; work experience In terms of requirements, most positions require employees to have more than 1 year of relevant work experience; in terms of academic requirements, a bachelor's degree has become the starting degree for most data-related jobs. When seeking data mining and data development jobs, a bachelor's degree does not have a competitive advantage. The results of the subject heading extraction of the LDA
topic model and the Word2Vec model show that the data analysis position emphasizes data sensitivity and the establishment of an index system; the data mining position focuses on the design and optimization of data analysis and mining algorithms, and requires the ability to use big data platforms; data development The research and development and operation and maintenance of major data platforms and data computing architectures on the job side; data operations aim to guide business growth through the combination of data and business. Data analysis and data operations are more business-oriented, while data mining and data development are more technology-oriented.

According to the above results, this paper aims to help job seekers who are committed to the development of data-related positions, help job seekers, fresh graduates and school students to choose suitable job types and employment areas according to their own conditions, and to provide targeted solutions according to market demand. Strengthen its own strength and increase the employment rate.
 

学位类型硕士
答辩日期2022-05
学位授予地点甘肃省兰州市
语种中文
论文总页数64
参考文献总数47
馆藏号0004295
保密级别公开
中图分类号C8/300
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/32310
专题统计与数据科学学院
推荐引用方式
GB/T 7714
王宇辰. 基于文本挖掘的数据类岗位人才需求分析[D]. 甘肃省兰州市. 兰州财经大学,2022.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
10741_2018000003131_(2982KB)学位论文 开放获取CC BY-NC-SA浏览 下载
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[王宇辰]的文章
百度学术
百度学术中相似的文章
[王宇辰]的文章
必应学术
必应学术中相似的文章
[王宇辰]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 10741_2018000003131_王宇辰.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。