作者袁慎
姓名汉语拼音yuanshen
学号2017000003117
培养单位兰州财经大学
电话15738812523
电子邮件15738812523@163.com
入学年份2017
学位类别学术硕士
培养级别硕士研究生
学科门类经济学
一级学科名称应用统计
学科方向统计学
学科代码020208
授予学位经济学硕士
第一导师姓名韩君
第一导师姓名汉语拼音hanjun
第一导师单位兰州财经大学
第一导师职称教授
题名基于属性加权的聚类算法在银行客户细分中的应用研究
英文题名Application of Clustering Algorithm Based on Attribute Weighting in Bank Customer Segmentation
关键词客户细分 K-Means聚类 Logistic逐步回归 权重设计 属性加权聚类
外文关键词Customer segmentation;K-Means clustering;Logistic stepwise regression;Weight design;Attribute weighted clustering
摘要

在如今互联网金融时代的潮流下,随着国内银行业务范围的扩大、客户量的 增长、时间的累积以及数据收集和存储技术的迅速发展,产生了一种“客户数据 丰富,但知识贫乏”的现象。银行业的激烈竞争实质上是客户资源的竞争,如何 挖掘庞大而多维的数据背后隐藏的潜在市场,如何发现客户的消费需求倾向,如 何筛选并挽留易流失的客户等等问题,迫切需要一种能够高效、多维度、精准化 的客户细分模型为银行实现企业利益最大化提供决策指导。 聚类算法是客户细分中运用最为广泛的方法,然而传统的 K-Means 算法在 实际应用中把所有属性特征按同等贡献度看待,没有考虑不同属性特征对聚类结 果可能造成的不同影响,忽略了业务含义。为解决 K-Means 算法所导致的聚类 偏差并提升聚类效果,本文在 K-Means 算法的基础上进行改进,通过 Logistic 逐步回归加权的方式筛选重要属性并赋予属性权重,使之能够按属性贡献度对数 据对象进行差异化度量,从而设计一种基于属性加权的聚类算法应用到银行客户 细分场景中。 本文使用的是从某银行数据库和 CRM 系统中随机抽样的客户全年交易记录 及相关信息数据,通过客户的当月 AUM 月日均(金融总资产)这一指标把客户 分为低端客户、中端客户和高端客户三组,以为银行带来收益为主要研究目标, 从客户基本属性信息、客户标识信息、客户价值信息、RFM 信息、客户交易及 动账最值信息五个维度实现银行客户细分,主要分为三个阶段: 第一阶段,三组客户分别运用基本统计分析、趋势分析、业务分析、相关性 分析等方法进行变量的选择与确定,以客户 AUM 资产达标为目标变量,应用 logistic 逐步回归模型尝试、比较及业务解读,并通过 ROC 曲线和 Lift 提升曲线 的评估验证,最后得到具有可解释性、可靠的相关变量和模型系数。 第二阶段,根据第一阶段所得的相关变量和模型系数使用回归权重设计的方 法确定属性加权聚类算法的权重,然后应用传统 K-Means 算法和改进的属性加 权聚类算法分别对三组客户依次进行聚类,通过两种聚类算法的可视化结果展示 与比较,以及聚类算法性能对比和分离度、紧密度、CH 指数和轮廓系数等有效 性评价标准的评估与验证,最终证明属性加权聚类算法的优越性。 第三阶段,应用基于属性加权聚类的客户细分算法,最终将银行客户细分成 兰州财经大学硕士学位论文 基于属性加权的聚类算法在银行客户细分中的应用研究 13 个小类,对于细分结果进行客户价值分析,合理的判断出需要重点维护的高 价值客户类别,需要挽留的易流失客户类别,需要重点发展的潜力客户类别和低 价值可放弃的客户类别等等,并提出银行企业维护、发展客户和优化资源配置提 供建议。

英文摘要

In the current trend of the Internet finance era, with the expansion of the domestic banking business, the growth in the number of customers, the accumulation of time, and the rapid development of data collection and storage technologies, a kind of "rich customer data but poor knowledge" has emerged. phenomenon. The fierce competition in the banking industry is essentially the competition of customer resources. How to tap the potential market hidden behind huge and multi-dimensional data, how to find the customer's consumption tendency, how to screen and retain customers who are easily lost, etc., urgently needs a kind of efficiently, multidimensional and accurate customer segmentation model provides guidance for Banks to maximize corporate interests. The clustering algorithm is the most widely used method in customer segmentation. However, the traditional K-Means algorithm treats all attribute features as equal contributions in practical applications, without considering the different effects that different attribute features may have on the clustering results. Ignore business implications. In order to solve the clustering bias caused by the K-Means algorithm and improve the clustering effect, this paper improves on the basis of the K-Means algorithm. The important attributes are filtered and weighted by logistic stepwise regression weighting, so that they can be attributed 兰州财经大学硕士学位论文 基于属性加权的聚类算法在银行客户细分中的应用研究 according to attributes. The contribution degree measures the data objects differently, and a clustering algorithm based on attribute weighting is designed to be applied to the bank customer segmentation scenario. This article uses a customer ’s annual transaction records and related information data randomly sampled from a bank database and CRM system. The customer is divided into low-end customers, medium-end customers through the indicator of the customer ’s AUM (financial total assets) for the current month. There are three groups of end-customers and high-end customers. The main research goal is to bring benefits to the bank. From the five dimensions of customer basic attribute information, customer contract information, customer value information, RFM information, customer transactions and account value information, the bank's customer details are realized. There are three main stages: In the first stage, the three groups of customers used basic statistical analysis, trend analysis, business analysis, correlation analysis and other methods to select and determine variables. The customer's AUM asset was used as the target variable. The logistic stepwise regression model was used to try, compare, and conduct business. Interpret and verify through the evaluation of the ROC curve and Lift lifting curve, and finally obtain interpretable and reliable related variables and model coefficients. In the second stage, the weights of the attribute-weighted clustering 兰州财经大学硕士学位论文 基于属性加权的聚类算法在银行客户细分中的应用研究 algorithm were determined using the regression weight design method based on the relevant variables and model coefficients obtained in the first stage, and then the traditional K-Means algorithm and the improved attribute-weighted clustering algorithm were applied to three groups of customers. Clustering is performed in turn, and the visualization results of the two clustering algorithms are displayed and compared, and the performance comparison of clustering algorithms and the evaluation and verification of effectiveness evaluation criteria such as separation, compactness, CH index and contour coefficient are finally proved. The superiority of the clustering algorithm. In the third stage, the customer segmentation algorithm based on attribute weighted clustering was applied, and finally bank customers were subdivided into 13 sub-categories. The customer value analysis was performed on the segmentation results to reasonably determine the high-value customer categories that need to be maintained. The types of customers that need to be retained, the types of potential customers that need to be developed and the types of low-value abandonable customers, etc., and provide advice for banks to maintain, develop customers, and optimize resource allocation.

学位类型硕士
答辩日期2020-05-24
学位授予地点甘肃省兰州市
研究方向市场研究
语种中文
论文总页数74
论文印刷版中手工粘贴图片页码0
插图总数0
插表总数0
参考文献总数0
馆藏号0003161
保密级别公开
中图分类号C8/229
保密年限0
文献类型学位论文
条目标识符http://ir.lzufe.edu.cn/handle/39EH0E1M/18960
专题统计与数据科学学院
推荐引用方式
GB/T 7714
袁慎. 基于属性加权的聚类算法在银行客户细分中的应用研究[D]. 甘肃省兰州市. 兰州财经大学,2020.
条目包含的文件 下载所有文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
35226.pdf(1757KB)学位论文 开放获取CC BY-NC-SA浏览 下载
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[袁慎]的文章
百度学术
百度学术中相似的文章
[袁慎]的文章
必应学术
必应学术中相似的文章
[袁慎]的文章
相关权益政策
暂无数据
收藏/分享
文件名: 35226.pdf
格式: Adobe PDF
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。