Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/124723
|
Title: | 整體學習應用於線上零售的回購預測 Ensemble learning for customer retention prediction in online retailing |
Authors: | 佘欣玲 SHE, XIN-LING |
Contributors: | 莊皓鈞、周彥君 Chuang, Hao-Chun、Chou, Yen-Chun 佘欣玲 SHE, XIN-LING |
Keywords: | 整體學習 零售業 回購預測 Ensemble learning Online retailers Cutomer retention |
Date: | 2019 |
Issue Date: | 2019-08-07 16:09:20 (UTC+8) |
Abstract: | 回購於顧客關係管理中扮演重要角色,其中為了改善過度行銷與溝通成本過高的狀況,消費者回購的議題成為線上零售業者提升經營績效的關鍵。本研究針對回購議題,首先了解如何從交易、退貨或取消等紀錄建構會員的消費行為和特徵?其次,研究如何採用XGBoost與LightGBM兩種整體學習的演算法,應用於預測消費者回購的議題,並比較何者的預測效果較優?第三,透過整體學習結合貝氏網路,探討哪些消費行為會影響回購?最後,如何從業者角度評估模型之結果,以提供完整的分析顧客回購之方法? 本研究相較於過去學者採用少量的特徵變數進行預測,本研究進行深入的特徵工程,總共建構167個變數,提供較完整的消費行為與特徵。另外,提供 XGBoost與LightGBM 兩種演算法的預測結果,且模型準確率最高可達90%,並將各模型進行深入探討與比較分析。更進一步地將整體學習結合貝氏網路,探討重要特徵與回購之關係,不僅協助業者了解哪些消費特徵會影響顧客的回購行為,透過模型的預測結果提供業者潛在的回購名單。對於模型預測的結果,提供業者成本效益之評估,協助業者以利潤為導向的決策依據,除了可以避免消費者對過度行銷反感,亦可降低業者與會員的溝通成本,讓業者可以了解顧客需求,並提升經營的績效。 Customer retention plays an important role in customer relationship management. In order to reduce the cost of communicating with customers and avoid over-marketing, capturing customer retention has become the key to online retail operations. This research attempts to address the following issues pertaining to customer retention. First, how can online retailers construct customer behaviors and characteristics from records of transactions, returns, and cancellations? Secondly, how to use the cutting-edge ensemble learning algorithms - XGBoost and LightGBM - to predict customer retention? Which algorithm performs better? Third, how can we combine knowledge extracted from ensemble learning the Bayesian network, to establish causal diagrams of how customer characteristics drive customer retention? Finally, how to evaluate the results of predictive models from a business perspective and perform a cost-benefit analysis of customer retention analytics? Compared with the past research using much fewer feature to predict customer retention, this research presents a fairly comprehensive feature engineering that results in a total of 167 variables of customer characteristics. In addition, we show that both XGBoost and LightGBM algorithms achieve prediction accuracy up to 90%. Furthermore, this study integrates ensemble learning with the Bayesian network to explore the relationship between important features and customer retention. Doing so helps retailers understand which characteristics will affect customer retention, in addition to providing a potential repurchase list based on model predictions. Finally, this study conducts a cost-effectiveness analysis according to model predictions, with the aim of helping online retailers make profit-oriented decisions for digital marketing. |
Reference: | Abirami, M., & Pattabiraman, V. (2016). Data mining approach for intelligent customer behavior analysis for a retail store. Paper presented at the Proceedings of the 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC–16’) (pp. 283-291). Springer, Cham. Al-Tit, A. A. (2015). The effect of service and food quality on customer satisfaction and hence customer retention. Asian Social Science, 11(23), 129. Alpaydin, E. (2016). Machine learning: the new AI: MIT press. Amin, M., Rezaei, S., & Tavana, F. S. (2015). Gender differences and consumer’s repurchase intention: the impact of trust propensity, usefulness and ease of use for implication of innovative online retail. International Journal of Innovation and Learning, 17(2), 217-233. Aren, S., Güzel, M., Kabadayı, E., & Alpkan, L. (2013). Factors affecting repurchase intention to shop at the same website. Procedia-Social and Behavioral Sciences, 99, 536-544. Bijalwan, V., Kumar, V., Kumari, P., & Pascual, J. (2014). KNN based machine learning approach for text and document mining. International Journal of Database Theory and Application, 7(1), 61-70. Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32. Bzdok, D., Altman, N., & Krzywinski, M. (2018). Statistics versus machine learning. Nature methods, 15(4), 233. Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Paper presented at the Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. (pp. 785-794). ACM. Colubri, A., Silver, T., Fradet, T., Retzepi, K., Fry, B., & Sabeti, P. (2016). Transforming clinical data into actionable prognosis models: machine-learning framework and field-deployable app to predict outcome of Ebola patients. PLoS neglected tropical diseases, 10(3), e0004549. Dai, C., Zhang, H., Arens, E., & Lian, Z. (2017). Machine learning approaches to predict thermal demands using skin temperatures: Steady-state conditions. Building and Environment, 114, 1-10. Díaz, G. R. (2017). The influence of satisfaction on customer retention in mobile phone market. Journal of Retailing and Consumer Services, 36, 75-85. Fader, P. S., Hardie, B. G., & Lee, K. L. (2005). RFM and CLV: Using iso-value curves for customer base analysis. Journal of marketing research, 42(4), 415-430. Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. Paper presented at the icml. (Vol. 96, pp. 148-156). Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189-1232. Gupta, S., & Kim, H. W. (2008). Linking structural equation modeling to Bayesian networks: Decision support for customer retention in virtual communities. European Journal of Operational Research, 190(3), 818-833. Hennig-Thurau, T., & Hansen, U. (2013). Relationship marketing: gaining competitive advantage through customer satisfaction and customer retention. Copenhagen, Denmark: Spieger. Ho, T. K. (1998). Nearest neighbors in random subspaces. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) (pp. 640-648). Springer, Berlin, Heidelberg. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye,Q.,Liu, T.-Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Paper presented at the Advances in Neural Information Processing Systems. Kumar, V. (2010). Customer relationship management.Wiley international encyclopedia of marketing. Lo, A. S., Stalcup, L. D., & Lee, A. (2010). Customer relationship management for hotels in Hong Kong. International Journal of Contemporary Hospitality Management, 22(2), 139-159. Martínez, A., Schmuck, C., Pereverzyev Jr, S., Pirker, C., & Haltmeier, M. (2018). A machine learning framework for customer purchase prediction in the non-contractual setting. European Journal of Operational Research. Perveen, S., Shahbaz, M., Guergachi, A., & Keshavjee, K. (2016). Performance analysis of data mining classification techniques to predict diabetes. Procedia Computer Science, 82, 115-121. Renjith, S. (2015). An Integrated Framework to Recommend Personalized Retention Actions to Control B2C E-Commerce Customer Churn. arXiv preprint arXiv:1511.06975. Saleh, K., & Shukairy, A. (2010). Conversion optimization: The art and science of converting prospects to customers: " O`Reilly Media, Inc.". Soltani, Z., & Navimipour, N. J. (2016). Customer relationship management mechanisms: A systematic review of the state of the art literature and recommendations for future research. Computers in Human Behavior, 61, 667-688. Wen, C., Prybutok, V. R., & Xu, C. (2011). An integrated model for customer online repurchase intention. Journal of Computer Information Systems, 52(1), 14-23. Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241-259. Xiao, Q., Chang, H. H., Geng, G., & Liu, Y. (2018). An ensemble machine-learning model to predict historical PM2. 5 concentrations in China from satellite data. Environmental science & technology, 52(22), 13260-13269. Zhang, Y., Pang, L., Shi, L., & Wang, B. (2014). Large scale purchase prediction with historical user actions on B2C online retail platform. arXiv preprint arXiv:1408.6515. Zhu, Y., Xie, C., Wang, G.-J., & Yan, X.-G. (2017). Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China’s SME credit risk in supply chain finance. Neural Computing and Applications, 28(1), 41-50. |
Description: | 碩士 國立政治大學 資訊管理學系 107356005 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0107356005 |
Data Type: | thesis |
DOI: | 10.6814/NCCU201900463 |
Appears in Collections: | [資訊管理學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
600501.pdf | | 1269Kb | Adobe PDF2 | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|