Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/124140
|
Title: | 基於集成學習框架之信用違約預測-以信用卡客戶為例 The Credit Default Prediction Based on Ensemble Learning-The Case of Credit Card Customers |
Authors: | 陳靜怡 Chen, Ching-Yi |
Contributors: | 江彌修 Chiang, Mi-Hsiu 陳靜怡 Chen, Ching-Yi |
Keywords: | 信用風險 違約風險 信用卡客戶 集成學習 機器學習 Credit risk Default risk Credit card clients Ensemble learning Machine learning |
Date: | 2019 |
Issue Date: | 2019-07-01 10:47:37 (UTC+8) |
Abstract: | 信用風險為金融機構最主要的風險來源之一,意指交易對手或借款者發生違約的風險。本研究基於Blending與Stacking集成學習框架,建構信用卡客戶違約風險預警模型,預測既有客戶未來違約的可能性,藉此在客戶發生違約行為之前,能先採取相關因應措施,並以單一模型之預測表現為基準進行比較。本研究以國內某大型銀行之信用卡客戶為研究對象,樣本資料期間為2005年4月至9月,包含信用卡持有人於這段期間的刷卡消費金額、付款金額、違約紀錄等交易相關資訊,與持有人之個人資訊。除了對原始資料進行資料前處理與特徵工程,本研究亦使用合成少數類過取樣技術 (SMOTE) 處理資料類別不平衡的情況。本研究採用實務上較適合評估信用風險的指標,如型二誤差、ROC曲線下方面積值 (AUC) 等,作為衡量模型成效的標準。實證結果顯示,相較於單一模型、以及Blending集成框架,經由Stacking集成框架所建構的模型在上述評估指標的衡量下之預測表現最好,驗證集成學習具有效提升模型成效的特性,但前提為在挑選集成框架中第一層分類器的模型時,必須考慮下列準則, (1) 各個模型間最好具差異性, (2) 各個模型的預測表現不能相差太大。 Credit risk is the risk of default on a debt that may arise from a borrower or counterparty failing to make required payment, which has been the main source of risk in most financial institutions. The purpose of this research is to construct an ensemble-learning-based credit risk model, especially based on Blending and Stacking approaches, for credit card default payment prediction. Financial institutions can take countermeasures to avoid losses due to existing customers with default payments, with the help of default alerts provided by our model. We also benchmark the performance of ensemble models against their base classifiers. This paper uses payment data in October, 2005, from an important bank in Taiwan and the targets are existing credit card holders of the bank. Our customer data include the amount of bill statement and previous payment, the past monthly payment records, and personal information etc. In addition to data preprocessing and feature engineering, we conduct Synthetic Minority Oversampling Technique (SMOTE) to deal with our imbalanced data. We use three evaluation metrics that are applicable to credit risk management in practice, such as Type II error, F_1-score, and the value of area under ROC curve, to evaluate the performance of these classification models. The results show that the classification model built based on Stacking approach outperforms base classifiers and Blending approach. The experimental evaluation also shows that ensemble learning has the potential to improve overall classification performance effectively under the premise of the base classifiers generated with high diversity and local accuracy. |
Reference: | 中文文獻 [1] 林萍珍、柯博昌、游俊忠 (2010),演化式多重組合羅吉斯迴歸模型—應用於信用評等,資訊管理學報,第十七卷第二期,頁115-140。 [2] 林榮禾,陳奕昌 (2008),利用資料探勘技術建構整合型信用評等模型,國立臺北科技大學商業自動化與管理研究所碩士論文。 [3] 柯柏成、孫玉清 (2014),信用風險衡量模式之探討,證券櫃檯月刊170期,103年4月號,頁98-105。 [4] 洪智力,陳勁宏 (2007),破產預測選擇性集成模型比較,中原大學資訊管理學系會議論文。 [5] 黃焜烽 (2018),利用深度類神經網路模型預測台灣股價指數走勢,國立臺北大學金融與合作經營學系碩士論文。 [6] 楊東翰 (2019),深度校準:以G2++利率模型為例,國立政治大學金融研究所碩士論文。 [7] 鍾經樊、黃嘉龍、黃博怡、謝有隆 (2006),台灣地區企業信用評分系統的建置、驗證和比較,中央研究院經濟研究所。 英文文獻 [8] Barr, R.S., Helgason, R.V., Kennington, J.L., eds. (1997), Interfaces in Computer Science and Operations Research: Advances in Metaheuristics, Optimization, and Stochastic Modeling Technologies, Springer Publishing. [9] Desai, Y.S., Crook, J.N. & Overstreet, G.A. (1996), “A comparison of neural networks and linear scoring models in the credit environment,” European Journal of Operations Research, 85: 24–37. [10] Dietterich T.G. (2000), “Ensemble methods in machine learning,” Proceedings of the First International Workshop on Multiple Classifier Systems, pp. 1-15. [11] Sarkar, D., Natarajan, V. (2019), “Ensemble Machine Learning Cookbook: Over 35 practical recipes to explore ensemble machine learning techniques using Python,” PACKT Publishing, pp.188. [12] Foreman, R. D. (2003), “A Logistic Analysis of Bankruptcy within the US Local Telecommunications Industry,” Journal of Economics and Business, (55:2), pp. 135-166. [13] Koh, H.C., Tan, W.C., Goh, C.P. (2006), “A Two-step Method to Construct Credit Scoring Models with Data Mining Techniques,” International Journal of Business and Information, Volume 1, pp. 96-118. [14] He, H., Member, IEEE, and Edwardo A.G. (2009), “Learning from Imbalanced Data,” IEEE Transactions on Knowledge and Data Engineering, VOL. 21, NO. 9. [15] Ince, H., Aktan, B. (2008), “A comparison of data mining techniques for credit scoring in banking: A managerial perspective,” Journal of Business Economics and Management, 10(3): 233-240. [16] Yeh, I-C., Li, J.-W., Lee, Y.-S. & Ting, T.-M. (2010), “Can the Risk Probability of Credit Card Customers be Estimated?” Journal of Information Technology and Applications. [17] Kacprzyk, J., Pedrycz, W. (2015), Handbook of Computational Intelligence, Springer Publishing. [18] Kingma, D.T., Ba, J. (2014), “Adam: A Method For Stochastic Optimization,” arXiv:1412.6980[cs.LG]. [19] Lo, A.W. (1986), “Logit Versus Discriminant Analysis-A Specification Test and Application to Corporate Bankruptcies,” Journal of Econometrics, Vol. 31, pp.151-178. [20] Ribeiro, M.T., Singh S. & Guestrin C. (2016), “Why Should I Trust You? Explaining the Predictions of Any Classifier,” KDD. [21] Ohlson, J. A. (1980), “Financial ratios and the probabilistic prediction of bankruptcy,” Journal of Accounting Research, 18, pp.109-131. [22] Peng, R.-Z. (2017), “Personal Credit Assessment Model Based on Stacking Ensemble Learning Algorithm,” Statistics and Application, 6(4), pp. 441-417. [23] Dzˇeroski, S., Zˇenko, B. (2004), “Is Combining Classifiers with Stacking Better than Selecting the Best One?” Kluwer Academic Publishers, pp.255-273. [24] West, D., Dellana, S & Qian, J. (2005), “Neural network ensemble strategies for financial decision applications,” Computers & Operations Research, Vol. 32, pp. 2543-2559. [25] Wolpert, D. (1992), “Stacked generalization,” Neural Networks, Volume 5, Issue 2, pp.241-259. [26] Tounsi, Y., Hassouni, L., & Anoun, H. (2018), “An Enhanced Comparative Assessment of Ensemble Learning for Credit Scoring,” International Journal of Machine Learning and Computing, Volume 8, No.5. |
Description: | 碩士 國立政治大學 金融學系 106352010 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0106352010 |
Data Type: | thesis |
DOI: | 10.6814/NCCU201900090 |
Appears in Collections: | [金融學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
201001.pdf | 3657Kb | Adobe PDF2 | 122 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|