Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/145857
|
Title: | 基於10-K報表ESG情緒萃取之企業違約預測模型:應用語意分析遷移學習 Corporate Default Prediction Model with ESG Sentiment: Transfer Learning-Based Sentiment Analysis of 10-K Reports |
Authors: | 陳科穎 Chen, Ke-Ying |
Contributors: | 江彌修 Chiang, Mi-Hsiu 陳科穎 Chen, Ke-Ying |
Keywords: | BERT FinBERT 10-K 機器學習 文本情緒 ESG 企業破產預測 BERT FinBERT 10-K Machine Learning Text Sentiment ESG Corporate Bankruptcy Prediction |
Date: | 2023 |
Issue Date: | 2023-07-06 16:46:19 (UTC+8) |
Abstract: | 企業破產研究一直是財務論文中重要的命題,過往許多文獻使用不同方法研 究企業違約風險以及公司潛在破產因子,透過分析財務報表之會計數據套用於計 量模型進行回歸分析研究。然而早期論文中,較缺乏探討非結構資料對於破產因 子的重要性,近幾年的研究,逐漸加入文字特徵提取,文字探勘技術運用在許多 層面萃取情緒,包含央行會議紀錄、新聞標題與內文、產業研究報告、10-K、永 續報告書等,透過模型萃取情緒分數,並加入情緒因子訓練模型,並期望能強化 與改善模型預測能力。本次研究以結構型資料與非結構資料建立機器學習模型, 進行企業破產違約預測,非結構化資料採取 BERT (Bidirectional Encoder Representations from Transformers) 與 FinBERT (BERT for Financial Text Mining) 分 別萃取美國上市公司 10-K MD&A 報表,企業表達營運情緒的正負分數,以及管 理階層對於 ESG 相關討論之重視程度的情緒分數,觀察兩因子是否能有效增強機 器學習模型預測能力。根據實證,加入正負情緒分數與 ESG 情緒分數能讓機器學 習的 AUC、RECALL 上升,後續比較 Logistic、SVM、Random Forest、XGBoost 模型中,所有模型預測能力皆上升,並且發現過採樣 (SMOTE) 能夠解決樣本不平 衡問題,強化整體預測能力,而本次研究發現集成學習預測能力較線性模型表現 更好,且 XGBoost 為所有模型中預測效果最佳的模型。 Bankruptcy prediction has always been an important topic in financial literature. Past studies have used different methods to investigate corporate default risk and potential bankruptcy factors, applying regression analysis to financial statement accounting data. However, early literature lacked exploration of the importance of non-structural data for bankruptcy factors. In recent years, research has gradually incorporated text feature extraction and text mining techniques to extract sentiment, including central bank meeting records, Fed minutes, news headlines and content, industry research reports, 10-K, and sustainability reports. By extracting sentiment scores through models and incorporating emotional factors into the training process, it is hoped to enhance the predictive power of the model. This study establishes a machine learning model based on structured and unstructured data to predict corporate bankruptcy and default. Unstructured data is extracted using BERT (Bidirectional Encoder Representations from Transformers) and FinBERT (BERT for Financial Text Mining) from 10-K MD&A reports of US listed companies, which express the positive and negative sentiment scores of corporate operating emotions and the degree of importance of ESG-related discussions by management in 10-K MD&A reports. We observe whether the two factors can effectively enhance the predictive power of the machine learning model. According to empirical results, adding positive and negative sentiment scores and ESG sentiment scores can increase the AUC and RECALL of machine learning. Moreover, among the Logistic, SVM, Random Forest, and XGBoost models, all models have improved predictive power. It was also found that oversampling can solve the problem of sample imbalance, enhancing overall predictive power. Ensemble learning was found to perform better than linear models, and XGBoost was the best-performing model among all models. |
Reference: | Albuquerque, R., Koskinen, Y., & Zhang, C. (2019). Corporate Social Responsibility and Firm Risk: Theory and Empirical Evidence. Management Science, 65(10), 4451-4469. Altman, E. I. (1968). Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy. The Journal of Finance, 23(4), 589-609. Araci, D. (2019). Finbert: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063. Barboza, F., Kimura, H., & Altman, E. (2017). Machine learning models and bankruptcy prediction. The Journal of Finance, 83, 405-417. Beaver, W. H. (1966). Financial Ratios as Predictors of Failure. Journal of Accounting Research, 4, 71-111. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Friede, G., Busch, T., & Bassen, A. (2015). ESG and financial performance: Aggregated evidence from more than 2000 empirical studies. Journal of Sustainable Finance & Investment, 5(4), 210-233. Huang, A. H., Wang, H., & Yang, Y. (2022). FinBERT: A Large Language Model for Extracting Information from Financial Text. Contemporary Accounting Research. Ionescu, G. H., Firoiu, D., Pirvu, R., & Vilag, R. D. (2019). The impact of ESG factors on market value of companies from travel and tourism industry. Technological and Economic Development of Economy, 25(5), 820-849. Kim, A. G., & Yoon, S. (2021). Corporate Bankruptcy Prediction with Domain-Adapted BERT. EMNLP 2021, 3rd Workshop on ECONLP. Lin, W. L., Law, S. H., Ho, J. A., & Sambasivan, M. (2019). The causality direction of the corporate social responsibility—Corporate financial performance Nexus:Application of Panel Vector Autoregression approach. The North American Journal of Economics and Finance, 48, 401–418. Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2020). ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. ICLR, OpenReview.net. Lundberg, S. M. & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 4765--4774). Curran Associates, Inc. Loughran, T., & McDonald, B. (2011). When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks. Journal of Finance, 66, 1, 35-65. Mai, F., Tian, S., Lee, C., & Ma, L. (2019). Deep learning models for bankruptcy prediction using textual disclosures. European Journal of Operational Research, 274, 2, 743–758. Narvekar, A., & Guha, D. (2021). Bankruptcy prediction using machine learning and an application to the case of the COVID-19 recession. Data Science in Finance and Economics, 1, 2, 180-195. Ohlson, J. A. (1980). Financial Ratios and the Probabilistic Prediction of Bankruptcy. Journal of Accounting Research, 109-131. Premachandra, I. M., Chen, Y., & Watson, J. (2011). DEA as a tool for predicting corporate failure and success: A case of bankruptcy assessment. Omega, 39, 6, 620- 626. Salton, G., & McGill, M. J. (1983). Introduction to Modern Information Retrieval. McGraw-Hill. Shetty, S., Musa, M., & Brédart, X. (2022). Bankruptcy Prediction Using Machine Learning Techniques. Journal of Risk and Financial Management, 15, 1, 35. Shumway, T. (2001). Forecasting Bankruptcy More Accurately: A Simple Hazard Model. Journal of Business, 74, 101-124. Velte, P. (2017). Does ESG performance have an impact on financial performance? Evidence from Germany. Journal of Global Responsibility, 8, 2, 169-178.Wang, N. (2017). Bankruptcy prediction using machine learning. Journal of Mathematical Finance, 7, 908-918. Wilson, D. L. (1972). Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man, and Cybernetics, 2, 3, 408-421. Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., & Le, Q. V. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. Advances in Neural Information Processing Systems, 32, 5754-5764. |
Description: | 碩士 國立政治大學 金融學系 110352008 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0110352008 |
Data Type: | thesis |
Appears in Collections: | [金融學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
200801.pdf | | 2561Kb | Adobe PDF2 | 45 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|