Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/157843
|
Title: | 新聞情緒之於罕見評級轉移的早期預警:檢驗 FinBERT 在情緒偏向與類別失衡情境下的領先預測能力 Early-Warning Power of News Sentiment for Infrequent Rating Transitions: Examining FinBERT’s Leading Predictive Ability under Sentiment Skew and Class Imbalance |
Authors: | 林威宇 Lin, Wei-Yu |
Contributors: | 江彌修 Chiang, Mi-Hsiu 林威宇 Lin, Wei-Yu |
Keywords: | 罕見評級變動 類別失衡深度學習 情緒萃取 SMOTE合成資料 |
Date: | 2025 |
Issue Date: | 2025-07-01 15:19:01 (UTC+8) |
Abstract: | 本研究針對傳統財務變數難以有效利用非結構化新聞資訊之侷限,提出結合遷移式學習與金融語境特化模型的企業信用預警創新方法。利用金融領域預訓練自然語言模型 FinBERT,本研究從 RavenPack 資料庫所蒐集之美國 S&P500 成分股新聞資料中萃取新聞文本之情緒特徵,進而建構企業信用不確定風險指標 (Corporate Credit Uncertainty risk index, CCU),用以量化新聞情緒對企業未來信用狀況可能產生的不確定性衝擊。實證結果顯示,FinBERT 不僅在情緒辨識 (正面、中立、負面) 的多項驗證指標表現上顯著優於一般語境模型 BERT,尤其在情緒偏向顯著的資料中展現更佳的穩定辨識能力,凸顯其遷移學習的明顯優勢。 為克服企業信用評級變動資料中評級變動三類標籤間的嚴重類別失衡問題,本研究採用 SMOTE (Synthetic Minority Over-sampling Technique) 技術進行少數類別樣本強化,以提升模型對罕見變動事件的辨識能力。結合傳統財務變數與 CCU 指標的 XGBoost 機器學習分類預警模型證實,CCU 指標在特徵重要性排序中顯著領先其他變數,是預測企業信用評級變動的最關鍵特徵,能有效補足傳統模型對非結構化軟性資訊的反應不足,顯著提升企業信用預警在情緒偏向與類別失衡情境下的即時性與準確度。 This study proposes a novel early warning framework for corporate credit rating changes by incorporating unstructured textual information from financial news, which is often overlooked by traditional financial variables. Utilizing the domain-specific pre-trained language model FinBERT, we extract sentiment features from news articles and construct a Corporate Credit Uncertainty (CCU) index to quantify the impact of media sentiment on credit risk. Based on news data of S&P 500 constituent companies from the RavenPack database, we compare the performance of BERT and FinBERT in identifying three sentiment categories (positive, neutral, negative). Empirical results indicate that FinBERT significantly outperforms the general-purpose BERT model across various evaluation metrics and maintains robust performance in sentiment-biased corpora, demonstrating the effectiveness of transfer learning in financial contexts. To address the severe class imbalance among rating changes, we adopt the Synthetic Minority Over-sampling Technique (SMOTE) to enhance the model’s ability to recognize rare rating transitions. We then construct a classification model using XGBoost, incorporating both CCU and traditional financial indicators. Results show that the CCU index ranks highest in feature importance and outperforms all conventional variables, serving as a key predictive signal for credit rating movements. The proposed approach effectively captures soft information absent in traditional models and significantly improves the timeliness and accuracy of credit risk forecasting, particularly under sentiment bias and class imbalance scenarios. |
Reference: | Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance, 23(4), 589–609. Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294. Araci, D. (2019). FinBERT: Financial sentiment analysis with pre-trained language models. arXiv preprint arXiv:1908.10063. Bajo, E., & Raimondo, C. (2017). Media attention and the pricing of initial public offerings. The Journal of Corporate Finance, 46, 1–22. Baker, H. K., Dutta, S., Saadi, S., & Zhong, L. (2022). Does media coverage affect credit rating change decisions? Journal of Banking & Finance, 145, 106667. Baker, S. R., Bloom, N., & Davis, S. J. (2016). Measuring economic policy uncertainty. The Quarterly Journal of Economics, 131(4), 1593–1636. Barber, B. M., & Odean, T. (2008). All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors. The Review of Financial Studies, 21(2), 785–818. Berndt, A., Douglas, R., Duffie, D., Ferguson, M., & Schranz, D. (2008). Measuring default risk premia from default swap rates and EDFs. Review of Financial Studies, 21(2), 133–176. Blanco, R., Brennan, S., & Marsh, I. W. (2005). An empirical analysis of the dynamic relation between investment-grade bonds and credit default swaps. Journal of Finance, 60(5), 2255–2281. Brogaard, J., & Detzel, A. (2015). The asset-pricing implications of government economic policy uncertainty. Management Science, 61(1), 3–18. Bybee, L., Kelly, B., Manela, A., & Moreira, A. (2024). The structure of economic news. The Review of Financial Studies, 37(1), 1–41. Campbell, J. Y., Hilscher, J., & Szilagyi, J. (2008). In search of distress risk. Journal of Finance, 63(6), 2899–2939. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357. Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system (Version 3). Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. Da, Z., Engelberg, J., & Gao, P. (2015). The sum of all FEARS investor sentiment and asset prices. The Review of Financial Studies, 28(1), 1–32. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. Dougal, C., Engelberg, J., Garcia, D., & Parsons, C. A. (2012). Journalists and the stock market. The Review of Financial Studies, 25(3), 639–679. Dyer, T., Lang, M., & Stice-Lawrence, L. (2017). The evolution of 10-K textual disclosure: Evidence from latent Dirichlet allocation. Journal of Accounting and Economics, 64(2–3), 221–245. Fang, L., & Peress, J. (2009). Media coverage and the cross-section of stock returns. The Journal of Finance, 64(5), 2023–2052. Ferguson, S., McGowan, C. B., & Martin, J. D. (2015). Do investor relations practices affect the volatility of firm stock returns? International Review of Financial Analysis, 40, 66–75. Frankel, J., Jennings, J. N., & Lee, J. (2021). Using textual analysis and machine learning to measure investor sentiment. The Review of Financial Studies, 34(9), 4223–4256. Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 328–339. Huang, A., Teoh, S. H., & Zhang, Y. (2018). Tone management. The Accounting Review, 93(5), 169–202. Hutto, C. J., & Gilbert, E. (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the 8th International AAAI Conference on Weblogs and Social Media, 216–225. Kaviani, M., Jang, Y., & Ghosh, A. (2020). Economic policy uncertainty and corporate bond credit spreads. Journal of Corporate Finance, 65, 101756. Kisgen, D. J. (2006). Credit ratings and capital structure. The Journal of Finance, 61(3), 1035–1072. Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring. European Journal of Operational Research, 247(1), 124–136. Li, F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and Economics, 45(2–3), 221–247. Li, X., Xie, H., Chen, L., Wang, J., & Deng, X. (2014). News impact on stock price return via sentiment analysis. Knowledge-Based Systems, 69, 14–23. Longstaff, F. A., Mithal, S., & Neis, E. (2005). Corporate yield spreads: Default risk or liquidity? New evidence from the credit default swap market. Journal of Finance, 60(5), 2213–2253. Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance, 66(1), 35–65. Loughran, T., & McDonald, B. (2014). Measuring readability in financial disclosures. The Journal of Finance, 69(4), 1643–1671. Lu, H.-M., Tsai, F.-T., Chen, H., Hung, M.-W., and Li, S.-H. (2012). Credit rating change modeling using news and financial ratios. ACM Transactions on Management Information Systems (TMIS), 3(3):1–30. Manela, A., & Moreira, A. (2017). News implied volatility and disaster concerns. The Journal of Financial Economics, 123(1), 137–162. Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates. Journal of Finance, 29(2), 449–470. Myšková, R., Hájek, P., & Olej, V. (2018). Predicting abnormal stock return volatility using textual analysis of news – A meta-learning approach. Amfiteatru Economic, 20(47), 602–619. Norden, L. (2017). Information in CDS spreads. Journal of Banking & Finance, 75, 118–135. Norden, L., & Weber, M. (2004). Informational efficiency of credit default swap and stock markets: The impact of credit rating announcements. Journal of Banking & Finance, 28(11), 2813–2843. Ohlson, J. A. (1980). Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research, 18(1), 109–131. Pan, L., Yan, J., & Zhang, H. (2023). Global economic policy uncertainty and sovereign credit risk: Evidence from CDS spreads. Emerging Markets Review, 55, 100928. Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. Pastor, L., & Veronesi, P. (2013). Political uncertainty and risk premia. Journal of Financial Economics, 110(3), 520–545. Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. Peress, J. (2014). The media and the diffusion of information in financial markets: Evidence from newspaper strikes. The Journal of Finance, 69(5), 2007–2043. Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, 2227–2237. Raunig, B. (2018). The impact of economic policy uncertainty on sovereign CDS spreads. Working Paper, Oesterreichische Nationalbank (OeNB). Shumway, T. (2001). Forecasting bankruptcy more accurately: A simple hazard model. Journal of Business, 74(1), 101–124. Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139–1168. Tetlock, P. C., Saar‐Tsechansky, M., & Macskassy, S. (2008). More than words: Quantifying language to measure firms’ fundamentals. The Journal of Finance, 63(3), 1437–1467. Wang, Y., Xu, N., & Zhong, L. (2019). Economic policy uncertainty and the CDS market. Journal of Financial Stability, 44, 100695. You, H., & Zhang, X. (2009). Financial reporting complexity and investor underreaction to 10-K information. The Review of Accounting Studies, 14(4), 559–586. |
Description: | 碩士 國立政治大學 金融學系 112352035 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0112352035 |
Data Type: | thesis |
Appears in Collections: | [金融學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
203501.pdf | 4581Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|