政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/127745
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113873/144892 (79%)
Visitors : 51916319      Online Users : 564
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/127745


    Title: 應用LDA主題模型於美國企業破產預測之研究
    Applying LDA Topic Modeling to U.S. Corporate Bankruptcy Prediction
    Authors: 許哲維
    Hsu, Che-Wei
    Contributors: 江彌修
    Chiang, Mi-Hsiu
    許哲維
    Hsu, Che-Wei
    Keywords: LDA
    主題模型
    企業破產預警
    10-K財報
    LDA
    Topic modeling
    Corporate bankruptcy prediction
    10-K
    Date: 2019
    Issue Date: 2019-12-06 09:25:54 (UTC+8)
    Abstract: 近年來,利用文字探勘進行文本資訊的特徵提取愈來愈便捷,許多研究逐漸運用文字探勘的技術,結合企業相關的新聞內容或公司發布的消息等文本資料,應用於金融與會計領域的研究,希望透過文字本身隱含的情緒萃取出更精準且即時的訊息,以增強模型的解釋能力、預測能力及結果的穩定程度。本研究以主題模型中的隱含狄利克雷分布LDA(Latent Dirichlet Allocation),將10-K財報的文本資訊透過模型進行主題的分類,觀察和風險有關的主題之下的字詞經由標準化後形成的變數是否能有效增加破產模型預測的準確度。根據實證結果,以10-K財報建立LDA主題分類後,選取和風險攸關的字詞之頻率進行檢驗,並標準化形成風險攸關主題變數後,發現無論是使用Logit模型或是Probit模型,納入風險攸關主題變數皆能夠提升美國企業破產預測的結果。
    In recent years, as it is way less time-consuming to apply text mining techniques, more researchers have made efforts to extract certain characteristics from soft data by combining text mining techniques with their own field of expertise to further capture real-time text information and improve their research as well. However, there is little research focusing on topic modeling and consideration of latent topics existing in every document in the field of finance. In this research, LDA topic modeling, a fashion to perform latent semantic analysis, is applied to categorize soft information from 10-K financial reports into several topics. The ultimate goal in this research is to analyze whether the standardization of word frequencies of the words under risk-related topics could improve corporate bankruptcy predicting accuracy. According to the empirical results, when using risk-related topic variable after enforcing LDA topic modeling and further transforming the outcome to a standardized variable in the model, the U.S. corporate bankruptcy predicting accuracy during the time period from 1998 to 2017 is improved under both Logit and Probit models.
    Reference: [1] Altman, E. I. (1968). Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy. The journal of finance, 23(4), 589-609.
    [2] Aziz, S., Dowling, M. M., Hammami, H., & Piepenbrink, A. (2019). Machine Learning in Finance: A Topic Modeling Approach. Available at SSRN 3327277.
    [3] Beaver, W. H., McNichols, M. F., & Rhie, J. W. (2005). Have Financial Statements Become Less Informative? Evidence from the Ability of Financial Ratios to Predict Bankruptcy. Review of Accounting studies, 10(1), 93-122.
    [4] Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3(Jan), 993-1022.
    [5] Bodnaruk, A., Loughran, T., & McDonald, B. (2015). Using 10-K Text to Gauge Financial Constraints. Journal of Financial and Quantitative Analysis, 50(4), 623-646.
    [6] Crosbie, P. J., & Bohn, J. R. (1999). Modeling Default Risk (KMV LLC).
    [7] Dyer, T., Lang, M., & Stice-Lawrence, L. (2017). The Evolution of 10-K Textual Disclosure: Evidence from Latent Dirichlet Allocation. Journal of Accounting and Economics, 64(2-3), 221-245.
    [8] Edison, H., & Carcel, H. (2019). Text Data Analysis Using Latent Dirichlet Allocation: An Application to FOMC Transcripts (No. 11). Bank of Lithuania.
    [9] Griffiths, T. L., & Steyvers, M. (2004). Finding Scientific Topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228-5235.
    [10] Hansen, S., McMahon, M., & Prat, A. (2017). Transparency and Deliberation Within the FOMC: a Computational Linguistics Approach. The Quarterly Journal of Economics, 133(2), 801-870.
    [11] Hofmann, T. (1999, July). Probabilistic Latent Semantic Analysis. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (pp. 289-296). Morgan Kaufmann Publishers Inc.
    [12] Loughran, T., & McDonald, B. (2011). When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65.
    [13] Merton, R. C. (1974). On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. The Journal of finance, 29(2), 449-470.
    [14] Moro, S., Cortez, P., & Rita, P. (2015). Business Intelligence in Banking: A literature Analysis from 2002 to 2013 Using Text Mining and Latent Dirichlet Allocation. Expert Systems with Applications, 42(3), 1314-1324.
    [15] Odom, M. D., & Sharda, R. (1990, June). A Neural Network Model for Bankruptcy Prediction. In 1990 IJCNN International Joint Conference on Neural Networks (pp. 163-168). IEEE.
    [16] Ohlson, J. A. (1980). Financial Ratios and the Probabilistic Prediction of Bankruptcy. Journal of Accounting Research, 109-131.
    [17] Tsai, F. T., Lu, H. M., & Hung, M. W. (2016). The Impact of News Articles and Corporate Disclosure on Credit Risk Valuation. Journal of Banking & Finance, 68, 100-116.
    [18] Tsai, M. F., & Wang, C. J. (2017). On the Risk Prediction and Analysis of Soft Information in Finance Reports. European Journal of Operational Research, 257(1), 243-250.
    [19] Salton, G., & McGill, M. J. (1983). Introduction to Modern Information Retrieval. Mcgraw-hill.
    [20] Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of Statistics, 6(2), 461-464.
    [21] Shumway, T. (2001). Forecasting Bankruptcy More Accurately: A Simple Hazard Model. The Journal of Business, 74(1), 101-124.
    [22] Timmermans, M., & Finance, M. (2014). US Corporate Bankruptcy Predicting Models (Doctoral Dissertation, Master’s thesis.[online]. Tilburg University, Tilburg. Available from: http://arno. uvt. nl/show. cgi).
    [23] Zmijewski, M. E. (1984). Methodological Issues Related to the Estimation of Financial Distress Prediction Models. Journal of Accounting Research, 59-82.
    Description: 碩士
    國立政治大學
    金融學系
    106352002
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0106352002
    Data Type: thesis
    DOI: 10.6814/NCCU201901264
    Appears in Collections:[Department of Money and Banking] Theses

    Files in This Item:

    File SizeFormat
    200201.pdf1929KbAdobe PDF241View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback