政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/137297

政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/137297

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | 全文筆數/總筆數 : 115256/146303 (79%)
造訪人次 : 54525869 線上人數 : 330

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

搜尋範圍

查詢小技巧：

您可在西文檢索詞彙前後加上"雙引號"，以獲取較精準的檢索結果

若欲以作者姓名搜尋，建議至進階搜尋限定作者欄位，可獲得較完整資料

進階搜尋

主頁 ‧ 登入 ‧ 上傳 ‧ 說明 ‧ 關於政大典藏 ‧ 管理

到手機版

政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 > Item 140.119/137297

請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/137297

題名:	基於注意力機制語言模型之財務風險文章偵測與實體辨識 Financial Risk-related News Detection and Named Entity Recognition via Transformer-based Language Models
作者:	盧佳妤 Lu, Jia-Yu
貢獻者:	蔡銘峰 Tsai, Ming-Feng 盧佳妤 Lu, Jia-Yu
關鍵詞:	注意力機制模型聯合訓練實體辨識自然語言處理 Transformer Attention mechanism Joint training Named-entity recognization Natual language processing
日期:	2021
上傳時間:	2021-10-01 10:06:33 (UTC+8)
摘要:	本研究利用注意力機制模型偵測財務文章之風險事件及抽取潛在金融犯罪名單，建構自動化模型以降低人力標記成本及提升預測速度。我們分析不同模型架構及訓練方法之優缺點，並比較傳統神經網路方法與 Transformer Based 模型的差異。模型架構分為兩階段，第一階段判斷目標文章是否包含金融風險事件，而第二階段則在這些文章中抽取高危險的名單。我們提出聯合訓練方法同時訓練兩階段的模型，透過實驗證明可在不損失正確性的情況提升訓練及預測速度，並得以提升模型穩定性。我們亦針對注意力機制模型內部的 Attention Weight 做視覺化分析，顯示模型能在不提供標注的情況自動關注金融風險詞彙。另外我們針對缺乏風險人名標記的訓練資料之情況，利用以上 Attention Weight 分析設計特殊的規則，達到一定程度的效果提升。最後我們額外在一個 Wikipedia 上的英文資料集做測試，說明此研究結果亦可應用於不同領域及不同語言的任務。 This thesis uses transformer-based models to detect risk events from financial articles and extract potential financial criminals. With such automated models, we can reduce human costs on labeling and increase prediction performance. In this thesis, we analyze the advantages and disadvantages of different approaches and compare the differences between traditional neural networks and Transformer-based models. The proposed method contains two stages: the first stage determines whether the target news contains financial risk events, and the second stage extracts high-risk entities from the news. We propose a joint-training method to train these two stages at the same time. Experimental results show that the proposed joint-training method improves prediction accuracy and enhances the stability of the training process. We also visualize the attention weights of the attention mechanism model, showing that the model automatically pays attention to financial risk vocabularies without providing annotations. In addition, we use the above attention weight scheme to design special rules, achieving a certain degree of effect improvement for the case that lacks risk-name-annotation. Finally, further experiments conducted on a dataset from English Wikipedia confirm that the proposed method can also apply to different domains and languages.
參考文獻:	[1] D. W. Otter, J. R. Medina, and J. K. Kalita, “A survey of the usages of deep learn- ing for natural language processing,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 2, pp. 604–624, 2020. [2] R.Jozefowicz,W.Zaremba,andI.Sutskever,“Anempiricalexplorationofrecurrent network architectures,” in International conference on machine learning. PMLR, 2015, pp. 2342–2350. [3] S.HochreiterandJ.Schmidhuber,“Longshort-termmemory,”Neuralcomputation, vol. 9, no. 8, pp. 1735–1780, 1997. [4] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” arXiv preprint arXiv:1706.03762, 2017. [5] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018. [6] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019. [7] K. Potdar, T. S. Pardawala, and C. D. Pai, “A comparative study of categorical vari- able encoding techniques for neural network classifiers,” International journal of computer applications, vol. 175, no. 4, pp. 7–9, 2017. [8] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word repre- sentations in vector space,” arXiv preprint arXiv:1301.3781, 2013. [9] Q. Liu, M. J. Kusner, and P. Blunsom, “A survey on contextual embeddings,” arXiv preprint arXiv:2003.07278, 2020. [10] J. Li, A. Sun, J. Han, and C. Li, “A survey on deep learning for named entity recog- nition,” IEEE Transactions on Knowledge and Data Engineering, pp. 1–1, 2020. [11] V.KrishnanandV.Ganapathy,“Namedentityrecognition,”StanfordLectureCS229, 2005. [12] S. R. Eddy, “Hidden markov models,” Current opinion in structural biology, vol. 6, no. 3, pp. 361–365, 1996. [13] J. Lafferty, A. McCallum, and F. C. Pereira, “Conditional random fields: Probabilis- tic models for segmenting and labeling sequence data,” 2001. [14] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d`Alché-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc., 2019, pp. 8024–8035. [Online]. Available: http://papers.neurips.cc/paper/9015-pytorch-an- imperative-style-high-performance-deep-learning-library.pdf [15] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. L. Scao, S. Gugger, M. Drame, Q. Lhoest, and A. M. Rush, “Transformers: State-of-the-art natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Online: Association for Computational Linguistics, Oct. 2020, pp. 38–45. [Online]. Available: https://www.aclweb.org/anthology/2020.emnlp-demos.6 [16] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014. [17] I.LoshchilovandF.Hutter,“Decoupledweightdecayregularization,”arXivpreprint arXiv:1711.05101, 2017.
描述:	碩士國立政治大學資訊科學系 108753120
資料來源:	http://thesis.lib.nccu.edu.tw/record/#G0108753120
資料類型:	thesis
DOI:	10.6814/NCCU202101564
顯示於類別:	[資訊科學系] 學位論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
312001.pdf		1806Kb	Adobe PDF2	148	檢視/開啟

在政大典藏中所有的資料項目都受到原著作權保護.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - 回饋