政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/101083

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 118786/149850 (79%)
Visitors : 82446208 Online Users : 278

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 商學院 > 金融學系 > 學位論文 > Item 140.119/101083

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/101083

Title:	應用網路新聞文字探勘於預測台灣股價趨勢之研究 A study of forecasting Taiwan stock price trends by applying news text mining technique
Authors:	陳人華 Chen, Ren Hua
Contributors:	廖四郎陳人華 Chen, Ren Hua
Keywords:	文字探勘 svm 新聞股市
Date:	2016
Issue Date:	2016-09-01 23:47:06 (UTC+8)
Abstract:	股市新聞是散戶投資人重要的消息來源管道，近年來集中市場裡散戶投資人交易占比雖然下滑，但仍有過半的比重，而過去文獻也一再指出新聞媒體的報導確實會影響股票的報酬，若能夠將新聞中的資訊萃取出來並用來建構交易策略，無論是單獨使用或者和其他策略相結合，均可帶給投資人額外的幫助。本研究運用支援向量機演算法(Support Vector Machine, SVM)進行自動分類及預測新聞發布後的股價趨勢，藉由應用張玉芳等人(2006)提出的改良式TF-IDF法，挑選新聞特徵詞的過程將會更準確，本研究從兩個不同的來源分別獲取數千篇新聞資料，包括鉅亨網和台灣經濟新報(TEJ)，透過分析大量的新聞資料使結果更具代表性與穩定性，然而實證結果卻發現預測模型的精確度仍然不足，因此本研究最終未能透過模型證明新聞內容對股價的關係。 Stock market news is an important source of information for individual investors. In Taiwan exchange market, individual investors participation is still above 50% though it was on a decline for resent years. Some past research showed that news do affect returns of stocks. If we can find a way to extract the information in the news and build a trading strategy based on it, investors will gain additional profit from using the strategy─whether they combine the strategy with another. This study use SVM algorithm for automatic classification and for predicting Taiwan stock price trends after a news published. By applying the improved TF-IDF method developed by Chang et al., the process of characteristic selection become more accurate. This study analyze thousands of news articles which come from two different source, cnYES and Taiwan Economic Journal (TEJ), in order to make the predicting model representative and stable. However, the empirical results show that the precision of the model isn’t good enough. This study find no evidence that the information in news contents associate with Taiwan stock returns.
Reference:	1.Barber, B. M., & Odean, T. (2008). All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors. Review of Financial Studies, 21(2), 785-818. 2.Chen, K. J., & Liu, S. H. (1992, August). Word identification for Mandarin Chinese sentences. In Proceedings of the 14th conference on Computational linguistics-Volume 1 (pp. 101-107). Association for Computational Linguistics. 3.Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning,20(3), 273-297. 4.Forman, G. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of machine learning research, 3(Mar), 1289-1305. 5.Gidofalvi, G., & Elkan, C. (2001). Using news articles to predict stock price movements. Department of Computer Science and Engineering, University of California, San Diego. 6.Hsu, C. W., Chang, C. C., & Lin, C. J. (2003). A practical guide to support vector classification. 7.Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., & Allan, J. (2000, November). Language models for financial news recommendation. InProceedings of the ninth international conference on Information and knowledge management (pp. 389-396). ACM. 8.Merton, R. C. (1987). A simple model of capital market equilibrium with incomplete information. The journal of finance, 42(3), 483-510. 9.Mittermayer, M. A. (2004). Forecasting intraday stock price trends with text mining techniques. In System Sciences, 2004. Proceedings of the 37th Annual Hawaii International Conference on (pp. 10-pp). IEEE. 10.Nie, J. Y., Brisebois, M., & Ren, X. (1996). On Chinese text retrieval. In Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 225-233). ACM. 11.Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513-523. 12.Salton, G., & McGill, M. J. (1986). Introduction to modern information retrieval. 13.Sproat, R. (1990). A STATISTICAL METHOD FOR FINDING WORD BOUNDARIES IN CHINESE TEXT. 14.Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139-1168. 15.Witten, I. H. (2005). Text mining. Practical handbook of Internet computing, 14-1. 16.Yang, Y. (1999). An evaluation of statistical approaches to text categorization. Information retrieval, 1(1-2), 69-90. 17.池祥萱, 林煜恩, 陳韋如 & 周賓凰. (2009). Does CEO Media Coverage Affect Firm Performance?. 交大管理學報, 1, 139-173. 18.張玉芳, 彭時名 & 呂佳. (2006). 基於文本分類 TFIDF 方法的改進與應用. 電腦工程, 32(19), 76-78. 19.鍾任明, 李維平, & 吳澤民. (2005). 運用文字探勘於日內股價
Description:	碩士國立政治大學金融研究所 103352019
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0103352019
Data Type:	thesis
Appears in Collections:	[金融學系] 學位論文

Files in This Item:

File	Description	Size	Format
201901.pdf		1271Kb	Adobe PDF2	307	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback