政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/101083
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文筆數/總筆數 : 113392/144379 (79%)
造訪人次 : 51206620      線上人數 : 886
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    政大機構典藏 > 商學院 > 金融學系 > 學位論文 >  Item 140.119/101083
    請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/101083


    題名: 應用網路新聞文字探勘於預測台灣股價趨勢之研究
    A study of forecasting Taiwan stock price trends by applying news text mining technique
    作者: 陳人華
    Chen, Ren Hua
    貢獻者: 廖四郎
    陳人華
    Chen, Ren Hua
    關鍵詞: 文字探勘
    svm
    新聞
    股市
    日期: 2016
    上傳時間: 2016-09-01 23:47:06 (UTC+8)
    摘要: 股市新聞是散戶投資人重要的消息來源管道,近年來集中市場裡散戶投資人交易占比雖然下滑,但仍有過半的比重,而過去文獻也一再指出新聞媒體的報導確實會影響股票的報酬,若能夠將新聞中的資訊萃取出來並用來建構交易策略,無論是單獨使用或者和其他策略相結合,均可帶給投資人額外的幫助。
    本研究運用支援向量機演算法(Support Vector Machine, SVM)進行自動分類及預測新聞發布後的股價趨勢,藉由應用張玉芳等人(2006)提出的改良式TF-IDF法,挑選新聞特徵詞的過程將會更準確,本研究從兩個不同的來源分別獲取數千篇新聞資料,包括鉅亨網和台灣經濟新報(TEJ),透過分析大量的新聞資料使結果更具代表性與穩定性,然而實證結果卻發現預測模型的精確度仍然不足,因此本研究最終未能透過模型證明新聞內容對股價的關係。
    Stock market news is an important source of information for individual investors. In Taiwan exchange market, individual investors participation is still above 50% though it was on a decline for resent years. Some past research showed that news do affect returns of stocks. If we can find a way to extract the information in the news and build a trading strategy based on it, investors will gain additional profit from using the strategy─whether they combine the strategy with another.
    This study use SVM algorithm for automatic classification and for predicting Taiwan stock price trends after a news published. By applying the improved TF-IDF method developed by Chang et al., the process of characteristic selection become more accurate. This study analyze thousands of news articles which come from two different source, cnYES and Taiwan Economic Journal (TEJ), in order to make the predicting model representative and stable. However, the empirical results show that the precision of the model isn’t good enough. This study find no evidence that the information in news contents associate with Taiwan stock returns.
    參考文獻: 1.Barber, B. M., & Odean, T. (2008). All that glitters: The effect of attention and news on the buying behavior of individual and institutional investors. Review of Financial Studies, 21(2), 785-818.
    2.Chen, K. J., & Liu, S. H. (1992, August). Word identification for Mandarin Chinese sentences. In Proceedings of the 14th conference on Computational linguistics-Volume 1 (pp. 101-107). Association for Computational Linguistics.
    3.Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning,20(3), 273-297.
    4.Forman, G. (2003). An extensive empirical study of feature selection metrics for text classification. Journal of machine learning research, 3(Mar), 1289-1305.
    5.Gidofalvi, G., & Elkan, C. (2001). Using news articles to predict stock price movements. Department of Computer Science and Engineering, University of California, San Diego.
    6.Hsu, C. W., Chang, C. C., & Lin, C. J. (2003). A practical guide to support vector classification.
    7.Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., & Allan, J. (2000, November). Language models for financial news recommendation. InProceedings of the ninth international conference on Information and knowledge management (pp. 389-396). ACM.
    8.Merton, R. C. (1987). A simple model of capital market equilibrium with incomplete information. The journal of finance, 42(3), 483-510.
    9.Mittermayer, M. A. (2004). Forecasting intraday stock price trends with text mining techniques. In System Sciences, 2004. Proceedings of the 37th Annual Hawaii International Conference on (pp. 10-pp). IEEE.
    10.Nie, J. Y., Brisebois, M., & Ren, X. (1996). On Chinese text retrieval. In Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 225-233). ACM.
    11.Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information processing & management, 24(5), 513-523.
    12.Salton, G., & McGill, M. J. (1986). Introduction to modern information retrieval.
    13.Sproat, R. (1990). A STATISTICAL METHOD FOR FINDING WORD BOUNDARIES IN CHINESE TEXT.
    14.Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139-1168.
    15.Witten, I. H. (2005). Text mining. Practical handbook of Internet computing, 14-1.
    16.Yang, Y. (1999). An evaluation of statistical approaches to text categorization. Information retrieval, 1(1-2), 69-90.
    17.池祥萱, 林煜恩, 陳韋如 & 周賓凰. (2009). Does CEO Media Coverage Affect Firm Performance?. 交大管理學報, 1, 139-173.
    18.張玉芳, 彭時名 & 呂佳. (2006). 基於文本分類 TFIDF 方法的改進與應用. 電腦工程, 32(19), 76-78.
    19.鍾任明, 李維平, & 吳澤民. (2005). 運用文字探勘於日內股價
    描述: 碩士
    國立政治大學
    金融研究所
    103352019
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0103352019
    資料類型: thesis
    顯示於類別:[金融學系] 學位論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    201901.pdf1271KbAdobe PDF2307檢視/開啟


    在政大典藏中所有的資料項目都受到原著作權保護.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回饋