政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/110797

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | 全文筆數/總筆數 : 116849/147881 (79%)
造訪人次 : 64204273 線上人數 : 544

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

搜尋範圍

查詢小技巧：

您可在西文檢索詞彙前後加上"雙引號"，以獲取較精準的檢索結果

若欲以作者姓名搜尋，建議至進階搜尋限定作者欄位，可獲得較完整資料

進階搜尋

主頁 ‧ 登入 ‧ 上傳 ‧ 說明 ‧ 關於政大典藏 ‧ 管理

到手機版

政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 > Item 140.119/110797

請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/110797

題名:	應用情感型態分析於指數股票型基金趨勢研究-以台灣卓越50基金為例 A study on the trend of exchange traded funds by sentiment pattern analysis in Yuanta Taiwan Top 50 ETF
作者:	林詠翔 Lin, Yong-Xiang
貢獻者:	姜國輝 Chiang, Kuo-Huie 林詠翔 Lin, Yong-Xiang
關鍵詞:	情感分析 LDA主題模型型態模型指數股票型基金 Sentimental analysis LDA Pattern model ETF
日期:	2017
上傳時間:	2017-07-11 11:29:01 (UTC+8)
摘要:	根據研究指出 ETF 資產規模近幾年快速成長，元大台灣卓越 50 基金因市場規模大等優勢受到投資人的青睞，賴以巨量資料的發展使得文字探勘技術成熟，故本研究希冀提出一套情感分析的價格預測模型，提升投資者的報酬率。過往學者以文章中的單詞作為文字探勘的分析單位，常會產生同義詞、多義詞的問題，因此提出情感型態分析的監督式學習方法建立模型。另外為了解決監督式學習難以取得訓練資料的限制，本研究混合非監督式學習方法進行主題分群與情緒傾向標注。本研究建立台灣股市新聞文本資料集，並篩選熱門議題詞詞庫，進行非監督式的 LDA 主題模型，發現在 2016 年總統選舉期間，媒體對於公司相關議題的注意力降低，使得相關的文本數量大幅減少;另外在情緒傾向標注階段，因混和了 NTUSD、知網及自行擴充演算法的情感詞庫，能夠將 10%中性詞彙產生極性判斷、96%的文本標注情緒傾向。視覺化工具分析結果指出，DIF-MACD 能夠預測台灣卓越 50 基金的長期走勢，而新聞情緒指數則在短期的價格波動上表現良好，且在主題模型分群中，總體經濟、公司維運類別的新聞情緒指數具有約 1-2 日領先指標特性，對於後續的價格預測模型有所助益。在監督式情感分析方法，為解決上述同義詞、多義詞的問題，本研究採用型態分類模型於中文文本，並與向量空間模型、支援向量機等方法做比較。實驗結果指出優化的型態分類模型，並結合台灣加權股價指數，表現相對良好，F1- Measure 可達 85%。進一步討論新聞情緒對於價格預測的重要性，發現在非交易時間序列中的新聞情緒，能夠對 0050 的價格波動產生影響。 The past research points out that the scale of ETF assets has been growing rapidly in recent years. Yuanta Taiwan Top 50 ETF is popular with investors because of the advantages of large market scale. Through the development of Big Data, the technology of Text Mining becomes mature. Thus, we analyze the price forecast model to raise the investors` rate of return. The research of Text Mining used to take the document term to analyze, but it often results in the problem with synonym and polysemy. Therefore, this research proposes a supervised learning method of sentiment pattern analysis. In addition, in order to solve the problem with training data about the supervised learning method, we mix the unsupervised learning method to carry out the subject grouping and sentimental tendency. In this study, we establish the news dataset and screen it as popular terms that are used to an unsupervised method of LDA model. The result points out that the number of news about company dropped significantly during the 2016 Taiwan president election because of the change of media sensation. Moreover, we create the sentiment dictionary that can determine the polarity of 10% neutral terms and the emotional tendency of 96% documents by mixing the NTUSD, HowNet knowledge Database and the self-expansion algorithm. Through the data visualization, the result shows that the curve of DIF-MACD is able to predict the long-term trend of 0050, while the sentiment index of the news makes a good showing in the short-term price volatility. Besides, the news sentiment index of the subjects that belong to general economy and company has about 1 to 2 day leading indicators. Eventually, we employ the Sentiment Pattern Taxonomy Model(PTM) in Chinese texts as supervised learning method and compare with VSM and SVM. The experiment result shows that PTM combined with Taiwan Weighted Stock Index is the best when its F1-Measure is up to 85%. Apart from this, we find that the sentiment index of the news in non-trading time can influence the price volatility of 0050.
參考文獻:	[ 1 ] 王波, & 郭曉軍. (2011). 基地情感分析的網路財經媒體通貨膨脹預期研究. 圖書情報工作, 55(16), 140-143. [ 2 ] 邸亮杜永萍. (2014). LDA 模型在微博使用者推薦中的應用. 電腦工程, 40(5), 1-6. [ 3 ] 杜嘉忠, 徐健, & 劉穎. (2014). 網路商品評論的特徵-情感詞本體構建與情感分析方法研究. 現代圖書情報技術, 30(5), 74-82. [ 4 ] 林冠中. (2005). 漸進式支持向量機於人臉辨識之應用. 成功大學資訊工程學系學位論文, 1-78. [ 5 ] 林彩雯. (2015). 以Google App 評論為字詞權重調整之情緒分析系統 [ 6 ] 林育龍. (2014). 對使用者評論之情感分析研究-以 Google Play 市集為例 [ 7 ] 陳信源, 葉鎮源, 林昕潔, 黃明居, 柯皓仁, & 楊維邦. (2009). 結合支援向量機與詮釋資料之圖書自動分類方法. 資訊科技國際期刊, 3 (1), 2-21. [ 8 ] 陳昭元. (2016). 應用情感分析於輿情之研究-以台灣2016總統選舉為例 [ 9 ] 許珠香, & 江弋. (2013). 基於潛在狄利克雷分配模型的醫療資料研究. [ 10 ] 張良杰. (2014). 巨量資料環境下之新聞主題暨輿情與股價關係之研究 [ 11 ] 游和正, 黃挺豪, & 陳信希. (2012). 領域相關詞彙極性分析及文件情緒分類之研究. 中文計算語言學期刊, 17(4), 33-47. [ 12 ] 萬常選, 江騰蛟, 鐘敏娟,& 邊海容. (2013). 基於詞性標注和依存句法的 Web 金融資訊情感計算. 電腦研究與發展, 50(12), 2554-2569. [ 13 ] 劉羿廷. (2015). 運用財經文本情感分析於台灣電子類股價指數趨勢預測之研究 [ 14 ] 蔡宇翔. (2016). 股市趨勢預測之研究 -財經評論文本情感分析 [ 15 ] 謝麗星, 周明, & 孫茂松. (2012). 基於層次結構的多策略中文微博情感分析和特徵抽取. 中文資訊學報, 26(1), 73-83. [ 16 ] 魏晶晶, & 吳曉吟. (2013). 電子商務產品評論多級情感分析的研究與實現. 軟體, 34(9), 65-67. [ 17 ] Baccianella, S., Esuli, A., & Sebastiani, F. (2010, May). SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining. In LREC (Vol. 10, pp. 2200-2204). [ 18 ] Baker, Malcolm, and Jeffrey Wurgler. "Investor sentiment in the stock market." The Journal of Economic Perspectives 21.2 (2007): 129-151. [ 19 ] Barberis, N., Shleifer, A., & Vishny, R. (1998). A model of investor sentiment. Journal of financial economics, 49(3), 307-343. [ 20 ] Basu, T., & Murthy, C. A. (2012, December). Effective text classification by a supervised feature selection approach. In Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on (pp. 918-925). IEEE. [ 21 ] Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022. [ 22 ] Brown, E. D. (2012). Will twitter make you a better investor? a look at sentiment, user reputation and their effect on the stock market. Proc. of SAIS. [ 23 ] Chau, F., Deesomsak, R., & Lau, M. C. (2011). Investor sentiment and feedback trading: Evidence from the exchange-traded fund markets. International Review of Financial Analysis, 20(5), 292-305. [ 24 ] Chiu, J., Chung, H., & Ho, K. Y. The asymmetric sentiment effect on equity liquidity and investor trading behavior in the subprime crisis period: Evidence from the ETF Market. [ 25 ] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297. [ 26 ] Don, A., Zheleva, E., Gregory, M., Tarkan, S., Auvil, L., Clement, T., ... & Plaisant, C. (2007, November). Discovering interesting usage patterns in text collections: integrating text mining with visualization. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (pp. 213-222). ACM. [ 27 ] Drucker, H., Wu, D., & Vapnik, V. N. (1999). Support vector machines for spam categorization. IEEE Transactions on Neural networks, 10(5), 1048-1054. [ 28 ] Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4), 82-89. [ 29 ] Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101(suppl 1), 5228-5235. [ 30 ] Henderson, K., & Eliassi-Rad, T. (2009, March). Applying latent dirichlet allocation to group discovery in large graphs. In Proceedings of the 2009 ACM symposium on Applied Computing (pp. 1456-1461). ACM. [ 31 ] Joachims, T. (1998). Text categorization with support vector machines: Learning with many relevant features. Machine learning: ECML-98, 137-142. [ 32 ] Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1), 1-167. [ 33 ] Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to information retrieval (Vol. 1, No. 1, p. 496). Cambridge: Cambridge university press. [ 34 ] Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity. [ 35 ] Maskeri, G., Sarkar, S., & Heafield, K. (2008, February). Mining business topics in source code using latent dirichlet allocation. In Proceedings of the 1st India software engineering conference (pp. 113-120). ACM. [ 36 ] Mishne, G., & De Rijke, M. (2006). MoodViews: Tools for Blog Mood Analysis. In AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs (pp. 153-154). [ 37 ] McKinsey Global Institute. (2011). Big data: The next frontier for innovation, competition, and productivity [ 38 ] MR, U. (2014). Pattern-based text mining method for the classification of research proposals. IJRCCT, 3(3), 129-135. [ 39 ] Newman, D., Hagedorn, K., Chemudugunta, C., & Smyth, P. (2007, June). Subject metadata enrichment using statistical topic models. In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries (pp. 366-375). ACM. [ 40 ] Ortigosa, A., Martín, J. M., & Carro, R. M. (2014). Sentiment analysis in Facebook and its application to e-learning. Computers in Human Behavior, 31, 527-541. [ 41 ] Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval, 2(1–2), 1-135. [ 42 ] Pang, B., Lee, L., & Vaithyanathan, S. (2002, July). Thumbs up?: sentiment classification using machine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10 (pp. 79-86). Association for Computational Linguistics. [ 43 ] Soliman, T. H. A., Elmasry, M. A., Hedar, A. R., & Doss, M. M. (2012, October). Utilizing support vector machines in mining online customer reviews. In Computer Theory and Applications (ICCTA), 2012 22nd International Conference on (pp. 192-197). IEEE. [ 44 ] Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational linguistics, 37(2), 267-307. [ 45 ] Tan, A. H. (1999, April). Text mining: The state of the art and the challenges. In Proceedings of the PAKDD 1999 Workshop on Knowledge Disocovery from Advanced Databases (Vol. 8, pp. 65-70). [ 46 ] Tata, S., & Patel, J. M. (2007). Estimating the selectivity of tf-idf based cosine similarity predicates. ACM Sigmod Record, 36(2), 7-12. [ 47 ] Tseng, Y. C., & Lee, W. C. (2016). Investor Sentiment and ETF Liquidity-Evidence from Asia Markets. Advances in Management and Applied Economics, 6(1), 89. [ 48 ] Turney, P. D. (2002, July). Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th annual meeting on association for computational linguistics (pp. 417-424). Association for Computational Linguistics. [ 49 ] Wu, S. T. (2007). Knowledge discovery using pattern taxonomy model in text mining (Doctoral dissertation, Queensland University of Technology). [ 50 ] Wu, S. T., Li, Y., Xu, Y., Pham, B., & Chen, P. (2004, September). Automatic pattern-taxonomy extraction for web mining. In Web Intelligence, 2004. WI 2004. Proceedings. IEEE/WIC/ACM International Conference on (pp. 242-248). IEEE. [ 51 ] Zhong, N., Li, Y., & Wu, S. T. (2012). Effective pattern discovery for text mining. IEEE transactions on knowledge and data engineering, 24(1), 30-44.
描述:	碩士國立政治大學資訊管理學系 104356013
資料來源:	http://thesis.lib.nccu.edu.tw/record/#G0104356013
資料類型:	thesis
顯示於類別:	[資訊管理學系] 學位論文

文件中的檔案:

檔案	大小	格式	瀏覽次數
601301.pdf	1878Kb	Adobe PDF2	43	檢視/開啟

在政大典藏中所有的資料項目都受到原著作權保護.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - 回饋