Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/67096
|
Title: | 應用kNN文字探勘技術於分析新聞評論 影響股價漲跌趨勢之研究 The Study of Analyzing Comments of News for Influence of Stock Price Trends Prediction by Using Knn Text Mining |
Authors: | 詹智勝 Chan, Chih Sheng |
Contributors: | 楊建民 Yang, Chien Ming 詹智勝 Chan, Chih Sheng |
Keywords: | 網路口碑 股價趨勢預測 文字探勘 kNN 群集分析 Internet Word-of-Mouth The Stock Trend Prediction Text Mining kNN Cluster Analysis |
Date: | 2013 |
Issue Date: | 2014-07-01 12:06:17 (UTC+8) |
Abstract: | 在網際網路快速發展下,大量使用者在獲取知識與新聞的管道,已由傳統媒體轉移到網路上。網路活動下使用者互動後所留下的訊息,也就是網路口碑,也逐漸受到重視。而隨著經濟發展,國人在固定薪資下無法負擔高房價、高物價的生活,如何透過投資理財來增加自身財富,已是非常普遍,其中又以股市投資為大眾所重視之途徑。
網路新聞的發布,除了具有網路的即時性外,配合使用者閱讀內化後所留下的評論,應含有比網路新聞本身內容更多的資訊,投資者便可藉此找尋隱含之中大量市場消息與資訊。
本研究為了在龐大的資料量中,幫助使用者挖掘其背後之涵義,進而提供投資預測,將蒐集網路新聞及其閱讀者評論共1068篇,並分為訓練資料與測試資料,使用文字探勘及相關技術做前處理,再透過kNN分群技術,計算訓練資料文件間相似度,將大量未知資料依其相似度做分群後,利用歷史股價訊息對群集結果之特徵分析解釋之並建立預測模型,最後透過測試資料將模型分群結果進行評估,進而對股價趨勢做出預測。 With the rapid development of the Internet, the way of user access to knowledge and news transfer from traditional media to the network. Internet word-of-mouth, the message generated from users` interaction on internet, attracts more and more people`s attention. With economic development, people in the fixed salary cannot afford high prices and high price in live. People increase their own wealth through investment is very common, among which the stock market is the way to public attention.
Internet news has the immediacy of the Internet. And the comments left with the user to read the internalization should contain more information than the Internet news. Investors can find the market news and information by Internet news and comments.
In this study, in order to help the user to find the meaning behind the huge amount of data, and thus provide investment forecast. We will collect 1068 of internet news and reader reviews to divide into training data and test data using text mining and related technologies to do the pre-treatment, and then calculate the similarity between the training data by kNN, a lot of unknown data according to their similarity clustering. Cluster through the historical share price analysis and modeling. Finally, the model clustering results were evaluated through the test data to predict price trends. The prediction model from training data clustering, use test data to do the evaluation found: k = 15, the similarity threshold value = 0.05, cluster the results of the F-measure performance up to 56% rise in the cluster. K values and the similarity threshold will be adjusted to obtain the most favorable results of the model |
Reference: | 一、 中文部分 1. 喻欣凱,2008,運用支援向量機與文字探勘於股價漲跌趨勢之預測,輔仁大學資訊管理學系碩士論文。 2. 陳均碩,2000,農業電子報使用者動機、行為與滿足程度之研究-以資策會「臺灣農業資訊網(TAIS)電子報」為例,國立臺灣大學農業推廣學研究所碩士論文。 3. 陳應強,2005,影響電子報讀者選擇與閱讀行為之研究,南華大學出版事業管理研究所碩士論文。 4. 鍾任明,2004,運用文字探勘於日內股價漲跌趨勢預測之研究,中原大學資訊管理研究所碩士論文。 5. 陳崇正,2009,應用網路書籤與VSM相似度演算法於強化實踐社群的形成,國立中正大學資訊工程研究所碩士論文。 6. 吳漢瑞,2011,應用文字探勘技術於臺灣上市公司重大訊息對股價影響之研究,國立政治大學資訊管理研究所碩士論文。 7. 陳柏均,2011,文件距離為基礎kNN分群技術與新聞事件偵測追蹤之研究,國立政治大學資訊管理研究所碩士論文。 8. 費翠,網路市場行家理論驗證與延伸---其網路資訊搜尋、口碑傳播、線上購物行為及個人特質研究,國立政治大學廣告研究所,2001。 二、 英文部分 1. Wuthrich, B., Cho, V., Leung, S., Permunetilleke, D., Sankaran, K., & Zhang, J. (1998, October). Daily stock market forecast from textual web data. InSystems, Man, and Cybernetics, 1998. 1998 IEEE International Conference on(Vol. 3, pp. 2720-2725). IEEE. 2. Lavrenko, V., Schmill, M., Lawrie, D., Ogilvie, P., Jensen, D., & Allan, J. (2000, August). Mining of concurrent text and time series. In KDD-2000 Workshop on Text Mining (pp. 37-44). 3. Gidófalvi, G., & Elkan, C. (2001). Using news articles to predict stock price movements. Department of Computer Science and Engineering, University of California, San Diego. 4. Ahmad, K., Oliveira, P. C. F., Casey, M., & Taskaya, T. (2002). Description of events: an analysis of keywords and indexical names. In Proceedings of the Third International Conference on Language Resources and Evaluation, LREC 2002: Workshop on Event Modelling for Multilingual Document Linking (pp. 29-35). 5. Fung, G., Yu, J., & Lam, W. (2002). News sensitive stock trend prediction.Advances in Knowledge Discovery and Data Mining, 481-493. 6. Pui Cheong Fung, G., Xu Yu, J., & Lam, W. (2003, March). Stock prediction: Integrating text mining approach using real-time news. In Computational Intelligence for Financial Engineering, 2003. Proceedings. 2003 IEEE International Conference on (pp. 395-402). IEEE. 7. Mittermayer, M. A. (2004, January). Forecasting intraday stock price trends with text mining techniques. In System Sciences, 2004. Proceedings of the 37th Annual Hawaii International Conference on (pp. 10-pp). IEEE. 8. Arndt, J. (1967). Role of product-related conversations in the diffusion of a new product. Journal of marketing Research, 291-295. 9. Westbrook, R. A. (1987). Product/consumption-based affective responses and postpurchase processes. Journal of marketing research, 258-270. 10. Bone, P. F. (1995). Word-of-mouth effects on short-term and long-term product judgments. Journal of Business Research, 32(3), 213-223. 11. Duhan, D. F., Johnson, S. D., Wilcox, J. B., & Harrell, G. D. (1997). Influences on consumer use of word-of-mouth recommendation sources. Journal of the Academy of Marketing Science, 25(4), 283-295. 12. Katz, E., & Lazarsfeld, P. F. (2006). Personal influence: The part played by people in the flow of mass communications. Transaction Pub. 13. Richins, M. L. (1983). Negative word-of-mouth by dissatisfied consumers: a pilot study. The Journal of Marketing, 68-78. 14. Sheth, J. N. (1971). Word-of-mouth in low-risk innovations. Journal of Advertising Research, 11(3), 15-18. 15. Engel, J. F., Kegerreis, R. J., & Blackwell, R. D. (1969). Word-of-mouth communication by the innovator. The Journal of Marketing, 15-19. 16. Rogers, E. M. (1995). Diffusion of innovations. Simon and Schuster. 17. Silverman, G. (1997). Harvesting the power of word of mouth. Potentials in Marketing, 30(9), 14-16. 18. Murray, K. B. (1991). A test of services marketing theory: consumer information acquisition activities. The Journal of Marketing, 10-25. 19. Hennig‐Thurau, T., Gwinner, K. P., Walsh, G., & Gremler, D. D. (2004). Electronic word‐of‐mouth via consumer‐opinion platforms: What motivates consumers to articulate themselves on the Internet?. Journal of interactive marketing, 18(1), 38-52. 20. Hanson, W. A. (2000), Principles of Internet Marketing, Ohio: South-Western College Publishing. 21. Granitz, N. A., & Ward, J. C. (1996). Virtual community: A sociocognitive analysis. Advances in Consumer Research, 23, 161-166. 22. Bickart, B., & Schindler, R. M. (2001). Internet forums as influential sources of consumer information. Journal of interactive marketing, 15(3), 31-40. 23. Herr, P. M., Kardes, F. R., & Kim, J. (1991). Effects of word-of-mouth and product-attribute information on persuasion: An accessibility-diagnosticity perspective. Journal of Consumer Research, 454-462. 24. Gelb, B. D., & Sundaram, S. (2002). Adapting to" word of mouse". Business Horizons, 45(4), 21-25. 25. Ridings, C. M., Gefen, D., & Arinze, B. (2002). Some antecedents and effects of trust in virtual communities. The Journal of Strategic Information Systems,11(3), 271-295. 26. Sullivan, D. (2001). Document warehousing and text mining: techniques for improving business operations, marketing, and sales. John Wiley & Sons, Inc. 27. Simoudis, E. (1996). Reality check for data mining. IEEE Expert: Intelligent systems and their applications, 11(5), 26-33. 28. Feldman, R., & Dagan, I. (1995, August). Knowledge discovery in textual databases (KDT). In Proc. 1st Int. Conf. Knowledge Discovery and Data Mining(pp. 112-117). 29. Singh, L., Scheuermann, P., & Chen, B. (1997, January). Generating association rules from semi-structured documents using an extended concept hierarchy. In Proceedings of the sixth international conference on Information and knowledge management (pp. 193-200). ACM. 30. Cheung, C. F., Lee, W. B., & Wang, Y. (2005). A multi-facet taxonomy system with applications in unstructured knowledge management. Journal of knowledge management, 9(6), 76-91. 31. Tan, A. H. (1999, April). Text mining: The state of the art and the challenges. InProceedings of the PAKDD 1999 Workshop on Knowledge Disocovery from Advanced Databases (pp. 65-70). 32. Chen, K. J., & Liu, S. H. (1992, August). Word identification for Mandarin Chinese sentences. In Proceedings of the 14th conference on Computational linguistics-Volume 1 (pp. 101-107). Association for Computational Linguistics. 33. Fan, С. К., & Tsai, W. H. (1988). Automatic word identification in Chinese sentences by the relaxation technique. Computer Processing of Chinese and Oriental Languages. 34. Sproat, R. and Shih, C., (1990), A Statistical Method for Finding Word Boundaries in Chinese Text, Computer Processing of Chinese and Oriental Languages, pp.336-351. 35. Nie, J. Y., Brisebois, M., & Ren, X. (1996, August). On Chinese text retrieval. InProceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 225-233). ACM. 36. Salton, G. M ac Gill M. J (1983). Introduction to Modern Information Retrieval.International Student Edition. 37. Yang, Y., & Liu, X. (1999, August). A re-examination of text categorization methods. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval (pp. 42-49). ACM. 38. Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1-47. 三、 網路部分 1. MM Days,http://mmdays.com/2007/05/16/knn/,2007/5/16。 2. Pew Research Center,http://www.pewresearch.org/,2010。 |
Description: | 碩士 國立政治大學 資訊管理研究所 100356044 102 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0100356044 |
Data Type: | thesis |
Appears in Collections: | [資訊管理學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
604401.pdf | 544Kb | Adobe PDF2 | 142 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|