Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/69195
|
Title: | 網路評價搜尋結果的正負意見分類系統 A sentiment classification system on search results of web opinions |
Authors: | 黃泓彰 Huang, Hung Chang |
Contributors: | 楊亨利 Yang, Heng Li 黃泓彰 Huang, Hung Chang |
Keywords: | 意見探勘 情感分析 情感分類 網路評價 Opinion mining Sentiment analysis Sentiment classification Web opinion |
Date: | 2013 |
Issue Date: | 2014-08-25 15:16:17 (UTC+8) |
Abstract: | 本研究嘗試建置一個包含兩個主要功能的系統,分別是網路評價搜尋以及情感分類。在網路評價搜尋的部份,我們使用Google搜尋並蒐集一攜帶型智慧裝置(智慧型手機、平板電腦與筆記型電腦)的網路評價搜尋結果;情感分類的部分則是將搜尋結果依照對該產品的意見分類為,共有正面/負面/中立、正面/負面、正面/非正面,以及負面/非負面等四種分類方式。為了建置此系統,我們首先從知名的網路論壇Mobile01和批踢踢蒐集和攜帶型智慧裝置有關的網路文章以及產品名稱,接著以人工的方式標記每篇文章,以及部分文章中的句子的情感。本研究設計了兩個層次的情感分類實驗,我們首先從語句層次出發,以監督式機器學習法訓練將句子分為正面/負面/中立等三個類別的分類模型後,再進入文章層次,將句子的意見彙整,並同樣以監督式機器學習法訓練四種不同文章層次的分類模型:正面/負面/中立、正面/負面、正面/非正面,以及負面/非負面。我們分別選出四種分類實驗中表現最佳的模型,並用於系統建置,其中表現最佳的是分類為正面/負面的分類模型,平均的F-measure為0.87;其次是分類為負面/非負面的模型,對負面類別的F-measure為0.83;接著是分類為正面/非正面的模型,對正面類別的F-measure為0.81;表現最差的是正面/負面/中立的分類,平均的F-measure為0.77。在正面/負面分類的準確率上,本研究的表現並不壞於過去以英文為主要語言的相關研究。最後,我們也以過去不經過語句層次的分類方法進行實驗並比較,其結果發現經過語句層次的情感分類比不經過語句層次的情感分類較佳。 In this research, we implemented a system that retrieves the search results of mobile phones, tablets, and notebooks from Google, and then classifies them as: (1) positive, negative, or neutral, (2) positive or negative, (3) positive or non-positive, (4) negative or non-negative. To build this system, first we collected some documents about mobile phones, tablets, and notebooks on two popular web forums: mobile01.com and ptt.cc. Next, a sentiment label (positive, negative, or neutral) is attached to each document and each sentence of these documents. We designed a two-level supervised sentiment classification experiment. At sentence level, we trained classifiers that classify sentences as positive, negative, or neutral. The best sentence classifier was then used at document level. At document level, the sentiment labels of the sentences in documents are used. We trained classifiers in four different classification problems: (1) positive, negative, or neutral, (2) positive vs. negative, (3) positive vs. non-positive, (4) negative vs. non-negative. The best is the second classifier with an average F-measure of 0.87. The next is the fourth classifier with an F-measure of 0.83 on negative class, and then comes with the third classifier with an F-measure of 0.81 on positive class. The last is the first classifier with an average F-measure of 0.77. Our accuracy is not worse than the past English study on the classification of positive vs. negative. Finally, we conducted another classification experiment using document-level-only classification method, and the results showed that our two-level sentiment classification (first sentence level, then document level) outperforms document-level-only sentiment classification. |
Reference: | 張育蓉(2012)。使用情緒分析於圖書館使用者滿意度評估之研究。國立中興大學圖書資訊學研究所未出版碩士學位論文,台灣,台中。 張慧美(2006)。網路語言之語言風格研究。彰化師大國文學誌,13,331-359。 梅家駒、竺一鳴、高蘊琦、殷鴻翔(1982)。同義詞詞林。上海:上海辭書出版社。 黃泓彰、楊亨利(2013)。運用機器學習與語言模型的網路用語轉譯系統。第19屆資訊管理暨實務研討會(IMP2013),台灣,台中。 Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2(3), 1-27. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm Church, K. W., & Hanks, P. (1989). Word association norms, mutual information and lexicography. Proceedings of the 27th Annual Conference of the ACL. New Brunswick, New Jersey. Das, S., & Chen, M. (2001). Yahoo! for Amazon: Extracting market sentiment from stock message boards. Proceedings of the Asia Pacific Finance Association Annual Conference (APFA), Bangkok, Thailand. Dave, K., Lawrence, S., & Pennock, D. M. (2003). Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. Proceedings of the 12th International World Wide Web Conference (WWW), Budapest, Hungary. Demartini, G., & Siersdorfer, S. (2010). Dear search engine: what`s your opinion about...?: Sentiment analysis for semantic enrichment of web search results. Proceedings of the 3rd Semantic Search Workshop, Raleigh, North Carolina. Ding, X., Liu, B., & Yu, P. S. (2008). A holistic lexicon-based approach to opinion mining. Proceedings of the Conference on Web Search and Web Data Mining (WSDM), Stanford, California. Eirinaki, M., Pisal, S., & Singh, J. (2012). Feature-based opinion mining and ranking. Journal of Computer and System Sciences, 78(4), 1175-1184. Ganapathibholta, M., & Liu, B. (2008). Mining opinions in comparative sentences. Proceedings of the 22nd International Conference on Computational Linguistics (COLING), Manchester, United Kindom. Kouloumpis, E., Wilson, T., & Moore, J. (2011). Twitter sentiment analysis: The good the bad and the OMG! Proceedings of the 5th International AAAI Conference on Weblogs and Social Media, Barcelona, Spain. Ku, L. W., & Chen, H. H. (2007). Mining opinions from the web: Beyond relevance retrieval. Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for Enhancing Information Retrieval, 58(12), 1838-1850. Software available at http://nlg18.csie.ntu.edu.tw:8080/opinion/index.html Lin, C. J., & Chao, P. H. (2010). Tourism-related opinion detection and tourist-attraction target identification. International Journal of Computational Linguistics and Chinese Language Processing, 15(1), 37-60. Liu, B. (2010). Sentiment Analysis and Subjectivity. In N. Indurkhya & F.J. Damerau (Eds.), Handbook of Natural Language Processing, (2nd ed.). Boca Raton: Chapman & Hall/CRC. Liu, B. (2012). Sentiment analysis and opinion mining. Morgan & Claypool Publishers. Liu, J., & Seneff, S. (2009). Review sentiment scoring via a parse-and-paraphrase paradigm. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Singapore, Singapore. Lu, B. (2010). Identifying opinion holders and targets with dependency parser in Chinese news texts. Proceedings of the Human Language Technologies: the 2010 Annual Conference of the North American Chapter of the ACL (NAACL HLT), Student Research Workshop, Los Angeles, California. Ma, T., & Wan, X. (2010). Opinion target extraction in Chinese news comments. Proceedings of the International Conference on Computational Linguistics (COLING) Poster Volume, Beijing, China. Moghaddam, S., & Ester, M. (2012). Aspect-based opinion mining from online reviews. Tutorial at Special Interest Group on Information Retrieval (SIGIR), Portland, Oregon. Retrieved March 10, 2014, from https://www.cs.sfu.ca/~ester/papers/SIGIR2012.Tutorial.Final.pdf Na, J. C., Sui, H., Khoo, C., Chan, S., & Zhou, Y. (2004). Effectiveness of simple linguistic processing in automatic sentiment classification of product reviews. Proceedings of the 8th International Society for Knowledge Organization Conference (ISKO), London, United Kindom. Nasukawa, T., & Yi, J. (2003). Sentiment analysis: Capturing favorability using natural language processing. Proceedings of the International Conference on Knowledge Capture (K-CAP), Sanibel Island, Florida. Pak, A., & Paroubek, P. (2010). Twitter as a corpus for sentiment analysis and opinion mining. Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), Valletta, Malta. Pang, B., & Lee, L. (2008a). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2), 1-135. Pang, B., & Lee, L. (2008b). Using very simple statistics for review search: An exploration. Proceedings of the International Conference on Computational Linguistics (COLING) Poster Paper, Manchester, United Kindom. Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Philadelphia, Pennsylvania. Popescu, A. M., & Etzioni, O. (2005). Extracting product features and opinions from reviews. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Vancouver, Canada. Su, Q., Xu, X., Guo, H., Guo, Z., Wu, X., Zhang, X., & Swen, B. (2008). Hidden sentiment association in Chinese web opinion mining. Proceedings of the 17th International Conference on World Wide Web, Beijing, China. Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, Pennsylvania. Turney, P. D., & Littman, M. L. (2002). Unsupervised learning of semantic orientation from a hundred-billion-word corpus (Technical Report ERB-1094). Ottawa, Canada: National Research Council Canada. Wiebe, J., Bruce, R. F., & O’Hara, T. P. (1999). Development and use of a gold-standard data set for subjectivity classifications. Proceedings of the Association for Computational Linguistics (ACL), College Park, Maryland. Yu, H., & Hatzivassiloglou, V. (2003). Towards answering opinion questions: Separating facts from opinions and identifying the polarity of opinion sentences. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Sapporo, Japan. |
Description: | 碩士 國立政治大學 資訊管理研究所 101356016 102 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0101356016 |
Data Type: | thesis |
Appears in Collections: | [資訊管理學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
601601.pdf | 1860Kb | Adobe PDF2 | 1711 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|