Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/97113
|
Title: | 針對臉書粉絲專頁貼文之政治傾向預測 Predicting Political Affiliation for Posts on Facebook Fan Pages |
Authors: | 張哲嘉 Chang, Che Chia |
Contributors: | 徐國偉 Hsu, Kuo Wei 張哲嘉 Chang, Che Chia |
Keywords: | 政治傾向 分類 臉書 文字探勘 political affiliation classification facebook text mining |
Date: | 2016 |
Issue Date: | 2016-06-01 13:53:37 (UTC+8) |
Abstract: | 近年來社群媒體興起,尤其以臉書為主。在台灣超過1500萬個臉書用戶,其遍及族群從公眾人物到一般民眾。此外,這類的新興資訊交流平台其實內含許多有意義的資訊,每一則貼文都隱含著每個使用者的情緒以及立場傾向。然而,利用社群媒體來預測選舉與使用者政治傾向已成為目前的趨勢,在台灣各政黨與政治人物紛紛成立粉絲專頁,投入利用網路與社群媒體來打選戰與預測民調。本研究發現此一特性,致力於預測粉絲專頁貼文之政治傾向,收集台灣兩大政黨派國民黨與民進黨之粉絲專頁貼文,建立兩種預測模型分別為以相異字為特徵模型與文字互動特徵模型。利用資料探勘之相關技術,以貼文所含藍綠政黨特徵表現建立分類器,並細部探討與設計多種特徵組合,比較不同特徵組合之預測效果與影響因素以及在預測資料不平衡的情況下是否影響分類結果。最後,研究結果顯示使用文字特徵中黨派典型字與互動特徵值域取對數並搭配KNN分類器效果最佳,其準確度可達0.908,F1-score可達0.827。 Recently, the social media is becoming more and more popular, especially Facebook. In Taiwan, there are 15 million Facebook users from celebrities to the general public. Receiving information every day from Facebook has become a lifestyle of most people. These new information-exchanging platforms contain lots of meaningful messages including users` emotions and affiliations. Moreover, using the social media data to predict the election result and political affiliation is becoming the current trend in Taiwan. For example, politicians try to win the election and predict the polls by means of Internet and the social media, and every political parties also have their own fan pages. In this thesis, we make an effort to predict the political inclinations of the posts of fan pages, especially for KMT and DPP which are the two largest political parties in Taiwan. We filter the appropriate literal and interactive features. We use the posts of the two parties to predict the political inclinations by constructing the classification models .In the end, we compare the performances of different classifiers .The result shows that the literal and interactive features work the best with KNN classifier, whose accuracy and F1-score are 0.908 and 0.827, respectively. |
Reference: | [1] D. Gayo-Avello, P. T. Metaxas and E. Mustafaraj, “Limits of Electoral Predictions using Twitter,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’11), 2011. [2] A. Boutet, H. Kim, and E. Yoneki, “What’s in Your Tweets? I Know Who You Supported in the UK 2010 General Election,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’12), 2012. [3] 結合長詞優先與序列標記之中文斷詞研究 林千翔∗、張嘉惠*、陳貞伶∗ Computational Linguistics and Chinese Language Processing Vol. 15, No. 3-4, September/December 2010, pp. 161-180 [4] Chen, K.J. & Ming-Hong Bai, "Unknown Word Detection for Chinese by a Corpus-based Learning Method," International Journal of Computational linguistics and Chinese Language Processing, 1998, Vol.3, #1, pages 27-44 [PS] [5]Chen, Keh-Jiann, and Wei-Yun Ma. "Unknown word extraction for Chinese documents." Proceedings of the 19th international conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 2002. [6]Ma, Wei-Yun, and Keh-Jiann Chen. "A bottom-up merging algorithm for Chinese unknown word extraction." Proceedings of the second SIGHAN workshop on Chinese language processing-Volume 17. Association for Computational Linguistics, 2003. [7] B. O’Connor, R. Balasubramanyan, B. R. Routledge, and N. A. Smith, “From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’10), 2010.
[8] A. Tumasjan, T. O. Sprenger, P. G. Sandner and I. M. Welpe, “Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment,” Proceedings of the International Conference on Weblogs and Social Media (ICWSM’10), 2010. [9] M. D. Conver, B. Goncalves, J. Ratkiweicz, A. Flammini, F. Menczer, “Predicting the Political Alignment of Twitter Users,” Proceedings of the IEEE Conference on Social Computing (SocialCom’11), 2011. [10] Clay Fink, Nathan Bos, Alexander Perrone, Edwina Liu, and Jonathon Kopcky, “Twitter, Public Opinion, and the 2011 Nigerian Presidential Election,” Proceedings of the IEEE Conference on Social Computing (SocialCom’13), 2013. [11] A. Makazhanov and D. Rafiel, “Predicting Political Preference of Twitter Users,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013. [12] S. O’Banion and L. Birnbaum, “Using Explicit Linguistic Expressions of Preference in Social Media to Predict Voting Behavior,” Proceedings of the International Conference on Advances in Social Network Analysis and Mining (ASONAM’13), 2013. [13] Marco Pennacchiotti, Ana-Maria Popescu,” Democrats, Republicans and Starbucks Afficionados: User Classification in Twitter,” Proceedings of the 17th SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11), 2011. [14] Tumitan, Diego, and Kurt Becker. "Sentiment-based features for predicting election polls: a case study on the brazilian scenario." Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on. Vol. 2. IEEE, 2014. [15] Z. Dong and Q. Dong, “HowNet and the Computation of Meaning,” World Scientific Publishing Co., Inc., River Edge, NJ, 2006. [16] Wu, Xindong, et al. "Top 10 algorithms in data mining." Knowledge and Information Systems 14.1 (2008): 1-37. [17] L. W. Ku and H. H. Chen, "Mining Opinions from the Web: Beyond Relevance Retrieval," Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for Enhancing Information Retrieval, 2007, Volume 58 Issue 12, pp.1838-1850. [18] 江家榕,以社群媒體為考量之選民政治傾向探索,政治大學論文,2015 [19] 陳慧潔,國小高年級學童臉書使用行為,臉書成癮與人際溝通能力相關研究,中華大學碩士論文 2013 [20] 林育珊,科技接受模式對學生使用社群媒體輔助學習的行為意圖之研究,高雄師範大學碩士論文,2015 [21] 陳冰淳,Web2.0時代影響社群媒體新聞資訊信任的心理因素——以微博為例,台灣大學碩士論文,2015 [22] 維基百科https://zh.wikipedia.org/wiki/Wikipedia:%E9%A6%96%E9%A1%B5 [23] 中央研究院中文斷詞系統,http://ckipsvr.iis.sinica.edu.tw/[2011/11/12] [24] 陳克健, 黃淑齡, 施悅音, 和陳怡君, “多層次概念定義與複雜關係表達-繁體字知網的新增架構,” 漢語詞彙語義研究的現狀與發展趨勢國際學術研討會, 2004. [25]Weaver, Jesse, and Paul Tarjan. "Facebook linked data via the graph API." Semantic Web 4.3 (2013): 245-250. [26] 黃羿綺,政治人物之社交網路建置與分析,政治大學論文,2015 [27]Loureiro, Antonio, Luis Torgo, and Carlos Soares. "Outlier detection using clustering methods: a data cleaning application." Proceedings of KDNet Symposium on Knowledge-based Systems for the Public Sector. Bonn, Germany. 2004. [28]Lewis, David D. "Naive (Bayes) at forty: The independence assumption in information retrieval." Machine learning: ECML-98. Springer Berlin Heidelberg, 1998. [29]Zhang, Min-Ling, and Zhi-Hua Zhou. "ML-KNN: A lazy learning approach to multi-label learning." Pattern recognition 40.7 (2007): 2038-2048. [30]Joachims, Thorsten. Making large scale SVM learning practical. Universität Dortmund, 1999. [31]Safavian, S. Rasoul, and David Landgrebe. "A survey of decision tree classifier methodology." (1990). [32]Rätsch, Gunnar, Takashi Onoda, and K-R. Müller. "Soft margins for AdaBoost." Machine learning 42.3 (2001): 287-320. |
Description: | 碩士 國立政治大學 資訊科學學系 103753002 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0103753002 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
300201.pdf | 3365Kb | Adobe PDF2 | 372 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|