Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/56328
|
Title: | 基於概念飄移探勘的社群多媒體之熱門程度預測 Popularity prediction of social multimedia based on concept drift mining |
Authors: | 鄭世宏 Jheng, Shih Hong |
Contributors: | 沈錳坤 Shan, Man Kwan 鄭世宏 Jheng, Shih Hong |
Keywords: | 社群多媒體 社群媒體 熱門預測 概念飄移 局部概念飄移 分類 Social Multimedia Social Media Popularity Prediction Concept Drift Local Concept Drift Classification |
Date: | 2012 |
Issue Date: | 2012-12-03 11:27:18 (UTC+8) |
Abstract: | 近年來社群平台(Social Media)的興起,提供了人與人之間簡便且快速互相交換各式各樣內容的機會。社群多媒體(Social Multimedia)指的就是使用者在社群平台上所互相交換的多媒體內容,相較於單純的多媒體內容而言,社群多媒體多了寶貴的大量社群平台使用者之間分享互動的記錄,以及社群平台使用者在社群網絡(Social Network)中的各項資訊。如此一來為多媒體內容提供了更多面向的資料,讓社群多媒體比起單純的多媒體內容有更多的應用的可能。
微網誌(Microblog)是個可以讓使用者自由的即時分享文字訊息的平台,有著許多使用者的當下的心情、眼前所看到聽到的事或與朋友對話等。而微網誌平台相較於其它單純用來分享多媒體內容的社群平台(例如YouTube或Flickr)而言,在微網誌平台上的多媒體內容有明顯的分享傳遞現象。而本研究的目標,就是要利用些多媒體內容在微網誌平台上的分享傳遞的特性與資料,針對群多媒體內容進行熱門預測。
隨著時間的前進,若以單一同樣的規則來進行熱門預測,將可能造成預測準確率的下降;再者,即使是在同樣的時間點,不同的多媒體內容會有各自隨著時間在熱門上的變化趨勢,還是會有需要不同的規則來進行熱門預測的可能性,也就是所謂的局部概念飄移現象。在此我們將熱門預測問題轉為資料探勘(Data Mining)中的分類(Classification)問題,並同時將局部概念飄移現象納入考慮,提出一個針對微網誌平台上多媒體內容的熱門預測方法。實驗結果顯示,有考慮局部概念飄移的熱門預測方法,在準確率的表現上明顯的優於GCD方法(平均有4%的提升)與Baseline方法(平均有10%的提升),代表我們的熱門預測方法更適合微網誌平台上的多媒體內容,也代表的確有概念飄移與局部概念飄移的現象存在。 In recent years, the rise of social media offers an easy and fast way for information exchange. Social multimedia refers to the multimedia content that users share on the social media. Different from traditional multimedia, social multimedia contains both the multimedia and user behavior information on social media.
Microblog is one type of social media. Compared to other social media such as YouTube and Flickr, microblogs provide a more friendly environment for users to propagate social multimedia. The goal of this thesis is to make use of the characteristics and information of propagation on microblogs for popularity prediction of social multimedia.
The popularity prediction method based on concept drift mining is proposed. In particular, the local concept drift mechanism is employed to capture the local characteristics of social multimedia. By taking the local concept drift into consideration, the task of popularity prediction is transformed into the ensemble classification problem. Experiments on social multimedia collected from plurk show that the proposed approach performs well. |
Reference: | [1] A. Bifet, J. Gama, M. Pechenizkiy and I. Žliobaitė, “Handling Concept Drift: Importance, Challenges & Solutions,” Tutorial, Proc. of the 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2011.
[2] L. Breiman, “Random Forests,” Machine Learning, Vol. 45, Issue 1, Pages 5-32, 2001.
[3] M. Cha, H. Kwak, P. Rodriguez, Y. Y. Ahn and S. Moon, “Analyzing the Video Popularity Characteristics of Large-Scale User Generated Content Systems,” IEEE/ACM Transactions on Networking, Vol. 17, Issue 5, Pages 1357-1370, 2009.
[4] F. Figueiredo, F. Benevenuto and J. M. Almeida, “The Tube over Time: Characterizing Popularity Growth of YouTube Videos,” Proc. of the 4th ACM International Conference on Web Search and Data Mining, 2011.
[5] L. Hong, O. Dan and B. D. Davison, “Predicting Popular Messages in Twitter,” Proc. of the 20th International Conference Companion on World Wide Web, 2011.
[6] M. Harries and K. Horn, “Detecting Concept Drift in Financial Time Series Prediction using Symbolic Machine Learning,” Proc. of the 8th Australian Joint Conference on Artificial Intelligence, World Scientific, 1995.
[7] X. Jin, A. Gallagher, L. Cao, J. Luo and J. Han, “The Wisdom of Social Multimedia: Using Flickr For Prediction and Forecast,” Proc. of the 18th International Conference on Multimedia, 2010.
[8] L. I. Kuncheva, “Classifier Ensembles for Changing Environments,” Proc. of the 5th International Workshop on Multiple Classifier Systems, 2004.
[9] K. Lerman and T. Hogg, “Using a Model of Social Dynamics to Predict Popularity of News,” Proc. of the 19th International Conference on World Wide Web, 2010.
[10] M. Naaman, H. Becker and L. Gravano, “Hip and Trendy: Characterizing Emerging Trends on Twitter,” Journal of the American Society for Information Science and Technology, Vol. 62, Issue 5, Pages 902-918, 2011.
[11] D. R. Wilson and T. R. Martinez, “Improved Heterogeneous Distance Functions,” Journal of Artificial Intelligence Research, Vol. 6, Issue 1, Pages 1-34, 1997.
[12] G. Szabo and B. A. Huberman, “Predicting the Popularity of Online Content,” Communications of the ACM, Vol. 53, Issue 8, Pages 80-88, 2010.
[13] W. N. Street and Y. Kim, “A Streaming Ensemble Algorithm (SEA) for Large-Scale Classification,” Proc. of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001.
[14] J. C. Schlimmer and R. H. Granger, “Incremental Learning from Noisy Data,” Journal of Machine Learning, Vol. 1, Issue 3, Pages 317-354, 1986.
[15] C. T. Ho, Modeling and Visualizing Information Propagation in Micro-Blogging Platform, Master Thesis, Graduate Institute of Networking and Multimedia, National Taiwan University, 2010.
[16] A. Tsymbla and M. Pechenizkiy, P. Cunningham and S. Puuronen, “Dynamic Integration of Classifiers for Handling Concept Drift,” An International Journal on Multi-Sensor, Multi-Source Information Fusion, Vol. 9, Issue 1, Pages 56–68, 2008.
[17] H. Wang, W. Fan, P. S. Yu and J. Han, “Mining Concept-Drifting Data Streams Using Ensemble Classifiers,” Proc. of 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003.
[18] I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools with Java Implementations, 2000.
[19] J. Yang and S. Counts, “Predicting the Speed, Scale, and Range of Information Diffusion in Twitter,” Proc. of the 4th International AAAI Conference on Weblogs and Social Media, 2010.
[20] J. Z. Kolter and M. A. Maloof, “Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift,” Proc. of the 3th IEEE International Conference on Data Mining, 2003.
[21] I. Žliobaitė, Learning under Concept Drift: an Overview, Technical Report, 2009.
[22] 社群媒體(Social Media),http://en.wikipedia.org/wiki/Social_media |
Description: | 碩士 國立政治大學 資訊科學學系 98753010 101 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0098753010 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
index.html | 0Kb | HTML2 | 378 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|