Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/111445
|
Title: | 多重插補法在線上使用者評分之應用 Managing online user-generated product reviews using multiple imputation methods |
Authors: | 李岑志 Li, Cen Jhih |
Contributors: | 唐揆 鄭宗記 Tang, Kwei Cheng, Tsung Chi 李岑志 Li, Cen Jhih |
Keywords: | 意見探勘 遺漏值 多重插補 Opinion mining Missing data Multiple imputation |
Date: | 2017 |
Issue Date: | 2017-07-31 10:57:07 (UTC+8) |
Abstract: | 隨著網路普及,人們越來越常在網路上購物並在線上評價商品,產生了非常大的口碑效應。不論對廠商或對消費者來說,線上商品評論都已經變得非常重要;消費者能藉由他人購買經驗判斷產品優劣,廠商能藉由消費者評價來提升產品品質,目前已有許多電子商務網站都有蒐集消費者購買產品後的意見回饋。 這些網站中有些提供消費者能對產品打一個總分並寫一段文字評論,然而每個消費者所評論的產品特徵通常各有不同,尤其是較晚購買的消費者更可能因為自己的意見已經有人提過而省略。將每個人提到的文字敘述量化為數字分數時,沒有寫到的特徵將會使量化後的資料存在許多遺漏值。 同時消費者也有可能提到一些不重要的特徵,若能找到消費者評論中,各個特徵影響消費者的多寡,廠商就能針對產品較重要的缺點改進。本研究將會著重探討消費者所提到的特徵對產品總分的影響,以及這些遺漏值填補後是否能接近消費者真實意見。 過去許多填補遺漏值的方法都是一次填補全部資料,並沒有考慮消費者會受到時間較早的評論影響。本研究設計一套多重插補的方法並透過模擬驗證,以之填補亞馬遜網站的Canon 系列 SX210、SX230、SX260等三個世代數位相機之消費者評論資料。研究結果指出此方法能夠準確估計各項特徵對產品總分的影響。 Online user-generated product reviews have become a rich source of product quality information for both producers and customers. As a result, many E-commerce websites allow customers to rate products using scores, and some together with text comments. However, people usually comment only on the features they care about and might omit those have been mentioned by previous customers. Consequently, missing data occur when analyzing comments. In addition, customers may comment the features which influence neither their satisfaction nor sales volume. Thus, it is important to find the significant features so that manufacturers can improve the main defects. Our research focuses on modeling customer reviews and their influence on predicting overall ratings. We aim to understand whether, by filling up missing values, the critical features can be identified and the features rating authentically reflect customer opinion. Many previous studies fill whole the dataset, but not consider that customer reviews might be influenced by the foregoing reviews. We propose a method based on multiple imputation and fill the costumer reviews of Canon digital camera (SX210, SX230, SX260 generations) on Amazon. We design a simulation to verify the method’s effectiveness and the method get a great result on identifying the critical features. |
Reference: | Aggarwal, C. C. (2016). Recommender Systems, New York: Springer. Androdge, R. R. and Little, R. J. A. (2010). A Review of Hot Deck Imputation for Survey Non-response, International Statistical Review, 78(1), 40-64. Atkinson, A. C. and T.-C. Cheng (2000). On Robust Linear Regression with Incomplete Data, Computational Statistics and Data Analysis, 33, 361-380. Azur, M. J., E. A. Stuart, C. Frangakis, and P. J. Leaf, (2011).Multiple Imputation by Chained Equations: What is it and how does it work? International Journal of Methods in Psychiatric Research, 20(1), 40–49. Dror, G., Koenigstein, N., Koren, Y., & Weimer, M. (2011). The yahoo! music dataset and kdd-cup`11. In Proceedings of the 2011 International Conference on KDD Cup 2011-Volume 18, 3-18. Duric, A. and F. Song (2011). Feature selection for sentiment analysis based on content and syntax models, Decision Support Systems, 53, 704–711. Heckerman, D., D. M. Chickering, C. Meek, R. Rounthwaite, and C. Kadie (2001). Dependency Networks for Inference, Collaborative Filtering, and Data Visualisation, Journal of Machine Learning Research, 1, 49–75. Hennig-Thurau, T., K. P. Gwinner, G. Walsh, and D. D. Gremler (2004). Electronic Word-of-Mouth via Consumer-Opinion Platforms: What Motivates Consumers to Articulate Themselves on the Internet? Journal of Interactive Marketing, 18(1), 38–52. Horrigan, J. A. (2008). Online shopping. Pew Internet and American Life Project Report, 36. Hu, Y., Zhang, D., Ye, J., Li, X., & He, X. (2013). Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(9), 2117-2130. Lin, P.-Y. (2013). Latent Opinion Extraction: Identify Critical Product Features in Multiple Generations. Unpublished master’s thesis. National Chengchi University MBA Program. Taipei, Taiwan. Available at http://thesis.lib.nccu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dallcdr&s=id=%22G0100355026%22.&searchmode=basic Lipsitz, S. R., M. Parzen, and L.-P. Zhao (2002). A Degrees-of-Freedom Approximation in Multiple Imputation, Journal of Statistical Computation and Simulation, 72(4), 309-318. Little, R. J. A. (1979). Maximum likelihood inference for multiple regression with missing values: a simulation study, Journal of the Royal Statistical Society Series B. Statistical Methodology, 44, 226-233. Little, R.J.A., D. B. Rubin (2002). Statistical analysis with missing data, 2nd edition, New Jersey: Wiley. Pradel, B., N. Usunier, and P. Gallinari (2012). Ranking With Non-Random Missing Ratings: Influence of Popularity and Positivity on Evaluation Metrics. Raghunathan, T. E., P. W. Solenberger, and J. Van-Hoewyk (2002), IVEware: Imputation and Variance Estimation Software, available at http://www.isr.umich.edu/src/smp/ive/ Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys, New York : John Wiley & Sons. Rubin, D.B. (1996). Multiple Imputation after 18+ Years, Journal of the American Statistical Association, 91(434), 473–489. Shih, W.J., Weisberg, S., 1986. Assessing in uence in multiple linear regression with incomplete data, Technometrics 28, 231–239. Sridhar, S. and R. Srinivasan (2012). Social influence effects in online product ratings, Journal of Marketing, 76(5), 70-88. Steck, H. (2010). Training and testing of recommender systems on data missing not at random, Proc. 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’10), 713-722. Steck, H. (2011). Item popularity and recommendation accuracy. In Proceedings of the fifth ACM conference on Recommender systems, 125-132. Van Buuren, S. and K. Groothuis-Oudshoorn, (2011). mice: Multivariate Imputation by Chained Equations in R, Journal of Statistical Software, 45(3), 1-67. Also available at http://www.jstatsoft.org/v45/i03/ Yang, X., Steck, H., Guo, Y., & Liu, Y. (2012). On top-k recommendation using social networks. In Proceedings of the sixth ACM conference on Recommender systems, 67-74. |
Description: | 碩士 國立政治大學 統計學系 104354014 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0104354014 |
Data Type: | thesis |
Appears in Collections: | [統計學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
401401.pdf | 1570Kb | Adobe PDF2 | 580 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|