Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/141029
|
Title: | 基於增強學習的直播電商推薦系統 Reinforcement learning based live streaming e-commerce recommender system |
Authors: | 唐思琪 Tang, Szu-Chi |
Contributors: | 林怡伶 Ling, Yi-Lin 唐思琪 Tang, Szu-Chi |
Keywords: | 直播電商 推薦系統 強化學習 探索與利用之權衡 神經網路 Live commerce Live streaming E-commerce Recommender system Recommendation system Multi-armed bandit Reinforcement learning User context Uncertainty Exploitation-exploration trade-off Gated Recurrent Unit Variational Autoencoder Bayesian neural networks |
Date: | 2022 |
Issue Date: | 2022-08-01 17:20:44 (UTC+8) |
Abstract: | 近年來,直播電商逐漸受到重視。不同於傳統的電商和單向推播的電視購物,直播電商更加強調即時互動性。由於開設直播的成本低,直播主發起直播的頻率很高、商品也是不斷推成出新,這些都促成了複雜且快速變動的環境,而推薦系統能夠幫助消費者在資訊爆炸的情況下快速做出決定。過往的推薦系統研究注重於準確率的最佳化,不只引發了同溫層效應,更因為總是推薦類似的商品,長期下來導致消費者的不滿意以及流失。為了在精準推薦與探索新喜好的取捨中獲得較好的平衡,我們將此議題看作是一個具備使用者情境的多臂吃角子老虎機問題。此研究在直播電商這種新的商業情境下,提出一個基於強化學習的推薦系統。它能夠通過靜態的顧客特徵以及具時序性的顧客特徵,找出顧客、直播主以及商品之間的關係。我們使用了一種循環神經網路——門基循環單元,來找出顧客隨時間變化的喜好。我們的直播電商推薦系統能夠藉由變分自動變碼器來模糊化顧客的特徵,並在推薦商品的過程中利用貝葉斯神經網路來引入不確定性,來達成控制探索顧客喜好與利用的平衡。據我們所知,我們是第一個提出以基於神經網路的上下文吃角子老虎機演算法,來解決直播電商平台環境下推薦問題的研究。我們比較了經典的多臂吃餃子老虎機演算法,並透過真實世界資料的實驗來初步驗證了我們的理論,並且展示了其在商業實務問題中的潛在應用。 In recent years, live stream e-commerce shopping has received extensive attention from e-commerce businesses and streaming platforms. Different from traditional TV shopping and online shopping, the emerging products roll out continuously on the live stream shopping platform where users and streamers interact and synchronize in real-time. Such a dynamic environment forms a complex user context. The recommender system plays a crucial role in assisting users in information-seeking tasks and decision-making from information overload. Previous recommender systems mainly focus on optimizing accuracy, which results in filter bubbles problem and high churn rates in the long run. To balance exploration and exploitation (EE) trade-off under a dynamic and fast-changing recommendation context, the research formulates the problem as a contextual bandit problem. This study provides a reinforcement learning (RL)-based solution for a new business scenario (i.e., live stream e-commerce) which addresses three relationships between customers, streamers, and products in both static and temporal user contexts. We use Gated Recurrent Unit (GRU) to model the context changes in users` preferences in streamers and products while maintaining their long-term engagement. By encoded uncertainty in neural networks with Variational Autoencoder (VAE) for user modeling and Bayesian Neural Network (BNN) for a product recommendation, the proposed Live E-commerce Recommender System (LERS) can control the balance of EE trade-off. To the best of our knowledge, our study is the first neural network-based contextual bandit algorithm dealing with the recommendation problem in the live streaming e-commerce platforms. We compared our algorithm with classic multi-armed bandit algorithms including UCB1, LinUCB, Exp3, and NeuralUCB. Preliminary experiment results on real-world data corroborate our theory and shed light on potential applications of our algorithm to real-world business problems. |
Reference: | Allesiardo, R., Féraud, R., & Bouneffouf, D. (2014). A neural networks committee for the contextual bandit problem. In Processings of the international conference on neural information processing (Vol. 8834, pp. 374–381). doi: 10.1007/978-3-319 -12637-1_47 Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Ma- chine Learning Research, 3(Nov), 397–422. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the mul- tiarmed bandit problem. Machine learning, 47(2), 235–256. doi: 10.1023/A: 1013689704352 Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (2002). The nonstochastic multiarmed bandit problem. SIAM journal on computing, 32(1), 48–77. doi: 10 .1137/S0097539701398375 Blundell, C., Cornebise, J., Kavukcuoglu, K., & Wierstra, D. (2015). Weight uncertainty in neural networks. In Proceedings of the 32nd international conference on inter- national conference on machine learning (Vol. 37, pp. 1613–1622). Bouneffouf, D., Bouzeghoub, A., & Gançarski, A. L. (2012). A contextual-bandit algo- rithm for mobile context-aware recommender system. In International conference on neural information processing (pp. 324–331). Burtini, G., Loeppky, J., & Lawrence, R. (2015). A survey of online experiment design with the stochastic multi-armed bandit. Retrieved from https://arxiv.org/abs/1510.00757 Cai, J., Wohn, D. Y., Mittal, A., & Sureshbabu, D. (2018). Utilitarian and hedonic moti- vations for live streaming shopping. In Proceedings of the 2018 acm international conference on interactive experiences for tv and online video (p. 81–88). doi: 10.1145/3210825.3210837 Cheng, Z., & Shen, J. (2016, April). On effective location-aware music recommen- dation. ACM Transactions on Information Systems (TOIS), 34(2), 1–32. doi: 10.1145/2846092 Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using rnn encoder-decoder for statistical machine translation. Retrieved from https://arxiv.org/abs/1406.1078 Choe, D.-E., Kim, H.-C., & Kim, M.-H. (2021). Sequence-based modeling of deep learn- ing with lstm and gru networks for structural damage detection of floating offshore wind turbine blades. Renewable Energy, 174, 218–235. Chu, W., Li, L., Reyzin, L., & Schapire, R. (2011). Contextual bandits with linear payoff functions. In Proceedings of the 14th international conference on artificial intelli- gence and statistics (pp. 208–214). Docherty, I. (2018). New governance challenges in the era of ‘smart’mobility. In Governance of the smart mobility transition. Du, C., Gao, Z., Yuan, S., Gao, L., Li, Z., Zeng, Y., ... Lee, K.-C. (2021). Exploration in online advertising systems with deep uncertainty-aware learning. In Proceedings of the 27th acm sigkdd conference on knowledge discovery & data mining (pp. 2792– 2801). Fang, H., Zhang, D., Shu, Y., & Guo, G. (2020). Deep learning for sequential recom- mendation: Algorithms, influential factors, and evaluations. ACM Transactions on Information Systems (TOIS), 39(1), 1–42. Gal, Y., & Ghahramani, Z. (2016). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning (pp. 1050–1059). Gawlikowski, J., Tassi, C. R. N., Ali, M., Lee, J., Humt, M., Feng, J., ... others (2021). A survey of uncertainty in deep neural networks. Retrieved from https://arxiv.org/abs/2107.03342 Gediminas, A., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and data engineering, 17(6), 734-749. doi: https://doi.org/10.1109/ TKDE.2005.99 Gulrajani, I., Kumar, K., Ahmed, F., Taiga, A. A., Visin, F., Vazquez, D., & Courville, A. (2016). Pixelvae: A latent variable model for natural images. Retrieved from https://arxiv.org/abs/1611.05013 He, X., Chen, T., Kan, M.-Y., & Chen, X. (2015). Trirank: Review-aware explainable recommendation by modeling aspects. In Proceedings of the 24th acm international on conference on information and knowledge management (p. 1661–1670). doi: 10.1145/2806416.2806504 Hu, M., & Chaudhry, S. S. (2020). Enhancing consumer engagement in e-commerce live streaming via relational bonds. Internet Research, 30(3). doi: 10.1108/INTR-03 -2019-0082 Kakade, S. M., Shalev-Shwartz, S., & Tewari, A. (2008). Efficient bandit algorithms for online multiclass prediction. In Proceedings of the 25th international conference on machine learning (pp. 440–447). doi: 10.1145/1390156.1390212 Katehakis, M. N., & Veinott Jr, A. F. (1987). The multi-armed bandit problem: Decom- position and computation. Mathematics of Operations Research, 12(2), 262–268. doi: 10.1287/moor.12.2.262 Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. Retrieved from https://arxiv.org/abs/1312.6114 Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5), 604–632. Ko, H.-C., & Chen, Z.-Y. (2020). Exploring the factors driving live streaming shopping intention: a perspective of parasocial interaction. In Proceedings of the 2020 inter- national conference on management of e-commerce and e-government (pp. 36–40). Langford, J., & Zhang, T. (2007). The Epoch-Greedy algorithm for contextual multi- armed bandits. In Proceedings of the 20th international conference on neural in- formation processing systems (p. 817–824). Lauret, P., Fock, E., Randrianarivony, R. N., & Manicom-Ramsamy, J.-F. (2008). Bayesian neural network approach to short time load forecasting. Energy conver- sion and management, 49(5), 1156–1166. Lee, H. I., Choi, I. Y., Moon, H. S., & Kim, J. K. (2020). A multi-period product recom- mender system in online food market based on recurrent neural networks. Sustain- ability, 12(3), 969. Li, J., Ren, P., Chen, Z., Ren, Z., Lian, T., & Ma, J. (2017). Neural attentive session-based recommendation. In (pp. 1419–1428). doi: 10.1145/3132847 Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on world wide web (pp. 661–670). doi: 10.1145/1772690.1772758 Li, S., Karatzoglou, A., & Gentile, C. (2016). Collaborative filtering bandits. In Proceed- ings of the 39th international acm sigir conference on research and development in information retrieval (pp. 539–548). Lin, C.-Y., & Chen, H.-S. (2019). Personalized channel recommendation on live streaming platforms. Multimedia Tools and Applications, 78(2), 1999–2015. Liu, Y. W., Lin, C. Y., & Huang, J. L. (2015). Live streaming channel recommendation using hits algorithm. In 2015 ieee international conference on consumer electronics taiwan (pp. 118–119). Martinez-Cantin, R., De Freitas, N., Brochu, E., Castellanos, J., & Doucet, A. (2009). A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots, 27(2), 93–103. doi: 10.1007/s10514-009-9130-2 Mullachery, V., Khera, A., & Husain, A. (2018). Bayesian neural networks. Retrieved from https://arxiv.org/abs/1801.07710 Pradel, B., Sean, S., Delporte, J., Guérif, S., Rouveirol, C., Usunier, N., ... France, O. (2011). A case study in a recommender system based on purchase data. In Proceed- ings of the 17th acm sigkdd international conference on knowledge discovery and data mining - kdd ’11 (pp. 377–385). doi: 10.1145/2020408 Rappaz, J., McAuley, J., & Aberer, K. (2021). Recommendation on live-streaming plat- forms: Dynamic availability and repeat consumption. In Fifteenth acm conference on recommender systems (pp. 390–399). Reinartz, W. J., & Kumar, V. (2003). The impact of customer relationship characteristics on profitable lifetime duration. Journal of marketing, 67(1), 77–99. Santana, L. L. B. d. S., Souza, A. B. S., Santana, D. L., Dourado, W. A., & Durão, F. A. (2017). Evaluating ensemble strategies for recommender systems under metadata reduction. In Proceedings of the 23rd brazillian symposium on multimedia and the web (pp. 125–132). doi: 10.1145/3126858.3126879 Satyal, S., Weber, I., Paik, H.-y., Di Ciccio, C., & Mendling, J. (2018). AB testing for process versions with contextual multi-armed bandit algorithms. In Proceedings of the international conference on advanced information systems engineering (pp. 19–34). doi: 10.1007/978-3-319-91563-0_2 Shahrampour, S., Rakhlin, A., & Jadbabaie, A. (2017). Multi-armed bandits in multi-agent networks. In Proceedings of the 2017 ieee international conference on acous- tics, speech and signal processing (p. 2786-2790). doi: 10.1109/ICASSP.2017.7952664 Shani, G., & Gunawardana, A. (2011). Evaluating recommendation systems. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender systems handbook (pp. 257–297). doi: 10.1007/978-0-387-85820-3_8 Su, X. (2019, dec). An empirical study on the influencing factors of e-commerce live streaming. In 2019 international conference on economic management and model engineering, icemme 2019 (pp. 492–496). doi: 10.1109/ICEMME49371 .2019.00103 Sun, Y., Shao, X., Li, X., Guo, Y., & Nie, K. (2019). How live streaming influences purchase intentions in social commerce: An it affordance perspective. Electronic Commerce Research and Applications, 37, 100886. doi: https://doi.org/10.1016/ j.elerap.2019.100886 Takahashi, R., & Zhang, S. (2017). Towards bursting filter bubble via contextual risks and uncertainties. Retrieved from https://arxiv.org/abs/1706.09985 Truong, Q.-T., Salah, A., & Lauw, H. W. (2021). Bilateral variational autoencoder for collaborative filtering. In Proceedings of the 14th acm international conference on web search and data mining (pp. 292–300). Vanchinathan, H. P., Nikolic, I., De Bona, F., & Krause, A. (2014). Explore-exploit in top-n recommender systems via gaussian processes. In Proceedings of the 8th acm conference on recommender systems (pp. 225–232). Vuyyuru, V. A., Rao, G. A., & Murthy, Y. (2021). A novel weather prediction model using a hybrid mechanism based on mlp and vae with fire-fly optimization algorithm. Evolutionary Intelligence, 14(2), 1173–1185. Wang, H., Wu, Q., & Wang, H. (2016). Learning hidden features for contextual bandits. In Proceedings of the 25th acm international on conference on information and knowledge management (pp. 1633–1642). Wang, Z., Lee, S.-J., & Lee, K.-R. (2018). Factors influencing product purchase intentionin taobao live streaming shopping. Journal of Digital Contents Society, 19(4), 649–659. Wikipedia. (2022). Livestream shopping — Wikipedia, the free encyclopedia. Retrieved from http://en.wikipedia.org/w/index.php?title=Livestream\\ %20shopping&oldid=1065424656 Wongkitrungrueng, A., & Assarut, N. (2020). The role of live streaming in building consumer trust and engagement with social commerce sellers. Journal of Business Research, 117, 543-556. doi: https://doi.org/10.1016/j.jbusres.2018.08.032 Wongkitrungrueng, A., Dehouche, N., & Assarut, N. (2020). Live streaming commerce from the sellers’perspective: implications for online relationship marketing. Jour- nal of Marketing Management, 36(5-6), 488–518. Xu, X., Wu, J.-H., & Li, Q. (2020). What drives consumer shopping behavior in live streaming commerce? Journal of Electronic Commerce Research, 21(3), 144–167. Xue, F., He, X., Wang, X., Xu, J., Liu, K., & Hong, R. (2019, April). Deep item-based col- laborative filtering for top-N recommendation. ACM Transactions on Information Systems (TOIS), 37(3). doi: 10.1145/3314578 Yang, T.-W., Shih, W.-Y., Huang, J.-L., Ting, W.-C., & Liu, P.-C. (2013). A hybrid preference-aware recommendation algorithm for live streaming channels. In 2013 conference on technologies and applications of artificial intelligence (pp. 188– 193). Zhang, S., Liu, H., He, J., Han, S., & Du, X. (2021). Deep sequential model for anchor recommendation on live streaming platforms. Big Data Mining and Analytics, 4(3), 173–182. Zhang, S., Yao, L., Sun, A., & Tay, Y. (2019). Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys (CSUR), 52(1), 1–38. doi: 10.1145/3285029 Zhang, X., Xie, H., Li, H., & CS Lui, J. (2020). Conversational contextual bandit: Algorithm and application. In Proceedings of the web conference 2020 (pp. 662–672). Zhou, D., Li, L., & Gu, Q. (2020). Neural contextual bandits with UCB-based exploration. In Proceedings of the 37th international conference on machine learning (Vol. 119, pp. 11492–11502). Zhou, M., Huang, J., Wu, K., Huang, X., Kong, N., & Campy, K. S. (2021, nov). Characterizing Chinese consumers’ intention to use live e-commerce shopping. Technology in Society, 67, 101767. doi: 10.1016/J.TECHSOC.2021.101767 Zou, L., Xia, L., Ding, Z., Song, J., Liu, W., & Yin, D. (2019). Reinforcement learning to optimize long-term user engagement in recommender systems. In Proceedings of the 25th acm sigkdd international conference on knowledge discovery & data mining (pp. 2810–2818). |
Description: | 碩士 國立政治大學 資訊管理學系 109356002 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0109356002 |
Data Type: | thesis |
DOI: | 10.6814/NCCU202201098 |
Appears in Collections: | [資訊管理學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
600201.pdf | | 2420Kb | Adobe PDF2 | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|