政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/146897

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 116849/147881 (79%)
Visitors : 64143821 Online Users : 340

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 > Item 140.119/146897

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/146897

Title:	應用於直播電商的探索性演員評論家推薦系統 Exploratory Actor-Critic Recommender for Online Streaming Retailing
Authors:	簡巧恩
Contributors:	林怡伶蕭舜文簡巧恩
Keywords:	推薦系統探索利用平衡強化學習深度學習 streaming retailing actor-critic recommendation system deep reinforcement learning exploration
Date:	2023
Issue Date:	2023-09-01 14:55:39 (UTC+8)
Abstract:	互動式推薦系統的發展受到了關注。此外，所有串流中提供的產品也都不同，這導致這些產品能夠在連續的行動空間中進行建模。因此，在線串流環境的大型物品空間中，我們使用了演員-評論家架構來推薦產品，以在用戶觀看直播時學習其偏好。基於演員生成的物品嵌入，我們選擇了最接近的幾個物品作為推薦的基礎。同時，為了確保用戶接收的信息足夠多樣，我們在演員生成結果嵌入之前提出了兩種探索策略。我們計劃進行相應的實驗，以檢驗所提出的探索策略是否能夠優於基線模型或一般的推薦系統。 The development of interactive recommender systems has received atten tion. Besides, the products provided are different among all the streams plus, causing the products being able to be modeled in continuous action space. Therefore, the actor-critic architecture is used to recommend products in the large item space of online streaming environments to learn users’ preferences while watching live streams. Based on the item embedding generated by the actor, the closest few items are selected as the basis for the recommenda tion. At the same time, to ensure that the information received by users is sufficiently diverse, we proposed two exploration strategies before the actor generates the result embeddings. We planned to conduct corresponding ex periments to examine whether the proposed exploration strategies are able to outperform the baseline model or general recommenders.
Reference:	Cai, J., Wohn, D. Y., Mittal, A., and Sureshbabu, D. (2018). Utilitarian and hedonic motivations for live streaming shopping. In Proceedings of the 2018 ACM international conference on interactive experiences for TV and online video, pages 81–88. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Fu, W. (2021). Consumer choices in live streaming retailing, evidence from taobao ecom- merce. In The 2021 12th International Conference on E-business, Management and Economics, pages 12–20. Han, J., Yu, Y., Liu, F., Tang, R., and Zhang, Y. (2019). Optimizing ranking algorithm in recommender system via deep reinforcement learning. In 2019 International Con- ference on Artificial Intelligence and Advanced Manufacturing (AIAM), pages 22–26. IEEE. Hofmann, K., Whiteson, S., and Rijke, M. D. (2013). Fidelity, soundness, and efficiency of interleaved comparison methods. ACM Transactions on Information Systems (TOIS), 31(4):1–43. Howard, R. A. (1960). Dynamic programming and markov processes. Jambo Live Streaming Platform (2020). Jambo live streaming platform. https:// jambolive.tv/. Katehakis, M. N. and Veinott Jr, A. F. (1987). The multi-armed bandit problem: decom- position and computation. Mathematics of Operations Research, 12(2):262–268. Kulesza, A., Taskar, B., et al. (2012). Determinantal point processes for machine learning. Foundations and Trends® in Machine Learning, 5(2–3):123–286. Ladosz, P., Weng, L., Kim, M., and Oh, H. (2022). Exploration in deep reinforcement learning: A survey. Information Fusion. Li, L., Chu, W., Langford, J., and Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. In Proceedings of the 19th international conference on World wide web, pages 661–670. Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wier- stra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971. Liu, F., Tang, R., Li, X., Zhang, W., Ye, Y., Chen, H., Guo, H., and Zhang, Y. (2018). Deep reinforcement learning based recommendation with explicit user-item interactions modeling. arXiv preprint arXiv:1810.12027. Liu, Y., Shen, Z., Zhang, Y., and Cui, L. (2021). Diversity-promoting deep reinforcement learning for interactive recommendation. In 5th International Conference on Crowd Science and Engineering, pages 132–139. Meta Platforms (2023). Ax • adaptive experimentation platform. https://ax.dev/. Michael Gimelfarb (2020). Adaptive epsilon-greedy exploration policy using bayesian ensembles. https://github.com/mike-gimelfarb/bayesian-epsilon-greedy. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-level control through deep reinforcement learning. nature, 518(7540):529–533. OpenAI (2023). Deep deterministic policy gradient - spinning up documentation. https: //spinningup.openai.com/en/latest/algorithms/ddpg.html/. Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R. Y., Chen, X., Asfour, T., Abbeel, P., and Andrychowicz, M. (2017). Parameter space noise for exploration. arXiv preprint arXiv:1706.01905. Rafailidis, D. and Nanopoulos, A. (2015). Modeling users preference dynamics and side information in recommender systems. IEEE Transactions on Systems, Man, and Cy- bernetics: Systems, 46(6):782–792. Sutton, R. S. and Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press. Tokic, M. (2010). Adaptive ε-greedy exploration in reinforcement learning based on value differences. In KI 2010: Advances in Artificial Intelligence: 33rd Annual German Conference on AI, Karlsruhe, Germany, September 21-24, 2010. Proceedings 33, pages 203–210. Springer. Wikipedia (2022). Ornstein–uhlenbeck process. https://en.wikipedia.org/wiki/ Ornstein%E2%80%93Uhlenbeck_process. Wongkitrungrueng, A., Dehouche, N., and Assarut, N. (2020). Live streaming commerce from the sellers’perspective: implications for online relationship marketing. Journal of Marketing Management, 36(5-6):488–518. Wu, Q., Liu, Y., Miao, C., Zhao, Y., Guan, L., and Tang, H. (2019). Recent advances in diversified recommendation. arXiv preprint arXiv:1905.06589. Yuyan, Z., Xiayao, S., and Yong, L. (2019). A novel movie recommendation system based on deep reinforcement learning with prioritized experience replay. In 2019 IEEE 19th International Conference on Communication Technology (ICCT), pages 1496–1500. IEEE. Zhao, X., Zhang, L., Ding, Z., Xia, L., Tang, J., and Yin, D. (2018). Recommendations with negative feedback via pairwise deep reinforcement learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1040–1048. Zheng, G., Zhang, F., Zheng, Z., Xiang, Y., Yuan, N. J., Xie, X., and Li, Z. (2018). Drn: A deep reinforcement learning framework for news recommendation. In Proceedings of the 2018 world wide web conference, pages 167–176.
Description:	碩士國立政治大學資訊管理學系 110356045
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0110356045
Data Type:	thesis
Appears in Collections:	[資訊管理學系] 學位論文

Files in This Item:

File	Description	Size	Format
604501.pdf		1649Kb	Adobe PDF2	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback