政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/158704

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 118628/149684 (79%)
Visitors : 79902132 Online Users : 317

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 資訊學院 > 資訊科學系碩士在職專班 > 學位論文 > Item 140.119/158704

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/158704

Title:	深度強化學習於二維裝箱問題的規劃策略 Deep Reinforcement Learning Strategies for Planning in Two-Dimensional Bin Packing Problems
Authors:	詹凱傑 Chan, Kai-Jie
Contributors:	李蔡彥 Li, Tsai-Yen 詹凱傑 Chan, Kai-Jie
Keywords:	強化學習深度強化學習二維裝箱問題 Deep Reinforcement Learning Deep Reinforcement Learning Two-dimensional Bin Packing Problem 2DBP
Date:	2025
Issue Date:	2025-08-04 15:09:17 (UTC+8)
Abstract:	二維裝箱問題是典型的NP-hard組合最佳化問題，廣泛應用於製造業和物流業。傳統啟發式演算法運算效率高但缺乏適應性，深度強化學習具備學習能力但容易陷入決策困境。本研究提出創新的混合策略方法，將深度強化學習與First-Fit演算法相結合。研究設計基於雙流網路架構的深度強化學習模型，將複雜動作空間分解為物品選擇和位置決策兩個子問題，採用多層次獎勵機制和專家引導策略提升學習效果。實驗採用三種代表性資料集評估六種演算法。結果顯示，DRL + First-Fit混合策略在測試的資料集上，基於物品放置率和空間利用率兩個重要指標上，比傳統最佳演算法皆有更好的綜合表現，且明顯改善純深度強化學習的密集空間放置不穩定問題。本研究的貢獻包括提出深度強化學習與傳統演算法的創新混合策略、設計完整的二維裝箱問題解決系統及建立標準化評估框架。實驗結果驗證了混合策略的有效性，也為相關問題提供方法論的參考。 The two-dimensional bin-packing problem represents a typical NP-hard combinatorial op-timization challenge with extensive applications in manufacturing and logistics industries. While traditional heuristic algorithms demonstrate high computational efficiency, they lack adaptability. On the contrary, deep reinforcement learning has strong learning capabilities but tends to en-counter decision-making dilemmas. This research proposes an innovative hybrid strategy that combines deep reinforcement learning with the First-Fit algorithm. The study designs a deep reinforcement learning model based on a dual-stream network architecture, decomposing the complex action space into two sub-problems: item selection and position decision-making. The system employs multi-level reward mechanisms and expert guidance strategies to enhance learning effectiveness. The experiments utilize three representative datasets to evaluate six algorithms. The results demonstrate that the DRL+First-Fit hybrid strategy achieves superior comprehensive perfor-mance compared to the best traditional algorithms in the tested datasets, based on two critical metrics: item placement rate and space utilization rate. Furthermore, the approach significantly addresses the instability issues of pure deep reinforcement learning in dense spatial arrangement scenarios. Our research contributions include proposing an innovative hybrid strategy that combines deep reinforcement learning with traditional algorithms, designing a comprehensive two-dimensional bin packing solution system, and establishing a standardized evaluation framework. Experimental results validate the effectiveness of the hybrid strategy, providing methodological references for related fields.
Reference:	[1] Lodi, A., Martello, S., & Vigo, D. (2002). Recent advances on two-dimensional bin packing problems. Discrete Applied Mathematics, 123(1-3), 379-396. [2] Martello, S., & Vigo, D. (1998). Exact solution of the two-dimensional finite bin packing problem. Management science, 44(3), 388-399. [3] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [4] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354-359. [5] Berkey, J. O., & Wang, P. Y. (1987). Two-dimensional finite bin-packing algorithms. Journal of the operational research society, 38(5), 423-429. [6] Burke, E. K., Kendall, G., & Whitwell, G. (2006). A new placement heuristic for the or-thogonal stock-cutting problem. Operations Research, 54(4), 655-671. [7] Hopper, E., & Turton, B. C. (2001). A review of the application of meta-heuristic algo-rithms to 2D strip packing problems. Artificial Intelligence Review, 16(4), 257-300. [8] Terashima-Marín, H., Ross, P., Farías-Zárate, C., López-Camacho, E., & Valenzue-la-Rendón, M. (2008). Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems. Annals of Operations Research, 179(1), 369-392. [9] Bennell, J. A., Lee, L. S., & Potts, C. N. (2013). A genetic algorithm for two-dimensional bin packing with due dates. International Journal of Production Economics, 145(2), 547-560. [10] Gomez, J. C., & Terashima-Marín, H. (2017). Evolutionary hyper-heuristics for tackling bi-objective 2D bin packing problems. Genetic Programming and Evolvable Machines, 18(3), 221-259. [11] Bouganis, A., & Shanahan, M. (2007). A vision-based intelligent system for packing 2-D irregular shapes. IEEE Transactions on Automation Science and Engineering, 4(3), 382-394. [12] Wang, L., Cai, J., & Zhao, X. (2024). Bin Packing Optimization via Deep Reinforcement Learning. arXiv:2403.12420. [13] López-Camacho, E., Terashima-Marín, H., Ross, P., & Ochoa, G. (2013). A unified hy-per-heuristic framework for solving bin packing problems. Expert Systems with Applica-tions, 40(13), 5516-5531. [14] Guo, H., Li, H., Shen, Y., & Wang, X. (2022). Two-dimensional irregular packing prob-lems: A review of mathematical models, solution approaches and applications. European Journal of Operational Research, 298(1), 1-20.
Description:	碩士國立政治大學資訊科學系碩士在職專班 106971025
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0106971025
Data Type:	thesis
Appears in Collections:	[資訊科學系碩士在職專班] 學位論文

Files in This Item:

File	Description	Size	Format
102501.pdf		5623Kb	Adobe PDF	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback