Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/158704
|
Title: | 深度強化學習於二維裝箱問題的規劃策略 Deep Reinforcement Learning Strategies for Planning in Two-Dimensional Bin Packing Problems |
Authors: | 詹凱傑 Chan, Kai-Jie |
Contributors: | 李蔡彥 Li, Tsai-Yen 詹凱傑 Chan, Kai-Jie |
Keywords: | 強化學習 深度強化學習 二維裝箱問題 Deep Reinforcement Learning Deep Reinforcement Learning Two-dimensional Bin Packing Problem 2DBP |
Date: | 2025 |
Issue Date: | 2025-08-04 15:09:17 (UTC+8) |
Abstract: | 二維裝箱問題是典型的NP-hard組合最佳化問題,廣泛應用於製造業和物流業。傳統啟發式演算法運算效率高但缺乏適應性,深度強化學習具備學習能力但容易陷入決策困境。本研究提出創新的混合策略方法,將深度強化學習與First-Fit演算法相結合。研究設計基於雙流網路架構的深度強化學習模型,將複雜動作空間分解為物品選擇和位置決策兩個子問題,採用多層次獎勵機制和專家引導策略提升學習效果。實驗採用三種代表性資料集評估六種演算法。結果顯示,DRL + First-Fit混合策略在測試的資料集上,基於物品放置率和空間利用率兩個重要指標上,比傳統最佳演算法皆有更好的綜合表現,且明顯改善純深度強化學習的密集空間放置不穩定問題。 本研究的貢獻包括提出深度強化學習與傳統演算法的創新混合策略、設計完整的二維裝箱問題解決系統及建立標準化評估框架。實驗結果驗證了混合策略的有效性,也為相關問題提供方法論的參考。 The two-dimensional bin-packing problem represents a typical NP-hard combinatorial op-timization challenge with extensive applications in manufacturing and logistics industries. While traditional heuristic algorithms demonstrate high computational efficiency, they lack adaptability. On the contrary, deep reinforcement learning has strong learning capabilities but tends to en-counter decision-making dilemmas. This research proposes an innovative hybrid strategy that combines deep reinforcement learning with the First-Fit algorithm. The study designs a deep reinforcement learning model based on a dual-stream network architecture, decomposing the complex action space into two sub-problems: item selection and position decision-making. The system employs multi-level reward mechanisms and expert guidance strategies to enhance learning effectiveness. The experiments utilize three representative datasets to evaluate six algorithms. The results demonstrate that the DRL+First-Fit hybrid strategy achieves superior comprehensive perfor-mance compared to the best traditional algorithms in the tested datasets, based on two critical metrics: item placement rate and space utilization rate. Furthermore, the approach significantly addresses the instability issues of pure deep reinforcement learning in dense spatial arrangement scenarios. Our research contributions include proposing an innovative hybrid strategy that combines deep reinforcement learning with traditional algorithms, designing a comprehensive two-dimensional bin packing solution system, and establishing a standardized evaluation framework. Experimental results validate the effectiveness of the hybrid strategy, providing methodological references for related fields. |
Reference: | [1] Lodi, A., Martello, S., & Vigo, D. (2002). Recent advances on two-dimensional bin packing problems. Discrete Applied Mathematics, 123(1-3), 379-396. [2] Martello, S., & Vigo, D. (1998). Exact solution of the two-dimensional finite bin packing problem. Management science, 44(3), 388-399. [3] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533. [4] Silver, D., Schrittwieser, J., Simonyan, K., Antonoglou, I., Huang, A., Guez, A., ... & Hassabis, D. (2017). Mastering the game of go without human knowledge. Nature, 550(7676), 354-359. [5] Berkey, J. O., & Wang, P. Y. (1987). Two-dimensional finite bin-packing algorithms. Journal of the operational research society, 38(5), 423-429. [6] Burke, E. K., Kendall, G., & Whitwell, G. (2006). A new placement heuristic for the or-thogonal stock-cutting problem. Operations Research, 54(4), 655-671. [7] Hopper, E., & Turton, B. C. (2001). A review of the application of meta-heuristic algo-rithms to 2D strip packing problems. Artificial Intelligence Review, 16(4), 257-300. [8] Terashima-Marín, H., Ross, P., Farías-Zárate, C., López-Camacho, E., & Valenzue-la-Rendón, M. (2008). Generalized hyper-heuristics for solving 2D Regular and Irregular Packing Problems. Annals of Operations Research, 179(1), 369-392. [9] Bennell, J. A., Lee, L. S., & Potts, C. N. (2013). A genetic algorithm for two-dimensional bin packing with due dates. International Journal of Production Economics, 145(2), 547-560. [10] Gomez, J. C., & Terashima-Marín, H. (2017). Evolutionary hyper-heuristics for tackling bi-objective 2D bin packing problems. Genetic Programming and Evolvable Machines, 18(3), 221-259. [11] Bouganis, A., & Shanahan, M. (2007). A vision-based intelligent system for packing 2-D irregular shapes. IEEE Transactions on Automation Science and Engineering, 4(3), 382-394. [12] Wang, L., Cai, J., & Zhao, X. (2024). Bin Packing Optimization via Deep Reinforcement Learning. arXiv:2403.12420. [13] López-Camacho, E., Terashima-Marín, H., Ross, P., & Ochoa, G. (2013). A unified hy-per-heuristic framework for solving bin packing problems. Expert Systems with Applica-tions, 40(13), 5516-5531. [14] Guo, H., Li, H., Shen, Y., & Wang, X. (2022). Two-dimensional irregular packing prob-lems: A review of mathematical models, solution approaches and applications. European Journal of Operational Research, 298(1), 1-20. |
Description: | 碩士 國立政治大學 資訊科學系碩士在職專班 106971025 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0106971025 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系碩士在職專班] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
102501.pdf | | 5623Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|