政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/158710

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 118199/149231 (79%)
Visitors : 74192822 Online Users : 90

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 資訊學院 > 資訊科學系碩士在職專班 > 學位論文 > Item 140.119/158710

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/158710

Title:	基於多代理深度強化學習之城市動態調整交通號誌控制方法 Adaptive Urban Traffic Signal Control Based On Multi-Agent Deep Reinforcement Learning
Authors:	陳文遠 Chen, Wen-Yuan
Contributors:	蔡子傑 Tsai, Tzu-Chieh 陳文遠 Chen, Wen-Yuan
Keywords:	智慧交通號誌深度強化學習動態號誌控制多代理深度強化學習多代理深度Q-Learning 交通壅塞 SUMO 模擬 Intelligent traffic signal control Deep reinforcement learning Dynamic signal timing Multi-Agent Reinforcement Learning Multi-Agent Q-Learning Traffic Congestion SUMO Simulation Q-Learning
Date:	2025
Issue Date:	2025-08-04 15:10:28 (UTC+8)
Abstract:	交通系統是人類生活中不可或缺的一部分，大幅縮短了人們之間的距離，提升了社會整體運作的效率與便利性。其中紅綠燈作為都市交通的核心控制系統之一，扮演協調不同方向車流的角色，能夠有效防止車輛衝突，並指引駕駛人安全且有秩序地通行於各路口。然而，現今多數交通號誌系統仍採用傳統的固定時相控制方式，每個相位依據預設時間進行切換，待綠燈秒數過後，先進入安全的黃燈過渡，最後則是紅燈階段，同時對向的車道轉換為綠燈，達成協調車流的目的。這種方式雖能在一般情況下維持交通秩序，但當車流量發生變化、分佈不均或進入尖峰時段時，容易導致一部分車道出現嚴重壅塞，而另一些車道則出現綠燈時間浪費，造成號誌效能下降。本研究針對上述問題，提出基於深度強化學習（Deep Reinforcement Learning, DRL）的動態控制交通號誌控制方法。系統設計上以低成本、可實作為前提，僅依賴車道進出口的感測器取得車輛進出時間，推估排隊長度、車輛平均車速與當前號誌狀態作為輸入，無須高昂的即時攝影機或車聯網設備。模型藉由學習和分析車流量與號誌時相之間的關係，進而找出最佳的號誌切換策略，當某車道車流量上升時，會給予該車道多一點綠燈時間，放行更多車輛，降低壅塞風險，反之流量低的車道得到相對少的綠燈時間，讓整體車輛放行數量達到最大化。最後的實驗結果以總延遲時間與排隊長度為指標進行量化比較，本研究所提出的方法在單路口實驗中，平均等待時間相較於固定時相減少約 74%，在多路口實驗中也較固定時相減少約50%，整體效能最佳；在平均排隊長度方面，本研究所提出的方法在多路口模擬中最大值為99 輛，相較於傳統固定時相的195輛大幅減少壅塞程度，也低於其他所有基準方法，顯示其在壅塞控制與通行效率方面均具明顯優勢。整體而言，本研究證實深度強化學習透過感測器資料可有效提升交通號誌控制效率，具備應用於大規模都市交通管理的潛力。 Traffic systems are an indispensable part of modern life, significantly reducing distances between people and improving overall societal efficiency and convenience. Traffic lights, as key components of urban control systems, coordinate vehicle flows to prevent collisions and ensure safe, orderly passage through intersections. However, many existing traffic signal systems still rely on traditional fixed-time control, where phases switch based on preset durations. After a green phase, the system transitions to yellow, then red, while the opposite direction turns green. Although effective under stable traffic, this method becomes inefficient when flow fluctuates, is uneven, or enters peak periods—causing congestion in some lanes and wasted green time in others, ultimately reducing control effectiveness. To address these issues, this study proposes a dynamic traffic signal control method based on Deep Reinforcement Learning (DRL). The system emphasizes low-cost and practical deployment, relying solely on entry and exit sensors to obtain vehicle timestamps. From this, it estimates queue length, average speed, and current signal phase as model input—without requiring expensive real-time cameras or V2I systems. The model learns optimal switching strategies by analyzing traffic patterns, allocating more green time to congested lanes and less to low-traffic lanes to maximize throughput. Experimental results show that the proposed method reduces average waiting time by around 74% in single-intersection simulations and 50% in multi-intersection scenarios. In the multi-intersection scenario, the maximum queue length was also significantly reduced—from 195 vehicles under fixed-time control to 99 vehicles—surpassing all baseline methods. These results confirm that DRL, supported by simple sensor data, can effectively improve signal control performance and holds strong potential for real-world large-scale deployment.
Reference:	[1] Kumar, R., Sharma, N.V.K. & Chaurasiya, V.K., “Adaptive traffic light control using deep reinforcement learning technique.,” Multimed Tools Appl 83, p. 13851–13872, 2024. [2] Federal Highway Administration., “Traffic detector handbook: Third edition - Volume I. U.S. Department of Transportation.,” https://www.fhwa.dot.gov/publications/research/operations/its/06108/02.cfm, 2006. [3] A. L. Samuel, “Some studies in machine learning using the game of checkers,,” IBM Journal of Research and Development, pp. vol. 3, no. 3, pp. 210–229, 1959. [4] T. M. Mitchell, “Machine Learning. New York, NY, USA: McGraw-Hill,” 1997. [5] R. S. S. a. A. G. Barto, “ Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press,” 2018. [6] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press, 2018. [7] R. Bellman, “Dynamic Programming. Princeton,” NJ, USA: Princeton Univ. Press, 1957., 1983. [8] R. S. Sutton, “Learning to predict by the methods of temporal differences,” Mach. Learn., vol. 3, no. 1, pp. 9–44, 1988. [9] C. J. C. H. W. a. P. Dayan, “Q-learning,” Mach. Learn., vol. 8, no. 3–4, pp. 279–292, 1992. [10] V. M. e. al, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2 2015. [11] G. H. C. R. a. P. W. D. Krajzewicz, “SUMO (Simulation of Urban MObility) – an open-source traffic simulation,” in Proc. 4th Middle East Symp. Simulation and Modelling (MESM2002),” 2002. [12] L. Qi, Y. Sun and W. Luan, “Large-Scale Traffic Signal Control Based on Multi-Agent Q-Learning and Pressure,” IEEE Access, pp. vol. 12, pp. 1092-1101, 2024. [13] L. L. W. -P. C. Y. Liu, “Intelligent traffic light control using distributed multi-agent Q learning,” 於 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 2017. [14] J. H. a. Y. Z. J. Zeng, “Adaptive Traffic Signal Control with Deep Recurrent Q-learning,” 於 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, China, 2018. [15] L. H. S. P. a. A. K. A. Tigga, “A Deep Q-Learning-Based Adaptive Traffic Light Control System for Urban Safety,,” 2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), Greater Noida, India, pp. 2430-2435, 2022. [16] J. D. C. Little, “A proof for the queuing formula: L = λW,” Operations Research, vol. 9, no. 3, pp. 383–387, 1961. [17] J. F. C. Kingman, “The single server queue in heavy traffic,” Math. Proc. Camb. Philos. Soc., vol. 57, no. 4, pp. 902–904, 1961. [18] L. Kleinrock, “Queueing Systems, Volume I: Theory. Wiley-Interscience,” 1975. [19] A. S.-C. a. A. Nipper, “MQTT Version 3.1.1.,” OASIS Standard, 2014. [20] Sutton, R. S., & Barto, A. G. , “Reinforcement learning: An introduction. MIT Press,” 2018.
Description:	碩士國立政治大學資訊科學系碩士在職專班 112971012
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0112971012
Data Type:	thesis
Appears in Collections:	[資訊科學系碩士在職專班] 學位論文

Files in This Item:

File	Size	Format
101201.pdf	1671Kb	Adobe PDF	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback