Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/159300
|
Title: | 基於人物屬性特徵之多視角監控影片檢索管理系統設計 Design of a Multi-view Surveillance Video Retrieval and Management System Based on Pedestrian Attributes |
Authors: | 王懷憶 Wang, Huai-Yi |
Contributors: | 廖峻鋒 Liao, Chun-Feng 王懷憶 Wang, Huai-Yi |
Keywords: | 監控影片檢索 人物屬性識別 YOLOv8n PP-Human 加權餘弦相似度 Surveillance Video Retrieval and Management Pedestrian Attribute Recognition YOLOv8n PP-Human Weighted Cosine Similarity |
Date: | 2025 |
Issue Date: | 2025-09-01 16:20:19 (UTC+8) |
Abstract: | 近年來,隨著監控攝影機技術的蓬勃發展與人工智慧模型的快速演進,智慧型監控系統已逐漸成為城市安全與場域管理的重要工具。這些攝影設備不僅能即時記錄現場畫面,更具備自動辨識能力,能產生包括人物特徵、物件類別、行為偵測與場景語意等結構化數據。然現有監控系統多仍侷限於傳統以時間軸與攝影機為主的檢索方式,無法充分利用所產生的豐富數據資源,導致在龐大的影像資料庫中搜尋特定目標時效率低下,並需大量人工逐一檢視確認,耗時費力且容易誤判。 本研究為了解決上述問題,提出一種以「屬性標籤索引技術」為核心之影片檢索與管理方法。該方法整合目標檢測與行人屬性辨識技術,對監控畫面中的人物進行屬性標註,並轉化為結構化索引資料,使影片能依據內容特徵進行更精確且有效的檢索。本研究同時設計並實作一套完整系統架構,後端模組負責接收與處理AI模型產出的屬性數據,分析多支影片間的語意關聯;前端介面則以視覺化方式呈現檢索結果與影片關聯地圖,提升使用者在檢索與管理過程中的體驗。 透過實證研究與案例測試,本研究驗證了屬性標籤索引技術於影片搜尋效率與管理效能上的顯著提升。相較傳統搜尋方式,使用者可更快速準確地定位目標片段,減少不必要的瀏覽與人力成本,並提高整體系統的操作直覺性與可用性。本研究成果預期能為未來智慧監控系統之資料管理提供參考依據,並拓展影像資料在公安、交通、商業與其他應用領域的價值。 In recent years, with the rapid development of surveillance camera technology and the evolution of artificial intelligence models, intelligent surveillance systems have gradually become essential tools for urban safety and environment management. These camera systems not only provide real-time visual monitoring but also possess automated recognition capabilities, generating structured data such as human attributes, object categories, behavior detection, and scene semantics. However, most existing surveillance systems still rely on conventional time-based and camera-based retrieval methods, failing to fully utilize the rich data produced. As a result, locating specific segments from vast video databases remains inefficient, time-consuming, and heavily dependent on manual inspection, often leading to human errors. To address these challenges, this study proposes a novel video retrieval and management method based on Attribute Tag Indexing Technology. The proposed approach integrates object detection and pedestrian attribute recognition to automatically annotate human features in surveillance footage and transform them into structured index data. This allows for more accurate and efficient video retrieval based on content characteristics. Furthermore, a complete system architecture is developed: the backend module processes attribute data generated by AI models and analyzes semantic relationships across videos, while the frontend visualizes retrieval results and inter-video relationships through an intuitive and interactive interface. Through empirical experiments and case testing, the proposed method demonstrates significant improvements in video search efficiency and management performance. Compared to traditional search methods, users can locate target segments more quickly and accurately, reducing browsing time and manual effort, while enhancing overall system usability. The outcomes of this research are expected to contribute to the development of intelligent surveillance data management systems and extend the practical value of video data in fields such as public safety, traffic monitoring, commercial analytics, and beyond. |
Reference: | [1] S. E. Umbaugh, Digital Image Processing and Analysis: Computer Vision and Image Analysis. Boca Raton, FL, USA: CRC Press, n.d.
[2] C. Kastner, Machine Learning in Production: From Models to Products. Cambridge, MA, USA: MIT Press, 2025.
[3] S. J. Prince, Understanding Deep Learning. Cambridge, UK: MIT Press, 2023.
[4] V. Adewopo, N. Elsayed, Z. Elsayed, M. Ozer, A. Abdelgawad, and M. Bayoumi, "Review on action recognition for accident detection in smart city transportation systems," arXiv preprint arXiv:2208.09588, 2022. [Online]. Available: https://doi.org/10.48550/arXiv.2208.09588
[5] Y. Zhao and A. Cai, "A novel relative orientation feature for shape-based object recognition" in 2009 IEEE International Conference on Network Infrastructure and Digital Content, Beijing, China, 2009, pp. 686-689, doi: 10.1109/ICNIDC.2009.5360852.
[6] J. Cao et al., "Multi-Task Collaborative Attention Network for Pedestrian Attribute Recognition" in 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 2023, pp. 1-6, doi: 10.1109/IJCNN54540.2023.10191574.
[7] Y. Benezeth, B. Emile, H. Laurent, and C. Rosenberger, "Vision-based system for human detection and tracking in indoor environment," International Journal of Social Robotics, vol. 2, no. 1, pp. 41–52, 2010.
[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," in Proc. 25th Int. Conf. Neural Information Processing Systems (NeurIPS), 2012, pp. 1097–1105. [Online]. Available:https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
[9] J. Redmon and A. Farhadi, "YOLOv3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018. [Online]. Available: https://arxiv.org/abs/1804.02767
[10] Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). "You only look once: Unified, real-time object detection. " in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 779–788)
[11] W. Liu et al., "SSD: Single shot multibox detector," in Proc. Eur. Conf. Comput. Vis. (ECCV), B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., vol. 9905, Lecture Notes in Computer Science. Cham, Switzerland: Springer, 2016, pp. 21–37. [Online]. Available: https://doi.org/10.1007/978-3-319-46448-0_2
[12] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June 2017, doi: 10.1109/TPAMI.2016.2577031.
[13] X. Chen, S. Zhuang, X. Zheng and Z. Wang, "Pedestrian Attribute Recognition Based On Deep Learning : A Survey," in 2021 International Conference on Information Technology and Biomedical Engineering (ICITBE), Nanchang, China, 2021, pp. 140-144, doi: 10.1109/ICITBE54178.2021.00039.
[14] L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang and Q. Tian, "Scalable Person Re-identification: A Benchmark," in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 2015, pp. 1116-1124, doi: 10.1109/ICCV.2015.133.
[15] NVIDIA Corporation, "Convolutional neural network (CNN)," NVIDIA Developer. [Online]. Available: https://developer.nvidia.com/discover/convolutional-neural-network
[16] D. Li, X. Chen and K. Huang, "Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios," in 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), Kuala Lumpur, Malaysia, 2015, pp. 111-115, doi: 10.1109/ACPR.2015.7486476.
[17] L. Bourdev, S. Maji and J. Malik, "Describing people: A poselet-based approach to attribute classification," in 2011 International Conference on Computer Vision, Barcelona, Spain, 2011, pp. 1543-1550, doi: 10.1109/ICCV.2011.6126413.
[18] Z. Tianyu, M. Zhenjiang and Z. Jianhu, "Combining CNN with Hand-Crafted Features for Image Classification," in 2018 14th IEEE International Conference on Signal Processing (ICSP), Beijing, China, 2018, pp. 554-557, doi: 10.1109/ICSP.2018.8652428.
[19] Papers with Code, "Pedestrian attribute recognition," [Online]. Available: https://paperswithcode.com/task/pedestrian-attribute-recognition
[20] N. Zhang and J. Kim, "A Survey on Attention mechanism in NLP," in 2023 International Conference on Electronics, Information, and Communication (ICEIC), Singapore, 2023, pp. 1-4, doi: 10.1109/ICEIC57457.2023.10049971.
[21] X. Chen, C. Fu, M. Tie, C.-W. Sham, and H. Ma, "AFFNet: An attention-based feature-fused network for surface defect segmentation," Applied Sciences, vol. 13, no. 11, p. 6428, 2023. [Online]. Available: https://doi.org/10.3390/app13116428
[22] PaddlePaddle, "PaddleDetection: deploy pipeline README, " GitHub repository, release/2.7, 2023. [Online]. Available: https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.7/deploy/pipeline/README.md. [Accessed: Jul. 23, 2025].
[23] Y. Liu, J. Yan and W. Ouyang, "Quality Aware Network for Set to Set Recognition," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 4694-4703, doi: 10.1109/CVPR.2017.499.
[24] D. Li, Z. Zhang, X. Chen and K. Huang, "A Richly Annotated Pedestrian Dataset for Person Retrieval in Real Surveillance Scenarios," in IEEE Transactions on Image Processing, vol. 28, no. 4, pp. 1575-1590, April 2019, doi: 10.1109/TIP.2018.2878349.
[25] Y. Deng, P. Luo, C. C. Loy and X. Tang, "Pedestrian attribute recognition at far distance," in *Proc. 22nd ACM Int. Conf. Multimedia (ACM MM)*, Orlando, FL, USA, Nov. 2014, pp. 789–792, doi: 10.1145/2647868.2654966.
[26] L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang and Q. Tian, "Scalable person re-identification: A benchmark," *IEEE Trans. Pattern Anal. Mach. Intell.*, vol. 38, no. 9, pp. 1623–1640, Sep. 2016, doi: 10.1109/TPAMI.2015.2491929.
[27] A. Bochkovskiy, C.-Y. Wang and H.-Y. M. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv preprint arXiv:2004.10934, 2020.
[28] C. Zhang, "A Survey of Visual Traffic Surveillance Using Spatio-Temporal Analysis and Mining, " International Journal of Multimedia Data Engineering and Management, vol. 4, no. 3, pp. 42–60, Jul. 2013, doi: 10.4018/JMDEM.2013070103.
[29] S. H.Y., G. Shivakumar and H. S. Mohana, "Crowd Behavior Analysis: A Survey," in 2017 International Conference on Recent Advances in Electronics and Communication Technology (ICRAECT), Bangalore, India, 2017, pp. 169-178, doi: 10.1109/ICRAECT.2017.66.
[30] D. A. Reid, M. S. Nixon, and S. V. Stevenage, “Soft Biometrics; Human Identification Using Comparative Descriptions,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 6, pp. 1216–1228, Jun. 2014, doi: 10.1109/TPAMI.2013.219. |
Description: | 碩士 國立政治大學 資訊科學系碩士在職專班 112971017 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0112971017 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系碩士在職專班] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
101701.pdf | | 11120Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|