政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/135981
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 113648/144635 (79%)
造访人次 : 51642512      在线人数 : 474
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/135981


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/135981


    题名: 基於語意分割之電纜線偵測
    Power Line Detection Based on Semantic Segmentation
    作者: 游晉瑞
    YOU, CHIN-JUI
    贡献者: 廖文宏
    Liao, Wen-Hung
    游晉瑞
    YOU, CHIN-JUI
    关键词: 電腦視覺
    語意分割
    深度學習
    Computer vision
    Semantic segmentation
    Deep learning
    日期: 2021
    上传时间: 2021-07-01 19:54:29 (UTC+8)
    摘要: 電腦視覺的領域中,語意分割是一項非常重要的技術,目前被廣泛應用於無人載具、自動駕駛、場景理解與避障等任務。隨著深度學習技術的進步,語意分割已取得一定的成果,然而針對電纜線檢測,仍有許多待改善的空間。現今電塔及電纜線的開源資料有限,本研究主要對目前兩大開源的電纜線資料集進行電纜線檢查的語意分割實驗,分別為Ground Truth of Powerline Dataset及PLD-UVA資料集,並將兩資料集之Ground Truth重新標記,使模型能夠精準的獲得Ground Truth資訊。
    近年來學者已注意到將不同感測器的資料融合,如熱感測器或深度感測器的資料,可以協助提升光學影像模型的準確度,其中RTFNet利用兩個編碼器將熱影像特徵融合到光學影像中,但此架構並未考慮到也可以將光學影像的特徵融合到熱影像中,達到互相幫助的效果。藉此,本研究以RTFNet為基礎提出Dual Segmentation Model,希望透過邊緣強化來加強電纜線的光學影像模型,讓語意層面的資訊得以互補,進而提升分割模型的準確度,使其超越現今電纜線檢測表現相當優異的LS-Net之結果。本論文提出之dual segmentation 模型,其Precision以0.7919與LS-Net的0.8004並駕齊驅,Recall以0.7710超越LS-Net之0.5368,總結以上兩者所表現的F-score得出0.7753,超越LS-Net的0.5940將近0.2之多,體現出本研究所提出模型之可靠性。
    In the field of computer vision, semantic segmentation is a technique widely employed in tasks such as autonomous driving, scene understanding and obstacle avoidance. With the advancement of deep learning technology, the performance of semantic segmentation has progressed rapidly. Yet there is still much room for improvement in the detection of power lines. Currently, open-source datasets of electric towers and power lines are quite limited. This research mainly conducts the semantic segmentation experiment of power lines inspection using two open-source datasets, namely, the Ground Truth of Power lines dataset and the PLD-UVA dataset. We re-labeled the above two datasets to better locate the region occupied by power lines.
    Researchers have reported the benefits of fusing data from different sensors, such as thermal sensors or depth sensors, to enhance the accuracy of optical image models. Among them, RTFNet utilizes two encoders to fuse thermal image features. However, this architecture fails to consider the characteristics of the optical image to promote mutual assistance. This research proposes the dual segmentation model (DSM) based on RTFNet. We hope to strengthen the optical image model of the power lines through edge enhancement, so that the semantic-level information can be complemented, and the accuracy of the segmentation model can be improved. Experimental results indicate that dual segmentation model outperforms the LS-Net model. Specifically, the precision of our model (0.7919) is comparable with that of LS-Net (0.8004). The recall (0.7710) surpasses that of LS-Net (0.5368). F-score of the DSM model is 0.7753, exceeding that of LS-Net (0.5940) of by nearly 0.2, validating the superiority of the proposed approach.
    參考文獻: [1] Yann LeCun, Corinna Cortes, Christopher J.C. Burges. THE MNIST DATABASE of handwritten digits. http://yann.lecun.com/exdb/mnist/, last visited on Dec 2018
    [2] ImageNet. http://www.image-net.org/
    [3] Yuanqing Lin, Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu, Liangliang Cao, Thomas Huang. Large-scale image classification: Fast feature extraction and SVM training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1689-1696, 2011.
    [4] Deng, Jia, et al. "Imagenet large scale visual recognition competition 2012 (ILSVRC2012)." See net. org/challenges/LSVRC (2012).
    [5] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. Going Deeper with Convolutions. arXiv:1409.4842v1, 2014.
    [6] Park, E., et al. "ILSVRC-2017." URL http://www. image-net. org/challenges/LSVRC/2017 (2017).
    [7] Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." arXiv. preprint arXiv:1709.01507 7 (2017).
    [8] LeCun, Yann, et al. “Gradient-based learning applied to document. recognition.” (1989)
    [9] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
    [10] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
    [11] Huang, Gao, et al. "Densely connected convolutional networks." CVPR. Vol. 1. No. 2. 2017.
    [12] Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber"Highway. Network" arXiv:1505.00387 [cs.LG]
    [13] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional. Networks for Biomedical Image Segmentation. arXiv:1505.04597v1 [cs.CV] 18 May 2015
    [14] Van Nhan Nguyen∗†, Robert Jenssen∗ , and Davide Roverso† ∗The UiT. Machine. Learning Group, UiT The Arctic University of Norway, 9019 Tromsø, Norway †Analytics Department, eSmart Systems, 1783 Halden, Norway"LS-Net: Fast Single-Shot Line-Segment Detector"arXiv:1912.09532v2 [cs.CV] 24 Jan 2020
    [15] P.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High Energy Accelerators and Instrumentation, 1959.
    [16] Duda, R. O. and P. E. Hart, “Use of the Hough Transformation to Detect Lines and Curves in Pictures,”Comm. ACM, Vol. 15, pp. 11–15 (January, 1972).
    [17] Chen, Y.; Li, Y.; Zhang, H. Automatic power line extraction from High. resolution remote sensing imagery based on an improved radon transform. Pattern Recognit. 2016, 49, 174–186
    [18] Yunping Chen , Yang Li , Huixiong Zhang , Ling Tong , Yongxing Cao , Zhihang Xue"Automatic power line extraction from high resolution remote sensing imagery based on an improved Radon transform"https://doi.org/10.1016/j.patcog.2015.07.004
    [19] Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. From contours to regions: An. empirical evaluation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, USA, 20–25 June 2009; pp. 2294–2301.
    [20] Arbeláez, P.; Maire, M.; Fowlkes, C.; Malik, J. Contour detection and. hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 898–916.
    [21] Shen, W.; Wang, X.; Wang, Y.; Xiang, B. DeepContour: A deep convolutional. feature learned by positivesharing loss for contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3982–3991.
    [22] Bertasius, G.; Shi, J.; Torresani, L. DeepEdge: A multi-scale bifurcated deep. network for top-down contour detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4380–4389
    [23] Yang, J.; Price, B.; Cohen, S. Object Contour Detection with a Fully. Convolutional Encoder-Decoder Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 26 June–1 July 2016; pp. 193–202
    [24] Maninis, K.K.; Pont-Tuset, J.; Arbeláez, P. Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 819–833
    [25] Madaan, R.; Maturana, D.; Scherer, S. Wire detection using synthetic data and dilated convolutional networks for unmanned aerial vehicles. In Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 3487–3494.
    [26] Xingchen Zhang, Gang Xiao, Shanghai Jiao Tong University, Shanghai, China Ke Gong, East China Normal University, Shanghai, China Ping Ye, Junhao Zhao, Shanghai Jiao Tong University, Shanghai, China “Power Line Detection for Aircraft Safety Based on Image Processing Techniques: Advances and Recommendations”
    [27] Long, J., E. Shelhamer, and T. Darrell. Fully convolutional networks for. semantic segmentation. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
    [28] J. Canny, "A Computational Approach to Edge Detection," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-8, no. 6, pp. 679-698, Nov. 1986, doi: 10.1109/TPAMI.1986.4767851.
    [29] Rabab Abdelfattah, Xiaofeng Wang, and Song Wang Department of Electrical Engineering, University of South Carolina, USA Department of Computer Science and Engineering, University of South Carolina, USA ," TTPLA: An Aerial-Image Dataset for Detection and Segmentation of Transmission Towers and Power Lines "
    [30] Yetgin, Ömer Emre; Gerek, Ömer Nezih (2019), “Ground Truth of Powerline. Dataset (Infrared-IR and Visible Light-VL)”, Mendeley Data, v9http://dx.doi.org/10.17632/twxp8xccsw.9
    [31] PLD-UVA Dataset :https://github.com/SnorkerHeng/PLD-UAV
    [32] Sun, Y., Zuo, W., & Liu, M. (2019). RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes. IEEE Robotics and Automation Letters, 1–1. doi:10.1109/lra.2019.2904733
    [33] Hu, J., L. Shen, and G. Sun. Squeeze-and-excitation networks. in Proceedings of. the IEEE conference on computer vision and pattern recognition. 2018
    [34] Huang, X. and S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. in Proceedings of the IEEE International Conference on Computer Vision. 2017.
    [35] Hou, Q., et al. Strip Pooling: Rethinking Spatial Pooling for Scene Parsing. 2020. arXiv:2003.13328
    描述: 碩士
    國立政治大學
    資訊科學系
    107753043
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0107753043
    数据类型: thesis
    DOI: 10.6814/NCCU202100504
    显示于类别:[資訊科學系] 學位論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    304301.pdf4802KbAdobe PDF20检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈