政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/142123
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文筆數/總筆數 : 113324/144300 (79%)
造訪人次 : 51131620      線上人數 : 853
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/142123
    請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/142123


    題名: 基於深度學習之衛星圖像變遷偵測優化
    Optimization of Deep Learning-based Change Detection in Satellite Images
    作者: 陳湘淇
    Chen, Hsiang-Chi
    貢獻者: 廖文宏
    Liao, Wen-Hung
    陳湘淇
    Chen, Hsiang-Chi
    關鍵詞: 深度學習
    卷積神經網路
    轉換器
    衛星影像
    地景變遷偵測
    Deep learning
    Convolutional neural networks
    Transformer
    Satellite images
    Change detection
    日期: 2022
    上傳時間: 2022-10-05 09:14:53 (UTC+8)
    摘要: 地景變遷偵測為遙測影像分析的基本應用之一,該任務須自給定之兩張同一地區、不同時間點之衛星影像,偵測出變遷部位,廣泛被運用於環境監控、災害評估、土地資源規劃等範疇。深度學習引入地景變遷偵測任務,能夠輔助資料標註人員加速工作流程;近幾年,除了在電腦視覺領域發展越趨成熟的卷積神經網路,基於轉換器的視覺任務架構大放異彩,本研究分別選用基於卷積網路、純轉換器、混合結構作為編碼的SNUNet、ChangeFormer與BIT地景變遷偵測模型進行探討,針對不同條件評估模型影響,並以此優化偵測表現。
    為維持模型面對不同變遷性質,或來自不同資料集之樣本的適應能力,本研究從訓練資料方面調整,增加一倍輸入時序交換的資料量或合併資料集進行訓練;另外我們也從目標函數端修改提出雙向損失,在不更動資料集之情況下,讓模型同時學習到「出現、消失」類型之變遷。上述訓練方式皆能有效提升模型泛化能力,在LEVIR-CD測試集上,IoU-1自不及0.1上升至超越0.7,達到接近基準之表現(0.7631);在S2Looking測試集上超越基準(0.4184),從小於0.1的IoU-1提升到0.4422。
    Change detection (CD), one of the fundamental applications in remote sensing (RS) image analysis, aims to identify surface changes based on bitemporal images of the same area. It is widely used in environmental monitoring, disaster assessment and land resource planning. Introducing deep learning approaches for change detection could help geographic data annotation workers improve workflow efficiency. In addition to convolutional neural network (CNN), the deep learning framework that has achieved remarkable performance on a variety of computer vision applications in recent years is transformer. To compare and improve the performance of change detection, this research investigates modern change detection models, namely, SNUNet, ChangeFormer and BIT, which are CNN-based, pure transformer-based and CNN-transformer hybrid encoding model, respectively.
    In this work, we attempt to maintain the adaptability of the CD model when processing input image pairs which have different changed types or are from another datasets. In terms of training data, we can either double the number of training pairs d by adding the same bitemporal images in reverse order or merge CD datasets to build a larger training data. In terms of objective function, we propose a bidirectional loss, which considers not only newly built but also demolished areas without the need for data augmentation. Experimental results show that the above approaches attain significant accuracy improvements (over 0.7 from less than 0.1 of the IoU-1 on the LEVIR-CD test sets; from below 0.1 of the IoU-1 increased to 0.4422 on the S2Looking test sets) and greatly enhance the model’s generalization capability.
    參考文獻: [1] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
    [2] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
    [3] Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurons in the cat`s striate cortex. The Journal of physiology, 148(3), 574-591.
    [4] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25, 1097-1105.
    [5] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
    [6] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
    [7] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
    [8] Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580-587).
    [9] He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).
    [10] Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
    [11] Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12), 2481-2495.
    [12] Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2014). Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062.
    [13] Yu, F., & Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122.
    [14] Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences, 79(8), 2554-2558.
    [15] Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training.
    [16] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
    [17] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., & Amodei, D. (2020). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems (pp. 1877-1901). Curran Associates, Inc..
    [18] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
    [19] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. nature, 323(6088), 533-536.
    [20] Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P., & Zhang, L. (2021). Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 6881-6890).
    [21] Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3213-3223).
    [22] Zhou, B., Zhao, H., Puig, X., Xiao, T., Fidler, S., Barriuso, A., & Torralba, A. (2019). Semantic understanding of scenes through the ade20k dataset. International Journal of Computer Vision, 127(3), 302-321.
    [23] Mottaghi, R., Chen, X., Liu, X., Cho, N.G., Lee, S.W., Fidler, S., Urtasun, R., & Yuille, A. (2014). The role of context for object detection and semantic segmentation in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 891-898).
    [24] Valanarasu, J. M. J., Oza, P., Hacihaliloglu, I., & Patel, V. M. (2021, September). Medical transformer: Gated axial-attention for medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 36-46). Springer, Cham.
    [25] Xiao, X., Lian, S., Luo, Z., & Li, S. (2018, October). Weighted res-unet for high-quality retina vessel segmentation. In 2018 9th international conference on information technology in medicine and education (ITME) (pp. 327-331). IEEE.
    [26] Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., & Liang, J. (2018). Unet++: A nested u-net architecture for medical image segmentation. In Deep learning in medical image analysis and multimodal learning for clinical decision support (pp. 3-11). Springer, Cham.
    [27] Fang, S., Li, K., Shao, J., & Li, Z. (2021). SNUNet-CD: A densely connected Siamese network for change detection of VHR images. IEEE Geoscience and Remote Sensing Letters, 19, 1-5.
    [28] Woo, S., Park, J., Lee, J. Y., & Kweon, I. S. (2018). CBAM: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV) (pp. 3-19).
    [29] Lebedev, M. A., Vizilter, Y. V., Vygolov, O. V., Knyaz, V. A., & Rubis, A. Y. (2018). Change detection in remote sensing images using conditional adversarial networks. International Archives of the Photogrammetry, Remote Sensing & Spatial Information Sciences, 42(2).
    [30] Chen, H., Qi, Z., & Shi, Z. (2021). Remote sensing image change detection with transformers. IEEE Transactions on Geoscience and Remote Sensing.
    [31] Chen, H., & Shi, Z. (2020). A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sensing, 12(10), 1662.
    [32] Ji, S., Wei, S., & Lu, M. (2018). Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Transactions on Geoscience and Remote Sensing, 57(1), 574-586.
    [33] Zhang, C., Yue, P., Tapete, D., Jiang, L., Shangguan, B., Huang, L., & Liu, G. (2020). A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 166, 183-200.
    [34] Bandara, W. G. C., & Patel, V. M. (2022). A Transformer-Based Siamese Network for Change Detection. arXiv preprint arXiv:2201.01293.
    [35] Shen, L., Lu, Y., Chen, H., Wei, H., Xie, D., Yue, J., Chen, R., Lv, S., & Jiang, B. (2021). S2Looking: A satellite side-looking dataset for building change detection. Remote Sensing, 13(24), 5094..
    [36] Dong, C., Loy, C. C., He, K., & Tang, X. (2015). Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2), 295-307.
    [37] Khan, B., Fraz, M. M., & Mumtaz, A. (2021, December). Enhanced Super-Resolution via Squeeze-and-Residual-Excitation in Aerial Imagery. In 2021 International Conference on Frontiers of Information Technology (FIT) (pp. 19-24). IEEE.
    [38] imgaug.augmenters.weather.CloudLayer. Imgaug documentation. https://imgaug.readthedocs.io/en/latest/source/api_augmenters_weather.html#imgaug.augmenters.weather.CloudLayer
    描述: 碩士
    國立政治大學
    資訊科學系
    109753114
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0109753114
    資料類型: thesis
    DOI: 10.6814/NCCU202201612
    顯示於類別:[資訊科學系] 學位論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    311401.pdf3964KbAdobe PDF259檢視/開啟


    在政大典藏中所有的資料項目都受到原著作權保護.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回饋