English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 114898/145937 (79%)
Visitors : 53926863      Online Users : 477
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 資訊學院 > 資訊科學系 > 學位論文 >  Item 140.119/155971
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/155971


    Title: 基於雙模態提示與直方圖指定的多天氣影像修復
    Image Restoration for Multi-Weather Degradation Using Dual-Modal Prompts and Histogram Specification
    Authors: 巫昆泰
    Wu, Kun-Tai
    Contributors: 彭彥璁
    陳駿丞

    Peng, Yan-Tsung
    Chen, Jun-Cheng

    巫昆泰
    Wu, Kun-Tai
    Keywords: 文字提示
    影像修復
    多模態
    Prompt-base
    Image restoration
    Multi-modal
    All-in-one
    Date: 2025
    Issue Date: 2025-03-03 14:03:38 (UTC+8)
    Abstract: 本論文提出了一種創新的影像修復方法,名為DualPromptIR,旨在應對多天氣情境中不可預測且多變的退化訊息挑戰。早期的方法主要依賴專門設計的編碼器與解碼器,但效能仍有提升空間。近期的研究探索了從數據分佈中學習視覺提示(Visual Prompts),以辨識不同類型的影像退化問題。
    然而,現有利用視覺語言模型(Vision-Language Models, VLMs)的影像修復方法,並未充分發揮預訓練視覺編碼器中所蘊含的豐富資訊,限制了文字資訊與影像特徵之間的互動效果。針對上述限制,DualPromptIR 融合了大規模預訓練的視覺語言模型,並結合視覺與文本雙模態提示(Dual-Modal Prompts)。該模型基於多層的編碼器和解碼器架構,並引入特徵互動模組(Feature Interaction Block, FIB),其中包含空間互動模組(Spatial Interaction Module, SIM)和文本聚合模組(Textual Aggregation Module, TAM)。這些模組能夠有效地促進輸入特徵與提示資訊之間的互動,從而精確辨識並修復受損影像中的退化區域。此外,該模型在初始修復階段引入色彩直方圖模組(Histogram Matching Module, HMM),透過學習色彩直方圖來修正影像中的色彩偏差,進一步提升影像修復的準確性與一致性。大量實驗結果表明,DualPromptIR 在「多任務」影像修復任務中,表現優於現有的最先進方法。
    This thesis proposes DualPromptIR, an advanced image restoration model designed to address the challenges of unpredictable and diverse degradation. While early methods relied on specialized encoders and decoders, there remains potential for performance enhancement. Recent research has explored learning visual prompts from data distribution to identify different degradation types.
    However, existing restoration methods utilizing Vision-Language Models (VLMs) do not fully exploit the rich information embedded in pre-trained visual encoders, which limits the interaction between textual information and image features.
    %
    Addressing these limitations, DualPromptIR leverages large-scale pre-trained vision-language models, incorporating both visual and textual dual-modal prompts. The model is built on a multi-level encoder-decoder architecture, featuring a Feature Interaction Block (FIB) that includes the Spatial Interaction Module (SIM) and the Textual Aggregation Module (TAM). These modules enable effective interaction between input features and prompts, allowing precise identification and restoration of degraded regions in corrupted images.
    Additionally, the Histogram Matching Module (HMM) is introduced at the initial restoration stage to correct color deviations in images by learning color histogram, the model achieves more accurate and consistent restoration. Extensive experiments demonstrate that DualPromptIR performs favorably against state-of-the-art methods in “all-in-one” image restoration tasks.
    Reference: [1] Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. Pre-trained image processing transformer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12299–12310, 2021.
    [2] Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for im age restoration. In European conference on computer vision, pages 17–33. Springer, 2022. [3] Wei-Ting Chen, Hao-Yu Fang, Jian-Jiun Ding, Cheng-Che Tsai, and Sy-Yen Kuo. Jstasr: Joint size and transparency-aware snow removal algorithm based on modified partial convolution and veiling effect removal. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI 16, pages 754–770. Springer, 2020.
    [4] Wei-Ting Chen, Zhi-KaiHuang, Cheng-CheTsai, Hao-Hsiang Yang, Jian-JiunDing, and Sy-Yen Kuo. Learning multiple adverse weather removal via two-stage knowl edge learning and multi-contrastive regularization: Toward a unified model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17653–17662, 2022.
    [5] Xiang Chen, Hao Li, Mingqiang Li, and Jinshan Pan. Learning a sparse transformer 35 network for effective image deraining. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5896–5905, 2023.
    [6] Dinu Coltuc, Philippe Bolon, and J-M Chassery. Exact histogram specification. IEEE Transactions on Image processing, 15(5):1143–1152, 2006.
    [7] Alexey Dosovitskiy. Animageisworth16x16words: Transformers for image recog nition at scale. arXiv preprint arXiv:2010.11929, 2020.
    [8] Hu Gao, Jing Yang, Ning Wang, Jingfan Yang, Ying Zhang, and Depeng Dang. Prompt-based all-in-one image restoration using cnns and transformer. arXiv preprint arXiv:2309.03063, 2023.
    [9] Jie Hu, Li Shen, and GangSun. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018.
    [10] PhillipIsola, Jun-YanZhu, TinghuiZhou, and AlexeiAEfros. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017. [11] Kui Jiang, Zhongyuan Wang, Peng Yi, Chen Chen, Zheng Wang, Xiao Wang, Jun jun Jiang, and Chia-Wen Lin. Rain-free and residue hand-in-hand: A progressive coupled network for real-time image deraining. IEEE Transactions on Image Processing, 30:7404–7418, 2021.
    [12] Xin Lai, Zhuotao Tian, Yukang Chen, Yanwei Li, Yuhui Yuan, Shu Liu, and Jiaya Jia. Lisa: Reasoning segmentation via large language model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9579 9589, 2024. 36
    [13] BoyunLi, Xiao Liu, Peng Hu, Zhongqin Wu, Jiancheng Lv, and Xi Peng. All-in-one image restoration for unknown corruption. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17452–17462, 2022. [14] Ruoteng Li, Loong-Fah Cheong, and Robby T Tan. Heavy rain image restoration: Integrating physics model and conditional adversarial learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1633 1642, 2019.
    [15] Ruoteng Li, Robby T Tan, and Loong-Fah Cheong. All in one bad weather removal using architectural search. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3175–3185, 2020.
    [16] Xia Li, Jianlong Wu, Zhouchen Lin, Hong Liu, and Hongbin Zha. Recurrent squeeze-and-excitation context aggregation net for single image deraining. In Proceedings of the European conference on computer vision (ECCV), pages 254–269, 2018.
    [17] Xing Liu, Masanori Suganuma, Zhun Sun, and Takayuki Okatani. Dual residual networks leveraging the potential of paired operations for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7007–7016, 2019.
    [18] Yun-Fu Liu, Da-Wei Jaw, Shih-Chia Huang, and Jenq-Neng Hwang. Desnownet: Context-aware deep network for snow removal. IEEE Transactions on Image Processing, 27(6):3064–3073, 2018.
    [19] Ozan Özdenizci and Robert Legenstein. Restoring vision in adverse weather condi tions with patch-based denoising diffusion models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 37
    [20] Yan-Tsung Peng, Yen-Rong Chen, Guan-Rong Chen, and Chun-Jung Liao. Histoformer: Histogram-based transformer for efficient underwater image enhancement. IEEE Journal of Oceanic Engineering, 2024.
    [21] Vaishnav Potlapalli, Syed Waqas Zamir, Salman Khan, and Fahad Shahbaz Khan. Promptir: Prompting for all-in-one blind image restoration. arXiv:2306.13090, 2023. arXiv preprint
    [22] RuiQian, RobbyTTan, WenhanYang, JiajunSu, andJiaying Liu. Attentive generative adversarial network for raindrop removal from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2482–2491, 2018.
    [23] Yuhui Quan, Shijie Deng, Yixin Chen, and Hui Ji. Deep learning for seeing through window with raindrops. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2463–2471, 2019.
    [24] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
    [25] Shangquan Sun, Wenqi Ren, Xinwei Gao, Rui Wang, and Xiaochun Cao. Restoring images in adverse weather conditions via histogram transformer. In European Conference on Computer Vision, pages 111–129. Springer, 2025.
    [26] Chunwei Tian, Menghua Zheng, Wangmeng Zuo, Bob Zhang, Yanning Zhang, and David Zhang. Multi-stage image denoising with the wavelet transform. Pattern Recognition, 134:109050, 2023. 38
    [27] Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, and Yinxiao Li. Maxim: Multi-axis mlp for image processing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5769–5780, 2022.
    [28] Jeya Maria Jose Valanarasu, Rajeev Yasarla, and Vishal M Patel. Transweather: Transformer-based restoration of images degraded by adverse weather conditions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2353–2363, 2022.
    [29] Tianyu Wang, XinYang, KeXu,ShaozheChen,QiangZhang, andRynsonWHLau. Spatial attentive single-image deraining with a high quality real rain dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12270–12279, 2019.
    [30] Zejin Wang, Jiazheng Liu, Guoqing Li, and Hua Han. Blind2unblind: Self supervised image denoising with visible blind spots. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2027 2036, 2022.
    [31] Jie Xiao, Xueyang Fu, Aiping Liu, Feng Wu, and Zheng-Jun Zha. Image de-raining transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(11):12978–12995, 2022.
    [32] Hao Yang, Liyuan Pan, Yan Yang, and Wei Liang. Language-driven all-in-one ad verse weather removal. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24902–24912, 2024.
    [33] Tian Ye, Sixiang Chen, Jinbin Bai, Jun Shi, Chenghao Xue, Jingxia Jiang, Junjie Yin, Erkang Chen, and Yun Liu. Adverse weather removal with codebook priors. In 39 Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 12653–12664, 2023.
    [34] Qiaosi Yi, Juncheng Li, Faming Fang, Aiwen Jiang, and Guixu Zhang. Efficient and accurate multi-scale topological network for single image dehazing. IEEE Transactions on Multimedia, 24:3114–3128, 2021.
    [35] Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5728–5739, 2022.
    [36] Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Multi-stage progressive image restoration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14821–14831, 2021.
    [37] Jinghao Zhang, Jie Huang, Mingde Yao, Zizheng Yang, Hu Yu, ManZhou, andFeng Zhao. Ingredient-oriented multi-degradation learning for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5825–5835, 2023.
    [38] Kaihao Zhang, Rongqing Li, Yanjiang Yu, Wenhan Luo, and Changsheng Li. Deep dense multi-scale network for snow removal using semantic and depth priors. IEEE Transactions on Image Processing, 30:7419–7431, 2021.
    [39] MoZhao, GangCao, XianglinHuang, and Lifang Yang. Hybrid transformer-cnn for real image denoising. IEEE Signal Processing Letters, 29:1252–1256, 2022.
    [40] Haofeng Zhong, Yuchen Hong, Shuchen Weng, Jinxiu Liang, and Boxin Shi. 40 Language-guided image reflection separation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 24913–24922, June 2024.
    [41] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
    [42] Yurui Zhu, TianyuWang, XueyangFu, XuanyuYang, XinGuo,JifengDai, YuQiao, and Xiaowei Hu. Learning weather-general and weather-specific features for image restoration under multiple adverse weather conditions. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 21747 21758, 2023.
    Description: 碩士
    國立政治大學
    資訊科學系
    111753207
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0111753207
    Data Type: thesis
    Appears in Collections:[資訊科學系] 學位論文

    Files in This Item:

    File SizeFormat
    320701.pdf15918KbAdobe PDF0View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback