Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/155970
|
Title: | 應用於白平衡校正的輔助雙模態跨域變換器 ABCDFormer: Auxiliary Bimodal Cross-Domain Transformer for White Balance Correction |
Authors: | 邱鈺臻 Chiu, Yu-Cheng |
Contributors: | 彭彥璁 Peng, Yan-Tsung 邱鈺臻 Chiu, Yu-Cheng |
Keywords: | 白平衡 多模態 變換器 White balance Multiple modalities Transformer |
Date: | 2025 |
Issue Date: | 2025-03-03 14:03:26 (UTC+8) |
Abstract: | 對於sRGB影像實現準確的白平衡(WB)是一項具有挑戰性的任務,需要修正來自不同光源的色溫變化並消除色偏,以生成自然且中性的顏色。現有的白平衡方法通常因後期處理中的全局色彩調整以及當前數據集中有限的色彩多樣性而受限,導致在色偏明顯的影像中表現不佳。為了解決這些限制,我們提出了一種輔助雙模態跨域變換器(Auxiliary Bimodal Cross-Domain Transformer, ABCDFormer),通過整合來自多模態與多領域的互補知識來提升白平衡校正效果。ABCDFormer結合了兩個輔助模型,用於提取全域顏色和色度直方圖訊息,進一步豐富目標模型對sRGB輸入的處理。此外,我們引入了一個互動通道注意力模組(Interactive Channel Attention, ICA),以促進跨模態知識轉移,將精煉的顏色特徵嵌入到影像表示中,實現更精確的校正。在公共白平衡基準數據集上的大量實驗表明,ABCDFormer優於現有的最先進方法。 Achieving accurate white balance (WB) for sRGB images is a challenging task, requiring the correction of color temperature variations from diverse light sources and the elimination of color casts to produce natural, neutral colors. Existing WB methods often struggle due to the limitations of global color adjustments applied during post-sRGB processing and the restricted color diversity in current datasets, resulting in suboptimal performance, especially for images with significant color shifts. To address these limitations, we propose an Auxiliary Bimodal Cross-Domain Transformer (ABCDFormer), which enhances WB correction by leveraging complementary knowledge from multiple modalities and domains. ABCDFormer integrates two auxiliary models to extract global color and chromaticity histograms, enriching the target model’s sRGB input processing. Additionally, an Interactive Channel Attention (ICA) module is introduced to facilitate cross-modality knowledge transfer, embedding refined color features into image representations for more precise corrections. Extensive experiments on benchmark WB datasets demonstrate that ABCDFormer outperforms state-of-the-art methods. |
Reference: | [1] Mahmoud Afifi and Michael S Brown. Deep white-balance editing. In CVPR, 2020. [2] Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. Auto white-balance correction for mixed-illuminant scenes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022. [3] Mahmoud Afifi, Brian Price, Scott Cohen, and Michael S Brown. When color constancy goes wrong: Correcting improperly white-balanced images. In CVPR, 2019. [4] Nikola Banić, Karlo Koščević, and Sven Lončarić. Unsupervised learning for color constancy. arXiv preprint arXiv:1712.00436, 2017. [5] Jonathan T Barron. Convolutional color constancy. In ICCV, 2015. [6] Jonathan T Barron and Ben Poole. The fast bilateral solver. In ECCV, 2016. [7] Jonathan T Barron and Yun-Ta Tsai. Fast fourier color constancy. In CVPR, 2017. [8] Simone Bianco and Claudio Cusano. Quasi-unsupervised color constancy. In CVPR, 2019. [9] Simone Bianco and Raimondo Schettini. Adaptive color constancy using faces. IEEE Trans. Pattern Analysis and Machine Intelligence., 2014. [10] David H Brainard and Brian A Wandell. Analysis of the retinex theory of color vision. JOSA A, 1986. [1] Mahmoud Afifi and Michael S Brown. Deep white-balance editing. In CVPR, 2020. [2] Mahmoud Afifi, Marcus A Brubaker, and Michael S Brown. Auto white-balance correction for mixed-illuminant scenes. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022. [3] Mahmoud Afifi, Brian Price, Scott Cohen, and Michael S Brown. When color constancy goes wrong: Correcting improperly white-balanced images. In CVPR, 2019. [4] Nikola Banić, Karlo Koščević, and Sven Lončarić. Unsupervised learning for color constancy. arXiv preprint arXiv:1712.00436, 2017. [5] Jonathan T Barron. Convolutional color constancy. In ICCV, 2015. [6] Jonathan T Barron and Ben Poole. The fast bilateral solver. In ECCV, 2016. [7] Jonathan T Barron and Yun-Ta Tsai. Fast fourier color constancy. In CVPR, 2017. [8] Simone Bianco and Claudio Cusano. Quasi-unsupervised color constancy. In CVPR, 2019. [9] Simone Bianco and Raimondo Schettini. Adaptive color constancy using faces. IEEE Trans. Pattern Analysis and Machine Intelligence., 2014. [10] David H Brainard and Brian A Wandell. Analysis of the retinex theory of color vision. JOSA A, 1986. [11] Gershon Buchsbaum. A spatial processor model for object colour perception. Journal of the Franklin institute, 1980. [12] Vladimir Bychkovsky, Sylvain Paris, Eric Chan, and Frédo Durand. Learning photographic global tonal adjustment with a database of input/output image pairs. In CVPR, 2011. [13] Jonathan Cepeda-Negrete and Raul E Sanchez-Yanez. Gray-world assumption on perceptual color spaces. In Image and Video Technology: 6th Pacific-Rim Symposium, PSIVT 2013, Guanajuato, Mexico, October 28-November 1, 2013. Proceedings 6, 2014. [14] Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yiping Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. Pre-trained image processing transformer. In CVPR, 2021. [15] Dongliang Cheng, Dilip K Prasad, and Michael S Brown. Illuminant estimation for color constancy: why spatial-domain methods work and the role of the color distribution. JOSA A, 2014. [16] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. [17] Sharma Gaurav. The ciede2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations. COLOR research and application, 30(1):21–30, 2005. [18] Peter Vincent Gehler, Carsten Rother, Andrew Blake, Tom Minka, and Toby Sharp. Bayesian color constancy revisited. In CVPR, pages 1–8. IEEE, IEEE Computer Society, 2008. [19] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018. [20] Yuanming Hu, Baoyuan Wang, and Stephen Lin. Fc4: Fully convolutional color constancy with confidence-weighted pooling. In CVPR, 2017. [21] Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision, pages 1501–1510, 2017. [22] Thomas Kailath. The divergence and bhattacharyya distance measures in signal selection. IEEE Transactions on Communication Technology, 1967. [23] Hakki Can Karaimer and Michael S Brown. Improving color reproduction accuracy on cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6440–6449, 2018. [24] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. [25] Furkan Kınlı, Doğa Yılmaz, Barış Özcan, and Furkan Kıraç. Modeling the lighting in scenes as style for auto white-balance correction. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023. [26] Chunxiao Li, Xuejing Kang, and Anlong Ming. Wbflow: Few-shot white balance for srgb images via reversible neural flows. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 1026–1034, 2023. [27] Chunxiao Li, Xuejing Kang, Zhifeng Zhang, and Anlong Ming. Swbnet: a stable white balance network for srgb images. In Proceedings of the AAAI Conference on Artificial Intelligence, 2023. [28] Yi-Chen Lo, Chia-Che Chang, Hsuan-Chao Chiu, Yu-Hao Huang, Chia-Ping Chen, Yu-Lin Chang, and Kevin Jou. Clcc: Contrastive learning for color constancy. In CVPR, 2021. [29] Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021. [30] Wu Shi, Chen Change Loy, and Xiaoou Tang. Deep specialized network for illuminant estimation. In ECCV, 2016. [31] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. [32] Joost Van De Weijer, Theo Gevers, and Arjan Gijsenij. Edge-based color constancy. IEEE TIP, 2007. [33] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 2017. [34] Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, and Houqiang Li. Uformer: A general u-shaped transformer for image restoration. In CVPR, 2022. [35] Yiyuan Zhang, Xiaohan Ding, Kaixiong Gong, Yixiao Ge, Ying Shan, and Xiangyu Yue. Multimodal pathway: Improve transformers with irrelevant data from other modalities. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6108–6117, 2024. [36] 陳冠融. 基於直方圖-視覺之雙變換器架構的白平衡校正. 碩士論文, 國立政治大學, 臺灣, 2024. 臺灣博碩士論文知識加值系統. |
Description: | 碩士 國立政治大學 資訊科學系 111753202 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0111753202 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
320201.pdf | | 16433Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|