Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/125643
|
Title: | 基於半色調轉換的影像對抗例防禦機制 Defense mechanism against adversarial attacks using density-based representation of images |
Authors: | 黃辰瑋 Huang, Chen-Wei |
Contributors: | 廖文宏 Liao, Wen-Hung 黃辰瑋 Huang, Chen-Wei |
Keywords: | 深度學習 對抗例防禦 半色調 輸入轉化 deep learning adversarial defense halftoning input recharacterization |
Date: | 2019 |
Issue Date: | 2019-09-05 16:15:04 (UTC+8) |
Abstract: | 對抗例是一種刻意使深度學習模型分類錯誤的輸入資料,只需要在輸入中加入微小的干擾便可以使輸出結果大幅改變。目前的研究中已提出了許多方法來保護神經網路避免受到對抗例攻擊的影響,而其中多數防禦方法已被證實無法有效抵抗對抗例的攻擊。為了解決這個問題,我們提出了輸入轉化,一種有效消除對抗例干擾以維持模型準確率的防禦方法。 輸入轉化分成兩個階段: 正向轉換及反向重構。我們希望透過具有破壞性的雙向轉換方式,使被刻意加入的干擾失效。在這項研究中,我們使用半色調轉換及半色調還原作為轉化方法進行實驗,並透過卷積層視覺化等方式進行結果分析。我們使用Tiny-ImageNet中的200個類別,共約26萬張128x128的灰階及半色調圖片作為訓練資料。 現有對抗例防禦研究中,大多採用梯度模糊、輸入轉換及對抗例訓練等機制作為防禦策略,其中又以對抗例訓練最具防禦效果。然而,對抗例訓練需生成對抗例並加入訓練樣本中,這在大部分的應用是不實際的。我們所提出的方法較類似於輸入轉換,使用半色調轉換方式將圖片由連續色調轉換為二進制,希望藉圖片不同的表現形態使對抗例攻擊失效。同時也使用半色調還原對其進行還原,嘗試藉由還原過程消除對抗例干擾。此方法不須對抗例訓練龐大的訓練成本,僅需對輸入資料進行前處理。 本論文提出的方法在VGG-16架構上對於灰階模型top5可達76.5%、半色調模型top5可達80.4%、混合模型更是達到85.14%。面對FGSM、I-FGSM、PGD對抗例攻擊混合模型top5仍可維持80.97%、78.77%、81.56%的準確率。雖然準確率仍受到影響,但對抗例效果卻大幅下降,與現有輸入轉換防禦機制相比,我們的方法平均可提升準確率約10%。 Adversarial examples are slightly modified inputs that are devised to cause erroneous inference of deep learning models. Recently, many methods have been proposed to counter the attack of adversarial examples. However, new ways of generating attacks have also surfaced accordingly. Protection against the intervention of adversarial examples is a fundamental issue that needs to be addressed before wide adoption of deep learning based intelligent systems. In this research, we utilize the method known as input recharacterization to effectively remove the perturbations found in the adversarial examples in order to maintain the performance of the original model. Input recharacterization typically consists of two stages: a forward transform and a backward reconstruction. Our hope is that by going through the lossy two-way transformation, the purposely added `noise` or `perturbation` will become ineffective. In this work, we employ digital halftoning and inverse halftoning for input recharacterization, although there exist many possible choices. We apply convolution layer visualization to better understand the network architecture and characteristics. The data set used in this study is Tiny ImageNet, consisting of 260 thousand 128x128 grayscale images belonging to 200 classes. Most of defense mechanisms rely on gradient masking, input transform and adversarial training. Among these strategies, adversarial training is widely regarded as the most effective. However, it requires adversarial examples to be generated and included in the training set, which is impractical in most applications. The proposed approach is more similar to input transform. We convert the image from intensity-based representation to density-based representation using halftone operation, which hopefully invalidates the attack by changing the image representation. We also investigate whether inverse halftoning can eliminate the adversarial perturbation. The proposed method does not require extra training of adversarial samples. Only low-cost input pre-processing is needed. On the VGG-16 architecture, the top-5 accuracy for the grayscale model is 76.5%, the top-5 accuracy for halftone model is 80.4%, and the top-5 accuracy for the hybrid model (trained with both grayscale and halftone images) is 85.14%. With adversarial attacks generated using FGSM, I-FGSM, and PGD, the top-5 accuracy of the hybrid model can still maintain 80.97%, 78.77%, 81.56%, respectively. Although the accuracy has been affected, the influence of adversarial examples is significantly discounted. The average improvement over existing input transform defense mechanisms is approximately 10%. |
Reference: | [1] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, arXiv preprint arXiv:1312.6199, 2014. [2] IBM Research AI tutorial: http://research.ibm.com/labs/ireland/nemesis2018/pdf/tutorial.pdf [3] Daniel Lowd and Christopher Meek. Adversarial learning. In Proceedings of the 11th ACM SIGKDD international conference on Knowledge discovery in data mining, pages 641–647. ACM, 2005. [4] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017. [5] Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014. [6] Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. Adversarial examples in the physical world. arXiv: Computer Vision and Pattern Recognition, 2016. [7] F. Tramer, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel. Ensemble Adversarial Training: " Attacks and Defenses. ArXiv e-prints, May 2017. [8] Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on, pages 372–387. IEEE, 2016. [9] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR, abs/1312.6034, 2013. [10] Seyed Mohsen Moosavi Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), number EPFL-CONF-218057, 2016. [11] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, 2017. [12] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017. [13] Marco Barreno, Blaine Nelson, Russell Sears, Anthony D Joseph, and J Doug Tygar. Can machine learning be secure? In Proceedings of the 2006 ACM Symposium on Information, computer and communications security, pages 16–25. ACM, 2006. [14] Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989. [15] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. international conference on machine learning, pages 448–456, 2015. [16] Cristian Bucilua, Rich Caruana, and Alexandru Niculescu-Mizil. Model compression. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’06, pages 535–541, New York, NY, USA, 2006. ACM. [17] Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015. [18] Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In Security and Privacy (SP), 2016 IEEE Symposium on, pages 582–597. IEEE, 2016. [19] Shixiang Gu and Luca Rigazio. Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068, 2014. [20] Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. computer vision and pattern recognition, pages 427–436, 2015. [21] Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick D Mcdaniel. On the (statistical) detection of adversarial examples. arXiv: Cryptography and Security, 2017. [22] Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. On detecting adversarial perturbations. international conference on learning representations, 2017. [23] Chuan Guo, Mayank Rana, Moustapha Cisse and Laurens van der Maaten. Countering Adversarial Images Using Input Transformation. arXiv:1711.00117v3 [cs.CV] 25 Jan 2018 [24] Weilin Xu, David Evans, and Yanjun Qi. Feature squeezing: Detecting adversarial examples in deep neural networks. CoRR, abs/1704.01155, 2017. [25] Leonid Rudin, Stanley Osher, and Emad Fatemi. Nonlinear total variation based noise removal algorithms. Physica D, 60:259–268, 1992. [26] Yuri Boykov, Olga Veksler, and Ramin Zabih. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11):1222–1239, 2001. [27] 陳乃瑋,基於卷積核冗餘的神經網路壓縮機制,政治大學資訊科學系碩士論文,2018。 [28] Repository for Scale-recurrent Network for Deep Image Deblurring https://github.com/jiangsutx/SRN-Deblur#scale-recurrent-network-for-deep-image-deblurring |
Description: | 碩士 國立政治大學 資訊科學系 106753021 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0106753021 |
Data Type: | thesis |
DOI: | 10.6814/NCCU201900958 |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
302101.pdf | 3554Kb | Adobe PDF2 | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|