Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/153280
|
Title: | 在自然語言處理中同時抵禦內部雜訊與外部攻擊的雙向防禦框架 A Dual Defense Framework Against Internal Noise and External Attacks in Natural Language Processing |
Authors: | 莊格維 Chuang, Ko-Wei |
Contributors: | 李蔡彥 黃瀚萱 Tsai-Yen Li Hen-Hsen Huang 莊格維 Chuang, Ko-Wei |
Keywords: | 雜訊標籤學習 群眾學習 對抗意識攻擊 對抗訓練 Noisy label learning Learning from Crowdsourcing Adversarial-awareness-attack Adversarial training |
Date: | 2024 |
Issue Date: | 2024-09-04 14:35:42 (UTC+8) |
Abstract: | 近年來,隨著大型語言模型(如Chat GPT)和通用人工智慧(AGI)及生成式人工智慧(GAI)模型的出現,這些技術引起了廣泛關注和研究。與此同時,人們對於人工智慧可能取代人類工作的擔憂日益增加,這種擔憂導致許多從業者試圖故意減緩人工智慧的發展,進而對人工智慧的進步構成挑戰。 在實際職場調查中,我們觀察到兩個值得關注的現象:第一,外部競爭者或廣告業者試圖利用模型的辨識弱點,對模型進行對抗性攻擊;第二,內部從業人員故意上傳錯誤的訓練資料,導致語言模型的準確率下降。 本研究旨在探討如何同時防範來自內部和外部的威脅,即對抗性標記者提供的錯誤標籤和外部使用者進行的對抗性攻擊,以確保模型的穩健性,並提升語言模型在雙重攻擊下的強韌性。 透過實驗和現實情境資料集的收集與分析,我們驗證了一系列演算法在抵抗對抗者上傳錯誤資料和辨識對抗攻擊方面的有效性。除此之外,我們的方法將基於圖像分類任務的先進雜訊標籤學習方法,替換成應用於基於自然語言處理方法,並證實了這些雜訊標籤學習方法在樣本特徵為圖像陣列以及文字向量時皆可以有效的實現,從而進一步證明了我們框架的價值和可靠性。 儘管本研究仍存在一些需要克服的限制,但我們相信,研究結果為未來可能面臨更多人為干擾的情境提供了更加穩健的防禦策略和構想。 In recent years, with the emergence of large language models such as Chat GPT, Artificial General Intelligence (AGI), and Generative Artificial Intelligence (GAI) models, these technologies have garnered widespread attention and research. Concurrently, concerns about AI potentially replacing human jobs have increased, leading many professionals to deliberately slow down AI development, posing challenges to its advancement. In our workplace investigation, we observed two noteworthy phenomena: first, external competitors or advertisers attempting to exploit model recognition weaknesses to conduct adversarial attacks; second, internal personnel intentionally uploading erroneous training data, causing a decline in language model accuracy. This study aims to explore how to simultaneously defend against threats from both internal and external sources, namely, adversarial annotators providing incorrect labels and external users conducting adversarial attacks, to ensure model robustness and enhance the resilience of language models under dual attacks. Through experiments and analysis of real-world datasets, we validated a series of algorithms that effectively counteract erroneous data uploads by adversaries and recognize adversarial attacks. Additionally, our approach adapts advanced noisy label learning methods, originally developed for image classification tasks, to natural language processing (NLP) methods, demonstrating that these noisy label learning methods are effective for both image arrays and text vectors. This further underscores the value and reliability of our framework. Despite some limitations that still need to be addressed, we believe our research results provide more robust defense strategies and insights for future scenarios potentially facing more human-induced interference. |
Reference: | [1] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” May 09, 2016, arXiv: arXiv:1506.02640. doi: 10.48550/arXiv.1506.02640. [2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2012. Accessed: Mar. 03, 2024. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html [3] K. A. Hambarde and H. Proenca, “Information Retrieval: Recent Advances and Beyond,” IEEE Access, vol. 11, pp. 76581–76604, 2023, doi: 10.1109/ACCESS.2023.3295776. [4] W. Zhang, T. Du, and J. Wang, “Deep Learning over Multi-field Categorical Data: A Case Study on User Response Prediction,” Jan. 11, 2016, arXiv: arXiv:1601.02376. doi: 10.48550/arXiv.1601.02376. [5] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” May 24, 2019, arXiv: arXiv:1810.04805. Accessed: Mar. 04, 2024. [Online]. Available: http://arxiv.org/abs/1810.04805 [6] A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving Language Understanding by Generative Pre-Training”. [7] J. Howard and S. Ruder, “Universal Language Model Fine-tuning for Text Classification,” May 23, 2018, arXiv: arXiv:1801.06146. doi: 10.48550/arXiv.1801.06146. [8] G. Damioli, V. Van Roy, and D. Vertesy, “The impact of artificial intelligence on labor productivity,” Eurasian Bus. Rev., vol. 11, no. 1, pp. 1–25, Mar. 2021, doi: 10.1007/s40821-020-00172-8. [9] M.-H. Huang and R. T. Rust, “Artificial Intelligence in Service,” J. Serv. Res., vol. 21, no. 2, pp. 155–172, May 2018, doi: 10.1177/1094670517752459. [10] D. Angluin and P. Laird, “Learning from noisy examples,” Mach. Learn., vol. 2, no. 4, pp. 343–370, Apr. 1988, doi: 10.1007/BF00116829. [11] B. Frenay and M. Verleysen, “Classification in the Presence of Label Noise: A Survey,” IEEE Trans. Neural Netw. Learn. Syst., vol. 25, no. 5, pp. 845–869, May 2014, doi: 10.1109/TNNLS.2013.2292894. [12] B. Han et al., “Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels,” Oct. 30, 2018, arXiv: arXiv:1804.06872. doi: 10.48550/arXiv.1804.06872. [13] L. P. F. Garcia, J. Lehmann, A. C. P. L. F. de Carvalho, and A. C. Lorena, “New label noise injection methods for the evaluation of noise filters,” Knowl.-Based Syst., vol. 163, pp. 693–704, Jan. 2019, doi: 10.1016/j.knosys.2018.09.031. [14] B. Han et al., “A Survey of Label-noise Representation Learning: Past, Present and Future,” Feb. 20, 2021, arXiv: arXiv:2011.04406. doi: 10.48550/arXiv.2011.04406. [15] D. Zhu, M. A. Hedderich, F. Zhai, D. I. Adelani, and D. Klakow, “Is BERT Robust to Label Noise? A Study on Learning with Noisy Labels in Text Classification,” Apr. 20, 2022, arXiv: arXiv:2204.09371. Accessed: May 24, 2024. [Online]. Available: http://arxiv.org/abs/2204.09371 [16] G. Dawson and R. Polikar, “Rethinking Noisy Label Models: Labeler-Dependent Noise with Adversarial Awareness,” Jun. 04, 2021, arXiv: arXiv:2105.14083. Accessed: Mar. 01, 2024. [Online]. Available: http://arxiv.org/abs/2105.14083 [17] G. Hermosillovaladez, C. Florin, L. Bogoni, L. Moy, Raykar, Vikas, and Shipeng, Yu, “Learning From Crowds”. [18] H. Song, M. Kim, D. Park, Y. Shin, and J.-G. Lee, “Learning from Noisy Labels with Deep Neural Networks: A Survey,” Mar. 09, 2022, arXiv: arXiv:2007.08199. Accessed: Mar. 04, 2024. [Online]. Available: http://arxiv.org/abs/2007.08199 [19] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and Harnessing Adversarial Examples,” Mar. 20, 2015, arXiv: arXiv:1412.6572. doi: 10.48550/arXiv.1412.6572. [20] I. J. Goodfellow et al., “Generative Adversarial Networks,” Jun. 10, 2014, arXiv: arXiv:1406.2661. doi: 10.48550/arXiv.1406.2661. [21] Y. Zhang, S. Zheng, P. Wu, M. Goswami, and C. Chen, “LEARNING WITH FEATURE-DEPENDENT LABEL NOISE: A PROGRESSIVE APPROACH,” 2021. [22] J. Li, R. Socher, and S. C. H. Hoi, “DivideMix: Learning with Noisy Labels as Semi-supervised Learning,” Feb. 18, 2020, arXiv: arXiv:2002.07394. Accessed: Mar. 01, 2024. [Online]. Available: http://arxiv.org/abs/2002.07394 [23] P. Chen, J. Ye, G. Chen, J. Zhao, and P.-A. Heng, “Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise,” Dec. 10, 2020, arXiv: arXiv:2012.05458. Accessed: Mar. 04, 2024. [Online]. Available: http://arxiv.org/abs/2012.05458 [24] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial Machine Learning at Scale,” Feb. 10, 2017, arXiv: arXiv:1611.01236. Accessed: Mar. 03, 2024. [Online]. Available: http://arxiv.org/abs/1611.01236 [25] T. Bai, J. Luo, J. Zhao, B. Wen, and Q. Wang, “Recent Advances in Adversarial Training for Adversarial Robustness,” Apr. 20, 2021, arXiv: arXiv:2102.01356. doi: 10.48550/arXiv.2102.01356. [26] P. L. Bartlett, M. I. Jordan, and J. D. McAuliffe, “Convexity, Classification, and Risk Bounds.” Accessed: Mar. 07, 2024. [Online]. Available: https://www.tandfonline.com/doi/abs/10.1198/016214505000000907 [27] N. D. Lawrence, B. Scholkopf, and T. Net, “Estimating a Kernel Fisher Discriminant in the Presence of Label Noise”. [28] T. Liu and D. Tao, “Classification with Noisy Labels by Importance Reweighting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 3, pp. 447–461, Mar. 2016, doi: 10.1109/TPAMI.2015.2456899. [29] C. Dong, L. Liu, and J. Shang, “Label Noise in Adversarial Training: A Novel Perspective to Study Robust Overfitting,” Oct. 12, 2023, arXiv: arXiv:2110.03135. Accessed: Mar. 04, 2024. [Online]. Available: http://arxiv.org/abs/2110.03135 [30] W. Simoncini and G. Spanakis, “SeqAttack: On Adversarial Attacks for Named Entity Recognition,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, H. Adel and S. Shi, Eds., Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 308–318. doi: 10.18653/v1/2021.emnlp-demo.35. [31] A. Farahani, S. Voghoei, K. Rasheed, and H. R. Arabnia, “A Brief Review of Domain Adaptation,” in Advances in Data Science and Information Engineering, R. Stahlbock, G. M. Weiss, M. Abou-Nasr, C.-Y. Yang, H. R. Arabnia, and L. Deligiannidis, Eds., Cham: Springer International Publishing, 2021, pp. 877–894. doi: 10.1007/978-3-030-71704-9_65. [32] G. Wilson and D. J. Cook, “A Survey of Unsupervised Deep Domain Adaptation,” ACM Trans. Intell. Syst. Technol., vol. 11, no. 5, p. 51:1-51:46, Jul. 2020, doi: 10.1145/3400066. [33] K. Sohn, S. Liu, G. Zhong, X. Yu, M.-H. Yang, and M. Chandraker, “Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos,” presented at the Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 3210–3218. Accessed: Jun. 07, 2024. [Online]. Available: https://openaccess.thecvf.com/content_iccv_2017/html/Sohn_Unsupervised_Domain_Adaptation_ICCV_2017_paper.html [34] Z. Yang, W. Chen, F. Wang, and B. Xu, “Unsupervised Domain Adaptation for Neural Machine Translation,” in 2018 24th International Conference on Pattern Recognition (ICPR), Aug. 2018, pp. 338–343. doi: 10.1109/ICPR.2018.8546053. [35] Z.-Y. Dou, J. Hu, A. Anastasopoulos, and G. Neubig, “Unsupervised Domain Adaptation for Neural Machine Translation with Domain-Aware Feature Embeddings,” Aug. 27, 2019, arXiv: arXiv:1908.10430. Accessed: Apr. 28, 2024. [Online]. Available: http://arxiv.org/abs/1908.10430 [36] L. O. Chua, “CNN: A Vision of Complexity,” Int. J. Bifurc. Chaos, vol. 07, no. 10, pp. 2219–2425, Oct. 1997, doi: 10.1142/S0218127497001618. [37] K. W. Church, “Word2Vec,” Nat. Lang. Eng., vol. 23, no. 1, pp. 155–162, Jan. 2017, doi: 10.1017/S1351324916000334. [38] J. Pennington, R. Socher, and C. Manning, “GloVe: Global Vectors for Word Representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), A. Moschitti, B. Pang, and W. Daelemans, Eds., Doha, Qatar: Association for Computational Linguistics, Oct. 2014, pp. 1532–1543. doi: 10.3115/v1/D14-1162. [39] C. Helwe, G. Dib, M. Shamas, and S. Elbassuoni, “A Semi-Supervised BERT Approach for Arabic Named Entity Recognition,” in Proceedings of the Fifth Arabic Natural Language Processing Workshop, I. Zitouni, M. Abdul-Mageed, H. Bouamor, F. Bougares, M. El-Haj, N. Tomeh, and W. Zaghouani, Eds., Barcelona, Spain (Online): Association for Computational Linguistics, Dec. 2020, pp. 49–57. Accessed: Jul. 16, 2024. [Online]. Available: https://aclanthology.org/2020.wanlp-1.5 [40] Y. Su et al., “CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models,” IEEEACM Trans. Audio Speech Lang. Process., vol. 29, pp. 2930–2941, 2021, doi: 10.1109/TASLP.2021.3105013. [41] K. Veselý, M. Hannemann, and L. Burget, “Semi-supervised training of Deep Neural Networks,” in 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Dec. 2013, pp. 267–272. doi: 10.1109/ASRU.2013.6707741. [42] V. S. Sheng and J. Zhang, “Machine Learning with Crowdsourcing: A Brief Summary of the Past Research and Future Directions,” Proc. AAAI Conf. Artif. Intell., vol. 33, no. 01, Art. no. 01, Jul. 2019, doi: 10.1609/aaai.v33i01.33019837. [43] J. Y. Yoo and Y. Qi, “Towards Improving Adversarial Training of NLP Models,” in Findings of the Association for Computational Linguistics: EMNLP 2021, M.-F. Moens, X. Huang, L. Specia, and S. W. Yih, Eds., Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 945–956. doi: 10.18653/v1/2021.findings-emnlp.81. [44] D. Tao, J. Cheng, Z. Yu, K. Yue, and L. Wang, “Domain-Weighted Majority Voting for Crowdsourcing,” IEEE Trans. Neural Netw. Learn. Syst., vol. 30, no. 1, pp. 163–174, Jan. 2019, doi: 10.1109/TNNLS.2018.2836969. [45] L. Li, R. Ma, Q. Guo, X. Xue, and X. Qiu, “BERT-ATTACK: Adversarial Attack Against BERT Using BERT,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), B. Webber, T. Cohn, Y. He, and Y. Liu, Eds., Online: Association for Computational Linguistics, Nov. 2020, pp. 6193–6202. doi: 10.18653/v1/2020.emnlp-main.500. |
Description: | 碩士 國立政治大學 資訊科學系碩士在職專班 109971002 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0109971002 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系碩士在職專班] 學位論文
|
Files in This Item:
File |
Description |
Size | Format | |
100201.pdf | | 1584Kb | Adobe PDF | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|