政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/155993

政大典藏 > College of Informatics > Executive Master Program of Computer Science of NCCU > Theses > Item 140.119/155993

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/155993

Title:	探討聯盟式學習架構中補救學習的效能恢復 Exploring the Performance Recovery of Remedial Learning Within the Federated Learning Framework
Authors:	黃哲偉 Huang, Che-Wei
Contributors:	廖文宏 Liao, Wen-Hung 黃哲偉 Huang, Che-Wei
Keywords:	聯盟式學習圖像分類深度學習遭受資料中毒攻擊補救學習 Federated Learning Image Classification Deep Learning Data Poisoning Attack Remedial Learning
Date:	2025
Issue Date:	2025-03-03 14:29:27 (UTC+8)
Abstract:	聯盟式學習是一項分散式機器學習技術，旨在促進多個單位之間的模型協作，而無需將敏感資料集中傳送至中央位置，從而在確保用戶隱私的同時，有效處理大規模和分散式資料集。本研究在圖像分類任務中，探討了在聯盟式學習架構下進行補救學習（Remedial Learning）訓練的效果。補救學習是一種專門針對模型在某些特定任務或資料集上表現不佳時，採取的補救措施，其核心目的是透過分析和理解模型的弱點，設計針對性的策略來提升模型的性能，確保其在處理不同任務和資料時能夠達到更高的準確度和穩定性。在聯盟式學習的背景下，如果參與訓練的單位（客戶端）其資料集中的資料分佈不均勻，或者存在一些特定的雜訊或偏差，就可能導致模型在某些單位上表現不佳。這些問題會影響模型的整體性能，使其難以在所有客戶端上都達到理想的效果。因此，需要透過補救學習的方式來針對這些問題進行調整和改進。評估方式包括Top-1和Top-5準確率以及損失函數。本研究顯示，在模型受到污染後進行補救學習時，採用共同補救學習，比自我補救學習更具成效。具體而言，共同補救學習其模型性能的恢復效果顯著優於自我補救訓練。為進一步驗證此結果，研究還進行了依靠其他單位補救學習。結果顯示，依靠其他單位補救學習同樣優於自我補救學習。此外，隨著訓練回合數的增加，依靠其他單位補救學習所達到的效果，幾乎與正常情況下的訓練結果相當。另外，研究也進一步探討了使用預訓練模型進行補救學習的效果，結果顯示，預訓練模型雖能在正常訓練下顯著提升性能，並在資料污染情況下保持穩定性。但總體而言，補救學習的改善效果不如預期顯著。最後還檢驗了當變更受污染單位時補救學習的有效性。結果顯示，無論是變更受污染的單位還是保持原有設置，補救學習的效果趨勢一致，遭受資料中毒攻擊單位的變更並不顯著影響補救學習的成效。綜上所述，我們推斷，在聯盟式學習場景中，當存在受污染單位時，透過共同補救學習，能夠更有效地恢復模型性能。另一種可行的策略是依靠其他單位補救學習，同樣能夠使模型效能接近正常水準。這一結果表明，聯盟式學習具有較強的強健性，能夠有效應對部分單位模型污染問題，進而提高整體系統的容錯能力。本研究透過比較不同訓練方法的效果，提供了在聯盟式學習中，面對單一單位資料樣本受污染時恢復模型性能的新見解。這些發現有助於我們更深入地理解如何在聯盟式學習系統中利用補救學習來維持模型的效能和準確率，並提供了一個可行的解決方案。 Federated learning is a decentralized machine learning technique designed to facilitate model collaboration among multiple entities without the need to centrally transfer sensitive data, thus ensuring user privacy while effectively handling large-scale and distributed datasets. This study investigates the effects of remedial learning training within a federated learning framework on image classification tasks. Remedial learning refers to corrective measures taken when a model underperforms on specific tasks or datasets, with the core goal of analyzing and understanding the model’s weaknesses to design targeted strategies that enhance its performance. This ensures the model can achieve higher accuracy and stability when dealing with various tasks and data. In a federated learning context, if the participating units (clients) have uneven data distributions or contain noise or biases, it may lead to poor model performance on some units, negatively affecting the overall model. This necessitates remedial learning to adjust and improve these issues. The evaluation metrics include Top-1 and Top-5 accuracy and loss functions. The study shows that when remedial learning is applied to a contaminated model, retraining with all participating units under the federated learning framework is more effective than applying remedial learning to only the contaminated unit. Specifically, when all units collaborate to retrain the model through remedial learning, the recovery of model performance is significantly better than when only the contaminated unit undergoes remedial training. To further verify this finding, the study also conducted federated retraining after excluding the contaminated unit. The results indicate that remedial learning conducted after excluding the contaminated unit still outperforms training solely on the contaminated unit. Furthermore, as the number of training rounds increases, the effects of remedial learning leveraging other uncontaminated units almost match the results of normal training. In addition, the study further explored the effects of using pretrained models in remedial learning. The results show that although pretrained models can significantly improve performance in normal training scenarios and maintain stability in cases of data contamination, the overall improvement through remedial learning is less pronounced than expected. Finally, the study also examined the effectiveness of remedial learning when the contaminated unit is changed. The results show that whether the contaminated unit is changed or kept the same, the trends in remedial learning effectiveness remain consistent, and changing the label error unit does not significantly affect the outcomes of remedial learning. In conclusion, we infer that in a federated learning context, when there are contaminated units, remedial learning through collaborative retraining of all units is more effective in restoring model performance. Another viable strategy is to exclude the contaminated unit and conduct remedial learning, which similarly brings the model performance close to normal levels. This finding suggests that federated learning demonstrates strong robustness, effectively mitigating the issue of model contamination in certain units, thereby improving the overall system's fault tolerance. This study provides new insights into restoring model performance when dealing with contaminated data from individual units in federated learning by comparing the effects of different training methods. These findings help deepen our understanding of how remedial learning can be used within federated learning systems to maintain model performance and accuracy, offering a viable solution to these challenges.
Reference:	[1] McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017, April). Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics (pp. 1273-1282). PMLR. [2] Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2), 1-19. [3] 維基百科:聯盟式學習定義。 https://en.wikipedia.org/wiki/Federated_learning [4] Wu, Y. (2012). Remedial Learning. In: Seel, N.M. (eds) Encyclopedia of the Sciences of Learning. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-1428-6_1632 [5] https://shreyansh26.github.io/post/2021-12-18_federated_optimization_fedavg/ [6] Yann LeCun, John S. Denker, and Sara A. Solla (1990) Optimal Brain Damage [7] Song Han, Huizi Mao, and William J. Dally (2016) Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding [8] Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko (2018) Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference [9] Geoffrey Hinton, Oriol Vinyals, Jeff Dean (2015) Distilling the Knowledge in a Neural Network [10] Yuxin Ma, Tiankai Xie, Jundong Li, Ross Maciejewski (2019, Jul) Explaining Vulnerabilities to Adversarial Machine Learning through Visual Analytics. https://arxiv.org/abs/1907.07296 [11] Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2), 1-210. [12] Li, Q., Diao, Y., Chen, Q., & He, B. (2022, May). Federated learning on non-iid data silos: An experimental study. In 2022 IEEE 38th International Conference on Data Engineering (ICDE) (pp. 965-978). IEEE. [13] Flower: A Friendly Federated Learning Framework. (GitHub) https://github.com/adap/flower [14] Flower: A Friendly Federated Learning Framework. https://flower.ai/ [15] Flower framework architecture. https://ar5iv.labs.arxiv.org/html/2104.03042 [16] CIFAR-100 . https://www.cs.toronto.edu/~kriz/cifar.html [17] ResNet. https://arxiv.org/abs/1512.03385 [18] Tzu-Ming Harry Hsu, Hang Qi, Matthew Brown(2019) Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification. https://arxiv.org/abs/1909.06335 [19] Massimo Bonavita, Patrick Laloyaux. Machine Learning for Model Error Inference and Correction. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2020MS002232 [20] PPML Series #2 - Federated Optimization Algorithms - FedSGD and FedAvg https://shreyansh26.github.io/post/2021-12-18_federated_optimization_fedavg/
Description:	碩士國立政治大學資訊科學系碩士在職專班 111971030
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0111971030
Data Type:	thesis
Appears in Collections:	[Executive Master Program of Computer Science of NCCU] Theses

Files in This Item:

File	Description	Size	Format
103001.pdf		6233Kb	Adobe PDF	76	View/Open

社群 sharing

Loading...