政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/157810

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | Items with full text/Total items : 118874/149939 (79%)
Visitors : 82669525 Online Users : 33

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

Scope

please add "double quotation mark" for query phrases to get precise results

please goto advance search for comprehansive author search

Adv. Search

Home ‧ Login ‧ Upload ‧ Help ‧ About ‧ Administer

Goto mobile version

政大機構典藏 > 商學院 > 統計學系 > 學位論文 > Item 140.119/157810

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/157810

Title:	結合主成分分析之神經元剪枝方法應用於優化孿生神經網路 An Optimization Approach for Siamese Neural Networks Using Principal Component Analysis-Based Neuron Pruning
Authors:	王冀鋼 Wang, Chi-Kang
Contributors:	周珮婷 Chou, Pei-Ting 王冀鋼 Wang, Chi-Kang
Keywords:	孿生神經網路神經元剪枝主成分分析非結構化資料模型簡化分類 Siamese Neural Network Principal Component Analysis Neuron Pruning Unstructured Data Model Simplification Classification
Date:	2025
Issue Date:	2025-07-01 15:03:45 (UTC+8)
Abstract:	神經網路模型在各類應用中展現出強大的預測能力，但其超參數設定仍然是影響模型效能的重要挑戰，尤其是在神經元數量的選擇上。當神經元數量不足時，模型往往難以捕捉數據中的複雜模式，導致預測精度下降；反之，過多的神經元則會大幅增加參數規模和計算成本，同時可能引發過擬合等問題。針對此一困境，本研究提出一種基於主成分分析的神經元剪枝策略，旨在對預訓練神經網路模型中的神經元權重進行解析，並篩選出具有代表性的神經元。為驗證所提出方法的適用性與普遍性，本研究設計了一系列實驗，利用適合在少量資料環境的孿生神經網路，分別針對結構性資料與非結構型資料進行訓練和預測，並紀錄和比較不同神經元配置下的模型預測結果。最後結果顯示，經過此方法挑選神經元不僅有效減少模型參數，在較高的累積解釋變異數比例下，簡化後的模型預測表現甚至優於預訓練模型。 Neural network models have demonstrated strong predictive capabilities across a wide range of applications. However, the tuning of hyperparameters remains a critical challenge affecting model performance, particularly in determining the appropriate number of neurons. When the number of neurons is insufficient, the model often fails to capture the complex patterns inherent in the data, leading to reduced predictive accuracy. Conversely, an excessive number of neurons significantly increases the parameter scale and computational cost, and may also result in overfitting. To address this issue, this study proposes a neuron pruning strategy based on Principal Component Analysis(PCA), which aims to analyze the weights of neurons in a pre-trained neural network and identify a subset of representative neurons. To evaluate the applicability and generalizability of the proposed method, a series of experiments were conducted using Siamese Neural Networks(SNN), which are suitable for low-data scenarios. The experiments were performed on both structured and unstructured datasets, where models were trained and tested under various neuron configurations. The results show that the neuron selection method not only effectively reduces the number of model parameters, but also enables the simplified models to achieve predictive performance that surpasses that of the original pretrained models, particularly when a high cumulative explained variance ratio is retained.
Reference:	Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary reviews: computational statistics, 2(4), 433-459. Bengio, Bengio, Y. (2012). Practical recommendations for gradient-based training of deep architectures. In Neural networks: Tricks of the trade: Second edition (pp. 437-478). Berlin, Heidelberg: Springer Berlin Heidelberg. Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. The journal of machine learning research, 13(1), 281-305. Bromley, J., Guyon, I., LeCun, Y., Säckinger, E., & Shah, R. (1993). Signature verification using a " siamese " time delay neural network. Advances in neural information processing systems, 6. Cao, Z., Shaomin, M. U., Yongyu, X. U., & Dong, M. (2018, December). Image retrieval method based on CNN and dimension reduction. In 2018 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) (pp. 441-445). IEEE. Cheng, Y., Wang, D., Zhou, P., & Zhang, T. (2017). A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282. Chopra, S., Hadsell, R., & LeCun, Y. (2005, June). Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 539-546). IEEE. Domhan, T., Springenberg, J. T., & Hutter, F. (2015, July). Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In IJCAI (Vol. 15, pp. 3460-8). Gale, T., Elsen, E., & Hooker, S. (2019). The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574. Hadsell, R., Chopra, S., & LeCun, Y. (2006, June). Dimensionality reduction by learning an invariant mapping. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06) (Vol. 2, pp. 1735-1742). IEEE. Han, S., Pool, J., Tran, J., & Dally, W. (2015). Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28. Hassibi, B., & Stork, D. (1992). Second order derivatives for network pruning: Optimal brain surgeon. Advances in neural information processing systems, 5. Jolliffe, I. T. (2002). Principal component analysis for special types of data (pp. 338-372). Springer New York. Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical transactions of the royal society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. Klein, A., Falkner, S., Springenberg, J. T., & Hutter, F. (2017, February). Learning curve prediction with Bayesian neural networks. In International conference on learning representations. Koch, G., Zemel, R., & Salakhutdinov, R. (2015, July). Siamese neural networks for one-shot image recognition. In ICML deep learning workshop (Vol. 2, No. 1, pp. 1-30). LeCun, Y., Denker, J., & Solla, S. (1989). Optimal brain damage. Advances in neural information processing systems, 2. Louizos, C., Welling, M., & Kingma, D. P. (2017). Learning sparse neural networks through L_0 regularization. arXiv preprint arXiv:1712.01312. Molchanov, D., Ashukha, A., & Vetrov, D. (2017, July). Variational dropout sparsifies deep neural networks. In International conference on machine learning (pp. 2498-2507). PMLR. Mueller, J., & Thyagarajan, A. (2016, March). Siamese recurrent architectures for learning sentence similarity. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1). Riera, M., Arnau, J. M., & González, A. (2022). DNN pruning with principal component analysis and connection importance estimation. Journal of Systems Architecture, 122, 102336. Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815-823). Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine learning algorithms. Advances in neural information processing systems, 25. Young, T., Hazarika, D., Poria, S., & Cambria, E. (2018). Recent trends in deep learning based natural language processing. ieee Computational intelligenCe magazine, 13(3), 55-75.
Description:	碩士國立政治大學統計學系 112354032
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0112354032
Data Type:	thesis
Appears in Collections:	[統計學系] 學位論文

Files in This Item:

File	Description	Size	Format
403201.pdf		1926Kb	Adobe PDF	0	View/Open

All items in 政大典藏 are protected by copyright, with all rights reserved.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - Feedback