政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/125535

政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/125535

English | 正體中文 | 简体中文 | Post-Print筆數 : 27 | 全文筆數/總筆數 : 114504/145531 (79%)
造訪人次 : 53444120 線上人數 : 1262

RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.

搜尋範圍

查詢小技巧：

您可在西文檢索詞彙前後加上"雙引號"，以獲取較精準的檢索結果

若欲以作者姓名搜尋，建議至進階搜尋限定作者欄位，可獲得較完整資料

進階搜尋

主頁 ‧ 登入 ‧ 上傳 ‧ 說明 ‧ 關於政大典藏 ‧ 管理

到手機版

政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 > Item 140.119/125535

請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/125535

題名:	filterNN: 基於神經網路之序列資料特徵選取方法 filterNN: NN-based Feature Selection from Sequential Data
作者:	吳皓銘 Wu, Hao-Ming
貢獻者:	蕭舜文 Hsiao, Shun-Wen 吳皓銘 Wu, Hao-Ming
關鍵詞:	神經網路特徵擷取序列資料 Neural Network Feature extraction Feature selection Sequential data
日期:	2019
上傳時間:	2019-09-05 15:45:56 (UTC+8)
摘要:	我們設計了一個新的神經網絡架構，它由兩部分組成，過濾器和分類器。我們實現了三種過濾器，可以過濾掉不必要的輸入數據。過濾後的數據將被輸入後一種分類器，以實現最高的訓練精度。由於過濾器和分類器一起訓練，因此，過濾器將保持輸入，這有助於分類器執行分類。因此，剩餘的輸入數據可以被視為該類的特徵。我們還設計了三個成本函數來實現不同的目的，1）濾波輸入可以盡可能少，2）濾波輸入可以更連續，3）分類器可以實現最高的訓練精度。這三個學習目標相互衝突，因此我們在本研究中展示了調整過程以實現最佳性能。我們使用基於文本的順序數據來測試所提出的神經網路架構的有用性。使用基於文本的順序數據是從現實世界收集的惡意軟件執行API調用。研究表明，所提出的神經網路架構有助於處理基於文本的序列數據，並將過濾域專家的特徵以進行進一步分析。 We design a new Neural Network architecture which consists of two parts, filter, and classifier. We implement three kinds of filters, which can filter out unnecessary input data. The filtered data will be fed into the latter classifier to achieve the highest training accuracy. Because of the filter and classifier are trained together, thus, the filter will keep the inputs which help classifier to perform the classification. Therefore, the remaining input data can be viewed as the characteristic of the class. We also design three cost function to achieve different purpose, 1) the filtered inputs could be as less as possible, 2) the filtered inputs could be more consecutive as possible, 3) the classifier could achieve the highest training accuracy as possible. The three learning goals are in conflict with each other, so we demonstrate the tuning process in this research to achieve the best performance. We use text-based sequential data to test the usefulness of the proposed NN architecture. The use of text-based sequential data is malware execution API calls which are collected from the real world. The research shows that the proposed NN architecture is helpful for dealing with text-based sequential data and will filter the characteristic for domain experts to perform further analyze.
參考文獻:	[1] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” in Proc. of the IEEE, 1998, pp. 2278-2324. [2] S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff, “A sense of self for unix processes,” in Proc 1996 IEEE Symp. on Security and Privacy (S&P) , 1996, pp. 120-128. [3] S. Hofmeyr, S. Forrest and A. Somayaji, “Intrusion detection using sequences of system calls,” Journal of Computer Security, vol. 6, no. 3, pp. 151-180, 1998. [4] W. Lee and S. J. Stolfo, “Data Mining Approaches for Intrusion Detection,” in Proc. USENIX Security Symp., 1998, pp. 79-93. [5] C. Kruegel, D. Mutz, F. Valeur, and G. Vigna,“On the Detection of Anomalous System Call Arguments,” in Proc. European Symp. on Research in Computer Security (ESORICS), 2003, pp. 101-118. [6] U. Bayer, C. Kruegel, and E. Kirda, “TTAnalyze: A Tool for Analyzing Malware,” in Proc. European Institute for Computer Antivirus Research Annual Conference (EICAR), 2006, pp. 180-192. [7] U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda, “Scalable, Behavior-Based Malware Clustering,” in Proc. Network and Distributed System Security Symp. (NDSS) , 2009, pp. 8-11. [8] C. Willems, T. Holz, and F. Freiling, “Toward Automated Dynamic Malware Analysis Using CWSandbox,” in IEEE Security and Privacy Magazine, vol. 5, no. 2, pp. 32-39, 2007. [9] S. W. Hsiao, Y. N. Chen, Y. S. Sun, and M. C. Chen, “A Cooperative Botnet Profiling and Detection in Virtualized Environment,” in Proc. IEEE Conf. on Communications and Network Security (IEEE CNS), 2013, pp. 154-162. [10] S. Hou, Y. Ye, Y. Song, and M. Abdulhayoglu, “Hindroid: An intelligent android malware detection system based on structured heterogeneous information network,” in Proc. of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 1507-1515. [11] K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. McDaniel, “Adversarial perturbations against deep neural networks for malware classification,” arXiv: 1606.04435, 2016. [12] Q. Wang, W. Guo, K. Zhang, A. G. Ororbia II, X. Xing, X. Liu, and C. L. Giles, “Adversary resistant deep neural networks with an application to malware detection,” in Proc. of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 1145-1153. [13] G. E. Dahl, J. W. Stokes, L. Deng, and D. Yu, “Large-scale Malware Classification Using Random Projections and Neural Networks,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013, pp. 3422-3426. [14] D. Gibert, “Convolutional neural networks for malware classification,” Master’s thesis, Universitat Politcnica de Catalunya, 2016. [Online]. Available: https://pdfs.semanticscholar.org/b692/8c4317c295f884fc70385fa9177a0b9fe1fb.pdf [15] G. J. Tesauro, J. O. Kephart and G.B. Sorkin, “Neural networks for computer virus recognition,” in IEEE Expert , vol. 11, no. 4, pp. 5-6, 1996. [16] O. Sornil and C. Liangboonprakong, “Malware Classification Using N-grams Sequential Pattern Feature,” International Journal of Information Processing and Management, vol. 4, no. 5, pp. 59-67, 2013. [17] C. Ravi and R. Manoharan, “Malware Detection using Windows API Sequence and Machine Learning,” International Journal of Computer Applications, vol. 43, no. 17, pp. 12-16, 2012. [18] R. Veeramani and N. Rai, “Windows api based malware detection and framework analysis,” in Int. Conf. on networks and cyber security, 2012, pp. 25-29. [19] J. Saxe and K. Berlin, “Deep neural network based malware detection using two dimensional binary program features,” in Proc. Int. Conf. on Malicious and Unwanted Software (MALCON), 2015, pp. 11-20. [20] B. Athiwaratkun and J. Stokes, “Malware classification with LSTM and GRU language models and a character-level CNN”, in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2017, pp. 2482-2486. [21] S. Tobiyama, Y. Yamaguchi, H. Shimada, T. Ikuse, and T. Yagi, “Malware detection with deep neural network using process behavior,” in 2016 IEEE 40th Annual Computer Software and Applications Conf. (COMPSAC), 2016, pp. 577-582. [22] W. Huang, and J. W. Stokes, “MtNet: a multi-task neural network for dynamic malware classification,” in Int. Conf. on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), 2016, pp. 399-418. [23] S. Seok and H. Kim, “Visualized Malware Classification Based-on Convolutional Neural Network,” Journal of the Korea Institute of Information Security and Cryptology, vol. 26, no. 1, pp. 197-208, 2016. [24] J. Vaughan, A. Sudjianto, E. Brahimi, J. Chen and V. N. Nair, “Explainable neural networks based on additive index models,”arXiv preprint arXiv:1806.01933, 2018. [25] J. Masci, U. Meier, D. Cirean and J. Schmidhuber, “Stacked convolutional auto-encoders for hierarchical feature extraction,” in International Conference on Artificial Neural Networks, Springer, Berlin, Heidelberg, 2011, pp. 52-59. [26] “Malware Knowledge Base”, Owl.nchc.org.tw, 2018. [Online]. Available: https://owl.nchc.org.tw/. [Accessed: 14- Apr- 2018]. [27] “VirusTotal”, Virustotal.com, 2018. [Online]. Available: https://www.virustotal.com/zh-tw/. [Accessed: 14- Apr- 2018]. [28] “Build a Convolutional Neural Network using Estimators — TensorFlow”, TensorFlow, 2019. [Online]. Available: https://www.tensorflow.org/tutorials/layers. [Accessed: 24- Feb- 2019]. https://www.tensorflow.org/tutorials/layers [29] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097-1105. [30] U. Bioinformatics Laboratory, “Orange Data Mining Fruitful & Fun”, Orange.biolab.si, 2019. [Online]. Available: https://orange.biolab.si/. [Accessed: 24- Feb- 2019]. [31] T. F. Smith and M. S. Waterman, “Identification of common molecular subsequences,” in Journal of Molecular Biology, vol. 147, no. 1, pp. 195-197, 1981. [32] W. J. Chiu, “Automated Malware Family Signature Generation based on Runtime API Call Sequence,”M.S. thesis, Dept. of Info. Mngmt, National Taiwan Univ., Taipei, 2018. Accessed on: Jul. 10, 2019. [33] Saul B. Needleman and Christian D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, 48 (3): 44353., doi:10.1016/0022-2836(70)90057-4., PMID 5420325, 1970. [34] R. W. Hamming, “Error detecting and error correcting codes,” The Bell System Technical Journal, 29 (2): 147160., doi:10.1002/j.1538- 7305.1950.tb00463.x., ISSN 0005-8580, April 1950. [35] S. Hochreiter, and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
描述:	碩士國立政治大學資訊管理學系 106356005
資料來源:	http://thesis.lib.nccu.edu.tw/record/#G1063560051
資料類型:	thesis
DOI:	10.6814/NCCU201900745
顯示於類別:	[資訊管理學系] 學位論文

文件中的檔案:

檔案	大小	格式	瀏覽次數
005101.pdf	1957Kb	Adobe PDF2	99	檢視/開啟

在政大典藏中所有的資料項目都受到原著作權保護.

社群 sharing

著作權政策宣告 Copyright Announcement

1.本網站之數位內容為國立政治大學所收錄之機構典藏，無償提供學術研究與公眾教育等公益性使用，惟仍請適度，合理使用本網站之內容，以尊重著作權人之權益。商業上之利用，則請先取得著作權人之授權。
The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

2.本網站之製作，已盡力防止侵害著作權人之權益，如仍發現本網站之數位內容有侵害著作權人權益情事者，請權利人通知本網站維護人員(nccur@nccu.edu.tw)，維護人員將立即採取移除該數位著作等補救措施。
NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.

DSpace Software Copyright © 2002-2004 MIT & Hewlett-Packard / Enhanced by NTU Library IR team Copyright © - 回饋