政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/119881
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 114105/145137 (79%)
Visitors : 52141048      Online Users : 540
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大典藏 > College of Commerce > Department of MIS > Theses >  Item 140.119/119881
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/119881


    Title: 歸納惡意軟體特徵
    Malware Family Characterization
    Authors: 劉其峰
    Liu, Chi-Feng
    Contributors: 郁方
    Yu, Fang
    劉其峰
    Liu, Chi-Feng
    Keywords: 遞歸神經網路
    增長層級式自我組織映射圖
    長短期記憶
    惡意軟體
    動態分析
    序列編碼
    RNN
    GHSOM
    LSTM
    Malware
    Sequence encoding
    Dynamic analysis
    Date: 2018
    Issue Date: 2018-09-03 15:47:50 (UTC+8)
    Abstract: Nowadays, a massive amount of sensitive data which are accessible and connected through personal computers and cloud services attracts hackers to develop malicious software (malware) to steal them. Owing to the success of deep learning on image and language recognition, researchers direct security systems to analyze and identify malware with deep learning approaches. This paper addresses the problem of analyzing and identifying complex and unstructured malware behaviors by proposing a framework of combining unsupervised and supervised learning algorithms with a novel sequence-aware encoding method. Particularly, we adopt a hybrid GHSOM (the Growing Hierarchical Self-Organizing Map) algorithm to cluster and encode similar malware behavior sequences from system call sequences to clustering feature vectors. Then, a Recurrent Neural Network (RNN) is trained to detect malware and predict their corresponding malware families based on the sequence of the behavior vectors. Our experiments show that the accuracy rate can be up to 0.98 in malware detection and 0.719 in malware classification of an 18-category malware dataset.
    Reference: [1] A.-r. M. https://commons.wikimedia.org/wiki/User:BiObserve (Raster version previously uploaded to Wikimedia)Alex Graves and G. H. (original)Eddie Antonio Santos (SVG version with TeX math), “Peephole long short-term memory,” ”[CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons”.
    [2] R. J. Canzanese Jr, “Detection and classification of malicious processes using system all analysis,” Ph.D. dissertation, Drexel University, 2015.
    [3] T. Moore, D. J. Pym, C. Ioannidis et al., Economics of information security and privacy. Springer, 2010.
    [4] N. Idika and A. P. Mathur, “A survey of malware detection techniques,” Purdue University, vol. 48, 2007.
    [5] “Manalyze,” https://github.com/JusticeRage/Manalyze, [Online; accessed 4-May2018].
    [6] S. Forrest, S. A. Hofmeyr, A. Somayaji, and T. A. Longstaff, “A sense of self for unix processes,” in Security and Privacy, 1996. Proceedings., 1996 IEEE Symposium on. IEEE, 1996, pp. 120–128.
    [7] M. Rhode, P. Burnap, and K. Jones, “Early stage malware prediction using recurrent neural networks,” arXiv preprint arXiv:1708.03513, 2017.
    [8] X. Wang and S. M. Yiu, “A multi-task learning model for malware classification with useful file access pattern from api call sequence,” arXiv preprint arXiv:1610.05945, 2016.
    [9] B. Kolosnjaji, A. Zarras, G. Webster, and C. Eckert, “Deep learning for classification of malware system call sequences,” in Australasian Joint Conference on Artificial Intelligence. Springer, 2016, pp. 137–149.
    [10] S. Tobiyama, Y. Yamaguchi, H. Shimada, T. Ikuse, and T. Yagi, “Malware detection with deep neural network using process behavior,” in Computer Software and Applications Conference (COMPSAC), 2016 IEEE 40th Annual, vol. 2. IEEE, 2016, pp. 577–582.
    [11] R. Pascanu, J. W. Stokes, H. Sanossian, M. Marinescu, and A. Thomas, “Malware classification with recurrent networks,” in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 1916–1920.
    [12] C.-H. Chiu, J.-J. Chen, and F. Yu, “An effective distributed ghsom algorithm for unsupervised clustering on big data,” in Big Data (BigData Congress), 2017 IEEE International Congress on. IEEE, 2017, pp. 297–304.
    [13] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997. [Online]. Available: http://dx.doi.org/10. 1162/neco.1997.9.8.1735
    [14] H. Sak, A. Senior, and F. Beaufays, “Long short-term memory recurrent neural network architectures for large scale acoustic modeling,” in Fifteenth annual conference of the international speech communication association, 2014.
    [15] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994.
    [16] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with lstm,” 1999.
    [17] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014.
    [18] T. Mikolov, M. Karafi´at, L. Burget, J. Cernock"y, and S. Khudanpur, “Recurrent ˇ neural network based language model,” in Eleventh Annual Conference of the International Speech Communication Association, 2010.
    [19] A. Graves, A.-r. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Acoustics, speech and signal processing (icassp), 2013 ieee international conference on. IEEE, 2013, pp. 6645–6649.
    [20] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in neural information processing systems, 2014, pp. 3104– 3112.
    [21] T. Kohonen, “The self-organizing map,” Proceedings of the IEEE, vol. 78, no. 9, pp. 1464–1480, 1990.
    [22] A. Rauber, D. Merkl, and M. Dittenbach, “The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data,” IEEE Transactions on Neural Networks, vol. 13, no. 6, pp. 1331–1341, 2002.
    [23] H. Shi, T. Hamagami, K. Yoshioka, H. Xu, K. Tobe, and S. Goto, “Structural classification and similarity measurement of malware,” IEEJ Transactions on Electrical and Electronic Engineering, vol. 9, no. 6, pp. 621–632, 2014.
    [24] W. Shuwei, W. Baosheng, Y. Tang, and Y. Bo, “Malware clustering based on snn density using system calls,” in International Conference on Cloud Computing and Security. Springer, 2015, pp. 181–191.
    [25] M. Dittenbach, D. Merkl, and A. Rauber, “The growing hierarchical self-organizing map,” in Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on, vol. 6. IEEE, 2000, pp. 15–19.
    [26] C. Guarnieri, A. Tanasi, J. Bremer, and M. Schloesser, “The cuckoo sandbox,” 2012.
    [27] Y.-H. Li, Y.-R. Tzeng, and F. Yu, “Viso: Characterizing malicious behaviors of virtual machines with unsupervised clustering,” in Cloud Computing Technology and Science (CloudCom), 2015 IEEE 7th International Conference on. IEEE, 2015, pp. 34–41.
    [28] S.-W. Lee and F. Yu, “Securing kvm-based cloud systems via virtualization introspection,” in System Sciences (HICSS), 2014 47th Hawaii International Conference on. IEEE, 2014, pp. 5028–5037.
    [29] F. Yu, S.-y. Huang, L.-c. Chiou, and R.-h. Tsaih, “Clustering ios executable using self-organizing maps,” in Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013, pp. 1–8.
    [30] R.-S. Pirscoveanu, M. Stevanovic, and J. M. Pedersen, “Clustering analysis of malware behavior using self organizing map,” in Cyber Situational Awareness, Data Analytics And Assessment (CyberSA), 2016 International Conference On. IEEE, 2016, pp. 1–6.
    [31] S. Marinai, E. Marino, and G. Soda, “Embedded map projection for dimensionality reduction-based similarity search,” in Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR). Springer, 2008, pp. 582–591.
    [32] “Virustotal,” https://www.virustotal.com/en/, [Online; accessed 4-April-2018].
    [33] M. Sebasti´an, R. Rivera, P. Kotzias, and J. Caballero, “Avclass: A tool for massive malware labeling,” in International Symposium on Research in Attacks, Intrusions, and Defenses. Springer, 2016, pp. 230–253.
    [34] Z. C. Lipton, J. Berkowitz, and C. Elkan, “A critical review of recurrent neural networks for sequence learning,” arXiv preprint arXiv:1506.00019, 2015.
    [35] W. Hu and Y. Tan, “Black-box attacks against rnn based malware detection algorithms,” arXiv preprint arXiv:1705.08131, 2017.
    [36] “strace(1) - linux man page,” https://linux.die.net/man/1/strace, [Online; accessed 5-April-2018].
    [37] S.-W. Hsiao, Y.-N. Chen, Y. S. Sun, and M. C. Chen, “A cooperative botnet profiling
    and detection in virtualized environment,” in Communications and Network Security (CNS), 2013 IEEE Conference on. IEEE, 2013, pp. 154–162.
    [38] “Linux syscall reference,” https://syscalls.kernelgrok.com/, [Online; accessed 11- August-2018].
    Description: 碩士
    國立政治大學
    資訊管理學系
    105356019
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0105356019
    Data Type: thesis
    DOI: 10.6814/THE.NCCU.MIS.025.2018.A05
    Appears in Collections:[Department of MIS] Theses

    Files in This Item:

    File SizeFormat
    601901.pdf826KbAdobe PDF2208View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback