Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/37122
|
Title: | 階層式的人聲分類與鼾聲聲學特性分析中的特徵篩選 Feature Selection in Hierarchical Classification of Human Sounds and Acoustic Analysis of Snoring Signals |
Authors: | 林裕凱 |
Contributors: | 廖文宏 林裕凱 |
Keywords: | 人聲分類 聲學特徵篩選 |
Date: | 2007 |
Issue Date: | 2009-09-19 12:12:02 (UTC+8) |
Abstract: | 人聲大致上可分為語音和非語音兩部分。傳統上對於聲音分類的研究大多強調語音、音樂和環境聲的分類。在本論文中,我們採取不同的觀點,著重於人聲中非語音部份的研究,聲音種類為笑聲、尖叫聲、打噴嚏聲和鼾聲。為了達到此目標,我們調查常用的幾種聲學特徵,並以多元適應性雲形迴歸和支持向量機進行特徵值篩選,找出對於非語音人聲分類具有代表性的聲學特徵。此外我們也進行多方面的模擬,以觀察雜訊對辨識率的影響。 本論文第二部份為鼾聲研究,我們比較一般普通麥克風和目前醫療用鼾聲麥克風(snoring microphone)、壓電感應器(piezo sensor)三者在偵測鼾聲上的表現。此外,並以KL divergence 和EMD兩種計算差異度的方法進行普通鼾聲與阻塞型鼾聲的分群。同樣地,我們加入不同程度雜訊至鼾聲訊號中,以測試兩方法抗雜訊的穩健度,結果顯示此兩種方法均有不錯的表現,其中EMD在大多數情況下有較佳的結果。 Human sounds can be roughly divided into two categories: speech and non-speech. Traditional audio scene analysis research puts more emphasis on the classification of audio signals into human speech, music, and environmental sounds. We take a different perspective in this thesis. We are mainly interested in the analysis of non-speech human sounds, including laugh, scream, sneeze, and snore. Toward this goal, we investigate many commonly used acoustic features and select useful ones for classification using multivariate adaptive regression splines (MARS) and support vector machine (SVM). To evaluate the robustness of the selected features, we also perform extensive simulations to observe the effect of noise on the accuracy of the classification.<br>The second part of this thesis is concerned with the analysis snoring signals. We use ordinary microphone as our snoring recorder and compare its sensitivity with snoring microphone and piezo sensor, which are often utilized in clinical settings. In addition, we classify simple snores and obstructive snores using two distance measures: KL divergence and earth mover`s distance (EMD). Similarly, we add noises to the snoring signals to examine the robustness of these two measures. It turns out that both methods perform satisfactorily, although EMD generates slightly better results in most cases. |
Reference: | 參考文獻 [1] Y. Su, “Analysis and Classification of Human Sounds,” Master’s thesis, Department of Computer Science National Chengchi University, 2006. [2] W. Stoltzman,“Toward a Social Signaling Framework: Activity and Emphasis in Speech,” Master’s thesis, Engineering in Electrical Engineering and Computer Science Massachusetts Institute of Technology, 2006. [3] 陳若涵,許肇凌,張智星,羅鳳珠,「以音樂內容為基礎的情緒分析與辨識」,第二屆電腦音樂與音訊技術研討會,Taipei,Taiwan,2006. [4] M.Pantic and L.J.M. Rothkrantz, “Toward an affect-sensitive multimodal human-computer interaction,” Proceedings of the IEEE, Vol.91, Issue 9, pp.1370 – 1390, 2003. [5] Z. Xin and Z. Ras, “Analysis of Sound Features for Music Timbre Recognition,” International Conference on Multimedia and Ubiquitous Engineering, 2007. [6] J. Wang, J. Wang, K. He and C. Hsu, “Environmental Sound Classification using Hybrid SVM/KNN Classifier and MPEG-7 Audio Low-Level Descriptor,” International Joint Conference on Neural Networks, 2006. [7] D. Deng, C. Simmermacher and S. Cranefield,“Finding the Right Features for Instrument Classification of Classical Music,”Integrating AI and Data Mining, pp.34 – 41, 2006. [8] R. Jarina and J. Olajec,“Discriminative Feature Selection for Applause Sounds Detection,”Image Analysis for Multimedia Interactive Services, Vol., Issue 6-8, pp.13 – 16, 2007. [9] V. A. Petrushin, “Emotion Recognition in Speech Signal: Experimental Study, Development, and Application,” Proceedings of the Sixth International Conference on Spoken Language Processing, 2000. [10] J. Rong, Y. Chen, M. Chowdhury and G. Li, “Acoustic Features Extraction for Emotion Recognition,” 6th IEEE/ACIS International Conference on Computer and Information Science, pp. 419-424, 2007. [11] J. J. Lien et al, “Automated Facial Expression Recognition,” Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 390-395, 1998. [12] K. Mase, “Recognition of Facial Expression from Optical Flow,” IEICE Transactions, Vol. E74, No.10, pp. 3474-3483, 1991. [13] C. Cheng and Y. Hung, “Visual/Acoustic Emotion Recognition,” IEEE International Conference on Multimedia and Expo, 2005. [14] Y. Hsu, M. Chen, C. Cheng and C. Wu, “Development of a portable device for home monitoring of snoring,” Journal of Biomedical Engineering - Applications, Basis & Communications, Vol. 17, No. 4, pp.176-180, 2005. [15] J. Sola-Soler, R. Jane, J.A. Fiz and J. Morera,“Automatic classification of subjects with and without Sleep Apnea through snoring analysis,”Engineering in Medicine and Biology Society, Vol. , Issue 22-26, pp.6093 -6096, 2007. [16] M. Cavusoglu, M. Kamasak, O. Erogul, T. Ciloglu, Y. Serinagaoglu and T. Akcam, “An efficient method for snore/nonsnore classification of sleep sounds,” Physiological Measurement, Vol. 28, No. 8, pp. 841-853, 2007. [17] R. J. Baken, “Clinical Measurement of Speech and Voice. London : Taylor and Francis,” 1987. [18] X. Huang, A. Acero and H. Hon, “Phonetics and Phonology,” Spoken Language Processing: A Guide to Theory, Algorithm and System Development, pp. 39, 2001. [19] J. H. Friedman, “Multivariate Adaptive Regression Splines,” Department of Statistics, Stanford University, Technical Report 102 Rev, 1990. [20] 李天行, 唐筱菁,「整合財務比率與智慧資本於企業危機診斷模式之建構-類神經網路與多元適應性雲形迴歸之應用」,資訊管理學報,11卷2期,2004年4月。 [21] C. Burges,“A Tutorial on Support Vector Machines for Pattern Recognition,”Data Mining and Knowledge Discovery 2:121 - 167, 1998. [22] 王小川,「語音訊號處理」,全華股份有限公司,2007年4月。 [23] 張智星,「音訊處理與辨識」, http://neural.cs.nthu.edu.tw/jang/books/audioSignalProcessing/ [retrieved July 2008]. [24] X. Lin , H. Peng and B. Liu,“Support Vector Machines for Text Categorization in Chinese Question Classification,” IEEE/WIC/ACM International Conference on Web Intelligence, pp. 334-337, 2006. [25] B. Ma, N. Nguyen and J. Rajapakse,“Gene .classification using codon usage analysis and support vector machines,”IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2007. [26] Y. Yang, R. Wang, Y. Liu, S. Li and X. Zhou,“Solving P2P Traffic Identification Problems Via Optimized Support Vector Machines,”IEEE/ACS International Conference on Computer Systems and Applications, pp. 165-171, 2007. [27] H.T. Lin and C.J. Lin,“A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods,”Technical report, Department of Computer Science & Information Engineering, National Taiwan University, 2003. [28] 譚慶鼎,「鼾聲如雷,傷的是誰?談打鼾與阻塞型睡眠呼吸中止症候群」, http://w3.mc.ntu.edu.tw/department/ent/tan/tan93-1.doc [retrieved July 2008] [29] 陳濘宏,「阻塞性睡眠呼吸中止症候群」, http://www.cgmh.org.tw/sleepcenterlnk/scolumn/20070101-4.html [retrieved July 2008] [30] 劉勝義,「臨床睡眠檢查學」,合記出版社,民國93年10月。 [31] Roche Seminars on Aging: Aging in Sleep, Zepelin, 1982. [32] Y. Rubner, C. Tomasi and L. J. Guibas,“A Metric for Distributions with Applications to Image Databases,”Proceedings of the IEEE International Conference on Computer Vision, Bombay, India, pp.59-66, 1998. |
Description: | 碩士 國立政治大學 資訊科學學系 95753024 96 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0957530241 |
Data Type: | thesis |
Appears in Collections: | [資訊科學系] 學位論文
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|