Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/111305
|
Title: | 非監督式學習下高風險行為青少年探究 Unsupervised learning of adolescent risk-taking study |
Authors: | 李承軒 |
Contributors: | 周珮婷 李承軒 |
Keywords: | 非監督式學習 聚合式階層分群法 資料雲幾何樹 風險行為 Unsupervised learning Agglomerative hierarchical clustering Data cloud geometry tree Risk-taking |
Date: | 2017 |
Issue Date: | 2017-07-24 11:59:12 (UTC+8) |
Abstract: | 本研究主要以非監督式學習的演算法,以兩種分群演算法,交叉探討青少年族群的高風險行為特徵。兩種分群演算法中,第一種為資料雲幾何樹,它具有溫度與時間兩個維度構面,透過溫度的篩選以及時間軸的自動偵測,提高群聚間的差異,另一種為聚合式階層分群法,它屬於簡潔明瞭、快速實用的方法。在此將風險行為資料分成連續型與類別型兩部分同時進行分群,並以檢定的方式來驗證是否滿足群間差異大。從顯著變數個數的比較中發現,階層式分群法的表現較佳,推測其群間差異較大,而從一方面來看,從變異比的比較中發現,資料幾何雲樹在特殊群聚下變異比較大,也就是群間差異大,反而階層式分群則只在第一次分群時群聚間差異較大,最後,計算特殊群聚與非特殊群聚的差異,發現特殊群聚的風險值較高,推測為高風險青少年,並從兩演算法下所得的特殊群聚中挑出重複出現的觀測值,作為我們所要找的高風險青少年目標群,並針對目標群人口資料整理。 The current study used the two clustering algorithms in unsupervised learning to explore adolescents’ risk-taking behaviors cross-culturally. The first algorithm was data cloud geometry tree, which considered two elements, temperature and time, in the algorithm. Through the filtering of temperature and the automatic detection of time axis, the differences between clusters were increased as temperature was lowered. The second algorithm was agglomerative hierarchical clustering, a simple and practical method. The risk-taking data were divided into two parts: numerical type and categorical type. Hypothesis tests were conducted to verify whether the differences between groups were significant. The results showed that the hierarchical clustering method performed better. In addition, the findings showed that the group differences in the special cluster were larger when using the data cloud geometry tree. Finally, the difference between the special group and the non-special group was calculated, and the risk value of the special group was high, which identified the potentially high-risk adolescents. The special clusters obtained from the two algorithms were compared to get the repeated subjects, which served as our target. Also, demographic data of the target were discussed. |
Reference: | Abbas, O. A. (2008). Comparisons Between Data Clustering Algorithms. Int. Arab J. Inf. Technol., 5(3), 320-325. Ahmad, A., & Dey, L. (2007). A method to compute distance between two categorical values of same attribute in unsupervised learning for categorical data set. Pattern Recognition Letters, 28(1), 110-118. Fushing, H., & McAssey, M. P. (2010). Time, temperature, and data cloud geometry. Physical Review E, 82(6), 061110. Fushing, H., Wang, H., VanderWaal, K., McCowan, B., & Koehl, P. (2013). Multi-scale clustering by building a robust and self correcting ultrametric topology on data points. PLoS ONE, 8(2), e56259. Hamming, R. W. (1950), Error Detecting and Error Correcting Codes. Bell System Technical Journal, 29: 147–160. doi: 10.1002/j.1538-7305.1950.tb00463.x Jia, H., Cheung, Y.-m., & Liu, J. (2016). A new distance metric for unsupervised learning of categorical data. IEEE transactions on neural networks and learning systems, 27(5), 1065-1079. Murtagh, F., & Legendre, P. (2011). Ward`s hierarchical clustering method: clustering criterion and agglomerative algorithm. arXiv preprint arXiv:1111.6285. |
Description: | 碩士 國立政治大學 統計學系 104354017 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0104354017 |
Data Type: | thesis |
Appears in Collections: | [統計學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
401701.pdf | 5922Kb | Adobe PDF2 | 389 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|