Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/98846
|
Title: | 數據幾何特徵的機器學習 A study of Data Geometry-based Learning |
Authors: | 劉憲忠 Liu, Hsien Chung |
Contributors: | 周珮婷 Chou, Pei Ting 劉憲忠 Liu, Hsien Chung |
Keywords: | 機器學習 幾何模式 machine learning data-geometry |
Date: | 2016 |
Issue Date: | 2016-07-11 16:54:50 (UTC+8) |
Abstract: | 本研究著重於數據的幾何模式以了解資料變數間的關係,運用統計模型配適所得的係數加權於距離矩陣上,是否能有效提升正確率。本研究主要使用資料雲幾何樹及餘弦相似度方法與抽樣多數決投票法判別預測資料類別,另外並與階層式分群法、支持向量機、Hybrid法於三筆不同資料的分類結果比較,其中有兩筆為生物行為評估專案資料與美國威斯康辛州診斷乳癌資料,使用監督式學習驗證資料分類結果,另一筆月亮模擬資料,使用半監督式學習預測新資料分類結果。最後,各方法的優劣性與原因將被探討與總結,可知不同資料數據的幾何,確實需要嘗試不同公式與演算法來達到好的機器學習結果。 The study focuses on the computed data-geometry based learning to discover the inter-dependence patterns among covariate vectors. In order to discover the patterns and improve classification accuracy, the distance functions are modified to better capture the geometry patterns and measure the association between variables. A comparison of the performance of my proposed learning rule to the other machine learning techniques will be summarized through three datasets. In the end, I demonstrated why the concept of geometry patterns is essential. |
Reference: | Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine learning, 36(1-2),105 -139. Baldi, P., & Brunak, S. (2001). Bioinformatics: the machine learning approach. MIT press. Cortes, C.; Vapnik, V. (1995). Support-vector networks. Machine Learning 20 (3):273. doi:10.1007/BF00994018. Chou, E. P. (2015, July). Data Driven Geometry for Learning. In International Workshop on Machine Learning and Data Mining in Pattern Recognition (pp. 395 -402). Springer International Publishing. Chou, E. P., Hsieh, F., & Capitanio, J. (2013, December). Computed Data-Geometry Based Supervised and Semi-supervised Learning in High Dimensional Data. In Machine Learning and Applications (ICMLA), 2013 12th International Conference on (Vol. 1, pp. 277-282). Chang, Y. C. I. (2003). Boosting SVM classifiers with logistic regression. See www. stat. sinica. edu. tw/library/c_tec_rep/2003-03. pdf. Culp, M. (2011). spa: A Semi-Supervised R Package for Semi-Parametric Graph-Based Estimation. Journal of Statistical Software, 40(10), 1-29. Fushing, H., Wang, H., VanderWaal, K., McCowan, B., & Koehl, P. (2013). Multi -scale clustering by building a robust and self correcting ultrametric topology on data points. PloS one, 8(2), e56259. Grozavu, N., Bennani, Y., & Lebbah, M. (2009, June). From variable weighting to cluster characterization in topographic unsupervised learning. In Neural Networks, 2009. IJCNN 2009. International Joint Conference on (pp. 1005 -1010). IEEE. Hastie, T., Tibshirani, R., Friedman, J., & Franklin, J. (2005). The elements of statistical learning: data mining, inference and prediction. The Mathematical Intelligencer, 27(2). Tan, A. C., & Gilbert, D. (2003, January). An empirical comparison of supervised machine learning techniques in bioinformatics. In Proceedings of the First Asia -Pacific bioinformatics conference on Bioinformatics 2003-Volume 19 (pp. 219 -222). Australian Computer Society, Inc.. |
Description: | 碩士 國立政治大學 統計學系 103354025 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0103354025 |
Data Type: | thesis |
Appears in Collections: | [統計學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
402501.pdf | 760Kb | Adobe PDF2 | 468 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|