政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/66311
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 113311/144292 (79%)
造访人次 : 50935629      在线人数 : 950
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 商學院 > 統計學系 > 期刊論文 >  Item 140.119/66311


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/66311


    题名: 高維度資料特徵選取之探討–應用於分類蛋白質質譜儀資料
    其它题名: On Feature Selection of High Dimensional Data - Application on Classifying Proteomic Spectra Data
    作者: 郭訓志;黃仁澤;薛慧敏
    Kuo, Hsun-Chih;Hunag, Jen-Tse;Hsueh,Huey-Miin
    贡献者: 統計系
    关键词: 特徵選取,蛋白質質譜儀資料,支援向量機,交叉驗證
    日期: 2011.06
    上传时间: 2014-05-27 15:13:43 (UTC+8)
    摘要: 一般健檢的腫瘤指標的靈敏度和特異性皆不高,也無法偵測較小的腫瘤,因此通常無法及早診斷出腫瘤。本研究的資料為應用蛋白質晶片與表面強化雷射解吸電離飛行質譜技術(SELDI)的血清蛋白質質譜資料,血清樣本來自健康的正常人以及三組不同時期的攝護腺癌症病人。研究目的在選取有助於區分不同時期攝護腺癌症的蛋白質特徵,利用重複隨機抽樣的交叉驗證和支援向量機(Support Vector Machine),先以t 檢定的平均p值、Kruskal-Wallis 檢定的平均p值、或平均分錯率對於所有蛋白質特徵進行排序,再利用向前選取方式找出最小分錯率模型之特徵變數。為了精簡模型,本研究同時考慮佐以相關係數與判定係數萃取後的特徵變數之分類結果。在各個方法比較上,使用Kruskal-Wallis檢定之最小p值特徵選取法的分類效果較好,而輔助的萃取方法以最大相關係數萃取法最能有效縮減特徵個數,同時又保持分類效果。
    Often the time the tumor marker of regular health evaluation is low in sensitivity and specificity so that it could not detect tumor of small size in time. This research aims to develop a classification tool for early diagnosis of tumor by studying proteomic mass spectra of prostate cancer data at different stages. The prostate cancer data studied are the Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometry (SELDI-TOF-MS) generated from 327 serum samples. Of the 327 serum samples, 81 are from unaffected healthy men (HM), 78 are from patients diagnosed with benign prostatic hyperplasia (BPH), 84 are from patients with organ-confined PCA (T1/T2), and 84 are from patients with non-organ-confined PCA (T3/T4). The goal of this research is to select features (peaks) of the mass spectra that are useful for classifying different stages of prostate cancer via repeated random subsampling cross-validation. The forward minimum-p_value method (derived from t test or Kruskal-Wallis test) and the forward minimum-classification-error method incorporated with SVM are proposed in this study. In addition, maximum-correlation method and maximum-R2 method are considered for further feature selection. In comparison, the forward minimum-p_value method derived from Kruskal-Wallis test often outperforms other methods in terms of classification rate. Moreover, the maximum-correlation method not only can reduce the number of features effectively but also can preserve the classification rate at the same time.
    關聯: Journal of Data Analysis, 6(3), 67-80
    数据类型: article
    显示于类别:[統計學系] 期刊論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    72-83.pdf1081KbAdobe PDF2853检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈