English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 114105/145137 (79%)
Visitors : 52199271      Online Users : 773
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 商學院 > 統計學系 > 學位論文 >  Item 140.119/30954
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/30954


    Title: 兩階段特徵選取法在蛋白質質譜儀資料之應用
    兩階段特徵選取法在蛋白質質譜儀資料之應用
    A Two-Stage Approach of Feature Selection on Proteomic Spectra Data
    A Two-Stage Approach of Feature Selection on Proteomic Spectra Data
    Authors: 王健源
    Wang,Chien-yuan
    Contributors: 張源俊
    郭訓志

    王健源
    Wang,Chien-yuan
    Keywords: 特徵選取
    特徵選取
    基因演算法
    基因演算法
    表面增強雷射脫附游離/飛行時間質譜
    表面增強雷射脫附游離/飛行時間質譜
    支援向量機
    支援向量機
    Feature Selection
    Feature Selection
    Genetic Algorithm (GA)
    Genetic Algorithm (GA)
    SELDI
    SELDI
    Support Vector Machines (SVM)
    Support Vector Machines (SVM)
    Date: 2005
    Issue Date: 2009-09-14
    Abstract: 藉由「早期發現,早期治療」的方式,我們可以降低癌症的死亡率。因此找出與癌症病變有關的生物標記以期及早發現與治療是一項重要的工作。本研究分析了包含正常人以及攝護腺癌症病人實際的蛋白質質譜資料,而這些蛋白質質譜資料是來自於表面強化雷射解吸電離飛行質譜技術(SELDI-TOF MS)的蛋白質晶片實驗。表面增強雷射脫附遊離飛行時間質譜技術可有效地留存生物樣本的蛋白質特徵。如果沒有經過適當的事前處理步驟以消除實驗雜訊,ㄧ 個質譜中可能包含多於數百或數千的特徵變數。為了加速對於可能的蛋白質生物標記的搜尋,我們只考慮可以區分癌症病人與正常人的特徵變數。
    基因演算法是一種類似生物基因演化的總體最佳化搜尋機制,它可以有效地在高維度空間中去尋找可能的最佳解。本研究中,我們利用仿基因演算法(GAL)進行蛋白質的特徵選取以區分癌症病人與正常人。另外,我們提出兩種兩階段仿基因演算法(TSGAL),以嘗試改善仿基因演算法的缺點。
    藉由「早期發現,早期治療」的方式,我們可以降低癌症的死亡率。因此找出與癌症病變有關的生物標記以期及早發現與治療是一項重要的工作。本研究分析了包含正常人以及攝護腺癌症病人實際的蛋白質質譜資料,而這些蛋白質質譜資料是來自於表面強化雷射解吸電離飛行質譜技術(SELDI-TOF MS)的蛋白質晶片實驗。表面增強雷射脫附遊離飛行時間質譜技術可有效地留存生物樣本的蛋白質特徵。如果沒有經過適當的事前處理步驟以消除實驗雜訊,ㄧ 個質譜中可能包含多於數百或數千的特徵變數。為了加速對於可能的蛋白質生物標記的搜尋,我們只考慮可以區分癌症病人與正常人的特徵變數。
    基因演算法是一種類似生物基因演化的總體最佳化搜尋機制,它可以有效地在高維度空間中去尋找可能的最佳解。本研究中,我們利用仿基因演算法(GAL)進行蛋白質的特徵選取以區分癌症病人與正常人。另外,我們提出兩種兩階段仿基因演算法(TSGAL),以嘗試改善仿基因演算法的缺點。
    Early detection and diagnosis can effectively reduce the mortality of cancer. The discovery of biomarkers for the early detection and diagnosis of cancer is thus an important task. In this study, a real proteomic spectra data set of prostate cancer patients and normal patients was analyzed. The data were collected from a Surface-Enhanced Laser Desorption/Ionization Time-Of-Flight Mass Spectrometry (SELDI-TOF MS) experiment. The SELDI-TOF MS technology captures protein features in a biological sample. Without suitable pre-processing steps to remove experimental noise, a mass spectrum could consists of more than hundreds or thousands of peaks. To narrow down the search for possible protein biomarkers, only those features that can distinguish between cancer and normal patients are selected.
    Genetic Algorithm (GA) is a global optimization procedure that uses an analogy of the genetic evolution of biological organisms. It’s shown that GA is effective in searching complex high-dimensional space. In this study, we consider GA-Like algorithm (GAL) for feature selection on proteomic spectra data in classifying prostate cancer patients from normal patients. In addition, we propose two types of Two-Stage GAL algorithm (TSGAL) to improve the GAL.
    Early detection and diagnosis can effectively reduce the mortality of cancer. The discovery of biomarkers for the early detection and diagnosis of cancer is thus an important task. In this study, a real proteomic spectra data set of prostate cancer patients and normal patients was analyzed. The data were collected from a Surface-Enhanced Laser Desorption/Ionization Time-Of-Flight Mass Spectrometry (SELDI-TOF MS) experiment. The SELDI-TOF MS technology captures protein features in a biological sample. Without suitable pre-processing steps to remove experimental noise, a mass spectrum could consists of more than hundreds or thousands of peaks. To narrow down the search for possible protein biomarkers, only those features that can distinguish between cancer and normal patients are selected.
    Genetic Algorithm (GA) is a global optimization procedure that uses an analogy of the genetic evolution of biological organisms. It’s shown that GA is effective in searching complex high-dimensional space. In this study, we consider GA-Like algorithm (GAL) for feature selection on proteomic spectra data in classifying prostate cancer patients from normal patients. In addition, we propose two types of Two-Stage GAL algorithm (TSGAL) to improve the GAL.
    Reference: Alpaydm, E.(2004). Introduction To Machine Learning. The MIT Press.
    Alpaydm, E.(2004). Introduction To Machine Learning. The MIT Press.
    Adam, B.L., Qu, Y., Davis, J.W., Ward, M.D., Clements, M.A., Cazares, L.H., Semmes, O.J., Schellhanmmer, P.F., Yasui, Y., Feng, Z., and Wright, G.L.(2002). Serum Protein Fingerprinting Coupled with a Pattern-Matching Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia and Healthy Men. CANCER RESEARCH 62(13), 3609–3614.
    Adam, B.L., Qu, Y., Davis, J.W., Ward, M.D., Clements, M.A., Cazares, L.H., Semmes, O.J., Schellhanmmer, P.F., Yasui, Y., Feng, Z., and Wright, G.L.(2002). Serum Protein Fingerprinting Coupled with a Pattern-Matching Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia and Healthy Men. CANCER RESEARCH 62(13), 3609–3614.
    Baggerly, K.A., Morris, J.S., and Coombes, K.R.(2004). Reproducibility of SELDI-TOF Protein Patterns in Serum: Comparing Data Sets from Different Experiments. Bioinformatics 20(5), 777-785.
    Baggerly, K.A., Morris, J.S., and Coombes, K.R.(2004). Reproducibility of SELDI-TOF Protein Patterns in Serum: Comparing Data Sets from Different Experiments. Bioinformatics 20(5), 777-785.
    Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J.(1984). Classification and Regression Trees. Belmont, CA: Wadsworth International Group, 203-215.
    Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J.(1984). Classification and Regression Trees. Belmont, CA: Wadsworth International Group, 203-215.
    Crosby J. L.(1967). Computers in the study of evolution. Science Progress (55), 279–292.
    Crosby J. L.(1967). Computers in the study of evolution. Science Progress (55), 279–292.
    Fraser, A. S.(1957). Simulation of Genetic Systems by Automatic Digital Computers—I: Introduction. Australian Journal of Biological Sciences (10), 484-491.
    Fraser, A. S.(1957). Simulation of Genetic Systems by Automatic Digital Computers—I: Introduction. Australian Journal of Biological Sciences (10), 484-491.
    Fogel, D. B.(1998). Evolutionary Computation: The Fossil Record. New York: IEEE Press.
    Fogel, D. B.(1998). Evolutionary Computation: The Fossil Record. New York: IEEE Press.
    Freund, Y., and Schapire, R.(1997) A Decision-Theoretical Generalization of On-Line Learning and an Application to Boosting. Computer System Science (55), 119-139.
    Freund, Y., and Schapire, R.(1997) A Decision-Theoretical Generalization of On-Line Learning and an Application to Boosting. Computer System Science (55), 119-139.
    Goldberg, D.E.(1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, New Work.
    Goldberg, D.E.(1989) Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley, New Work.
    Gretzer, M.B., Chan, D.W., van Rootselaar, C.L., Rosenzweig, J.M., Dalrymple, S., Mangold, L.A., Partin, A.W., and Veltri, R.W.(2004). Proteomic Analysis of Dunning Prostate Cancer Cell Lines With Variable Metastatic Potential Using SELDI-TOF. Prostate (60), 325-331.
    Gretzer, M.B., Chan, D.W., van Rootselaar, C.L., Rosenzweig, J.M., Dalrymple, S., Mangold, L.A., Partin, A.W., and Veltri, R.W.(2004). Proteomic Analysis of Dunning Prostate Cancer Cell Lines With Variable Metastatic Potential Using SELDI-TOF. Prostate (60), 325-331.
    Holland, J.H.(1975) Adaptation in Natural and Artificial Systems, The University of Michigan Press; Ann Arbor, IL.
    Holland, J.H.(1975) Adaptation in Natural and Artificial Systems, The University of Michigan Press; Ann Arbor, IL.
    Honkela T.(1998). Self-Organizing Maps in Natural Language Processing. Helsinki University of Technology Neural Networks Research Centre.
    Honkela T.(1998). Self-Organizing Maps in Natural Language Processing. Helsinki University of Technology Neural Networks Research Centre.
    Hutchens, T.W., Yip, T.T.(1993). New Desorption Strategies for the Mass Spectrometric Analysis of Macromolecules. Rapid Commun Mass Spectrom (7), 576-580.
    Hutchens, T.W., Yip, T.T.(1993). New Desorption Strategies for the Mass Spectrometric Analysis of Macromolecules. Rapid Commun Mass Spectrom (7), 576-580.
    Lilien, R., Farid, H., and Donald, B.(2003). Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum. Journal of Computational Biology 10(6):925-946.
    Lilien, R., Farid, H., and Donald, B.(2003). Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum. Journal of Computational Biology 10(6):925-946.
    Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., and Liotta, L.A.(2002a). Use of Proteomic Patterns in Serum to Identify Ovarian Cancer. Lancet 359(9306), 572–577.
    Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., and Liotta, L.A.(2002a). Use of Proteomic Patterns in Serum to Identify Ovarian Cancer. Lancet 359(9306), 572–577.
    Petricoin, E.F., Ornstein, D.K., Paweletz, C.P., Ardekani, A., Hackett, P.S., Velassco, A., Trucco, C., Wiegand, L., Wood, K., Simone, C.B., Levine, P.J., Linehan, W.M., EmmertBuck, M.R., Steinberg, S.M., Kohn, E.C., and Liotta, L.A.(2002b). Serum Proteomic Patterns for Detection of Prostate Cancer. Journal of the National Cancer Institute 94(20), 1576–1578.
    Petricoin, E.F., Ornstein, D.K., Paweletz, C.P., Ardekani, A., Hackett, P.S., Velassco, A., Trucco, C., Wiegand, L., Wood, K., Simone, C.B., Levine, P.J., Linehan, W.M., EmmertBuck, M.R., Steinberg, S.M., Kohn, E.C., and Liotta, L.A.(2002b). Serum Proteomic Patterns for Detection of Prostate Cancer. Journal of the National Cancer Institute 94(20), 1576–1578.
    Peng, S., Xu, Q., Ling, X.B., Peng, X., Du, W., and Chen, L.(2003). Molecular Classification of Cancer Types from Microarray Data Using the Combination of Genetic Algorithms and Support Vector Machines. FEBS Letters 555, 358-362.
    Peng, S., Xu, Q., Ling, X.B., Peng, X., Du, W., and Chen, L.(2003). Molecular Classification of Cancer Types from Microarray Data Using the Combination of Genetic Algorithms and Support Vector Machines. FEBS Letters 555, 358-362.
    Qu, Y., Adam, B.L., Yasui, Y., Ward, M.D., Cazares, L.H., Schellhammer, P.F., Feng, Z., Semmes, O.J., and Wright, G.L.(2002). Boosted Decision Tree Analysis of Surface-Enhanced Laser Desorption/Ionization Mass Spectral Serum Profiles Discriminates Prostate Cancer from Noncancer Patients. Clin Chem.Clinical Chemistry 48(10), 1835–1843.
    Qu, Y., Adam, B.L., Yasui, Y., Ward, M.D., Cazares, L.H., Schellhammer, P.F., Feng, Z., Semmes, O.J., and Wright, G.L.(2002). Boosted Decision Tree Analysis of Surface-Enhanced Laser Desorption/Ionization Mass Spectral Serum Profiles Discriminates Prostate Cancer from Noncancer Patients. Clin Chem.Clinical Chemistry 48(10), 1835–1843.
    Tuszynski, J.(2006). Processing & Classification of Protein Mass Spectra (SELDI) Data. The caMassClass Package of R software.
    Tuszynski, J.(2006). Processing & Classification of Protein Mass Spectra (SELDI) Data. The caMassClass Package of R software.
    Tong, W, Xie, Q, Hong, H, Fang, H., Shi, L., Perkins, R., and Petricoin, E.F.(2004). Using Decision Forest to Classify Prostate Cancer Samples on the Basis of SELDI-TOF MS Data: Assessing Chance Correlation and Prediction Confidence. Environmental Health Perspectives 112(16),
    Tong, W, Xie, Q, Hong, H, Fang, H., Shi, L., Perkins, R., and Petricoin, E.F.(2004). Using Decision Forest to Classify Prostate Cancer Samples on the Basis of SELDI-TOF MS Data: Assessing Chance Correlation and Prediction Confidence. Environmental Health Perspectives 112(16),
    Description: 碩士
    碩士
    國立政治大學
    國立政治大學
    統計研究所
    統計研究所
    93354025
    93354025
    94
    94
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0933540252
    http://thesis.lib.nccu.edu.tw/record/#G0933540252
    Data Type: thesis
    thesis
    Appears in Collections:[統計學系] 學位論文

    Files in This Item:

    File SizeFormat
    index.html0KbHTML2679View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback