政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/131472
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113451/144438 (79%)
Visitors : 51309340      Online Users : 844
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/131472


    Title: 多維度變異係數模型-基於B-Spline 近似之選模
    Variable Selection of High Dimension Varying Coefficient Model Under B-Spline Approximation
    Authors: 楊博安
    Yang, Po-An
    Contributors: 黃子銘
    Huang, Tzee-Ming
    楊博安
    Yang, Po-An
    Keywords: 變異係數模型
    B-平滑曲獻
    向前選取法
    Varying coefficient model
    B-spline
    Forward selection
    Group lasso
    Date: 2020
    Issue Date: 2020-09-02 11:42:02 (UTC+8)
    Abstract: 變異係數模型是一種非線性模型,在許多領域都有廣泛的應用。與線型模型相比,變異係數模型最大的特點是允許係數隨著影響變數變動而變動,同時也保留易於詮釋的優點。而在大數據的時代,資料蒐集變得相對容易,當資料的變數個數非常大,而有顯著貢獻的真實變數不多時,如何挑選有用的變數十分重要。現行研究中多半以向前選取法與正規化方法兩種類型為主。本文以模擬實驗比較分組向前選取法與group lasso方法在不同條件設定下的優劣,並提出下列兩點建議:為了防止向前選取法過早停止,建議在BIC不再改善後再進行數步選取變數群組流程;某些時候group lasso傾向選取過多無關變數或選取過少真實變數,建議在進行完數種不同懲罰項的group lasso之後進行向後選取法,以決定最優模型。
    Varying coefficient model is a form of nonlinear regression models which has numerous applications in many fields. While enjoying the good interpretability, the major difference from linear model is that the coefficients are allowed to vary systematically and smoothly in more than one dimension. However, in big data, when the number of candidate variables are very large, it is challenging to select the relevant variables. In recent years, there are several works dealing with this situation. Two main approaches are selection methods and regularization methods. In this thesis, we compare groupwise forward selection and group lasso in different conditions of simulations. For forward selection, we suggest running several steps after the stopping criterion is met in order to avoid stopping too early. We also find that group lasso method select too much unrelated variables or select too few true variables under some conditions. Thus, we apply groupwise backward selection after choosing several penalty terms in group lasso to improve the performance.
    Reference: Bertsekas, D. P. (2016). Nonlinear Programming. 3rd edition. Athena Scientific.

    Breiman, L. (1995). Better subset regression using the nonnegative garrote. Technometrics, 37(4):373–384.

    Cai, J., Fan, J., Zhou, H., Zhou, Y., et al. (2007). Hazard models with varying coefficients for multivariate failure time data. The Annals of Statistics, 35(1):324–354.

    Cheng, M.-Y., Honda, T., and Zhang, J.T. (2016). Forward variable selection for sparse ultrahigh dimensional varying coefficient models. Journal of the American Statistical Association, 111(515):1209–1221.

    De Boor, C., De Boor, C., Mathématicien, E.U., De Boor, C., and De Boor, C. (1978). A practical guide to splines, volume 27. springer-verlag New York.

    Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al. (2004). Least angle regression.The Annals of statistics, 32(2):407–499.

    Fan, J., Feng, Y., and Song, R. (2011). Nonparametric independence screening in sparse ultra-high-dimensional additive models. Journal of the American Statistical Association, 106(494):544–557.

    Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association, 96(456):1348–1360.

    Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(5):849–911.

    Fan, J., Ma, Y., and Dai, W. (2014). Nonparametric independence screening in sparse ultra-high-dimensional varying coefficient models. Journal of the American Statistical Association, 109(507):1270–1284.

    Hastie, T. and Tibshirani, R. (1993). Varying-coefficient models. Journal of the Royal Statistical Society: Series B (Methodological), 55(4):757–796.

    Hastie, T. J. and Tibshirani, R. J. (1990). Generalized additive models, volume 43. CRC press.

    Luo, S. and Chen, Z. (2014). Sequential lasso cum EBIC for feature selection with ultrahigh dimensional feature space. Journal of the American Statistical Association,
    109(507):1229–1240.

    Meier, L., Van De Geer, S., and Bühlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1):53–71.

    Nicholls, D. and Quinn, B. (1982). Random coefficient autoregressive models: an introduction. Lecture notes in statistics. Springer, Springer Nature, United States.

    Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288.

    Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. Journal of the American Statistical Association, 104(488):1512–1524.

    Wei, F., Huang, J., and Li, H. (2011). Variable selection and estimation in high-dimensional varying coefficient models. Statistica Sinica, 21(4):1515–1540.

    Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68:49–67.

    Zhang, W., Lee, S.Y., and Song, X. (2002). Local polynomial fitting in semivarying coefficient model. Journal of Multivariate Analysis, 82(1):166–188.

    Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American statistical association, 101(476):1418–1429.

    Zou, H. and Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320.
    Description: 碩士
    國立政治大學
    統計學系
    107354003
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0107354003
    Data Type: thesis
    DOI: 10.6814/NCCU202001217
    Appears in Collections:[Department of Statistics] Theses

    Files in This Item:

    File Description SizeFormat
    400301.pdf433KbAdobe PDF20View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback