English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文筆數/總筆數 : 119146/150226 (79%)
造訪人次 : 85767258      線上人數 : 507
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋
    政大機構典藏 > 商學院 > 統計學系 > 學位論文 >  Item 140.119/159044
    請使用永久網址來引用或連結此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/159044


    題名: 應用地理加權隨機森林與SHAP探討房價之空間異質性
    Examining Spatial Heterogeneity of Housing Prices Using Geographically Weighted Random Forest and SHAP
    作者: 陳宏丞
    Chen, Hong-Cheng
    貢獻者: 陳怡如
    Chen, Yi-Ju
    陳宏丞
    Chen, Hong-Cheng
    關鍵詞: 實價登錄
    新冠肺炎疫情
    地理加權迴歸
    隨機森林
    SHAP
    Actual price registration
    COVID-19 epidemic
    Geographically weighted regression
    Random forest
    SHAP
    日期: 2025
    上傳時間: 2025-09-01 14:50:40 (UTC+8)
    摘要: 台灣房地產市場的研究初期多從利率、稅制等總體經濟與政策層面進行分析。隨著實價登錄資料普及,研究視角深入至屋齡、坪數等個體特徵的微觀層面,然而傳統計量模型普遍忽略地理空間的相互影響,即空間自相關效應,導致模型估計產生偏誤,同時也無法捕捉變數影響力隨地理位置改變而產生的變化。為此地理加權迴歸模型(Geographicallyweightedregression,GWR)被廣泛應用以捕捉這種空間異質性。與此同時,機器學習方法因其強大的預測能力也被大量應用於房價預測研究,但其模型多為難以解釋的黑箱模型,且同樣未充分考量空間特性的雙重侷限。這些既有模型的不足催生了整合空間分析與機器學習優勢的新研究方向。本篇論文研究之主要目的為應用一種能同時探討空間異質性與捕捉非線性資訊的前沿模型——地理隨機森林(GeographicallyRandomForest, GRF)以新北市於2022年1月至2022年6月的住宅交易之實價登錄資料進行實證分析,並引入SHAP (SHapley Additive exPlanations) 方法具體量化並視覺化地呈現各個變數在不同地理位置對房價預測結果的貢獻,從而探討新北市住宅在不同地區房價決定機制的潛在變化。此外,由於GRF 模型的計算量龐大,因此本研究提出兩種計算成本較低的兩階段混合模型,並驗證其是否能在維持預測效能的前提下,成為更具成本效益的替代方案。結果顯示,新北市住宅房價影響因素存在顯著的空間異質性,傳統的全域模型不足以捕捉區域差異。在影響房價的重要因素中,坪數具最強的正向影響力,屋齡則為主要的負向因素,且這兩者的影響程度會因地點而異,例如坪數在中和永和地區貢獻更高,而屋齡在成熟市區的負面效應更強。此外,鄰近捷運站能顯著提升房價,但其效益具有高度局部性,主要集中於捷運路線周邊。此外,本研究透過交叉驗證進行模型比較,結果顯示地理隨機森林(GRF)模型能有效處理空間變異,展現出最佳的預測精度,而RF-to-GWR兩階段模型則提供了一種計算成本顯著較低,同時能達到接近地理隨機森林預測效果的替代方案。
    Initial research on Taiwan' s real estate market predominantly focused on macroeconomic and policy-level analyses, such as interest rates and tax systems. With the widespread availability of actual price registration data, the research perspective has shifted to a micro level examination of individual property characteristics like building age and floor area. However, traditional econometric models often overlook the mutual influence of geographical space—the spatial autocorrelation effect—leading to biased model estimations. Concurrently, these models fail to capture the variations in the influence of variables across different geographical locations. To address this, the Geographically Weighted Regression (GWR) model has been widely adopted to capture such spatial heterogeneity. In parallel, machine learning methods have been extensively used in housing price prediction for their powerful predictive capabilities, yet they are often limited by their ”black-box” nature, which makes them difficult to interpret, and they similarly fail to adequately consider spatial characteristics. These deficiencies in existing models have spurred a new research direction that integrates the advantages of both spatial analysis and machine learning.

    The primary objective of this thesis is to apply a cutting-edge model, the Geographically Random Forest (GRF), which can simultaneously explore spatial heterogeneity and capture non-linear information. This study conducts an empirical analysis using actual price registration data for residential transactions in New Taipei City from January to June 2022. The SHAP (SHapley Additive exPlanations) method is introduced to concretely quantify and visualize the contribution of each variable to the housing price prediction results at different geographical locations, thereby exploring the potential variations in the determinants of housing prices across different districts of New Taipei City. Furthermore, due to the substantial computational load of the GRF model, this research proposes two two-stage hybrid models with lower computational costs and verifies whether they can serve as more cost-effective alternatives while maintaining predictive performance.

    The results indicate the presence of significant spatial heterogeneity in the factors influencing residential housing prices in New Taipei City, revealing that traditional global models are insufficient for capturing regional differences. Among the important factors affecting housing prices, floor area has the strongest positive impact, while building age is the primary negative factor. The extent of the influence of both of these variables differs by location; for instance, the contribution of floor area is higher in the Zhonghe and Yonghe districts, whereas the negative effect of building age is more pronounced in mature urban areas. Proximity to an MRT station significantly increases housing prices, but this benefit is highly localized, primarily concentrated along the MRT lines. Additionally, through cross-validation and model comparison, the results show that the Geographically Random Forest (GRF) model effectively handles spatial variation and demonstrates superior predictive accuracy. Meanwhile, the RF-to-GWR two-stage model offers a viable alternative that significantly reduces computational costs while achieving a predictive performance close to that of the GRF model.
    參考文獻: Breiman, L. (1996). Out-of-bag estimation (Technical Report). Statistics Department, University of California, Berkeley, California.

    Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Brunsdon, C., Fotheringham, A. S., & Charlton, M. E. (1996). Geographically weighted regression: A method for exploring spatial nonstationarity. Geographical Analysis, 28(4), 281–298.

    Chen, M.-C., Tsai, I.-C., & Chang, C.-O. (2007). House prices and household income: Do they move apart? Evidence from Taiwan. Habitat International, 31(2), 243–256.

    Chou, C.-I., & Li, S. P. (2010). House price distributions of Taiwan: A preliminary study. arXiv.

    Georganos, S., Grippa, T., Niang Gadiaga, A., Linard, C., Lennert, M., Vanhuysse, S., Mboga, N., Wolff, É., & Kalogirou, S. (2019). Geographical random forests: A spatial
    extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling. Geocarto International, 36(2), 121–136.

    Georganos, S., & Kalogirou, S. (2022). A forest of forests: A spatially weighted and computationally efficient formulation of geographical random forests. ISPRS International Journal of Geo-Information, 11, 471.

    Lundberg, S., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.

    Li, L. (2019). Geographically Weighted Machine Learning and Downscaling for High Resolution Spatiotemporal Estimations of Wind Speed. Remote Sensing, 11(11), 1378.

    Moran, P. A. P. (1950). Notes on continuous stochastic phenomena. Biometrika, 37(1/2), 17–23.

    Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

    Quiñones, S., Goyal, A., & Ahmed, Z. U. (2021). Geographically weighted machine learning model for untangling spatial heterogeneity of type 2 diabetes mellitus (T2D) prevalence in the USA. Scientific Reports, 11, 6955.

    Rosen, S. (1974). Hedonic prices and implicit markets: Product differentiation in pure competition. Journal of Political Economy, 82(1), 34–55.

    Schimohr, K., Doebler, P., & Scheiner, J. (2023). Prediction of bike-sharing trip counts: Comparing parametric spatial regression models to a geographically weighted XGBoost algorithm. Geographical Analysis, 55(3), 651–684.

    Wang, S., Gao, K., Zhang, L., Yu, B., & Easa, S. M. (2024). Geographically weighted machine learning for modeling spatial heterogeneity in traffic crash frequency and determinants in US. Accident Analysis & Prevention, 199, 107528.

    Young, H. P. (1985). Monotonic solutions of cooperative games. International Journal of Game Theory, 14(2), 65–72.

    Ye, M., Zhu, L., Li, X., Ke, Y., Huang, Y., Chen, B., Yu, H., Li, H., & Feng, H. (2023). Estimation of the soil arsenic concentration using a geographically weighted XGBoost model based on hyperspectral data. Science of The Total Environment, 858(Part 1), 159798.

    王潔敏(2009)。《大眾運輸系統對房地產價格之影響研究-以高雄大都會區為例》(碩士論文)。國立成功大學都市計劃學系碩博士班。

    王思青(2021)。《影響臺北市住宅房價因素之探討》(碩士論文)。國立臺北大學統計學系。

    江明珠、王詩淯(2025)。次市場與樓層效應:不動產估價中異質性特徵的分析。住宅學報,34(1),57–85。

    李明翰(2012)。《以特徵價格法探討影響房價之因子-以新北市板橋區為例》(碩士論文)。國立臺灣海洋大學應用經濟研究所。

    林靖親(2022)。《臺灣各縣市人口結構對其房價的影響-以空間計量模型分析》(碩士論文)。國立政治大學行政管理碩士學程。

    陳湘穎(2025)。《醫院真的是嫌惡設施嗎?以高雄市住宅大樓房價為例》(碩士論文)。國立中山大學財務管理學系研究所。

    徐鈺翔(2023)。《萬華群聚感染事件是否影響萬華區房價?雙重差分的實證分析》(碩士論文)。國立清華大學高階經營管理雙聯碩士在職學位學程。

    張少綺(2020)。《生活機能對住宅價格的影響–以臺北市大安區為例》(碩士論文)。國立政治大學地政學系。

    許曉雯(2024)。《疫情後新北市房價輿論聲量變化與變動趨勢之研究》(碩士論文)。國立臺北科技大學管理學院資訊與財金管理EMBA專班。

    楊宗憲、蘇倖慧(2011)。迎毗設施與鄰避設施對住宅價格影響之研究。住宅學報,20(2), 61–80。

    葉芳秀(2018)。《人口特徵對房價影響之分析》(碩士論文)。國立政治大學財務管理學系。

    解佳伶(2024)。《地段率對於房價的影響-以高雄市和臺南市為例》(碩士論文)。國立中山大學企業管理學系研究所。

    鄭貫廷(2024)。《半參數地理加權邏輯斯迴歸在實價登錄資料分析的應用》(碩士論文)。國立政治大學統計學系。

    劉元欽(2022)。《從空氣汙染之空間距離探討對房價的影響-以台化公司彰化廠為例》(碩士論文)。國立中興大學應用經濟學系所。

    韓恩之、蔡瑄文、賴淑芳(2016)。捷運交通對房價之影響以台北市為例。地理資訊系統季刊,10(4),24–28。
    描述: 碩士
    國立政治大學
    統計學系
    112354030
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0112354030
    資料類型: thesis
    顯示於類別:[統計學系] 學位論文

    文件中的檔案:

    檔案 大小格式瀏覽次數
    403001.pdf7426KbAdobe PDF0檢視/開啟


    在政大典藏中所有的資料項目都受到原著作權保護.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回饋