Please use this identifier to cite or link to this item:
Title: | 比較交叉適配與p值合併對於特徵重要性檢定之影響 Comparing Cross-Fitting and P-Value Combination Methods in Testing Feature Importances |
Authors: | 顏立平 Yen, Li-Ping |
Contributors: | 黃柏僩 Huang, Po-Hsien 顏立平 Yen, Li-Ping |
Keywords: | 機器學習 特徵重要性 特徵重要性檢定 machine learning feature importance feature importance tests |
Date: | 2025 |
Issue Date: | 2025-02-04 16:13:39 (UTC+8) |
Abstract: | 機器學習(machine learning,ML)算則建立之模型,長期以來被認為難以詮釋。而隨著可解釋機器學習(interpretable ML)之發展,研究者已可透過多種特徵重要性(feature importance)檢定,如殘差排序檢定(residual permutation test,RPT)、條件預測影響(conditional predictive impact,CPI)、 與 逐 一 變 數 排 除 (leave-one-covariate-out,LOCO),以了解哪些特徵具有統計顯著(statistically significant)之預測能力。傳統的特徵重要性檢定仰賴資料拆分(data splitting),即將資料拆為訓練集與測試集,前者用於訓練預測式,後者用於進行檢定。然而,資料拆分伴隨的樣本數減少意味著統計檢定力(statistical power)之喪失,且容許研究者從多次拆分挑選有利之分析結果,即所謂的資料窺探(data snooping),其會造成型一錯誤率(type I error)膨脹。為了解決單次資料拆分所帶來的問題,研究者可考慮透過重複資料拆分獲得多組分析結果,再使用 p 值合併或交叉適配(cross-fit)將多組結果進行整合。本研究試圖透過模擬實驗來評估多種 p 值合併法和有無交叉適配之策略組合,於 RPT、CPI 與 LOCO 之實徵表現。模擬結果顯示資料窺探的確會導致型一錯誤率膨脹,而所有的組合皆可將型一錯誤率控制在顯著水準(α = 0.05)以下,唯一的例外為 RPT 搭配 Cauchy 法會造成型一錯誤率膨脹。在檢定力方面,使用Bonferroni 法搭配交叉適配,以及單獨使用 Cauchy 法兩種策略組合展現相對較佳的檢定力,且優於單次資料拆分,而其餘的 p 值合併法儘管可控制型一錯誤率,卻展現低於單次資料拆分之檢定力。 Machine learning (ML) models have long been considered difficult to interpret. However, with the development of interpretable machine learning (interpretable ML), researchers can now use various feature importance tests, such as the residual permutation test (RPT), conditional predictive impact (CPI), and leave-one-covariate-out (LOCO), to identify which features have statistically significant predictive power. Traditional feature importance tests rely on data splitting, dividing the dataset into a training set for model fitting and a test set for statistical test. This approach reduces sample size, resulting in a loss of statistical power, and allows researchers to engage in data snooping by selecting favorable analysis results from multiple splits. To address the issues caused by single data splitting, researchers may consider repeated data splitting to obtain multiple analysis results, which can then be combined using p-value aggregation methods or cross-fitting. This study aims to evaluate the empirical performance of various combinations of p-value aggregation methods and cross-fitting strategies through simulation experiments applied to RPT, CPI, and LOCO. Simulation results reveal that data snooping inflates type I error rates, whereas almost all strategy combinations effectively control type I errors, except for RPT paired with the Cauchy method. In terms of statistical power, the combination of Bonferroni correction with cross-fitting and the standalone use of the Cauchy method exhibit relatively better power compared to single data splitting. Other p-value aggregation methods, while controlling type I errors, demonstrate lower statistical power than single data splitting. |
Description: | 碩士 國立政治大學 心理學系 111752001 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0111752001 |
Data Type: | thesis |
