Title: | 以資料採礦技術分析大台北地區保單貸款 |
Authors: | 李珮榕 |
Contributors: | 鄭宇庭 李珮榕 |
Keywords: | 資料採礦 保單貸款 類神經網路 CART C4.5 |
Date: | 2002 |
Issue Date: | 2009-09-14 |
Abstract: | 摘要
從模型結果來看影響保單是否有貸款的變數,在類神經網路模型的靈敏度分析結果中,對模型影響較大的變數為體位別、被保人職業別級數、保險型態及地區;在CART模型結果中,影響較大的變數為繳別、保單年度、保單價值金、繳費方式及投保面額;在C4.5模型結果中,影響較大的變數為主約保單預定利率、年繳化保費、保單年度及繳別。對於CART、C4.5模型,選擇有較高正確率的規則,以提供保險公司決策方針。 In this study, data mining is being applied on data taken from one of the life insurance company in Taipei. The techniques used are neural network, CART and C4.5 which are widely used models in data mining. In the process of acquiring samples, we comprised groups of samples by using different kind of sampling methods, different sample sizes, different ratios of loaned to un-loaned policies. In addition another groups of samples are created based on whether the continuous variables have been transformed. We then applied the three models into each of our various samples combinations to see which samples combination best described consumer behaviors with respect to their borrowing attitudes against their policies and its effects on different data mining models.
The results we found based on our study are summarized as following:
1. The assigned ratios have great influences on the model. However the magnitude of influences of sampling method and sample size on the model depends largely on the sample combination.
2. The sample combinations having transformed continuous variables affect and improve the results of neural network model significantly. However for CART model, the affects are insignificant whether the continuous variables having been transformed or not. The effect of transformed continuous variables on C4.5 is of limited.
3. The variables used to describe the behavior of the consumers as to taking the loan against the insurance policy vary for the three models. |
