政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/30885
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 113325/144300 (79%)
造访人次 : 51185913      在线人数 : 841
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻
    政大機構典藏 > 商學院 > 統計學系 > 學位論文 >  Item 140.119/30885


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/30885


    题名: 應用資料採礦技術於資料庫加值中的抽樣方法
    THE SAMPLING METHODS FOR VALUE-ADDED DATABASE IN DATA-MINING
    作者: 陳惠雯
    贡献者: 鄭宇庭
    謝邦昌



    陳惠雯
    关键词: 資料庫
    資料採礦
    抽樣方法
    資料加值
    Database
    Data Mining
    Sampling
    Value-added database
    日期: 2003
    上传时间: 2009-09-14
    摘要: In the wake of growing database that has already become the trend of today’s business environment within the foreseeable future, reviewing quality information from mountains of data residing on corporations or organizations’ network such as sales figures, manufacturing statistics, financial data and experimental data is clearly costly, time consuming and definitely ineffective approach. Therefore we would need a sound and effective method in obtaining only portions of the data that are representative to the population and which allow us to build the reliable model based upon the sampled data. However, sometimes we have a situation where the database is of limited in size, under such circumstance, we initiate the idea which is relatively new to adding the attributes or values into the database to enhance the quality of the data Follow through such a procedure; it is obvious that implementing a good sampling method is an important groundwork leading us to reach final destination that is obtaining a reliable predictive model. And this is our research goal that is to get an effective and representative value-added sample of by means of sampling method for building an accuracy predictive model. The concept is pretty straightforward that is if we want to get good predictive samples then we need the correct sampling methods. The sampling methods under study are simple random sample, system sample, stratified sample and uniform design. The models used are the C5.0, logistic regression, and neural network for categorical predictive variable and stepwise regression for continuous predictive variable. The results are discussed in the conclusion section.

    Keywords: Database、Data Mining、Sampling、Value-added database
    參考文獻: Chinese
    [1] 趙民德、謝邦昌,探索真相-抽樣理論和實務,曉園出版社,1999.
    [2] 黃文隆,抽樣方法,滄海書局,1999.
    [3] 趙民德,砂中選礦(Data Mining)的一些我見我思,中國統計學報,2002,12.
    [4] 王濟川、郭志剛,Logistic 迴歸模型-方法及應用,五南圖書出版股份有限公司,2003,3.
    [5] 崔巍 編著, 陳舜德 審校,資料庫系統與應用,博碩文化股份有限公司,
    2001,4.
    [6] 張慶賀,資料倉儲中實體化視域自我維護之研究,朝陽科技大學,2003.
    English
    [1] Alan Mayne,Michael B Wood,Introducing Relational Database,1983.
    [2] Bernd Gartner and Emo Welzl,A Simple Sampling Lemma: Analysis and Applications in Geometric Optimization,2002,4.
    [3] Colleen McCue、Emilys. Stone、Teresap. Gooch,Data Mining and Value-Added Analysis,2003.
    [4] CHAP T. LE,APPLIED CATEGORICAL DATA ANALYSIS,Wiley-Interscience Publication,1998.
    [5] C. J. Date,Relational Database Writings 1991-1994,1995.
    [6] David Hand、Heikki Mannila、and Padhraic Smyth,PRINCIPLES OF Data Mining,2001.
    [7] Laboratory 2: Ecological population: a crash course in sampling and statistics.
    [8] Margaret H.Dunham,DATA MINING Introductory and Advanced Topics,2003.
    [9] Saerndal Carl-Erik、Bengt Swensson、Jan Wretman,Model Assisted Survey Sampling,New York: Springer-Verlag,1992.
    [10] USDA Technical Services Division: GRAIN INSPECTION PACKERS AND STOCKYARDS ADMINISIRATION,2001,1.
    [11] William Mendenhall、Terry Sincich,A SECOND COURSE IN STATISTICS REGRESSION ANALYSIS,PRENTICE FALL,fifth edition,1996.
    描述: 碩士
    國立政治大學
    統計研究所
    91354016
    92
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0091354016
    数据类型: thesis
    显示于类别:[統計學系] 學位論文

    文件中的档案:

    档案 大小格式浏览次数
    index.html0KbHTML2248检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈