政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/30885
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113451/144438 (79%)
Visitors : 51302433      Online Users : 899
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/30885


    Title: 應用資料採礦技術於資料庫加值中的抽樣方法
    THE SAMPLING METHODS FOR VALUE-ADDED DATABASE IN DATA-MINING
    Authors: 陳惠雯
    Contributors: 鄭宇庭
    謝邦昌



    陳惠雯
    Keywords: 資料庫
    資料採礦
    抽樣方法
    資料加值
    Database
    Data Mining
    Sampling
    Value-added database
    Date: 2003
    Issue Date: 2009-09-14
    Abstract: In the wake of growing database that has already become the trend of today’s business environment within the foreseeable future, reviewing quality information from mountains of data residing on corporations or organizations’ network such as sales figures, manufacturing statistics, financial data and experimental data is clearly costly, time consuming and definitely ineffective approach. Therefore we would need a sound and effective method in obtaining only portions of the data that are representative to the population and which allow us to build the reliable model based upon the sampled data. However, sometimes we have a situation where the database is of limited in size, under such circumstance, we initiate the idea which is relatively new to adding the attributes or values into the database to enhance the quality of the data Follow through such a procedure; it is obvious that implementing a good sampling method is an important groundwork leading us to reach final destination that is obtaining a reliable predictive model. And this is our research goal that is to get an effective and representative value-added sample of by means of sampling method for building an accuracy predictive model. The concept is pretty straightforward that is if we want to get good predictive samples then we need the correct sampling methods. The sampling methods under study are simple random sample, system sample, stratified sample and uniform design. The models used are the C5.0, logistic regression, and neural network for categorical predictive variable and stepwise regression for continuous predictive variable. The results are discussed in the conclusion section.

    Keywords: Database、Data Mining、Sampling、Value-added database
    Reference: Chinese
    [1] 趙民德、謝邦昌,探索真相-抽樣理論和實務,曉園出版社,1999.
    [2] 黃文隆,抽樣方法,滄海書局,1999.
    [3] 趙民德,砂中選礦(Data Mining)的一些我見我思,中國統計學報,2002,12.
    [4] 王濟川、郭志剛,Logistic 迴歸模型-方法及應用,五南圖書出版股份有限公司,2003,3.
    [5] 崔巍 編著, 陳舜德 審校,資料庫系統與應用,博碩文化股份有限公司,
    2001,4.
    [6] 張慶賀,資料倉儲中實體化視域自我維護之研究,朝陽科技大學,2003.
    English
    [1] Alan Mayne,Michael B Wood,Introducing Relational Database,1983.
    [2] Bernd Gartner and Emo Welzl,A Simple Sampling Lemma: Analysis and Applications in Geometric Optimization,2002,4.
    [3] Colleen McCue、Emilys. Stone、Teresap. Gooch,Data Mining and Value-Added Analysis,2003.
    [4] CHAP T. LE,APPLIED CATEGORICAL DATA ANALYSIS,Wiley-Interscience Publication,1998.
    [5] C. J. Date,Relational Database Writings 1991-1994,1995.
    [6] David Hand、Heikki Mannila、and Padhraic Smyth,PRINCIPLES OF Data Mining,2001.
    [7] Laboratory 2: Ecological population: a crash course in sampling and statistics.
    [8] Margaret H.Dunham,DATA MINING Introductory and Advanced Topics,2003.
    [9] Saerndal Carl-Erik、Bengt Swensson、Jan Wretman,Model Assisted Survey Sampling,New York: Springer-Verlag,1992.
    [10] USDA Technical Services Division: GRAIN INSPECTION PACKERS AND STOCKYARDS ADMINISIRATION,2001,1.
    [11] William Mendenhall、Terry Sincich,A SECOND COURSE IN STATISTICS REGRESSION ANALYSIS,PRENTICE FALL,fifth edition,1996.
    Description: 碩士
    國立政治大學
    統計研究所
    91354016
    92
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0091354016
    Data Type: thesis
    Appears in Collections:[Department of Statistics] Theses

    Files in This Item:

    File SizeFormat
    index.html0KbHTML2250View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback