English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113318/144297 (79%)
Visitors : 50986429      Online Users : 884
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 商學院 > 資訊管理學系 > 學位論文 >  Item 140.119/77885
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/77885


    Title: 運用記憶體內運算於智慧型健保院所異常查核之研究
    A Research into In-Memory Computing Techniques for Intelligent Check of Health-Insurance Fraud
    Authors: 湯家哲
    Tang, Jia Jhe
    Contributors: 姜國輝
    Chiang, Johannes K.
    湯家哲
    Tang, Jia Jhe
    Keywords: 異常健保院所
    記憶內運算
    Apache Spark
    Benford’s Law
    機器學習演算法
    Illegal Medical Institutions
    In-Memory Computing
    Apache Spark
    Benford’s Law
    Machine Learning Algorithms
    Date: 2015
    Issue Date: 2015-08-24 10:10:27 (UTC+8)
    Abstract: 我國全民健保近年財務不佳,民國98年收支短絀達582億元。根據中央健康保險署資料,截至目前為止,特約醫事服務機構違規次數累積達13722次。在所有重大違規事件中,大部分是詐欺行為。
    健保審查機制主要以電腦隨機抽樣,再由人工進行調查。然而,這樣的審查方式無法有效抽取到違規醫事機構之樣本,造成審查效果不彰。
    Benford’s Law又稱第一位數法則,其概念為第一位數的值越小則該數字出現的頻率越大,反之相反。該方法被應用於會計、金融、審計及經濟領域中。楊喻翔(2012)將Benford’s Law相關指標應用於我國全民健保上,並結合機器學習演算法來進行健保異常偵測。
    Zaharia et al. (2012)提出了一種具容錯的群集記憶內運算模式 Apache Spark,在相同的運算節點及資源下,其資料運算效率及速度可勝出Hadoop MapReduce 20倍以上。
    為解決健保異常查核效果不彰問題,本研究將採用Benford’s Law,使用國家衛生研究院發行之健保資料計算成為Benford’s Law指標和實務指標,接著並使用支援向量機和邏輯斯迴歸來建構出異常查核模型。然而健保資料量龐大,為加快運算時間,本研究使用Apache Spark做為運算環境,並以Hadoop MapReduce作為標竿,比較運算效率。
    研究結果顯示,本研究撰寫的Spark程式運算時間能較MapReduce快2倍;在分類模型上,支援向量機和邏輯斯迴歸所進行的住院資料測試,敏感度皆有80%以上;而所進行的門診資料測試,兩個模型的準確率沒有住院資料高,但邏輯斯迴歸測試結果仍保有一定的準確性,在敏感度仍有75%,整體正確率有73%。
    本研究使用Apache Spark節省處理大量健保資料的運算時間。其次本研究建立的智慧型異常查核模型,確實能查核出違約的醫事機構,而模型所查核出可能有詐欺及濫用健保之醫事機構,可進行下階段人工調查,最終得改善健保查核效力。
    Financial condition of National Health Insurance (NHI) has been wretched in recent years. The income statement in 2009 indicated that National Health Insurance Administration (NHIA) was in debt for NTD $58.2 billion. According to NHIA data, certain medical institutions in Taiwan violated the NHI laws for 13722 times. Among all illegal cases, fraud is the most serious.
    In order to find illegal medical institutions, NHIA conducted random sampling by computer. Once the data was collected, NHIA investigators got involved in the review process. However, the way to get the samples mentioned above cannot reveal the reality.
    Benford`s law is called the First-Digit Law. The concept of Benford’s Law is that the smaller digits would appear more frequently, while larger digits would occur less frequently. Benford’s Law is applied to accounting, finance, auditing and economics. Yang(2012) used Benford’s Law in NHI data and he also used machine learning algorithms to do fraud detection.
    Zaharia et al. (2012) proposed a fault-tolerant in-memory cluster computing -Apache Spark. Under the same computing nodes and resources, Apache Spark’s computing is faster than Hadoop MapReduce 20 times.
    In order to solve the problem of medical claims review, Benford’s Law was applied to this study. This study used NHI data which was published by National Health Research Institutes. Then, we computed NHI data to generate Benford’s Law variables and technical variables. Finally, we used support vector machine and logistics regression to construct the illegal check model. During system development, we found that the data size was big. With the purpose of reducing the computing time, we used Apache Spark to build computing environment. Furthermore, we adopted Hadoop MapReduce as benchmark to compare the performance of computing time.
    This study indicated that Apache Spark is faster twice than Hadoop MapReduce. In illegal check model, with support vector machine and logistics regression, we had 80% sensitivity in inpatient data. In outpatient data, the accuracy of support vector machine and logistics regression were lower than inpatient data. In this case, logistics regression still had 75% sensitivity and 73% accuracy.
    This study used Apache Spark to compute NHI data with lower computing time. Second, we constructed the intelligent illegal check model which can find the illegal medical institutions for manual check. With the use of illegal check model, the procedure of medical claims review will be improved.
    Reference: Apache Spark, https://spark.apache.org, 2015
    Bhattacharya, Sukanto, Dongming Xu, and Kuldeep Kumar. "An ANN-based auditor decision support system using Benford`s law." Decision support systems 50.3 (2011): 576-584.
    Busta, B., & Weinberg, R. "Using Benford’s law and neural networks as a review procedure," Managerial Auditing Journal (13:6) 1998, pp 356-366.
    Carlini, Emanuele, et al. "Balanced Graph Partitioning with Apache Spark." Euro-Par 2014: Parallel Processing Workshops. Springer International Publishing, 2014.
    Carslaw, Charles APN. "Anomalies in income numbers: Evidence of goal oriented behavior." Accounting Review (1988): 321-327.
    Christian, C., and Gupta, S. “New evidence on secondary evasion,” The Journal of the American Taxation Association, 1993, pp 72-92
    Coulouris, George F., Jean Dollimore, and Tim Kindberg. Distributed systems: concepts and design. pearson education, 2005.
    Dean, Jeffrey, and Sanjay Ghemawat. "MapReduce: simplified data processing on large clusters." Communications of the ACM 51.1 (2008): 107-113.
    Dimiduk, Nick, et al. HBase in action. Shelter Island: Manning, 2013.
    Glaser, W. A. Paying the doctor: systems of remuneration and their effects Johns Hopkins Press, Baltimore, 1970.
    Harnie, D., Vapirev, A., Wegner, J. K., Gedich, A., Steijaert, M., & Wuyts, R. (2015). Scaling Machine Learning for Target Prediction in Drug Discovery using Apache Spark. In Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing: Workshop on Clusters, Clouds and Grids for Life Sciences.
    Hill, Theodore P. "A statistical derivation of the significant-digit law." Statistical Science (1995): 354-363.
    Hill, Theodore P. "The First Digit Phenomenon A century-old observation about an unexpected pattern in many numerical tables applies to the stock market, census statistics and accounting data." American Scientist 86.4 (1998): 358-363.
    Kvam, Paul H., and Brani Vidakovic. Nonparametric statistics with applications to science and engineering. Vol. 653. John Wiley & Sons, 2007.
    Lin, Chieh-Yen, et al. "Large-scale logistic regression and linear support vector machines using Spark." Big Data (Big Data), 2014 IEEE International Conference on. IEEE, 2014.
    Lu, Fletcher, and J. Efrim Boritz. "Detecting fraud in health insurance data: Learning to model incomplete Benford’s law distributions." Machine Learning: ECML 2005. Springer Berlin Heidelberg, 2005. 633-640.
    Lu, Fletcher, J. Efrim Boritz, and Dominic Covvey. "Adaptive fraud detection using Benford’s law." Advances in Artificial Intelligence. Springer Berlin Heidelberg, 2006. 347-358.
    Mell, Peter, and Tim Grance. "The NIST definition of cloud computing." (2011).
    Nigrini, M. J. “A taxpayer compliance application of Benford`s Law,” The Journal of the American Taxation Association, 1996, pp 72-91
    Nigrini, M. J. & W. Wood. 1996. Assessing the integrity of tabulated demographic data. Working paper, Saint Mary`s University, Halifax, N.S
    Nigrini, M. J. “Using digital frequencies to detect fraud.” The White Paper (April): 3-6. 1996.
    Nigrini, Mark J., and Linda J. Mittermaier. "The use of Benford`s law as an aid in analytical procedures." Auditing: A Journal of Practice & Theory 16.2 (1997): 52.
    Nigrini, M. J. “Digital Analysis Using Benford’s Law.” Global Audit Publications, Vancouver, B.C., Canada, 2000.
    Shvachko, Konstantin, et al. "The hadoop distributed file system." Mass Storage Systems and Technologies (MSST), 2010 IEEE 26th Symposium on. IEEE, 2010.
    Solaimani, Mohiuddin, et al. "Statistical technique for online anomaly detection using Spark over heterogeneous data from multi-source VMware performance data." Big Data (Big Data), 2014 IEEE International Conference on. IEEE, 2014.
    Sparrow, Malcolm K. Fraud Control in the Health Care Industry: Assessing the State of the Art. US Department of Justice, Office of Justice Programs, National Institute of Justice, 1998.
    Thomas, Kurt, et al. "Design and evaluation of a real-time url spam filtering service." Security and Privacy (SP), 2011 IEEE Symposium on. IEEE, 2011.
    White, Tom. Hadoop: The definitive guide. " O`Reilly Media, Inc.", 2012.
    Wikipedia "Support vector machine," 2015, http://en.wikipedia.org/wiki/Support_vector_machine
    Wikipedia "Distributed computing," 2015, http://en.wikipedia.org/wiki/Distributed_computing
    Zaharia, Matei, et al. "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing." Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012
    中央健康保險署 "衛生福利部中央健康保險署業務執行報告," 2015, http://www.nhi.gov.tw/webdata/webdata.aspx?menu=17&menu_id=1023&WD_ID=1023&webdata_id=4719
    中央健康保險署 "全民健康保險特約醫事服務機構查處統計表," 2015, http://www.nhi.gov.tw/webdata/webdata.aspx?menu=17&menu_id=1023&WD_ID=1023&webdata_id=2401
    中央健康保險署 "醫療費用執行報告," 2015, http://www.nhi.gov.tw/webdata/webdata.aspx?menu=17&menu_id=1023&WD_ID=1023&webdata_id=3601
    中央健康保險署 "重要統計資料," 2015, http://www.nhi.gov.tw/webdata/webdata.aspx?menu=17&menu_id=1023&WD_ID=1043&webdata_id=805
    中央健康保險署 "全民健康保險統計," 2015, http://www.nhi.gov.tw/webdata/webdata.aspx?menu=17&menu_id=1023&WD_ID=1043&webdata_id=3351
    郭芷余, 黃馨儀 & 洪振生 "馬光中醫詐健保數百萬," 2014, http://www.appledaily.com.tw/appledaily/article/headline/20140719/35968249/
    全民健康保險醫療費用協定委員會 "全民健康保險醫療費用總額支付制度," 2005, http://www.nhi.gov.tw/Resource/webdata/Attach_13636_2_8.2:總額QA手冊第六版含94年.pdf
    章殷超 "全民健康保險醫療服務審查問題之探討," 臺灣醫學 (7:1) 2003, pp 104-114.
    湯玲郎, & 林信忠 "資料萃取法在健保費用稽核之研究," 醫療資訊雜誌 (11) 2000, pp 85-104.
    黃煌雄, 沈美真, & 劉興善 "我國全民健康保險總體檢," 監察院, 2011.
    中央健康保險署 "2014-2015 全民健康保險年報," 2015
    楊喻翔 "運用Benford定律的智慧型健保費用異常偵測模型之研究," 國立政治大學資訊管理系博士論文, 2012.
    趙孟捷 "健保五大花招及最新違規名單," 2014, http://www.thrf.org.tw/Page_Show.asp?Page_ID=1937
    蔡明樺 "太扯自診做大腸鏡醫亂掰詐健保," 2014, http://www.appledaily.com.tw/appledaily/article/headline/20140603/35868423/
    Description: 碩士
    國立政治大學
    資訊管理研究所
    102356041
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0102356041
    Data Type: thesis
    Appears in Collections:[資訊管理學系] 學位論文

    Files in This Item:

    File SizeFormat
    index.html0KbHTML2213View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback