Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/142099
|
Title: | 聯邦學習與區塊鏈隱私保護在信用風險預測中的應用 Application of Federated Learning and Blockchain Privacy Protection in Credit Risk Prediction |
Authors: | 林和勲 Lin, Ho-Hsun |
Contributors: | 陳恭 林和勲 Lin, Ho-Hsun |
Keywords: | 區塊鏈 機器學習 聯邦學習 分散式身分識別 可驗證憑證 Blockchain Machine learning Federated learning Decentralized identifiers Verifiable credentials |
Date: | 2022 |
Issue Date: | 2022-10-05 09:09:03 (UTC+8) |
Abstract: | 聯邦學習是一個分散式機器學習的概念,擁有資料集的參與者可以進行模型訓練,藉由提供模型訓練參數,解決訓練資料不足、資料隱私問題。區塊鏈是一種實現價值轉移的去中心化分散式資料庫技術,藉由去中心身分識別機制,允許參與者保護隱私權,保障資料自主權。隨著全球市場供給的轉變,金融機構加快銀行業務數位轉型,加強跨單位與多源異構資料整合,減少既有組織的資料孤島。然而國內個人資料保護法與各國監理單位隱私保護的重視,如何妥善應用資料並兼具合規與安全性,成了影響新興科技導入的重點。
銀行是相對保守的金融機構,持有的資料集比較敏感,不能輕易地使用這些資料進行資料採擷,前提是要保證資料集使用的合法性,安全性和規範性。為了更精確地了解客戶(KYC)、客戶盡職調查(CDD)與打擊洗錢(AML),需要巨量外部多維度的「開放資料」來優化模型,以實現風險預警與客戶管理等目標。很多時候金融機構只有聯徵中心的信用資料,資料來源包括銀行以及政府,包含經濟部中小企業處的融資服務平台和財政部的資訊中心,主要是授信資料、包含信用卡資料和客戶的個人資料。透過開放銀行及API,結合第三方服務業者共享資料,可以提供更多元的加值金融服務。.
本研究給出了一個使用企業金融授信場景的概念驗證(PoC),使用區塊鏈框架Hyperledger Aries和隱私保護聯邦學習(FL)平台的開源專案OpenMined,並基於新興的去中心化標識符(DID),實現使參與組織能夠相互驗證由監理機構發布的數字身分證明及憑證。所提出的分散式身分驗證機制可以應用於監管任何工作流程(Workflow)、資料收集和模型訓練,而不僅限於金融授信領域。 Federated learning is a concept of decentralized machine learning. Participants with datasets can conduct model training. By providing model training parameters, to solve the problems of insufficient training data and data privacy. Blockchain is a decentralized database technology that realizes value transfer. It allows participants to protect privacy and data autonomy through a decentralized identification mechanism. With the change in global market supply, financial institutions accelerate the digital transformation of banking business, strengthen cross-unit and multi-source heterogeneous data integration, reduce data silos in existing organizations. However, domestic personal data protection laws and the importance of privacy protection by supervisory agencies in various countries, how to properly use data and have both compliance and security have become the focus of influencing the introduction of emerging technologies.
Banks are relatively conservative financial institutions, and the data sets they hold are relatively sensitive. Banks cannot easily use these data for data collection, provided that the legality, security and standardization of the use of data sets are guaranteed. In order to understand customers (KYC), customer due diligence (CDD) and anti-money laundering (AML) more accurately, a huge amount of external multi-dimensional 「Open Data」 is needed to optimize the model to achieve the goals of risk warning and customer management. In many cases, financial institutions only have the credit information of the JCIC. The data sources include banks and the government, including the financing service platform of the SMEA and the information center of the Ministry of Finance, mainly credit information , including credit card information and personal information of customers. Through open banking and APIs, and sharing data with third-party service providers, more value-added financial services can be provided.
This thesis presents a proof-of-concept (PoC) using a corporate financial credit scenario, using the blockchain framework Hyperledger Aries and the privacy-preserving FL platform`s open source project OpenMined, and based on the emerging Decentralized Identifier (DID) to enable participation Organizations can mutually authenticate digital identities and credentials issued by supervisory agencies. The proposed decentralized authentication mechanism can be applied to supervise any workflow, data collection and model training, not limited to the field of financial credit. |
Reference: | Abramson, W.; Hall, A.J.; Papadopoulos, P.; Pitropakis, N.; Buchanan, W.J. (2020). A Distributed Trust Framework for Privacy-Preserving Machine Learning. In Trust, Privacy and Security in Digital Business, 205–220.
Abramson, W.; van Deursen, N.E.; Buchanan, W.J. (2020). Trust-by-Design: Evaluating Issues and Perceptions within Clinical Passporting. arXiv.
Al-Rubaie, M.; Chang, J.M. (2019). Privacy-Preserving Machine Learning: Threats and Solutions. IEEE Security & Privacy, 49–58.
Angelou, N.; Benaissa, A.; Cebere, B.; Clark, W.; Hall, A.J.; Hoeh, M.A.; Liu, D.; Papadopoulos, P.; Roehm, R.; Sandmann, R. (2020). Asymmetric Private Set Intersection with Applications to Contact Tracing and Private Vertical Federated Machine Learning. arXiv .
AuM.H., TsangP.P., SusiloW., & MuY. (2009). Dynamic Universal Accumulators for DDH Groups and Their Application to Attribute-Based Anonymous Credential Systems. Topics in Cryptology, 295-308.
BoettigerC. (2015). An introduction to Docker for reproducible research. ACM SIGOPS.
Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. (2016). Practical Secure Aggregation for Federated Learning. arXiv.
Buchanan, W.J.; Imran, M.A.; Rehman, M.U.; Zhang, L.; Abbasi, Q.H.; Chrysoulas, C.; Haynes, D.; Pitropakis, N.; Papadopoulos, P. (2020). Review and Critical Analysis of Privacy-Preserving Infection Tracking and Contact Tracing. Frontiers in Communication.
CamenischJ., & LysyanskayaA. (2003). A Signature Scheme with Efficient Protocols. In Security in Communication Networks, 268–289.
CamenischJ., DubovitskayaM., LehmannA., NevenG., PaquinC., & PreissF.S. (2013). Concepts and Languages for Privacy-Preserving Attribute-Based Authentication. IFIP Advances in Information and Communication Technology (頁 34–52). Springer, Berlin, Heidelberg.
ChaumD. (1985). Security without identification: Transaction systems to make big brother obsolete. Communications of the ACM, 1030–1044.
ChaumD.L. (1981). Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM, 84–90.
ChaumDavid. (1985 ). Security without identification: transaction systems to make big brother obsolete. Communications of the ACM.
Dachman-Soled, D.; Malkin, T.; Raykova, M.; Yung, M. (2009). Efficient Robust Private Set Intersection. International Conference on Applied Cryptography and Network Security, 125–142.
Das, D.; Avancha, S.; Mudigere, D.; Vaidynathan, K.; Sridharan, S.; Kalamkar, D.; Kaul, B.; Dubey, P. (2016). Distributed Deep Learning Using Synchronous Stochastic Gradient Descent. arXiv.
Davie, M.; Gisolfi, D.; Hardman, D.; Jordan, J.; O’Donnell, D.; Reed, D. (2019). The Trust Over IP Stack. 擷取自 RFC Editor: https://github.com/hyperledger/aries-rfcs/tree/master/concepts/0289-toip-stack
DenhamE. (2017). Royal Free – Google DeepMind trial failed to comply with data protection law. Information Commissioner`s Office.
DunphyPaul. (2018). A First Look at Identity Management Schemes on the Blockchain. IEEE.
DworkC. (2011). Differential Privacy. Encyclopedia of Cryptography and Security, 338–340.
ElGamalTaher. (1984). A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms. Advances in Cryptology. Workshop on the Theory and Application of Cryptographic Techniques.
Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, Vitaly Shmatikov. (2018). How To Backdoor Federated Learning. arXiv.
Examplesand Harnessing AdversarialExplaining. (2014). Explaining and Harnessing Adversarial Examples . arXiv.
Fontaine, C.; Galand, F. (2007). A Survey of Homomorphic Encryption for Nonspecialists. EURASIP Journal on Information Security.
Fredrikson, M.; Jha, S.; Ristenpart, T. (2015). Model inversion attacks that exploit confidence information and basic countermeasures. Computer and Communications Security (頁 1322–1333). ACM.
Goyal, P.; Goyal, A. (2017). Comparative Study of two Most Popular Packet Sniffing Tools. Computational Intelligence and Communication Networks, 77–81.
GroupCredential CommunityW3C. (2019). DID Method Registry. 擷取自 https://w3c-ccg.github.io/didmethod-registry/
HallP. (2021). Proposals for Model Vulnerability and Security. 擷取自 O’Reilly: https://www.oreilly.com/content/proposals-for-model-vulnerability-and-security/
HardmanD. (2019). DID Communication. 擷取自 GitHub: https://github.com/hyperledger/ariesrfcs/tree/master/concepts/0005-didcomm
HardmanD. (2019). Peer DID Method Specification. 擷取自 GitHub: https://openssi.github.io/peer-did-methodspec/index.html
HoffmanA.M. (2002). A Conceptualization of Trust in International Relations. European Journal of International Relations, 375–401.
IEEE Guide for Architectural Framework and Application of Federated Machine Learning. (2021). 擷取自 IEEE Xplore: https://ieeexplore.ieee.org/document/9382202
Jakub Konečný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, Dave Bacon. (2016). Federated Learning: Strategies for Improving Communication Efficiency. arXiv.
Jan Camenisch, Ioannis Krontiris, Anja Lehmann, Gregory Neven, Christian Paquin, Kai Rannenberg, Harald Zwingelberg. (2014). D2.1 Architecture for Attribute-based Credential Technologies. 擷取自 ABC4Trust: https://abc4trust.eu/index.php/pub
Jones, M.; Bradley, J.; Sakimura, N. (2015). JSON Web Signatures. 擷取自 RFC Editor: https://tools.ietf.org/html/rfc7515
Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. (2019). Advances and Open Problems in Federated Learning. arXiv.
KeymolenEsther. (2016). Trust on the Line: A Philosophical Exploration of Trust in the Networked Era. Wolf Legal Publishers .
Kholod, I.; Yanaki, E.; Fomichev, D.; Shalugin, E.; Novikova, E.; Filippov, E.; Nordlund, M. (2021). Open-Source Federated Learning Frameworks for IoT: A Comparative Review and Analysis. Sensors, 167.
Kluyver, T.; Ragan-Kelley, B.; Pérez, F.; Granger, B.E.; Bussonnier, M.; Frederic, J.; Kelley, K.; Hamrick, J.B.; Grout, J.; Corlay, S. (2016). Jupyter Notebooks-a Publishing Format for Reproducible Computational Workflows. International Conference on Electronic Publishing (頁 87–90). IOS Press.
Longley, D.; Sporny, M.; Allen, C. (2019). Linked Data Signatures 1.0. 擷取自 W3C Digital Verification Community Group: https://w3c-dvcg.github.io/ld-signatures/
Luis Muñoz-González, Battista Biggio, Ambra Demontis, Andrea Paudice, Vasin Wongrassamee, Emil C. Lupu. (2017 ). Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization. arXiv.
Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang. (2016). Deep Learning with Differential Privacy. arXiv.
Martin, A.; Raponi, S.; Combe, T.; Di Pietro, R. (2018). Docker ecosystem – Vulnerability Analysis. Computer Communications, 30–43.
NelsonL.Stephen. (1995). A Field Guide to Internet Trust. Microsoft Press.
Nuding, F.; Mayer, R. (2020). Poisoning attacks in federated learning: An evaluation on traffic sign classification. Poisoning Attacks in Federated Learning: An Evaluation on Traffic Sign Classification (頁 168–170.). ACM CODASPY.
OWASP. (2018). The Ten Most Critical Web Application Security Risk. 擷取自 https://owasp.org/wwwproject-top-ten/
Paul Voigtvon dem BusscheAxel. (2017). The EU General Data Protection Regulation (GDPR). Springer Cham.
Reed, D.; Sporny, M.; Longely, D.; Allen, C.; Sabadello, M.; Grant, R. (2020). Decentralized Identifiers (DIDs) v1.0. 擷取自 https://w3c.github.io/did-core/
Rivest, R.L.; Shamir, A.; Adleman, L. (1978). A method for obtaining digital signatures and public-key cryptosystems. ACM, 120–126.
Ryffel, T.; Trask, A.; Dahl, M.; Wagner, B.; Mancuso, J.; Rueckert, D.; Passerat-Palmbach, J. (2018). A generic framework for privacy preserving deep learning. arXiv.
Salem, A.; Zhang, Y.; Humbert, M.; Berrang, P.; Fritz, M.; Backes, M. (2018). ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models. arVix.
Shokri, R.; Stronati, M.; Song, C.; Shmatikov, V. (2017). Membership Inference Attacks against Machine Learning Models. IEEE Symposium on Security and Privacy, 3–18.
SmithJ.E., & NairR. (2005). The architecture of virtual machines. Computer , 32–38.
Song, C.; Ristenpart, T.; Shmatikov, V. (2017). Machine Learning Models that Remember Too Much. Computer and Communications Security, 587–601.
SpornyM., LongelyD., & ChadwickD. (2019). Verifiable Credentials Data Model 1.1. Technical Report. W3C, https://www.w3.org/TR/vc-data-model/.
TerbuO. (2020). DIF Starts DIDComm Working Group. 擷取自 https://medium.com/decentralized-identity/dif-startsdidcomm-working-group-9c114d9308dc
TharwatA. (2020). Classification assessment methods. Applied Computing and Informatics, 168–192.
Tramèr, F.; Zhang, F.; Juels, A.; Reiter, M.K.; Ristenpart, T. (2016). Stealing Machine Learning Models via Prediction APIs. USENIX Security, 601–618.
WangF., & De FilippiP. (2020). Self-Sovereign Identity in a Globalized World: Credentials-Based Identity Systems as a Driver for Economic Inclusion. Frontiers in Communication.
WohlwendJ. (2016). Elliptic Curve Cryptography: Pre and Post Quantum. 擷取自 Technical Reports - LibGuides at MIT Libraries: https://math.mit.edu/~apost/courses/18.204-2016/18.204_Jeremy_Wohlwend_final_paper.pdf
YehC.L. (2018). Pursuing consumer empowerment in the age of big data: A comprehensive regulatory framework for data brokers. Telecommunications Policy , 282–292.
ZhangY., JiaR., PeiH., WangW., LiB., & SongD. (2019). The Secret Revealer: Generative Model-Inversion Attacks against Deep Neural Networks. arXiv. |
Description: | 碩士 國立政治大學 資訊科學系碩士在職專班 104971007 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0104971007 |
Data Type: | thesis |
DOI: | 10.6814/NCCU202201609 |
Appears in Collections: | [資訊科學系碩士在職專班] 學位論文
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|