Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/125646
|
Title: | 深度學習應用在偵測拓撲結構域 Topology Association Domain Identification using Deep Learning |
Authors: | 楊鎮遠 Yang, Jhen-Yuan |
Contributors: | 張家銘 Chang, Jia-Ming 楊鎮遠 Yang, Jhen-Yuan |
Keywords: | 拓撲關聯域 TAD Hi-C 染色體組織 深度學習 Topology Association Domain TAD Hi-C Chromosome organization Deep learning |
Date: | 2019 |
Issue Date: | 2019-09-05 16:15:31 (UTC+8) |
Abstract: | 摘要 ● 背景:近年來,越來越多的證據表明三維染色體結構在基因組功能中起著重要作用。拓撲關聯域(TAD)是一種自相互作用區域,已被證明是染色體的結構單元。然而,在高通量染色體構象捕獲圖中鑑定TAD 是一項計算挑戰。 ● 結果:我們提出了一個新問題,即TAD 分類,而不是原始的TAD 識別。具體地,我們將Hi-C 圖考慮為圖像,使得TAD 分類是使用兩個深度學習模型,卷積神經網絡和殘差神經網絡來解決的圖像分類問題。此外,我們設計了一種合乎邏輯的方法來生成非TAD 數據,用於二元分類問題。通過跨物種和細胞類型驗證,深度學習模型的表現 良好,AUC> 0.80。 ● 結論:TAD 在進化過程中被證明是保守的。有趣的是,我們的結果證實TAD 分類模型是實用的跨物種。從圖像分類的角度來看,它表明人與鼠之間的TAD 顯示了共同的模式。我們的方法可以成為測試Hi-C 圖中TAD 變化或保存的新方法。例如,如果兩個分類模型是可交換的,則保留兩個Hi-C 圖的TAD Abstract ● Background: In the last years, increasing evidence indicates that three-dimensional chromosome structure plays important rule in genomic function. A Topologically Associating Domain (TAD), a self-interacting region, has been shown as a structure unit of chromosome. However, it is a computational challenge to identify TADs in high-throughput chromosome conformation capture map. ● Results: We proposed a novel problem, TAD classification, instead of original TAD identification. Specifically, we consider Hi-C map as image such that TAD classification is an image classification problem which is solved using two deep learning models, convolutional neural network and residual neural network. Besides, we designed an elegant way to generate non-TAD data for binary classification problem. The performance of deep learning models is quite promising, AUC > 0.80, through cross species and cell types validation. ● Conclusions: TAD has been shown conserved during evolution. Interestingly, our results confirm TAD classification model is practical cross species. It indicates TADs between human and mouse show common pattern from point of view of image classification. Our approach could be a new way to test variation or conservation of TADs among Hi-C maps. For example, TADs of two Hi-C maps are conserved if two classification models are exchangeable. |
Reference: | 1. Bonev, B. & Cavalli, G. Organization and function of the 3D genome. Nat Rev Genet. 17:661–78. 2016. 2. Dekker, J. et al. Capturing chromosome conformation. Science. 295(5558):1306–11. 2002. 3. Simonis, M. et al. Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-ChIP (4C). Nat Genet. 38:1348–54. 2006. 4. Dostie, J. & Dekker, J. Mapping networks of physical interactions between genomic ele-ments using 5C technology. Nat Protoc. 2:988–1002. 2007. 5. Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals fold-ing principles of the human genome. Science. 326(5950):289–93. 2009. 6. Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature. 485, pp. 376-380. 2012. 7. Peifer, M. et al. Telomerase activation by genomic rearrangements in high-risk neuroblas-toma. Nature. 526:700–704. 2015. 8. Zufferey, M. et al. Comparison of computational methods for the identification of topologically associating domains. Genome Biol. 19(1):217. 2018. 9. van Berkum, N.L. et al. Hi-C: a method to study the three-dimensional architecture of ge-nomes. J Vis Exp. 39:pii:1869. 2010. 10. Sexton, T. et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell. 148, 458–472. 2012. 11. Matharu, N. & Ahituv, N. Minor. Loops in major folds: enhancer-promoter looping, chromatin restructuring, and their association with transcriptional regulation and disease. PLoS Genet. 11: e1005640. 2015. 12. Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern.1980; 36, 193–202 13. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. IEEE. 86(11):2278–2324. 1998. 14. Krizhevsky, A., Sutskever, I., and Hinton, G. ImageNet classification with deep convolutional neural networks. NIPS. 2012. 15. Yaffe, E. & Tanay, A. Probabilistic modeling of Hi-C contact maps eliminates systematic bi-ases to characterize global chromosomal architecture. Nature Genet. 2012; 43, 1059–1065 16. Hi-C project at Ren Lab, http://chromosome.sdsc.edu/mouse/hi-c/download.html 17. Pal, K., Forcato, M., and Ferrari, F. Hi-C analysis: from data generation to integration. Bio-phys Rev, 11. pp. 67-78. 2019. 18. Dali, R. & Blanchette, M. A critical assessment of topologically associating domain predic-tion tools. Nucleic Acids Res. 45, 2994–3005. 2017. 19. Hu, J. et al. Squeeze-and-excitation networks. CVPR.2018 20. Ioffe,S. & Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR; 2015. 21. He, K. et al. Deep residual learning for image recognition. CVPR. 2016 22. Y. Shen. et al. A map of the cis-regulatory sequences in the mouse genome Nature, 488, pp. 116-120. 2012 23. Liu T, Wang Z. HiCNN: a very deep convolutional neural network to better enhance the resolution of Hi-C data. Bioinformatics. 2019 24. Z. Wang, W. Yan, and T. Oates. Time Series Classification from Scratch with Deep Neural Networks: A Strong Baseline. CoRR, abs/1611.06455. 2016. 25. Zhou, B. et al. Learning deep features for discriminative localization. CVPR. 2014 26. Zhang, Y. et al. Enhancing Hi-C data resolution with deep convolutional neural network HiCPlus. Nat Commun 9, 750. 2018. 27. Szabo, Q. et al. TADs are 3D structural units of higher-order chromosome organization in Drosophila. Science Advances 4, eaar8082. 2018. 28. Henderson, J. et al. Accurate prediction of boundaries of high resolution topologically associated domains (TADs) in fruit flies using deep learning. Nucleic Acids Res. 47, e78. 2019. 29. Schuettengruber, B. et al. Cooperativity, specificity, and evolutionary stability of Polycomb targeting in Drosophila. Cell Rep 9, 219–33. 2014. 30. Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–80. 2014. 31. Bonev, B. et al. Multiscale 3D Genome Rewiring during Mouse Neural Development. Cell 171, 557–572.e24. 2017. |
Description: | 碩士 國立政治大學 資訊科學系 105753033 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G1057530331 |
Data Type: | thesis |
DOI: | 10.6814/NCCU201901133 |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
033101.pdf | 1630Kb | Adobe PDF2 | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|