Loading...
|
Please use this identifier to cite or link to this item:
https://nccur.lib.nccu.edu.tw/handle/140.119/136962
|
Title: | HiCSeg:針對不同樣本和物種的互動式基因體分割 HiCSeg: an interactive genome segmentation cross samples and species |
Authors: | 吳映函 Wu, Yin-Han |
Contributors: | 張家銘 Chang, Jia-Ming 吳映函 Wu, Yin-Han |
Keywords: | 基因體分割 Hi-C ChIP-Seq Genome segmentation Hi-C ChIP-Seq |
Date: | 2021 |
Issue Date: | 2021-09-02 16:54:31 (UTC+8) |
Abstract: | Hi-C的全基因組染色體接觸可用於研究染色體的更高級別組織,例如隔室或拓撲關聯域。根據哺乳動物Hi-C圖的主成分分析可得到數據中兩個區室A和B。TAD或隔室可被視為基因組的分段。通常我們會使用基因體分割進行數據壓縮,並在不同細胞類型中整理出不同的修飾。我們比較了不同解析度下的PCA結果以找出差異,然後引入ChIP-Seq數據進行進一步分析。我們還引進了其他兩種進行聚類的方法,Louvain和Leiden。它們不僅可以與PCA的結果進行比較,還可以計算出網路的相關性。此外,我們可以基於結合ChIP-Seq和Hi-C的資訊使用兩者相加及網路融合來分割基因組。 The genome-wide chromosomal contact by Hi-C can be used to investigate the higher-level organization of chromosomes, such as compartments or topologically associating domains (TAD). Hi-C data revealed two compartments, A and B, based on principal component analysis (PCA) of Hi-C maps in mammals. TAD or compartment can be considered as a segmentation of the genome. Generally, we use genome segmentation for data compression and sort out different modifications in different cell types. We compared the PCA results in various resolutions to determine the difference and introduced the ChIP-Seq data for further analysis. We also introduce other methods to do clustering, which are the Louvain and Leiden methods. They can not only compare with the result of PCA but also figure out the correlation of networks. Furthermore, we can segment the genome based on integrated ChIP-Seq and Hi-C information using adding function and network fusion. |
Reference: | Balazs, R. (2014). Epigenetic mechanisms in Alzheimer’s disease. Degenerative neurological and neuromuscular disease, 4, 85. Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008 (10), P10008. Bo Wang, Aziz M Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno, Benjamin Haibe-Kains & Anna Goldenberg (2014). Similarity network fusion for aggregating data types on a genomic scale. Nature Methods volume 11, 333–337. ChromHMM: Chromatin state discovery and characterization. http://compbio.mit.edu/ChromHMM/ Community detection for NetworkX’s documentation (2010). https://Python-louvain.readthedocs.io/en/latest/ Dekker,J. et al. (2002) Capturing chromosome conformation. Science, 295, 1306–11. Eigenvector, Juicer (2017). https://github.com/aidenlab/juicer/wiki/Eigenvector ENCODE: Encyclopedia of DNA Elements. https://www.encodeproject.org/ Eugenio Marco1, Wouter Meuleman, Jialiang Huang, Kimberly Glass, Luca Pinello, Jianrong Wang,Manolis Kellis & Guo-Cheng Yuan (2017). Multi-scale chromatin state annotation using a hierarchical hidden Markov model. Nature communications. DOI: 10.1038/ncomms15011 Illumina et al. (2007) Pub. No. 770-2007-007 Current as of 26 November 2007. Whole-Genome Chromatin IP Sequencing (ChIP-Seq). Introduction of dataset preprocessing (2014). File: GSE63525_GM12878_combined_README.rtf. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63525 Kloetgen, A., Thandapani, P., Ntziachristos, P., Ghebrechristos, Y., Nomikou, S., Lazaris, C., ... & Tsirigos, A. (2020). Three-dimensional chromatin landscapes in T cell acute lymphoblastic leukemia. Nature genetics, 52(4), 388-400. Lan, X., Witt, H., Katsumura, K., Ye, Z., Wang, Q., Bresnick, E. H., ... & Jin, V. X. (2012). Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages. Nucleic acids research, 40(16), 7690-7704. Lieberman-Aiden E, Van Berkum N L, Williams L, et al. Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science 326, 289–293 (2009). Lin Liu, Yiqian Zhang, Jianxing Feng, Ning Zheng, Junfeng Yin, Yong Zhang (2012). GeSICA: genome segmentation from intra-chromosomal associations. BMC Genomics. 2012 May 4;13:164. doi: 10.1186/1471-2164-13-164. Luo, Z., Wang, X., Jiang, H., Wang, R., Chen, J., Chen, Y., ... & Song, X. (2020). Reorganized 3D genome structures support transcriptional regulation in mouse spermatogenesis. iScience, 23(4), 101034. Network fusion. https://nbisweden.github.io/workshop_omics_integration/session_nmf/SNF_main.html Rao, S.S.P., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Robinson, J.T., Sanborn, A.L., Machol, I., Omer, A.D., Lander, E.S., et al. (2014). A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680. SIMILARITY NETWORK FUSION(SNF). http://compbio.cs.toronto.edu/SNF/SNF/Software.html Strahl, B. D., & Allis, C. D. (2000). The language of covalent histone modifications. Nature, 403(6765), 41-45. Traag, V.A., Waltman, L. & van Eck, N.J. From Louvain to Leiden: guaranteeing well-connected communities. Sci Rep 9, 5233 (2019). https://doi.org/10.1038/s41598-019-41695-z Van Berkum, Nynke L et al. (2010) Hi-C: a method to study the three-dimensional architecture of genomes. Journal of visualized experiments : JoVE ,39, 1869. Visualization tool: Juicebox. https://www.aidenlab.org/juicebox/ Waltman, L., & Van Eck, N. J. (2013). A smart local moving algorithm for large-scale modularity-based community detection. The European physical journal B, 86(11), 1-14. Weighted correlation network analysis. https://en.wikipedia.org/wiki/Weighted_correlation_network_analysis networkanalysis, CWTSLeiden (2020). https://github.com/CWTSLeiden/networkanalysis |
Description: | 碩士 國立政治大學 資訊科學系 108753102 |
Source URI: | http://thesis.lib.nccu.edu.tw/record/#G0108753102 |
Data Type: | thesis |
DOI: | 10.6814/NCCU202101389 |
Appears in Collections: | [資訊科學系] 學位論文
|
Files in This Item:
File |
Size | Format | |
310201.pdf | 3161Kb | Adobe PDF2 | 0 | View/Open |
|
All items in 政大典藏 are protected by copyright, with all rights reserved.
|