1
|
Jardanowska-Kotuniak M, Dramiński M, Własnowolski M, Łapiński M, Sengupta K, Agarwal A, Filip A, Ghosh N, Pancaldi V, Grynberg M, Saha I, Plewczynski D, Dąbrowski MJ. Unveiling epigenetic regulatory elements associated with breast cancer development. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.12.623187. [PMID: 39605637 PMCID: PMC11601335 DOI: 10.1101/2024.11.12.623187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Breast cancer is the most common cancer in women and the 2nd most common cancer worldwide, yearly impacting over 2 million females and causing 650 thousand deaths. It has been widely studied, but its epigenetic variation is not entirely unveiled. We aimed to identify epigenetic mechanisms impacting the expression of breast cancer related genes to detect new potential biomarkers and therapeutic targets. We considered The Cancer Genome Atlas database with over 800 samples and several omics datasets such as mRNA, miRNA, DNA methylation, which we used to select 2701 features that were statistically significant to differ between cancer and control samples using the Monte Carlo Feature Selection and Interdependency Discovery algorithm, from an initial total of 417,486. Their biological impact on cancerogenesis was confirmed using: statistical analysis, natural language processing, linear and machine learning models as well as: transcription factors identification, drugs and 3D chromatin structure analyses. Classification of cancer vs control samples on the selected features returned high classification weighted Accuracy from 0.91 to 0.98 depending on feature-type: mRNA, miRNA, DNA methylation, and classification algorithm. In general, cancer samples showed lower expression of differentially expressed genes and increased β-values of differentially methylated sites. We identified mRNAs whose expression is well explained by miRNA expression and differentially methylated sites β-values. We recognized differentially methylated sites possibly affecting NRF1 and MXI1 transcription factors binding, causing a disturbance in NKAPL and PITX1 expression, respectively. Our 3D models showed more loosely packed chromatin in cancer. This study successfully points out numerous possible regulatory dependencies.
Collapse
Affiliation(s)
- Marta Jardanowska-Kotuniak
- Computational Biology Group, Institute of Computer Science of the Polish Academy of Sciences, Warsaw, Poland
- Institute of Biochemistry and Biophysics of the Polish Academy of Sciences, Warsaw, Poland
| | - Michał Dramiński
- Computational Biology Group, Institute of Computer Science of the Polish Academy of Sciences, Warsaw, Poland
| | - Michał Własnowolski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Marcin Łapiński
- Computational Biology Group, Institute of Computer Science of the Polish Academy of Sciences, Warsaw, Poland
| | - Kaustav Sengupta
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Abhishek Agarwal
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Adam Filip
- Computational Biology Group, Institute of Computer Science of the Polish Academy of Sciences, Warsaw, Poland
| | - Nimisha Ghosh
- Department of Computer Science and Information Technology, Institute of Technical Education and Research, Siksha O Anusandhan University, Bhubaneswar, Odisha, 751030, India
| | - Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics of the Polish Academy of Sciences, Warsaw, Poland
| | - Indrajit Saha
- Department of Computer Science and Engineering, National Institute of Technical Teachers’ Training and Research, Kolkata 700106, India
| | - Dariusz Plewczynski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Michał J. Dąbrowski
- Computational Biology Group, Institute of Computer Science of the Polish Academy of Sciences, Warsaw, Poland
| |
Collapse
|
2
|
Mergener R, Nunes MR, Böttcher AK, Siqueira MB, Peruzzo HF, Merola MC, Riegel M, Zen PRG. invdup(8)(8q24.13q24.3)-A Complex Alteration and Its Clinical Consequences. Genes (Basel) 2024; 15:910. [PMID: 39062689 PMCID: PMC11276216 DOI: 10.3390/genes15070910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 06/12/2024] [Accepted: 07/10/2024] [Indexed: 07/28/2024] Open
Abstract
Structural variation is a source of genetic variation that, in some cases, may trigger pathogenicity. Here, we describe two cases, a mother and son, with the same partial inverted duplication of the long arm of chromosome 8 [invdup(8)(q24.21q24.21)] of 17.18 Mb, showing different clinical manifestations: microcephaly, dorsal hypertrichosis, seizures and neuropsychomotor development delay in the child, and a cleft lip/palate, down-slanted palpebral fissures and learning disabilities in the mother. The deleterious outcome, in general, is reflected by the gain or loss of genetic material. However, discrepancies among the clinical manifestations raise some concerns about the genomic configuration within the chromosome and other genetic modifiers. With that in mind, we also performed a literature review of research published in the last 20 years about the duplication of the same, or close, chromosome region, seeking the elucidation of at least some relevant clinical features.
Collapse
Affiliation(s)
- Rafaella Mergener
- Post-Graduate Program in Pathology, Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre 90050-170, RS, Brazil
| | - Marcela Rodrigues Nunes
- Post-Graduate Program in Pathology, Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre 90050-170, RS, Brazil
- Medical Genetics Resident, Irmandade da Santa Casa de Misericórdia de Porto Alegre (ISCMPA), Porto Alegre 90020-090, RS, Brazil
| | - Ana Kalise Böttcher
- Undergraduate Program in Biomedical Science, Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre 90050-170, RS, Brazil
| | - Monique Banik Siqueira
- Undergraduate Program in Biomedical Sciences, Universidade do Vale do Rio dos Sinos (UNISINOS), São Leopoldo 93022-750, RS, Brazil;
| | - Helena Froener Peruzzo
- Undergraduate Program in Biomedical Science, Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre 90050-170, RS, Brazil
| | - Milene Carvalho Merola
- Undergraduate Program in Biomedical Science, Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre 90050-170, RS, Brazil
| | - Mariluce Riegel
- Casa dos Raros, Center for Comprehensive Care and Training in Rare Diseases, Porto Alegre 90610-261, RS, Brazil
- National Institute of Population Medical Genetics (INAGEMP), Porto Alegre 90035-903, RS, Brazil
| | - Paulo Ricardo Gazzola Zen
- Post-Graduate Program in Pathology, Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre 90050-170, RS, Brazil
- Medical Genetics, Department of Clinical Medicine, Universidade Federal de Ciências da Saúde de Porto Alegre(UFCSPA), Porto Alegre 90020-090, RS, Brazil
- Irmandade da Santa Casa de Misericórdia de Porto Alegre (ISCMPA), Porto Alegre 90050-170, RS, Brazil
| |
Collapse
|
3
|
Kadlof M, Banecki K, Chiliński M, Plewczynski D. Chromatin image-driven modelling. Methods 2024; 226:54-60. [PMID: 38636797 DOI: 10.1016/j.ymeth.2024.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 03/13/2024] [Accepted: 04/05/2024] [Indexed: 04/20/2024] Open
Abstract
The challenge of modelling the spatial conformation of chromatin remains an open problem. While multiple data-driven approaches have been proposed, each has limitations. This work introduces two image-driven modelling methods based on the Molecular Dynamics Flexible Fitting (MDFF) approach: the force method and the correlational method. Both methods have already been used successfully in protein modelling. We propose a novel way to employ them for building chromatin models directly from 3D images. This approach is termed image-driven modelling. Additionally, we introduce the initial structure generator, a tool designed to generate optimal starting structures for the proposed algorithms. The methods are versatile and can be applied to various data types, with minor modifications to accommodate new generation imaging techniques.
Collapse
Affiliation(s)
- Michał Kadlof
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland.
| | - Krzysztof Banecki
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Mateusz Chiliński
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland; Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Dariusz Plewczynski
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Centre of New Technologies, University of Warsaw, Warsaw, Poland
| |
Collapse
|
4
|
Wlasnowolski M, Grabowski P, Roszczyk D, Kaczmarski K, Plewczynski D. cudaMMC: GPU-enhanced multiscale Monte Carlo chromatin 3D modelling. Bioinformatics 2023; 39:btad588. [PMID: 37774005 PMCID: PMC10568367 DOI: 10.1093/bioinformatics/btad588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 08/14/2023] [Accepted: 09/28/2023] [Indexed: 10/01/2023] Open
Abstract
MOTIVATION Investigating the 3D structure of chromatin provides new insights into transcriptional regulation. With the evolution of 3C next-generation sequencing methods like ChiA-PET and Hi-C, the surge in data volume has highlighted the need for more efficient chromatin spatial modelling algorithms. This study introduces the cudaMMC method, based on the Simulated Annealing Monte Carlo approach and enhanced by GPU-accelerated computing, to efficiently generate ensembles of chromatin 3D structures. RESULTS The cudaMMC calculations demonstrate significantly faster performance with better stability compared to our previous method on the same workstation. cudaMMC also substantially reduces the computation time required for generating ensembles of large chromatin models, making it an invaluable tool for studying chromatin spatial conformation. AVAILABILITY AND IMPLEMENTATION Open-source software and manual and sample data are freely available on https://github.com/SFGLab/cudaMMC.
Collapse
Affiliation(s)
- Michal Wlasnowolski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw 02-097, Poland
| | - Pawel Grabowski
- Department of Information Processing Systems, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
| | - Damian Roszczyk
- Department of Information Processing Systems, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
| | - Krzysztof Kaczmarski
- Department of Information Processing Systems, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
| | - Dariusz Plewczynski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw 00-662, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw 02-097, Poland
| |
Collapse
|
5
|
Li N, Meng G, Yang C, Li H, Liu L, Wu Y, Liu B. Changes in epigenetic information during the occurrence and development of gastric cancer. Int J Biochem Cell Biol 2022; 153:106315. [DOI: 10.1016/j.biocel.2022.106315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 09/22/2022] [Accepted: 10/18/2022] [Indexed: 11/24/2022]
|
6
|
Orozco G, Schoenfelder S, Walker N, Eyre S, Fraser P. 3D genome organization links non-coding disease-associated variants to genes. Front Cell Dev Biol 2022; 10:995388. [PMID: 36340032 PMCID: PMC9631826 DOI: 10.3389/fcell.2022.995388] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 09/27/2022] [Indexed: 11/13/2022] Open
Abstract
Genome sequencing has revealed over 300 million genetic variations in human populations. Over 90% of variants are single nucleotide polymorphisms (SNPs), the remainder include short deletions or insertions, and small numbers of structural variants. Hundreds of thousands of these variants have been associated with specific phenotypic traits and diseases through genome wide association studies which link significant differences in variant frequencies with specific phenotypes among large groups of individuals. Only 5% of disease-associated SNPs are located in gene coding sequences, with the potential to disrupt gene expression or alter of the function of encoded proteins. The remaining 95% of disease-associated SNPs are located in non-coding DNA sequences which make up 98% of the genome. The role of non-coding, disease-associated SNPs, many of which are located at considerable distances from any gene, was at first a mystery until the discovery that gene promoters regularly interact with distal regulatory elements to control gene expression. Disease-associated SNPs are enriched at the millions of gene regulatory elements that are dispersed throughout the non-coding sequences of the genome, suggesting they function as gene regulation variants. Assigning specific regulatory elements to the genes they control is not straightforward since they can be millions of base pairs apart. In this review we describe how understanding 3D genome organization can identify specific interactions between gene promoters and distal regulatory elements and how 3D genomics can link disease-associated SNPs to their target genes. Understanding which gene or genes contribute to a specific disease is the first step in designing rational therapeutic interventions.
Collapse
Affiliation(s)
- Gisela Orozco
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
- NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, United Kingdom
| | - Stefan Schoenfelder
- Enhanc3D Genomics Ltd., Cambridge, United Kingdom
- Epigenetics Programme, The Babraham Institute, Babraham Research Campus, CB22 3AT Cambridge, Cambridge, United Kingdom
| | | | - Stephan Eyre
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom
- NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, United Kingdom
| | - Peter Fraser
- Enhanc3D Genomics Ltd., Cambridge, United Kingdom
- Department of Biological Science, Florida State University, Tallahassee, FL, United States
| |
Collapse
|
7
|
Das P, Shen T, McCord RP. Characterizing the variation in chromosome structure ensembles in the context of the nuclear microenvironment. PLoS Comput Biol 2022; 18:e1010392. [PMID: 35969616 PMCID: PMC9410561 DOI: 10.1371/journal.pcbi.1010392] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 08/25/2022] [Accepted: 07/15/2022] [Indexed: 11/23/2022] Open
Abstract
Inside the nucleus, chromosomes are subjected to direct physical interaction between different components, active forces, and thermal noise, leading to the formation of an ensemble of three-dimensional structures. However, it is still not well understood to what extent and how the structural ensemble varies from one chromosome region or cell-type to another. We designed a statistical analysis technique and applied it to single-cell chromosome imaging data to reveal the heterogeneity of individual chromosome structures. By analyzing the resulting structural landscape, we find that the largest dynamic variation is the overall radius of gyration of the chromatin region, followed by domain reorganization within the region. By comparing different human cell-lines and experimental perturbation data using this statistical analysis technique and a network-based similarity quantification approach, we identify both cell-type and condition-specific features of the structural landscapes. We identify a relationship between epigenetic state and the properties of chromosome structure fluctuation and validate this relationship through polymer simulations. Overall, our study suggests that the types of variation in a chromosome structure ensemble are cell-type as well as region-specific and can be attributed to constraints placed on the structure by factors such as variation in epigenetic state. Recent work has revealed principles of how chromosomes are folded and structured inside the human nucleus. It is now even possible to microscopically trace the path of chromosomes in 3D in individual cells. With this data, we can start to examine how much variation exists in chromosome structure and what biological factors may restrict or enhance this variation. Are chromosomes stuck in just a few possible positions or do they move around more freely, sampling many configurations? Here, we use a mathematical approach to compare chromosome structure variation in different cell types, at different locations along the genome, and when key structural proteins are removed. Through these comparisons and dynamic simulations of chromosome behavior, we identify factors that may constrain or promote variation in chromosome structure.
Collapse
Affiliation(s)
- Priyojit Das
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Tongye Shen
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
- Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee, United States of America
| | - Rachel Patton McCord
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, Tennessee, United States of America
- Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
8
|
Madsen-Østerbye J, Bellanger A, Galigniana NM, Collas P. Biology and Model Predictions of the Dynamics and Heterogeneity of Chromatin-Nuclear Lamina Interactions. Front Cell Dev Biol 2022; 10:913458. [PMID: 35693945 PMCID: PMC9178083 DOI: 10.3389/fcell.2022.913458] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 05/12/2022] [Indexed: 11/13/2022] Open
Abstract
Associations of chromatin with the nuclear lamina, at the nuclear periphery, help shape the genome in 3 dimensions. The genomic landscape of lamina-associated domains (LADs) is well characterized, but much remains unknown on the physical and mechanistic properties of chromatin conformation at the nuclear lamina. Computational models of chromatin folding at, and interactions with, a surface representing the nuclear lamina are emerging in attempts to characterize these properties and predict chromatin behavior at the lamina in health and disease. Here, we highlight the heterogeneous nature of the nuclear lamina and LADs, outline the main 3-dimensional chromatin structural modeling methods, review applications of modeling chromatin-lamina interactions and discuss biological insights inferred from these models in normal and disease states. Lastly, we address perspectives on future developments in modeling chromatin interactions with the nuclear lamina.
Collapse
Affiliation(s)
- Julia Madsen-Østerbye
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Aurélie Bellanger
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Natalia M. Galigniana
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
- Department of Immunology and Transfusion Medicine, Oslo University Hospital, Oslo, Norway
| | - Philippe Collas
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
- Department of Immunology and Transfusion Medicine, Oslo University Hospital, Oslo, Norway
| |
Collapse
|
9
|
3DGenBench: a web-server to benchmark computational models for 3D Genomics. Nucleic Acids Res 2022; 50:W4-W12. [PMID: 35639501 PMCID: PMC9252746 DOI: 10.1093/nar/gkac396] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 04/26/2022] [Accepted: 05/23/2022] [Indexed: 11/13/2022] Open
Abstract
Modeling 3D genome organisation has been booming in the last years thanks to the availability of experimental datasets of genomic contacts. However, the field is currently missing the standardisation of methods and metrics to compare predictions and experiments. We present 3DGenBench, a web server available at https://inc-cost.eu/benchmarking/, that allows benchmarking computational models of 3D Genomics. The benchmark is performed using a manually curated dataset of 39 capture Hi-C profiles in wild type and genome-edited mouse cells, and five genome-wide Hi-C profiles in human, mouse, and Drosophila cells. 3DGenBench performs two kinds of analysis, each supplied with a specific scoring module that compares predictions of a computational method to experimental data using several metrics. With 3DGenBench, the user obtains model performance scores, allowing an unbiased comparison with other models. 3DGenBench aims to become a reference web server to test new 3D genomics models and is conceived as an evolving platform where new types of analysis will be implemented in the future.
Collapse
|
10
|
Quan C, Ping J, Lu H, Zhou G, Lu Y. 3DSNP 2.0: update and expansion of the noncoding genomic variant annotation database. Nucleic Acids Res 2022; 50:D950-D955. [PMID: 34723317 PMCID: PMC8728236 DOI: 10.1093/nar/gkab1008] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/11/2021] [Accepted: 10/12/2021] [Indexed: 12/30/2022] Open
Abstract
The rapid development of single-molecule long-read sequencing (LRS) and single-cell assay for transposase accessible chromatin sequencing (scATAC-seq) technologies presents both challenges and opportunities for the annotation of noncoding variants. Here, we updated 3DSNP, a comprehensive database for human noncoding variant annotation, to expand its applications to structural variation (SV) and to implement variant annotation down to single-cell resolution. The updates of 3DSNP include (i) annotation of 108 317 SVs from a full spectrum of functions, especially their potential effects on three-dimensional chromatin structures, (ii) evaluation of the accessible chromatin peaks flanking the variants across 126 cell types/subtypes in 15 human fetal tissues and 54 cell types/subtypes in 25 human adult tissues by integrating scATAC-seq data and (iii) expansion of Hi-C data to 49 human cell types. In summary, this version is a significant and comprehensive improvement over the previous version. The 3DSNP v2.0 database is freely available at https://omic.tech/3dsnpv2/.
Collapse
Affiliation(s)
- Cheng Quan
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Jie Ping
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Hao Lu
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Gangqiao Zhou
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| | - Yiming Lu
- Beijing Institute of Radiation Medicine, State Key Laboratory of Proteomics, Beijing 100850, China
| |
Collapse
|
11
|
Chiliński M, Sengupta K, Plewczynski D. From DNA human sequence to the chromatin higher order organisation and its biological meaning: Using biomolecular interaction networks to understand the influence of structural variation on spatial genome organisation and its functional effect. Semin Cell Dev Biol 2021; 121:171-185. [PMID: 34429265 DOI: 10.1016/j.semcdb.2021.08.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 08/06/2021] [Accepted: 08/12/2021] [Indexed: 12/30/2022]
Abstract
The three-dimensional structure of the human genome has been proven to have a significant functional impact on gene expression. The high-order spatial chromatin is organised first by looping mediated by multiple protein factors, and then it is further formed into larger structures of topologically associated domains (TADs) or chromatin contact domains (CCDs), followed by A/B compartments and finally the chromosomal territories (CTs). The genetic variation observed in human population influences the multi-scale structures, posing a question regarding the functional impact of structural variants reflected by the variability of the genes expression patterns. The current methods of evaluating the functional effect include eQTLs analysis which uses statistical testing of influence of variants on spatially close genes. Rarely, non-coding DNA sequence changes are evaluated by their impact on the biomolecular interaction network (BIN) reflecting the cellular interactome that can be analysed by the classical graph-theoretic algorithms. Therefore, in the second part of the review, we introduce the concept of BIN, i.e. a meta-network model of the complete molecular interactome developed by integrating various biological networks. The BIN meta-network model includes DNA-protein binding by the plethora of protein factors as well as chromatin interactions, therefore allowing connection of genomics with the downstream biomolecular processes present in a cell. As an illustration, we scrutinise the chromatin interactions mediated by the CTCF protein detected in a ChIA-PET experiment in the human lymphoblastoid cell line GM12878. In the corresponding BIN meta-network the DNA spatial proximity is represented as a graph model, combined with the Proteins-Interaction Network (PIN) of human proteome using the Gene Association Network (GAN). Furthermore, we enriched the BIN with the signalling and metabolic pathways and Gene Ontology (GO) terms to assert its functional context. Finally, we mapped the Single Nucleotide Polymorphisms (SNPs) from the GWAS studies and identified the chromatin mutational hot-spots associated with a significant enrichment of SNPs related to autoimmune diseases. Afterwards, we mapped Structural Variants (SVs) from healthy individuals of 1000 Genomes Project and identified an interesting example of the missing protein complex associated with protein Q6GYQ0 due to a deletion on chromosome 14. Such an analysis using the meta-network BIN model is therefore helpful in evaluating the influence of genetic variation on spatial organisation of the genome and its functional effect in a cell.
Collapse
Affiliation(s)
- Mateusz Chiliński
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland; Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Kaustav Sengupta
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Dariusz Plewczynski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland; Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland.
| |
Collapse
|
12
|
Zha M, Wang N, Zhang C, Wang Z. Inferring Single-Cell 3D Chromosomal Structures Based on the Lennard-Jones Potential. Int J Mol Sci 2021; 22:ijms22115914. [PMID: 34072879 PMCID: PMC8199262 DOI: 10.3390/ijms22115914] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 05/23/2021] [Accepted: 05/28/2021] [Indexed: 11/16/2022] Open
Abstract
Reconstructing three-dimensional (3D) chromosomal structures based on single-cell Hi-C data is a challenging scientific problem due to the extreme sparseness of the single-cell Hi-C data. In this research, we used the Lennard-Jones potential to reconstruct both 500 kb and high-resolution 50 kb chromosomal structures based on single-cell Hi-C data. A chromosome was represented by a string of 500 kb or 50 kb DNA beads and put into a 3D cubic lattice for simulations. A 2D Gaussian function was used to impute the sparse single-cell Hi-C contact matrices. We designed a novel loss function based on the Lennard-Jones potential, in which the ε value, i.e., the well depth, was used to indicate how stable the binding of every pair of beads is. For the bead pairs that have single-cell Hi-C contacts and their neighboring bead pairs, the loss function assigns them stronger binding stability. The Metropolis-Hastings algorithm was used to try different locations for the DNA beads, and simulated annealing was used to optimize the loss function. We proved the correctness and validness of the reconstructed 3D structures by evaluating the models according to multiple criteria and comparing the models with 3D-FISH data.
Collapse
Affiliation(s)
- Mengsheng Zha
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, 118 College Dr, Hattiesburg, MS 39406, USA; (M.Z.); (C.Z.)
| | - Nan Wang
- Department of Computer Science, New Jersey City University, 2039 Kennedy Blvd, Jersey City, NJ 07305, USA;
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, 118 College Dr, Hattiesburg, MS 39406, USA; (M.Z.); (C.Z.)
| | - Zheng Wang
- Department of Computer Science, University of Miami, 1364 Memorial Drive, Coral Gables, FL 33124, USA
- Correspondence:
| |
Collapse
|
13
|
Quan C, Li Y, Liu X, Wang Y, Ping J, Lu Y, Zhou G. Characterization of structural variation in Tibetans reveals new evidence of high-altitude adaptation and introgression. Genome Biol 2021; 22:159. [PMID: 34034800 PMCID: PMC8146648 DOI: 10.1186/s13059-021-02382-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/14/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Structural variation (SV) acts as an essential mutational force shaping the evolution and function of the human genome. However, few studies have examined the role of SVs in high-altitude adaptation and little is known of adaptive introgressed SVs in Tibetans so far. RESULTS Here, we generate a comprehensive catalog of SVs in a Chinese Tibetan (n = 15) and Han (n = 10) population using nanopore sequencing technology. Among a total of 38,216 unique SVs in the catalog, 27% are sequence-resolved for the first time. We systematically assess the distribution of these SVs across repeat sequences and functional genomic regions. Through genotyping in additional 276 genomes, we identify 69 Tibetan-Han stratified SVs and 80 candidate adaptive genes. We also discover a few adaptive introgressed SV candidates and provide evidence for a deletion of 335 base pairs at 1p36.32. CONCLUSIONS Overall, our results highlight the important role of SVs in the evolutionary processes of Tibetans' adaptation to the Qinghai-Tibet Plateau and provide a valuable resource for future high-altitude adaptation studies.
Collapse
Affiliation(s)
- Cheng Quan
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yuanfeng Li
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Xinyi Liu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yahui Wang
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Jie Ping
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
| | - Yiming Lu
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
| | - Gangqiao Zhou
- Department of Genetics & Integrative Omics, State Key Laboratory of Proteomics, National Center for Protein Sciences, Beijing Institute of Radiation Medicine, 27 Taiping Road, Beijing, 100850 People’s Republic of China
- Hebei University, Baoding, Hebei Province 071002 People’s Republic of China
- Collaborative Innovation Center for Personalized Cancer Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu Province 211166 People’s Republic of China
- Medical College of Guizhou University, Guiyang, Guizhou Province 550025 People’s Republic of China
| |
Collapse
|
14
|
Gong H, Yang Y, Zhang S, Li M, Zhang X. Application of Hi-C and other omics data analysis in human cancer and cell differentiation research. Comput Struct Biotechnol J 2021; 19:2070-2083. [PMID: 33995903 PMCID: PMC8086027 DOI: 10.1016/j.csbj.2021.04.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 04/04/2021] [Accepted: 04/04/2021] [Indexed: 02/07/2023] Open
Abstract
With the development of 3C (chromosome conformation capture) and its derivative technology Hi-C (High-throughput chromosome conformation capture) research, the study of the spatial structure of the genomic sequence in the nucleus helps researchers understand the functions of biological processes such as gene transcription, replication, repair, and regulation. In this paper, we first introduce the research background and purpose of Hi-C data visualization analysis. After that, we discuss the Hi-C data analysis methods from genome 3D structure, A/B compartment, TADs (topologically associated domain), and loop detection. We also discuss how to apply genome visualization technologies to the identification of chromosome feature structures. We continue with a review of correlation analysis differences among multi-omics data, and how to apply Hi-C and other omics data analysis into cancer and cell differentiation research. Finally, we summarize the various problems in joint analyses based on Hi-C and other multi-omics data. We believe this review can help researchers better understand the progress and applications of 3D genome technology.
Collapse
Affiliation(s)
- Haiyan Gong
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Shunde Graduate School of University of Science and Technology Beijing, Foshan 528000, China
| | - Yi Yang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Sichen Zhang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Minghong Li
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
| | - Xiaotong Zhang
- Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Advanced Innovation Center for Materials Genome Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Shunde Graduate School of University of Science and Technology Beijing, Foshan 528000, China
| |
Collapse
|
15
|
Belokopytova P, Fishman V. Predicting Genome Architecture: Challenges and Solutions. Front Genet 2021; 11:617202. [PMID: 33552135 PMCID: PMC7862721 DOI: 10.3389/fgene.2020.617202] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 12/15/2020] [Indexed: 12/22/2022] Open
Abstract
Genome architecture plays a pivotal role in gene regulation. The use of high-throughput methods for chromatin profiling and 3-D interaction mapping provide rich experimental data sets describing genome organization and dynamics. These data challenge development of new models and algorithms connecting genome architecture with epigenetic marks. In this review, we describe how chromatin architecture could be reconstructed from epigenetic data using biophysical or statistical approaches. We discuss the applicability and limitations of these methods for understanding the mechanisms of chromatin organization. We also highlight the emergence of new predictive approaches for scoring effects of structural variations in human cells.
Collapse
Affiliation(s)
- Polina Belokopytova
- Natural Sciences Department, Novosibirsk State University, Novosibirsk, Russia
- Institute of Cytology and Genetics Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk, Russia
| | - Veniamin Fishman
- Natural Sciences Department, Novosibirsk State University, Novosibirsk, Russia
- Institute of Cytology and Genetics Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk, Russia
| |
Collapse
|