1
|
Zhang S, Xu N, Fu L, Yang X, Ma K, Li Y, Yang Z, Li Z, Feng Y, Jiang X, Han J, Hu R, Zhang L, Lian D, de Gennaro L, Paparella A, Ryabov F, Meng D, He Y, Wu D, Yang C, Mao Y, Bian X, Lu Y, Antonacci F, Ventura M, Shepelev VA, Miga KH, Alexandrov IA, Logsdon GA, Phillippy AM, Su B, Zhang G, Eichler EE, Lu Q, Shi Y, Sun Q, Mao Y. Integrated analysis of the complete sequence of a macaque genome. Nature 2025; 640:714-721. [PMID: 40011769 PMCID: PMC12003069 DOI: 10.1038/s41586-025-08596-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 01/03/2025] [Indexed: 02/28/2025]
Abstract
The crab-eating macaques (Macaca fascicularis) and rhesus macaques (Macaca mulatta) are pivotal in biomedical and evolutionary research1-3. However, their genomic complexity and interspecies genetic differences remain unclear4. Here, we present a complete genome assembly of a crab-eating macaque, revealing 46% fewer segmental duplications and 3.83 times longer centromeres than those of humans5,6. We also characterize 93 large-scale genomic differences between macaques and humans at a single-base-pair resolution, highlighting their impact on gene regulation in primate evolution. Using ten long-read macaque genomes, hundreds of short-read macaque genomes and full-length transcriptome data, we identified roughly 2 Mbp of fixed-genetic variants, roughly 240 Mbp of complex loci, 16.76 Mbp genetic differentiation regions and 110 alternative splice events, potentially associated with various phenotypic differences between the two macaque species. In summary, the integrated genetic analysis enhances understanding of lineage-specific phenotypes, adaptation and primate evolution, thereby improving their biomedical applications in human disease research.
Collapse
Affiliation(s)
- Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
| | - Ning Xu
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Lianting Fu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Kaiyue Ma
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yamei Li
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Zikun Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Zhengtong Li
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yu Feng
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu, China
| | - Xinrui Jiang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Junmin Han
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Ruixing Hu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Lu Zhang
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Da Lian
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Luciana de Gennaro
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Annalisa Paparella
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Dan Meng
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yaoxi He
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
| | - Dongya Wu
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- School of Medicine, Zhejiang University, Hangzhou, China
| | - Chentao Yang
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Yuxiang Mao
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xinyan Bian
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Yong Lu
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Francesca Antonacci
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Mario Ventura
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Valery A Shepelev
- Institute of Molecular Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Karen H Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ivan A Alexandrov
- Department of Anatomy and Anthropology and Department of Human Molecular Genetics and Biochemistry, Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Glennis A Logsdon
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Adam M Phillippy
- Center for Genomics and Data Science Research, Genome Informatics Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Bing Su
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Yunnan Key Laboratory of Integrative Anthropology, Kunming, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Guojie Zhang
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China
- Center of Evolutionary and Organismal Biology, and Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
- School of Medicine, Zhejiang University, Hangzhou, China
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Qing Lu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China
| | - Qiang Sun
- Institute of Neuroscience, Center for Excellence in Brain Science and Intelligence Technology, State Key Laboratory of Neuroscience, Chinese Academy of Sciences, Shanghai, China.
- Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai, China.
- National Key Laboratory of Genetic Evolution and Animal Model, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
- University of Chinese Academy of Sciences, Beijing, China.
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.
- Center for Genomic Research, International Institutes of Medicine, Fourth Affiliated Hospital, Zhejiang University, Yiwu, China.
- Shanghai Key Laboratory of Embryo Original Diseases, International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
2
|
Yan H, Han J, Jin S, Han Z, Si Z, Yan S, Xuan L, Yu G, Guan X, Fang L, Wang K, Zhang T. Post-polyploidization centromere evolution in cotton. Nat Genet 2025; 57:1021-1030. [PMID: 40033059 DOI: 10.1038/s41588-025-02115-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 02/03/2025] [Indexed: 03/05/2025]
Abstract
Upland cotton (Gossypium hirsutum) accounts for more than 90% of the world's cotton production and, as an allotetraploid, is a model plant for polyploid crop domestication. In the present study, we reported a complete telomere-to-telomere (T2T) genome assembly of Upland cotton accession Texas Marker-1 (T2T-TM-1), which has a total size of 2,299.6 Mb, and annotated 79,642 genes. Based on T2T-TM-1, interspecific centromere divergence was detected between the A- and D-subgenomes and their corresponding diploid progenitors. Centromere-associated repetitive sequences (CRCs) were found to be enriched for Gypsy-like retroelements. Centromere size expansion, repositioning and structure variations occurred post-polyploidization. It is interesting that CRC homologs were transferred from the diploid D-genome progenitor to the D-subgenome, invaded the A-subgenome and then underwent post-tetraploidization proliferation. This suggests an evolutionary advantage for the CRCs of the D-genome progenitor, presents a D-genome-adopted inheritance of centromere repeats after polyploidization and shapes the dynamic centromeric landscape during polyploidization in polyploid species.
Collapse
Affiliation(s)
- Hu Yan
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China
| | - Jinlei Han
- School of Life Sciences, Nantong University, Nantong, China
| | - Shangkun Jin
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China
| | - Zegang Han
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China
| | - Zhanfeng Si
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China
| | - Sunyi Yan
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China
| | - Lisha Xuan
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China
| | - Guangrun Yu
- School of Life Sciences, Nantong University, Nantong, China
| | - Xueying Guan
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| | - Lei Fang
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China.
- Hainan Institute of Zhejiang University, Sanya, China.
| | - Kai Wang
- School of Life Sciences, Nantong University, Nantong, China.
| | - Tianzhen Zhang
- Zhejiang Provincial Key Laboratory of Crop Genetic Resources, the Advanced Seed Institute, Plant Precision Breeding Academy, College of Agriculture and Biotechnology, Key Laboratory of Plant Factory Generation-adding Breeding, Ministry of Agriculture and Rural Affairs, Zhejiang University, Hangzhou, China.
- Hainan Institute of Zhejiang University, Sanya, China.
| |
Collapse
|
3
|
Ouyang J. Transcription as a double-edged sword in genome maintenance. FEBS Lett 2025; 599:147-156. [PMID: 39704019 DOI: 10.1002/1873-3468.15080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 10/29/2024] [Accepted: 10/31/2024] [Indexed: 12/21/2024]
Abstract
Genome maintenance is essential for the integrity of the genetic blueprint, of which only a small fraction is transcribed in higher eukaryotes. DNA lesions occurring in the transcribed genome trigger transcription pausing and transcription-coupled DNA repair. There are two major transcription-coupled DNA repair pathways. The transcription-coupled nucleotide excision repair (TC-NER) pathway has been well studied for decades, while the transcription-coupled homologous recombination repair (TC-HR) pathway has recently gained attention. Importantly, recent studies have uncovered crucial roles of RNA transcripts in TC-HR, opening exciting directions for future research. Transcription also plays pivotal roles in regulating the stability of highly specialized genomic structures such as telomeres, centromeres, and fragile sites. Despite their positive function in genome maintenance, transcription and RNA transcripts can also be the sources of genomic instability, especially when colliding with DNA replication and forming unscheduled pathological RNA:DNA hybrids (R-loops), respectively. Pathological R-loops can result from transcriptional stress, which may be induced by transcription dysregulation. Future investigation into the interplay between transcription and DNA repair will reveal novel molecular bases for genome maintenance and transcriptional stress-associated genomic instability, providing therapeutic targets for human disease intervention.
Collapse
Affiliation(s)
- Jian Ouyang
- Department of Biochemistry and Molecular Biology
- Hollings Cancer Center, Medical University of South Carolina, Charleston, SC, USA
| |
Collapse
|
4
|
Glunčić M, Vlahović I, Rosandić M, Paar V. Neuroblastoma Breakpoint Family 3mer Higher Order Repeats/Olduvai Triplet Pattern in the Complete Genome of Human and Nonhuman Primates and Relation to Cognitive Capacity. Genes (Basel) 2024; 15:1598. [PMID: 39766865 PMCID: PMC11675761 DOI: 10.3390/genes15121598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 12/03/2024] [Accepted: 12/12/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND/OBJECTIVES The ~1.6 kb NBPF repeat units in neuroblastoma breakpoint family (NBPF) genes are specific to humans and are associated with cognitive capacity in higher primates. While the number of NBPF monomers/Olduvai sequences in humans is approximately 2-3 times greater than in great apes, the difference in copy number values of canonical NBPF 3mer Higher-order repeats (HORs)/Olduvai triplets between humans and great apes is substantially larger. This study aims to analyze the organization and evolutionary significance of NBPF 3mer HORs/Olduvai triplets in fully sequenced primate genomes. METHODS We applied the global repeat map (GRM) algorithm to identify canonical and variant NBPF 3mer HORs/Olduvai triplets in the complete genomes of humans, chimpanzees, gorillas, and orangutans. The resulting monomer arrays were analyzed using the GRMhor algorithm to generate detailed schematic representations of NBPF HOR organization. RESULTS The analysis reveals a distinct difference in NBPF-related patterns among these primates, particularly in the number of tandemly organized canonical 3mer HORs/Olduvai triplets: 61 tandemly organized canonical NBPF 3mer HORs/Olduvai triplets in humans, compared to 0 in chimpanzees and orangutans, and 9 in gorillas. When considering only tandemly organized 3mer HORs/Olduvai triplets with more than three copies, the numbers adjust to 36 in humans and 0 in great apes. Furthermore, the divergence between individual NBPF monomers in humans and great apes is twice as high as that observed within great apes. CONCLUSIONS These findings support the hypothesis that the tandem organization of NBPF 3mer HORs/Olduvai triplets plays a crucial role in enhancing cognitive capacity in humans compared to great apes, potentially providing a significant evolutionary advantage. This effect complements the impact of the increased number of individual NBPF monomers/Olduvai sequences, together contributing to a synergistic amplification effect.
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia; (M.G.); (V.P.)
| | - Ines Vlahović
- Department of Interdisciplinary Sciences, Algebra University College, 10000 Zagreb, Croatia
| | - Marija Rosandić
- Department of Internal Medicine, University Hospital Centre Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia; (M.G.); (V.P.)
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| |
Collapse
|
5
|
Glunčić M, Barić D, Paar V. Efficient genome monomer higher-order structure annotation and identification using the GRMhor algorithm. BIOINFORMATICS ADVANCES 2024; 4:vbae191. [PMID: 39659587 PMCID: PMC11630843 DOI: 10.1093/bioadv/vbae191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 11/02/2024] [Accepted: 11/26/2024] [Indexed: 12/12/2024]
Abstract
Motivation Tandem monomeric units, integral components of eukaryotic genomes, form higher-order repeat (HOR) structures that play crucial roles in maintaining chromosome integrity and regulating gene expression and protein abundance. Given their significant influence on processes such as evolution, chromosome segregation, and disease, developing a sensitive and automated tool for identifying HORs across diverse genomic sequences is essential. Results In this study, we applied the GRMhor (Global Repeat Map hor) algorithm to analyse the centromeric region of chromosome 20 in three individual human genomes, as well as in the centromeric regions of three higher primates. In all three human genomes, we identified six distinct HOR arrays, which revealed significantly greater differences in the number of canonical and variant copies, as well as in their overall structure, than would be expected given the 99.9% genetic similarity among humans. Furthermore, our analysis of higher primate genomes, which revealed entirely different HOR sequences, indicates a much larger genomic divergence between humans and higher primates than previously recognized. These results underscore the suitability of the GRMhor algorithm for studying specificities in individual genomes, particularly those involving repetitive monomers in centromere structure, which is essential for proper chromosome segregation during cell division, while also highlighting its utility in exploring centromere evolution and other repetitive genomic regions. Availability and implementation Source code and example binaries freely available for download at github.com/gluncic/GRM2023.
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, Zagreb 10000, Croatia
| | - Domjan Barić
- Faculty of Science, University of Zagreb, Zagreb 10000, Croatia
| | - Vladimir Paar
- Faculty of Science, University of Zagreb, Zagreb 10000, Croatia
- Department of Mathematical, Physical and Chemical Sciences, Croatian Academy of Sciences and Arts, Zagreb 10000, Croatia
| |
Collapse
|
6
|
Jolma A, Hernandez-Corchado A, Yang AW, Fathi A, Laverty KU, Brechalov A, Razavi R, Albu M, Zheng H, Kulakovskiy IV, Najafabadi HS, Hughes TR. GHT-SELEX demonstrates unexpectedly high intrinsic sequence specificity and complex DNA binding of many human transcription factors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.11.618478. [PMID: 39605368 PMCID: PMC11601218 DOI: 10.1101/2024.11.11.618478] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
A long-standing challenge in human regulatory genomics is that transcription factor (TF) DNA-binding motifs are short and degenerate, while the genome is large. Motif scans therefore produce many false-positive binding site predictions. By surveying 179 TFs across 25 families using >1,500 cyclic in vitro selection experiments with fragmented, naked, and unmodified genomic DNA - a method we term GHT-SELEX (Genomic HT-SELEX) - we find that many human TFs possess much higher sequence specificity than anticipated. Moreover, genomic binding regions from GHT-SELEX are often surprisingly similar to those obtained in vivo (i.e. ChIP-seq peaks). We find that comparable specificity can also be obtained from motif scans, but performance is highly dependent on derivation and use of the motifs, including accounting for multiple local matches in the scans. We also observe alternative engagement of multiple DNA-binding domains within the same protein: long C2H2 zinc finger proteins often utilize modular DNA recognition, engaging different subsets of their DNA binding domain (DBD) arrays to recognize multiple types of distinct target sites, frequently evolving via internal duplication and divergence of one or more DBDs. Thus, contrary to conventional wisdom, it is common for TFs to possess sufficient intrinsic specificity to independently delineate cellular targets.
Collapse
Affiliation(s)
- Arttu Jolma
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Aldo Hernandez-Corchado
- Department of Human Genetics, McGill University, Montréal, QC H3A 0C7, Canada
- Victor P. Dahdaleh Institute of Genomic Medicine, Montréal, QC H3A 0G1, Canada
| | - Ally W.H. Yang
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Ali Fathi
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Kaitlin U. Laverty
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
- Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | | | - Rozita Razavi
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Mihai Albu
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Hong Zheng
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | | | - Ivan V. Kulakovskiy
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, Moscow, Russia and Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Russia
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, Moscow, Russia
| | - Hamed S. Najafabadi
- Department of Human Genetics, McGill University, Montréal, QC H3A 0C7, Canada
- Victor P. Dahdaleh Institute of Genomic Medicine, Montréal, QC H3A 0G1, Canada
| | - Timothy R. Hughes
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
7
|
Glunčić M, Vlahović I, Rosandić M, Paar V. Novel Cascade Alpha Satellite HORs in Orangutan Chromosome 13 Assembly: Discovery of the 59mer HOR-The largest Unit in Primates-And the Missing Triplet 45/27/18 HOR in Human T2T-CHM13v2.0 Assembly. Int J Mol Sci 2024; 25:7596. [PMID: 39062839 PMCID: PMC11276891 DOI: 10.3390/ijms25147596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 07/05/2024] [Accepted: 07/09/2024] [Indexed: 07/28/2024] Open
Abstract
From the recent genome assembly NHGRI_mPonAbe1-v2.0_NCBI (GCF_028885655.2) of orangutan chromosome 13, we computed the precise alpha satellite higher-order repeat (HOR) structure using the novel high-precision GRM2023 algorithm with Global Repeat Map (GRM) and Monomer Distance (MD) diagrams. This study rigorously identified alpha satellite HORs in the centromere of orangutan chromosome 13, discovering a novel 59mer HOR-the longest HOR unit identified in any primate to date. Additionally, it revealed the first intertwined sequence of three HORs, 18mer/27mer/45mer HORs, with a common aligned "backbone" across all HOR copies. The major 7mer HOR exhibits a Willard's-type canonical copy, although some segments of the array display significant irregularities. In contrast, the 14mer HOR forms a regular Willard's-type HOR array. Surprisingly, the GRM2023 high-precision analysis of chromosome 13 of human genome assembly T2T-CHM13v2.0 reveals the presence of only a 7mer HOR, despite both the orangutan and human genome assemblies being derived from whole genome shotgun sequences.
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
| | - Ines Vlahović
- Department of Interdisciplinary Sciences, Algebra University College, 10000 Zagreb, Croatia;
| | - Marija Rosandić
- University Hospital Centre Zagreb (Ret.), 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| |
Collapse
|
8
|
Filliaux S, Bertelsen C, Baughman H, Komives E, Lyubchenko Y. The Interaction of NF-κB Transcription Factor with Centromeric Chromatin. J Phys Chem B 2024; 128:5803-5813. [PMID: 38860885 DOI: 10.1021/acs.jpcb.3c08388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2024]
Abstract
Centromeric chromatin is a subset of chromatin structure and governs chromosome segregation. The centromere is composed of both CENP-A nucleosomes (CENP-Anuc) and H3 nucleosomes (H3nuc) and is enriched with alpha-satellite (α-sat) DNA repeats. These CENP-Anuc have a different structure than H3nuc, decreasing the base pairs (bp) of wrapped DNA from 147 bp for H3nuc to 121 bp for CENP-Anuc. All these factors can contribute to centromere function. We investigated the interaction of H3nuc and CENP-Anuc with NF-κB, a crucial transcription factor in regulating immune response and inflammation. We utilized atomic force microscopy (AFM) to characterize complexes of both types of nucleosomes with NF-κB. We found that NF-κB unravels H3nuc, removing more than 20 bp of DNA, and that NF-κB binds to the nucleosomal core. Similar results were obtained for the truncated variant of NF-κB comprised only of the Rel homology domain and missing the transcription activation domain (TAD), suggesting that RelATAD is not critical in unraveling H3nuc. By contrast, NF-κB did not bind to or unravel CENP-Anuc. These findings with different affinities for two types of nucleosomes to NF-κB may have implications for understanding the mechanisms of gene expression in bulk and centromere chromatin.
Collapse
Affiliation(s)
- Shaun Filliaux
- Department of Pharmaceutical Sciences, University of Nebraska Medical Center, Omaha, Nebraska 68198-6025, United States
| | - Chloe Bertelsen
- Department of Pharmaceutical Sciences, University of Nebraska Medical Center, Omaha, Nebraska 68198-6025, United States
| | - Hannah Baughman
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, California 92093-0378, United States
| | - Elizabeth Komives
- Department of Chemistry and Biochemistry, UC San Diego, La Jolla, California 92093-0378, United States
| | - Yuri Lyubchenko
- Department of Pharmaceutical Sciences, University of Nebraska Medical Center, Omaha, Nebraska 68198-6025, United States
| |
Collapse
|
9
|
Makova KD, Pickett BD, Harris RS, Hartley GA, Cechova M, Pal K, Nurk S, Yoo D, Li Q, Hebbar P, McGrath BC, Antonacci F, Aubel M, Biddanda A, Borchers M, Bornberg-Bauer E, Bouffard GG, Brooks SY, Carbone L, Carrel L, Carroll A, Chang PC, Chin CS, Cook DE, Craig SJC, de Gennaro L, Diekhans M, Dutra A, Garcia GH, Grady PGS, Green RE, Haddad D, Hallast P, Harvey WT, Hickey G, Hillis DA, Hoyt SJ, Jeong H, Kamali K, Pond SLK, LaPolice TM, Lee C, Lewis AP, Loh YHE, Masterson P, McGarvey KM, McCoy RC, Medvedev P, Miga KH, Munson KM, Pak E, Paten B, Pinto BJ, Potapova T, Rhie A, Rocha JL, Ryabov F, Ryder OA, Sacco S, Shafin K, Shepelev VA, Slon V, Solar SJ, Storer JM, Sudmant PH, Sweetalana, Sweeten A, Tassia MG, Thibaud-Nissen F, Ventura M, Wilson MA, Young AC, Zeng H, Zhang X, Szpiech ZA, Huber CD, Gerton JL, Yi SV, Schatz MC, Alexandrov IA, Koren S, O'Neill RJ, Eichler EE, Phillippy AM. The complete sequence and comparative analysis of ape sex chromosomes. Nature 2024; 630:401-411. [PMID: 38811727 PMCID: PMC11168930 DOI: 10.1038/s41586-024-07473-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 04/26/2024] [Indexed: 05/31/2024]
Abstract
Apes possess two sex chromosomes-the male-specific Y chromosome and the X chromosome, which is present in both males and females. The Y chromosome is crucial for male reproduction, with deletions being linked to infertility1. The X chromosome is vital for reproduction and cognition2. Variation in mating patterns and brain function among apes suggests corresponding differences in their sex chromosomes. However, owing to their repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the methodology developed for the telomere-to-telomere (T2T) human genome, we produced gapless assemblies of the X and Y chromosomes for five great apes (bonobo (Pan paniscus), chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla gorilla), Bornean orangutan (Pongo pygmaeus) and Sumatran orangutan (Pongo abelii)) and a lesser ape (the siamang gibbon (Symphalangus syndactylus)), and untangled the intricacies of their evolution. Compared with the X chromosomes, the ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements-owing to the accumulation of lineage-specific ampliconic regions, palindromes, transposable elements and satellites. Many Y chromosome genes expand in multi-copy families and some evolve under purifying selection. Thus, the Y chromosome exhibits dynamic evolution, whereas the X chromosome is more stable. Mapping short-read sequencing data to these assemblies revealed diversity and selection patterns on sex chromosomes of more than 100 individual great apes. These reference assemblies are expected to inform human evolution and conservation genetics of non-human apes, all of which are endangered species.
Collapse
Affiliation(s)
| | - Brandon D Pickett
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Monika Cechova
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Karol Pal
- Penn State University, University Park, PA, USA
| | - Sergey Nurk
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - DongAhn Yoo
- University of Washington School of Medicine, Seattle, WA, USA
| | - Qiuhui Li
- Johns Hopkins University, Baltimore, MD, USA
| | - Prajna Hebbar
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | | | | | - Erich Bornberg-Bauer
- University of Münster, Münster, Germany
- MPI for Developmental Biology, Tübingen, Germany
| | - Gerard G Bouffard
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shelise Y Brooks
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lucia Carbone
- Oregon Health and Science University, Portland, OR, USA
- Oregon National Primate Research Center, Hillsboro, OR, USA
| | - Laura Carrel
- Penn State University School of Medicine, Hershey, PA, USA
| | | | | | - Chen-Shan Chin
- Foundation of Biological Data Sciences, Belmont, CA, USA
| | | | | | | | - Mark Diekhans
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Amalia Dutra
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gage H Garcia
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Diana Haddad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Glenn Hickey
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - David A Hillis
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | - Hyeonsoo Jeong
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Yong-Hwee E Loh
- University of California Santa Barbara, Santa Barbara, CA, USA
| | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Kelly M McGarvey
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Karen H Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Evgenia Pak
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Benedict Paten
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | - Arang Rhie
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Joana L Rocha
- University of California Berkeley, Berkeley, CA, USA
| | - Fedor Ryabov
- Masters Program in National Research, University Higher School of Economics, Moscow, Russia
| | | | - Samuel Sacco
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | - Steven J Solar
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Sweetalana
- Penn State University, University Park, PA, USA
| | - Alex Sweeten
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | | | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Mario Ventura
- Università degli Studi di Bari Aldo Moro, Bari, Italy
| | | | - Alice C Young
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Xinru Zhang
- Penn State University, University Park, PA, USA
| | | | | | | | - Soojin V Yi
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | | | - Sergey Koren
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Evan E Eichler
- University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| | - Adam M Phillippy
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
10
|
Logsdon GA, Rozanski AN, Ryabov F, Potapova T, Shepelev VA, Catacchio CR, Porubsky D, Mao Y, Yoo D, Rautiainen M, Koren S, Nurk S, Lucas JK, Hoekzema K, Munson KM, Gerton JL, Phillippy AM, Ventura M, Alexandrov IA, Eichler EE. The variation and evolution of complete human centromeres. Nature 2024; 629:136-145. [PMID: 38570684 PMCID: PMC11062924 DOI: 10.1038/s41586-024-07278-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 03/07/2024] [Indexed: 04/05/2024]
Abstract
Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size1. As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions2,3. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome4,5. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.
Collapse
Affiliation(s)
- Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Allison N Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | - Claudia R Catacchio
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies, Oxford, United Kingdom
| | - Julian K Lucas
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mario Ventura
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Ivan A Alexandrov
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
- Department of Anatomy and Anthropology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Dan David Center for Human Evolution and Biohistory Research, Tel Aviv University, Tel Aviv, Israel
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
11
|
Glunčić M, Vlahović I, Rosandić M, Paar V. Novel Concept of Alpha Satellite Cascading Higher-Order Repeats (HORs) and Precise Identification of 15mer and 20mer Cascading HORs in Complete T2T-CHM13 Assembly of Human Chromosome 15. Int J Mol Sci 2024; 25:4395. [PMID: 38673983 PMCID: PMC11050224 DOI: 10.3390/ijms25084395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 04/28/2024] Open
Abstract
Unraveling the intricate centromere structure of human chromosomes holds profound implications, illuminating fundamental genetic mechanisms and potentially advancing our comprehension of genetic disorders and therapeutic interventions. This study rigorously identified and structurally analyzed alpha satellite higher-order repeats (HORs) within the centromere of human chromosome 15 in the complete T2T-CHM13 assembly using the high-precision GRM2023 algorithm. The most extensive alpha satellite HOR array in chromosome 15 reveals a novel cascading HOR, housing 429 15mer HOR copies, containing 4-, 7- and 11-monomer subfragments. Within each row of cascading HORs, all alpha satellite monomers are of distinct types, as in regular Willard's HORs. However, different HOR copies within the same cascading 15mer HOR contain more than one monomer of the same type. Each canonical 15mer HOR copy comprises 15 monomers belonging to only 9 different monomer types. Notably, 65% of the 429 15mer cascading HOR copies exhibit canonical structures, while 35% display variant configurations. Identified as the second most extensive alpha satellite HOR, another novel cascading HOR within human chromosome 15 encompasses 164 20mer HOR copies, each featuring two subfragments. Moreover, a distinct pattern emerges as interspersed 25mer/26mer structures differing from regular Willard's HORs and giving rise to a 34-monomer subfragment. Only a minor 18mer HOR array of 12 HOR copies is of the regular Willard's type. These revelations highlight the complexity within the chromosome 15 centromeric region, accentuating deviations from anticipated highly regular patterns and hinting at profound information encoding and functional potential within the human centromere.
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
| | - Ines Vlahović
- Algebra LAB, Algebra University College, 10000 Zagreb, Croatia;
| | - Marija Rosandić
- Department of Internal Medicine, University Hospital Centre Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| |
Collapse
|
12
|
Di Tommaso E, Giunta S. Dynamic interplay between human alpha-satellite DNA structure and centromere functions. Semin Cell Dev Biol 2024; 156:130-140. [PMID: 37926668 DOI: 10.1016/j.semcdb.2023.10.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 10/04/2023] [Accepted: 10/10/2023] [Indexed: 11/07/2023]
Abstract
Maintenance of genome stability relies on functional centromeres for correct chromosome segregation and faithful inheritance of the genetic information. The human centromere is the primary constriction within mitotic chromosomes made up of repetitive alpha-satellite DNA hierarchically organized in megabase-long arrays of near-identical higher order repeats (HORs). Centromeres are epigenetically specified by the presence of the centromere-specific histone H3 variant, CENP-A, which enables the assembly of the kinetochore for microtubule attachment. Notably, centromeric DNA is faithfully inherited as intact haplotypes from the parents to the offspring without intervening recombination, yet, outside of meiosis, centromeres are akin to common fragile sites (CFSs), manifesting crossing-overs and ongoing sequence instability. Consequences of DNA changes within the centromere are just starting to emerge, with unclear effects on intra- and inter-generational inheritance driven by centromere's essential role in kinetochore assembly. Here, we review evidence of meiotic selection operating to mitigate centromere drive, as well as recent reports on centromere damage, recombination and repair during the mitotic cell division. We propose an antagonistic pleiotropy interpretation to reconcile centromere DNA instability as both driver of aneuploidy that underlies degenerative diseases, while also potentially necessary for the maintenance of homogenized HORs for centromere function. We attempt to provide a framework for this conceptual leap taking into consideration the structural interface of centromere-kinetochore interaction and present case scenarios for its malfunctioning. Finally, we offer an integrated working model to connect DNA instability, chromatin, and structural changes with functional consequences on chromosome integrity.
Collapse
Affiliation(s)
- Elena Di Tommaso
- Laboratory of Genome Evolution, Department of Biology & Biotechnology Charles Darwin, Sapienza University of Rome, Rome 00185, Italy
| | - Simona Giunta
- Laboratory of Genome Evolution, Department of Biology & Biotechnology Charles Darwin, Sapienza University of Rome, Rome 00185, Italy.
| |
Collapse
|
13
|
Packiaraj J, Thakur J. DNA satellite and chromatin organization at mouse centromeres and pericentromeres. Genome Biol 2024; 25:52. [PMID: 38378611 PMCID: PMC10880262 DOI: 10.1186/s13059-024-03184-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 02/12/2024] [Indexed: 02/22/2024] Open
Abstract
BACKGROUND Centromeres are essential for faithful chromosome segregation during mitosis and meiosis. However, the organization of satellite DNA and chromatin at mouse centromeres and pericentromeres is poorly understood due to the challenges of assembling repetitive genomic regions. RESULTS Using recently available PacBio long-read sequencing data from the C57BL/6 strain, we find that contrary to the previous reports of their homogeneous nature, both centromeric minor satellites and pericentromeric major satellites exhibit a high degree of variation in sequence and organization within and between arrays. While most arrays are continuous, a significant fraction is interspersed with non-satellite sequences, including transposable elements. Using chromatin immunoprecipitation sequencing (ChIP-seq), we find that the occupancy of CENP-A and H3K9me3 chromatin at centromeric and pericentric regions, respectively, is associated with increased sequence enrichment and homogeneity at these regions. The transposable elements at centromeric regions are not part of functional centromeres as they lack significant CENP-A enrichment. Furthermore, both CENP-A and H3K9me3 nucleosomes occupy minor and major satellites spanning centromeric-pericentric junctions and a low yet significant amount of CENP-A spreads locally at centromere junctions on both pericentric and telocentric sides. Finally, while H3K9me3 nucleosomes display a well-phased organization on major satellite arrays, CENP-A nucleosomes on minor satellite arrays are poorly phased. Interestingly, the homogeneous class of major satellites also phase CENP-A and H3K27me3 nucleosomes, indicating that the nucleosome phasing is an inherent property of homogeneous major satellites. CONCLUSIONS Our findings reveal that mouse centromeres and pericentromeres display a high diversity in satellite sequence, organization, and chromatin structure.
Collapse
Affiliation(s)
- Jenika Packiaraj
- Department of Biology, Emory University, 1510 Clifton Rd, Atlanta, GA, 30322, USA
| | - Jitendra Thakur
- Department of Biology, Emory University, 1510 Clifton Rd, Atlanta, GA, 30322, USA.
| |
Collapse
|
14
|
Filliaux S, Bertelsen C, Baughman H, Komives E, Lyubchenko YL. The Interaction of NF-κB Transcription Factor with Centromeric Chromatin. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.13.580208. [PMID: 38405937 PMCID: PMC10888803 DOI: 10.1101/2024.02.13.580208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Centromeric chromatin is a subset of chromatin structure and governs chromosome segregation. The centromere is composed of both CENP-A nucleosomes (CENP-A nuc ) and H3 nucleosomes (H3 nuc ) and is enriched with alpha-satellite (α-sat) DNA repeats. These CENP-A nuc have a different structure than H3 nuc , decreasing the base pairs (bp) of wrapped DNA from 147 bp for H3 nuc to 121 bp for CENP-A nuc . All these factors can contribute to centromere function. We investigated the interaction of H3 nuc and CENP-A nuc with NF-κB, a crucial transcription factor in regulating immune response and inflammation. We utilized Atomic Force Microscopy (AFM) to characterize complexes of both types of nucleosomes with NF-κB. We found that NF-κB unravels H3 nuc , removing more than 20 bp of DNA, and that NF-κB binds to the nucleosomal core. Similar results were obtained for the truncated variant of NF-κB comprised only of the Rel Homology domain and missing the transcription activation domain (TAD), suggesting the RelA TAD is not critical in unraveling H3 nuc . By contrast, NF-κB did not bind to or unravel CENP- A nuc . These findings with different affinities for two types of nucleosomes to NF-κB may have implications for understanding the mechanisms of gene expression in bulk and centromere chromatin.
Collapse
|
15
|
Brannan EO, Hartley GA, O’Neill RJ. Mechanisms of Rapid Karyotype Evolution in Mammals. Genes (Basel) 2023; 15:62. [PMID: 38254952 PMCID: PMC10815390 DOI: 10.3390/genes15010062] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 12/27/2023] [Accepted: 12/28/2023] [Indexed: 01/24/2024] Open
Abstract
Chromosome reshuffling events are often a foundational mechanism by which speciation can occur, giving rise to highly derivative karyotypes even amongst closely related species. Yet, the features that distinguish lineages prone to such rapid chromosome evolution from those that maintain stable karyotypes across evolutionary time are still to be defined. In this review, we summarize lineages prone to rapid karyotypic evolution in the context of Simpson's rates of evolution-tachytelic, horotelic, and bradytelic-and outline the mechanisms proposed to contribute to chromosome rearrangements, their fixation, and their potential impact on speciation events. Furthermore, we discuss relevant genomic features that underpin chromosome variation, including patterns of fusions/fissions, centromere positioning, and epigenetic marks such as DNA methylation. Finally, in the era of telomere-to-telomere genomics, we discuss the value of gapless genome resources to the future of research focused on the plasticity of highly rearranged karyotypes.
Collapse
Affiliation(s)
- Emry O. Brannan
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA; (E.O.B.); (G.A.H.)
| | - Gabrielle A. Hartley
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA; (E.O.B.); (G.A.H.)
| | - Rachel J. O’Neill
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA; (E.O.B.); (G.A.H.)
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
16
|
Makova KD, Pickett BD, Harris RS, Hartley GA, Cechova M, Pal K, Nurk S, Yoo D, Li Q, Hebbar P, McGrath BC, Antonacci F, Aubel M, Biddanda A, Borchers M, Bomberg E, Bouffard GG, Brooks SY, Carbone L, Carrel L, Carroll A, Chang PC, Chin CS, Cook DE, Craig SJ, de Gennaro L, Diekhans M, Dutra A, Garcia GH, Grady PG, Green RE, Haddad D, Hallast P, Harvey WT, Hickey G, Hillis DA, Hoyt SJ, Jeong H, Kamali K, Kosakovsky Pond SL, LaPolice TM, Lee C, Lewis AP, Loh YHE, Masterson P, McCoy RC, Medvedev P, Miga KH, Munson KM, Pak E, Paten B, Pinto BJ, Potapova T, Rhie A, Rocha JL, Ryabov F, Ryder OA, Sacco S, Shafin K, Shepelev VA, Slon V, Solar SJ, Storer JM, Sudmant PH, Sweetalana, Sweeten A, Tassia MG, Thibaud-Nissen F, Ventura M, Wilson MA, Young AC, Zeng H, Zhang X, Szpiech ZA, Huber CD, Gerton JL, Yi SV, Schatz MC, Alexandrov IA, Koren S, O’Neill RJ, Eichler E, Phillippy AM. The Complete Sequence and Comparative Analysis of Ape Sex Chromosomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.30.569198. [PMID: 38077089 PMCID: PMC10705393 DOI: 10.1101/2023.11.30.569198] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Apes possess two sex chromosomes-the male-specific Y and the X shared by males and females. The Y chromosome is crucial for male reproduction, with deletions linked to infertility. The X chromosome carries genes vital for reproduction and cognition. Variation in mating patterns and brain function among great apes suggests corresponding differences in their sex chromosome structure and evolution. However, due to their highly repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the state-of-the-art experimental and computational methods developed for the telomere-to-telomere (T2T) human genome, we produced gapless, complete assemblies of the X and Y chromosomes for five great apes (chimpanzee, bonobo, gorilla, Bornean and Sumatran orangutans) and a lesser ape, the siamang gibbon. These assemblies completely resolved ampliconic, palindromic, and satellite sequences, including the entire centromeres, allowing us to untangle the intricacies of ape sex chromosome evolution. We found that, compared to the X, ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements. This divergence on the Y arises from the accumulation of lineage-specific ampliconic regions and palindromes (which are shared more broadly among species on the X) and from the abundance of transposable elements and satellites (which have a lower representation on the X). Our analysis of Y chromosome genes revealed lineage-specific expansions of multi-copy gene families and signatures of purifying selection. In summary, the Y exhibits dynamic evolution, while the X is more stable. Finally, mapping short-read sequencing data from >100 great ape individuals revealed the patterns of diversity and selection on their sex chromosomes, demonstrating the utility of these reference assemblies for studies of great ape evolution. These complete sex chromosome assemblies are expected to further inform conservation genetics of nonhuman apes, all of which are endangered species.
Collapse
Affiliation(s)
| | - Brandon D. Pickett
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Monika Cechova
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Karol Pal
- Penn State University, University Park, PA, USA
| | - Sergey Nurk
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - DongAhn Yoo
- University of Washington School of Medicine, Seattle, WA, USA
| | - Qiuhui Li
- Johns Hopkins University, Baltimore, MD, USA
| | - Prajna Hebbar
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | | | | | - Erich Bomberg
- University of Münster, Münster, Germany
- MPI for Developmental Biology, Tübingen, Germany
| | - Gerard G. Bouffard
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shelise Y. Brooks
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lucia Carbone
- Oregon Health & Science University, Portland, OR, USA
- Oregon National Primate Research Center, Hillsboro, OR, USA
| | - Laura Carrel
- Penn State University School of Medicine, Hershey, PA, USA
| | | | | | - Chen-Shan Chin
- Foundation of Biological Data Sciences, Belmont, CA, USA
| | | | | | | | - Mark Diekhans
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Amalia Dutra
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gage H. Garcia
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Diana Haddad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Glenn Hickey
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - David A. Hillis
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | - Hyeonsoo Jeong
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | | | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Karen H. Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Evgenia Pak
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Benedict Paten
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | - Arang Rhie
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | | | - Samuel Sacco
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | - Steven J. Solar
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Sweetalana
- Penn State University, University Park, PA, USA
| | - Alex Sweeten
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | | | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Alice C. Young
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Xinru Zhang
- Penn State University, University Park, PA, USA
| | | | | | | | - Soojin V. Yi
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | | | - Sergey Koren
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Evan Eichler
- University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Adam M. Phillippy
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
17
|
Arora UP, Sullivan BA, Dumont BL. Variation in the CENP-A sequence association landscape across diverse inbred mouse strains. Cell Rep 2023; 42:113178. [PMID: 37742188 PMCID: PMC10873113 DOI: 10.1016/j.celrep.2023.113178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 04/25/2023] [Accepted: 09/08/2023] [Indexed: 09/26/2023] Open
Abstract
Centromeres are crucial for chromosome segregation, but their underlying sequences evolve rapidly, imposing strong selection for compensatory changes in centromere-associated kinetochore proteins to assure the stability of genome transmission. While this co-evolution is well documented between species, it remains unknown whether population-level centromere diversity leads to functional differences in kinetochore protein association. Mice (Mus musculus) exhibit remarkable variation in centromere size and sequence, but the amino acid sequence of the kinetochore protein CENP-A is conserved. Here, we apply k-mer-based analyses to CENP-A chromatin profiling data from diverse inbred mouse strains to investigate the interplay between centromere variation and kinetochore protein sequence association. We show that centromere sequence diversity is associated with strain-level differences in both CENP-A positioning and sequence preference along the mouse core centromere satellite. Our findings reveal intraspecies sequence-dependent differences in CENP-A/centromere association and open additional perspectives for understanding centromere-mediated variation in genome stability.
Collapse
Affiliation(s)
- Uma P Arora
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA; Graduate School of Biomedical Sciences, Tufts University, 136 Harrison Avenue, Boston, MA 02111, USA.
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, 213 Research Drive, Box 3054, Durham, NC 27710, USA
| | - Beth L Dumont
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA; Graduate School of Biomedical Sciences, Tufts University, 136 Harrison Avenue, Boston, MA 02111, USA; Graduate School of Biomedical Science and Engineering, University of Maine, 5775 Stodder Hall, Room 46, Orono, ME 04469, USA.
| |
Collapse
|
18
|
Takata H, Masuda Y, Ohmido N. CRISPR imaging reveals chromatin fluctuation at the centromere region related to cellular senescence. Sci Rep 2023; 13:14609. [PMID: 37670098 PMCID: PMC10480159 DOI: 10.1038/s41598-023-41770-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 08/31/2023] [Indexed: 09/07/2023] Open
Abstract
The human genome is spatially and temporally organized in the nucleus as chromatin, and the dynamic structure of chromatin is closely related to genome functions. Cellular senescence characterized by an irreversible arrest of proliferation is accompanied by chromatin reorganisation in the nucleus during senescence. However, chromatin dynamics in chromatin reorganisation is poorly understood. Here, we report chromatin dynamics at the centromere region during senescence in cultured human cell lines using live imaging based on the clustered regularly interspaced short palindromic repeat/dCas9 system. The repetitive sequence at the centromere region, alpha-satellite DNA, was predominantly detected on chromosomes 1, 12, and 19. Centromeric chromatin formed irregular-shaped domains with high fluctuation in cells undergoing 5'-aza-2'-deoxycytidine-induced senescence. Our findings suggest that the increased fluctuation of the chromatin structure facilitates centromere disorganisation during cellular senescence.
Collapse
Affiliation(s)
- Hideaki Takata
- Biomedical Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Ikeda, Osaka, 563-8577, Japan.
| | - Yumena Masuda
- Graduate School of Human Development and Environment, Kobe University, Nada-ku, Kobe, 657-8501, Japan
| | - Nobuko Ohmido
- Graduate School of Human Development and Environment, Kobe University, Nada-ku, Kobe, 657-8501, Japan
| |
Collapse
|
19
|
Packiaraj J, Thakur J. DNA satellite and chromatin organization at house mouse centromeres and pericentromeres. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.18.549612. [PMID: 37503200 PMCID: PMC10370071 DOI: 10.1101/2023.07.18.549612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Centromeres are essential for faithful chromosome segregation during mitosis and meiosis. However, the organization of satellite DNA and chromatin at mouse centromeres and pericentromeres is poorly understood due to the challenges of sequencing and assembling repetitive genomic regions. Using recently available PacBio long-read sequencing data from the C57BL/6 strain and chromatin profiling, we found that contrary to the previous reports of their highly homogeneous nature, centromeric and pericentromeric satellites display varied sequences and organization. We find that both centromeric minor satellites and pericentromeric major satellites exhibited sequence variations within and between arrays. While most arrays are continuous, a significant fraction is interspersed with non-satellite sequences, including transposable elements. Additionally, we investigated CENP-A and H3K9me3 chromatin organization at centromeres and pericentromeres using Chromatin immunoprecipitation sequencing (ChIP-seq). We found that the occupancy of CENP-A and H3K9me3 chromatin at centromeric and pericentric regions, respectively, is associated with increased sequence abundance and homogeneity at these regions. Furthermore, the transposable elements at centromeric regions are not part of functional centromeres as they lack CENP-A enrichment. Finally, we found that while H3K9me3 nucleosomes display a well-phased organization on major satellite arrays, CENP-A nucleosomes on minor satellite arrays lack phased organization. Interestingly, the homogeneous class of major satellites phase CENP-A and H3K27me3 nucleosomes as well, indicating that the nucleosome phasing is an inherent property of homogeneous major satellites. Overall, our findings reveal that house mouse centromeres and pericentromeres, which were previously thought to be highly homogenous, display significant diversity in satellite sequence, organization, and chromatin structure.
Collapse
Affiliation(s)
- Jenika Packiaraj
- Department of Biology, Emory University, 1510 Clifton Rd, Atlanta, GA 30322
| | - Jitendra Thakur
- Department of Biology, Emory University, 1510 Clifton Rd, Atlanta, GA 30322
| |
Collapse
|
20
|
Logsdon GA, Rozanski AN, Ryabov F, Potapova T, Shepelev VA, Mao Y, Rautiainen M, Koren S, Nurk S, Porubsky D, Lucas JK, Hoekzema K, Munson KM, Gerton JL, Phillippy AM, Alexandrov IA, Eichler EE. The variation and evolution of complete human centromeres. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.30.542849. [PMID: 37398417 PMCID: PMC10312506 DOI: 10.1101/2023.05.30.542849] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
We completely sequenced and assembled all centromeres from a second human genome and used two reference sets to benchmark genetic, epigenetic, and evolutionary variation within centromeres from a diversity panel of humans and apes. We find that centromere single-nucleotide variation can increase by up to 4.1-fold relative to other genomic regions, with the caveat that up to 45.8% of centromeric sequence, on average, cannot be reliably aligned with current methods due to the emergence of new α-satellite higher-order repeat (HOR) structures and two to threefold differences in the length of the centromeres. The extent to which this occurs differs depending on the chromosome and haplotype. Comparing the two sets of complete human centromeres, we find that eight harbor distinctly different α-satellite HOR array structures and four contain novel α-satellite HOR variants in high abundance. DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by at least 500 kbp-a property not readily associated with novel α-satellite HORs. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan, and macaque genomes. Comparative analyses reveal nearly complete turnover of α-satellite HORs, but with idiosyncratic changes in structure characteristic to each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the p- and q-arms of human chromosomes and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.
Collapse
Affiliation(s)
- Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Allison N. Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Julian K. Lucas
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ivan A. Alexandrov
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
- Department of Anatomy and Anthropology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Dan David Center for Human Evolution and Biohistory Research, Tel Aviv University, Tel Aviv, Israel
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
21
|
Gao S, Yang X, Guo H, Zhao X, Wang B, Ye K. HiCAT: a tool for automatic annotation of centromere structure. Genome Biol 2023; 24:58. [PMID: 36978122 PMCID: PMC10053651 DOI: 10.1186/s13059-023-02900-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 03/17/2023] [Indexed: 03/30/2023] Open
Abstract
Significant improvements in long-read sequencing technologies have unlocked complex genomic areas, such as centromeres, in the genome and introduced the centromere annotation problem. Currently, centromeres are annotated in a semi-manual way. Here, we propose HiCAT, a generalizable automatic centromere annotation tool, based on hierarchical tandem repeat mining to facilitate decoding of centromere architecture. We apply HiCAT to simulated datasets, human CHM13-T2T and gapless Arabidopsis thaliana genomes. Our results are generally consistent with previous inferences but also greatly improve annotation continuity and reveal additional fine structures, demonstrating HiCAT's performance and general applicability.
Collapse
Affiliation(s)
- Shenghan Gao
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Xiaofei Yang
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China.
| | - Hongtao Guo
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Xixi Zhao
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Bo Wang
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- MOE Key Lab for Intelligent Networks & Networks Security, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- Genome Institute, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
- Faculty of Science, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
22
|
Saayman X, Graham E, Nathan WJ, Nussenzweig A, Esashi F. Centromeres as universal hotspots of DNA breakage, driving RAD51-mediated recombination during quiescence. Mol Cell 2023; 83:523-538.e7. [PMID: 36702125 PMCID: PMC10009740 DOI: 10.1016/j.molcel.2023.01.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 10/07/2022] [Accepted: 01/03/2023] [Indexed: 01/27/2023]
Abstract
Centromeres are essential for chromosome segregation in most animals and plants yet are among the most rapidly evolving genome elements. The mechanisms underlying this paradoxical phenomenon remain enigmatic. Here, we report that human centromeres innately harbor a striking enrichment of DNA breaks within functionally active centromere regions. Establishing a single-cell imaging strategy that enables comparative assessment of DNA breaks at repetitive regions, we show that centromeric DNA breaks are induced not only during active cellular proliferation but also de novo during quiescence. Markedly, centromere DNA breaks in quiescent cells are resolved enzymatically by the evolutionarily conserved RAD51 recombinase, which in turn safeguards the specification of functional centromeres. This study highlights the innate fragility of centromeres, which may have been co-opted over time to reinforce centromere specification while driving rapid evolution. The findings also provide insights into how fragile centromeres are likely to contribute to human disease.
Collapse
Affiliation(s)
- Xanita Saayman
- Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK
| | - Emily Graham
- Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK
| | - William J Nathan
- Laboratory of Genome Integrity, National Cancer Institute, NIH, Bethesda, MD 20892-4254, USA
| | - Andre Nussenzweig
- Laboratory of Genome Integrity, National Cancer Institute, NIH, Bethesda, MD 20892-4254, USA
| | - Fumiko Esashi
- Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK.
| |
Collapse
|
23
|
Logsdon GA, Eichler EE. The Dynamic Structure and Rapid Evolution of Human Centromeric Satellite DNA. Genes (Basel) 2022; 14:92. [PMID: 36672831 PMCID: PMC9859433 DOI: 10.3390/genes14010092] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 12/22/2022] [Accepted: 12/24/2022] [Indexed: 12/31/2022] Open
Abstract
The complete sequence of a human genome provided our first comprehensive view of the organization of satellite DNA associated with heterochromatin. We review how our understanding of the genetic architecture and epigenetic properties of human centromeric DNA have advanced as a result. Preliminary studies of human and nonhuman ape centromeres reveal complex, saltatory mutational changes organized around distinct evolutionary layers. Pockets of regional hypomethylation within higher-order α-satellite DNA, termed centromere dip regions, appear to define the site of kinetochore attachment in all human chromosomes, although such epigenetic features can vary even within the same chromosome. Sequence resolution of satellite DNA is providing new insights into centromeric function with potential implications for improving our understanding of human biology and health.
Collapse
Affiliation(s)
- Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
24
|
Glunčić M, Vlahović I, Rosandić M, Paar V. Tandemly repeated NBPF HOR copies (Olduvai triplets): Possible impact on human brain evolution. Life Sci Alliance 2022; 6:6/1/e202101306. [PMID: 36261226 PMCID: PMC9584774 DOI: 10.26508/lsa.202101306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 09/29/2022] [Accepted: 09/30/2022] [Indexed: 11/24/2022] Open
Abstract
Previously it was found that the neuroblastoma breakpoint family (NBPF) gene repeat units of ∼1.6 kb have an important role in human brain evolution and function. The higher order organization of these repeat units has been discovered by both methods, the higher order repeat (HOR)-searching method and the HLS searching method. Using the HOR searching method with global repeat map algorithm, here we identified the tandemly organized NBPF HORs in the human and nonhuman primate NCBI reference genomes. We identified 50 tandemly organized canonical 3mer NBPF HOR copies (Olduvai triplets), but none in nonhuman primates chimpanzee, gorilla, orangutan, and Rhesus macaque. This discontinuous jump in tandemly organized HOR copy number is in sharp contrast to the known gradual increase in the number of Olduvai domains (NBPF monomers) from nonhuman primates to human, especially from ∼138 in chimpanzee to ∼300 in human genome. Using the same global repeat map algorithm method we have also determined the 3mer tandems of canonical 3mer HOR copies in 20 randomly chosen human genomes (10 male and 10 female). In all cases, we found the same 3mer HOR copy numbers as in the case of the reference human genome, with no mutation. On the other hand, some point mutations with respect to reference genome are found for some NBPF monomers which are not tandemly organized in canonical HORs.
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, Zagreb, Croatia
| | | | - Marija Rosandić
- University Hospital Centre Zagreb (ret), Zagreb, Croatia,Croatian Academy of Sciences and Arts, Zagreb, Croatia
| | - Vladimir Paar
- Faculty of Science, University of Zagreb, Zagreb, Croatia,Croatian Academy of Sciences and Arts, Zagreb, Croatia
| |
Collapse
|
25
|
Haig D. Paradox lost: Concerted evolution and centromeric instability: Centromeres are hospitable habitats for repeats that evolve adaptations for proliferation within the nucleus sometimes at organismal cost.: Centromeres are hospitable habitats for repeats that evolve adaptations for proliferation within the nucleus sometimes at organismal cost. Bioessays 2022; 44:e2200023. [PMID: 35748194 DOI: 10.1002/bies.202200023] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 06/07/2022] [Accepted: 06/09/2022] [Indexed: 11/11/2022]
Abstract
Homologous centromeres compete for segregation to the secondary oocyte nucleus at female meiosis I. Centromeric repeats also compete with each other to populate centromeres in mitotic cells of the germline and have become adapted to use the recombinational machinery present at centromeres to promote their own propagation. Repeats are not needed at centromeres, rather centromeres appear to be hospitable habitats for the colonization and proliferation of repeats. This is probably an indirect consequence of two distinctive features of centromeric DNA. Centromeres are subject to breakage by the mechanical forces exerted by microtubules and meiotic crossing-over is suppressed. Centromeric proteins acting in trans are under selection to mitigate the costs of centromeric repeats acting in cis. Collateral costs of mitotic competition at centromeres may help to explain the high rates of aneuploidy observed in early human embryos.
Collapse
Affiliation(s)
- David Haig
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
26
|
Population Scale Analysis of Centromeric Satellite DNA Reveals Highly Dynamic Evolutionary Patterns and Genomic Organization in Long-Tailed and Rhesus Macaques. Cells 2022; 11:cells11121953. [PMID: 35741082 PMCID: PMC9221937 DOI: 10.3390/cells11121953] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 06/12/2022] [Accepted: 06/14/2022] [Indexed: 02/04/2023] Open
Abstract
Centromeric satellite DNA (cen-satDNA) consists of highly divergent repeat monomers, each approximately 171 base pairs in length. Here, we investigated the genetic diversity in the centromeric region of two primate species: long-tailed (Macaca fascicularis) and rhesus (Macaca mulatta) macaques. Fluorescence in situ hybridization and bioinformatic analysis showed the chromosome-specific organization and dynamic nature of cen-satDNAsequences, and their substantial diversity, with distinct subfamilies across macaque populations, suggesting increased turnovers. Comparative genomics identified high level polymorphisms spanning a 120 bp deletion region and a remarkable interspecific variability in cen-satDNA size and structure. Population structure analysis detected admixture patterns within populations, indicating their high divergence and rapid evolution. However, differences in cen-satDNA profiles appear to not be involved in hybrid incompatibility between the two species. Our study provides a genomic landscape of centromeric repeats in wild macaques and opens new avenues for exploring their impact on the adaptive evolution and speciation of primates.
Collapse
|
27
|
Sundararajan K, Straight AF. Centromere Identity and the Regulation of Chromosome Segregation. Front Cell Dev Biol 2022; 10:914249. [PMID: 35721504 PMCID: PMC9203049 DOI: 10.3389/fcell.2022.914249] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 05/13/2022] [Indexed: 11/13/2022] Open
Abstract
Eukaryotes segregate their chromosomes during mitosis and meiosis by attaching chromosomes to the microtubules of the spindle so that they can be distributed into daughter cells. The complexity of centromeres ranges from the point centromeres of yeast that attach to a single microtubule to the more complex regional centromeres found in many metazoans or holocentric centromeres of some nematodes, arthropods and plants, that bind to dozens of microtubules per kinetochore. In vertebrates, the centromere is defined by a centromere specific histone variant termed Centromere Protein A (CENP-A) that replaces histone H3 in a subset of centromeric nucleosomes. These CENP-A nucleosomes are distributed on long stretches of highly repetitive DNA and interspersed with histone H3 containing nucleosomes. The mechanisms by which cells control the number and position of CENP-A nucleosomes is unknown but likely important for the organization of centromeric chromatin in mitosis so that the kinetochore is properly oriented for microtubule capture. CENP-A chromatin is epigenetically determined thus cells must correct errors in CENP-A organization to prevent centromere dysfunction and chromosome loss. Recent improvements in sequencing complex centromeres have paved the way for defining the organization of CENP-A nucleosomes in centromeres. Here we discuss the importance and challenges in understanding CENP-A organization and highlight new discoveries and advances enabled by recent improvements in the human genome assembly.
Collapse
|
28
|
Kunyavskaya O, Dvorkina T, Bzikadze AV, Alexandrov I, Pevzner PA. Automated annotation of human centromeres with HORmon. Genome Res 2022; 32:1137-1151. [PMID: 35545449 DOI: 10.1101/gr.276362.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 05/06/2022] [Indexed: 11/24/2022]
Abstract
Recent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats (HORs)). Even though there was a half-century-long series of semi-manual studies of centromere architecture, a rigorous centromere annotation algorithm is still lacking. Moreover, an automated centromere annotation is a prerequisite for studies of genetic diseases associated with centromeres, and evolutionary studies of centromeres across multiple species. Although the monomer decomposition (transforming a centromere into a monocentromere written in the monomer alphabet) and the HOR decomposition (representing a monocentromere in the alphabet of HORs) are currently viewed as two separate problems, we demonstrate that they should be integrated into a single framework in such a way that HOR (monomer) inference affects monomer (HOR) inference. We thus developed the HORmon algorithm that integrates the monomer/HOR inference and automatically generates the human monomers/HORs that are largely consistent with the previous semi-manual inference.
Collapse
Affiliation(s)
- Olga Kunyavskaya
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University
| | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University
| | | | - Ivan Alexandrov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University
| | | |
Collapse
|
29
|
Altemose N, Glennis A, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, Hoyt SJ, Uralsky L, Ryabov FD, Shew CJ, Sauria MEG, Borchers M, Gershman A, Mikheenko A, Shepelev VA, Dvorkina T, Kunyavskaya O, Vollger MR, Rhie A, McCartney AM, Asri M, Lorig-Roach R, Shafin K, Aganezov S, Olson D, de Lima LG, Potapova T, Hartley GA, Haukness M, Kerpedjiev P, Gusev F, Tigyi K, Brooks S, Young A, Nurk S, Koren S, Salama SR, Paten B, Rogaev EI, Streets A, Karpen GH, Dernburg AF, Sullivan BA, Straight AF, Wheeler TJ, Gerton JL, Eichler EE, Phillippy AM, Timp W, Dennis MY, O'Neill RJ, Zook JM, Schatz MC, Pevzner PA, Diekhans M, Langley CH, Alexandrov IA, Miga KH. Complete genomic and epigenetic maps of human centromeres. Science 2022; 376:eabl4178. [PMID: 35357911 PMCID: PMC9233505 DOI: 10.1126/science.abl4178] [Citation(s) in RCA: 270] [Impact Index Per Article: 90.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.
Collapse
Affiliation(s)
- Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - A. Glennis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Pragya Sidhwani
- Department of Biochemistry, Stanford University, Stanford, CA, USA
| | - Sasha A. Langley
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Gina V. Caldas
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Savannah J. Hoyt
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Lev Uralsky
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
| | | | - Colin J. Shew
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | | | | | - Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | | | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ann M. McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ryan Lorig-Roach
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Daniel Olson
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | | | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Gabrielle A. Hartley
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Fedor Gusev
- Vavilov Institute of General Genetics, Moscow, Russia
| | - Kristof Tigyi
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Shelise Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alice Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sofie R. Salama
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| | - Evgeny I. Rogaev
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School, Worcester, MA, USA
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Gary H. Karpen
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- BioEngineering and BioMedical Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Abby F. Dernburg
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA
| | | | - Travis J. Wheeler
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | - Jennifer L. Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical School, Department of Biochemistry and Molecular Biology and Cancer Center, University of Kansas, Kansas City, KS, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | - Rachel J. O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California at San Diego, San Diego, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Charles H. Langley
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
| | - Ivan A. Alexandrov
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
- Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| |
Collapse
|
30
|
Abstract
Centromeres, the chromosomal loci where spindle fibers attach during cell division to segregate chromosomes, are typically found within satellite arrays in plants and animals. Satellite arrays have been difficult to analyze because they comprise megabases of tandem head-to-tail highly repeated DNA sequences. Much evidence suggests that centromeres are epigenetically defined by the location of nucleosomes containing the centromere-specific histone H3 variant cenH3, independently of the DNA sequences where they are located; however, the reason that cenH3 nucleosomes are generally found on rapidly evolving satellite arrays has remained unclear. Recently, long-read sequencing technology has clarified the structures of satellite arrays and sparked rethinking of how they evolve, and new experiments and analyses have helped bring both understanding and further speculation about the role these highly repeated sequences play in centromere identification.
Collapse
Affiliation(s)
- Paul B Talbert
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | - Steven Henikoff
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| |
Collapse
|
31
|
Abstract
We are entering a new era in genomics where entire centromeric regions are accurately represented in human reference assemblies. Access to these high-resolution maps will enable new surveys of sequence and epigenetic variation in the population and offer new insight into satellite array genomics and centromere function. Here, we focus on the sequence organization and evolution of alpha satellites, which are credited as the genetic and genomic definition of human centromeres due to their interaction with inner kinetochore proteins and their importance in the development of human artificial chromosome assays. We provide an overview of alpha satellite repeat structure and array organization in the context of these high-quality reference data sets; discuss the emergence of variation-based surveys; and provide perspective on the role of this new source of genetic and epigenetic variation in the context of chromosome biology, genome instability, and human disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California 95064, USA; .,Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA
| | - Ivan A Alexandrov
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; .,Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199004, Russia.,Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| |
Collapse
|
32
|
Suzuki Y, Morishita S. The time is ripe to investigate human centromeres by long-read sequencing†. DNA Res 2021; 28:6381569. [PMID: 34609504 PMCID: PMC8502840 DOI: 10.1093/dnares/dsab021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 09/28/2021] [Indexed: 01/05/2023] Open
Abstract
The complete sequencing of human centromeres, which are filled with highly repetitive elements, has long been challenging. In human centromeres, α-satellite monomers of about 171 bp in length are the basic repeating units, but α-satellite monomers constitute the higher-order repeat (HOR) units, and thousands of copies of highly homologous HOR units form large arrays, which have hampered sequence assembly of human centromeres. Because most HOR unit occurrences are covered by long reads of about 10 kb, the recent availability of much longer reads is expected to enable observation of individual HOR occurrences in terms of their single-nucleotide or structural variants. The time has come to examine the complete sequence of human centromeres.
Collapse
Affiliation(s)
- Yuta Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8568, Japan
| | - Shinichi Morishita
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8568, Japan
| |
Collapse
|
33
|
Feliciello I, Pezer Ž, Kordiš D, Bruvo Mađarić B, Ugarković Đ. Evolutionary History of Alpha Satellite DNA Repeats Dispersed within Human Genome Euchromatin. Genome Biol Evol 2021; 12:2125-2138. [PMID: 33078196 PMCID: PMC7719264 DOI: 10.1093/gbe/evaa224] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/14/2020] [Indexed: 01/03/2023] Open
Abstract
Major human alpha satellite DNA repeats are preferentially assembled within (peri)centromeric regions but are also dispersed within euchromatin in the form of clustered or short single repeat arrays. To study the evolutionary history of single euchromatic human alpha satellite repeats (ARs), we analyzed their orthologous loci across the primate genomes. The continuous insertion of euchromatic ARs throughout the evolutionary history of primates starting with the ancestors of Simiformes (45-60 Ma) and continuing up to the ancestors of Homo is revealed. Once inserted, the euchromatic ARs were stably transmitted to the descendant species, some exhibiting copy number variation, whereas their sequence divergence followed the species phylogeny. Many euchromatic ARs have sequence characteristics of (peri)centromeric alpha repeats suggesting heterochromatin as a source of dispersed euchromatic ARs. The majority of euchromatic ARs are inserted in the vicinity of other repetitive elements such as L1, Alu, and ERV or are embedded within them. Irrespective of the insertion context, each AR insertion seems to be unique and once inserted, ARs do not seem to be subsequently spread to new genomic locations. In spite of association with (retro)transposable elements, there is no indication that such elements play a role in ARs proliferation. The presence of short duplications at most of ARs insertion sites suggests site-directed recombination between homologous motifs in ARs and in the target genomic sequence, probably mediated by extrachromosomal circular DNA, as a mechanism of spreading within euchromatin.
Collapse
Affiliation(s)
- Isidoro Feliciello
- Department of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia.,Dipartimento di Medicina Clinica e Chirurgia, Universita' degli Studi di Napoli Federico II, Italy
| | - Željka Pezer
- Department of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| | - Dušan Kordiš
- Department of Molecular and Biomedical Sciences, Jožef Stefan Institute, Ljubljana, Slovenia
| | | | - Đurđica Ugarković
- Department of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia
| |
Collapse
|
34
|
Dvorkina T, Kunyavskaya O, Bzikadze AV, Alexandrov I, Pevzner PA. CentromereArchitect: inference and analysis of the architecture of centromeres. Bioinformatics 2021; 37:i196-i204. [PMID: 34252949 PMCID: PMC8336445 DOI: 10.1093/bioinformatics/btab265] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Motivation Recent advances in long-read sequencing technologies led to rapid progress in centromere assembly in the last year and, for the first time, opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. However, since these advances have not been yet accompanied by the development of the centromere-specific bioinformatics algorithms, even the fundamental questions (e.g. centromere annotation by deriving the complete set of human monomers and high-order repeats), let alone more complex questions (e.g. explaining how monomers and high-order repeats evolved) about human centromeres remain open. Moreover, even though there was a four-decade-long series of studies aimed at cataloging all human monomers and high-order repeats, the rigorous algorithmic definitions of these concepts are still lacking. Thus, the development of a centromere annotation tool is a prerequisite for follow-up personalized biomedical studies of centromeres across the human population and evolutionary studies of centromeres across various species. Results We describe the CentromereArchitect, the first tool for the centromere annotation in a newly sequenced genome, apply it to the recently generated complete assembly of a human genome by the Telomere-to-Telomere consortium, generate the complete set of human monomers and high-order repeats for ‘live’ centromeres, and reveal a vast set of hybrid monomers that may represent the focal points of centromere evolution. Availability and implementation CentromereArchitect is publicly available on https://github.com/ablab/stringdecomposer/tree/ismb2021 Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA 92093, USA
| | - Ivan Alexandrov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA
| |
Collapse
|
35
|
Jernfors T, Danforth J, Kesäniemi J, Lavrinienko A, Tukalenko E, Fajkus J, Dvořáčková M, Mappes T, Watts PC. Expansion of rDNA and pericentromere satellite repeats in the genomes of bank voles Myodes glareolus exposed to environmental radionuclides. Ecol Evol 2021; 11:8754-8767. [PMID: 34257925 PMCID: PMC8258220 DOI: 10.1002/ece3.7684] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 04/27/2021] [Accepted: 05/05/2021] [Indexed: 12/21/2022] Open
Abstract
Altered copy number of certain highly repetitive regions of the genome, such as satellite DNA within heterochromatin and ribosomal RNA loci (rDNA), is hypothesized to help safeguard the genome against damage derived from external stressors. We quantified copy number of the 18S rDNA and a pericentromeric satellite DNA (Msat-160) in bank voles (Myodes glareolus) inhabiting the Chernobyl Exclusion Zone (CEZ), an area that is contaminated by radionuclides and where organisms are exposed to elevated levels of ionizing radiation. We found a significant increase in 18S rDNA and Msat-160 content in the genomes of bank voles from contaminated locations within the CEZ compared with animals from uncontaminated locations. Moreover, 18S rDNA and Msat-160 copy number were positively correlated in the genomes of bank voles from uncontaminated, but not in the genomes of animals inhabiting contaminated, areas. These results show the capacity for local-scale geographic variation in genome architecture and are consistent with the genomic safeguard hypothesis. Disruption of cellular processes related to genomic stability appears to be a hallmark effect in bank voles inhabiting areas contaminated by radionuclides.
Collapse
Affiliation(s)
- Toni Jernfors
- Department of Biological and Environmental ScienceUniversity of JyväskyläJyväskyläFinland
| | - John Danforth
- Department of Biochemistry & Molecular BiologyRobson DNA Science CentreArnie Charbonneau Cancer InstituteCumming School of MedicineUniversity of CalgaryCalgaryCanada
| | - Jenni Kesäniemi
- Department of Biological and Environmental ScienceUniversity of JyväskyläJyväskyläFinland
| | - Anton Lavrinienko
- Department of Biological and Environmental ScienceUniversity of JyväskyläJyväskyläFinland
| | - Eugene Tukalenko
- Department of Biological and Environmental ScienceUniversity of JyväskyläJyväskyläFinland
- National Research Center for Radiation Medicine of the National Academy of Medical ScienceKyivUkraine
| | - Jiří Fajkus
- Mendel Centre for Plant Genomics and ProteomicsCentral European Institute of Technology (CEITEC)Masaryk UniversityBrnoCzech Republic
- Laboratory of Functional Genomics and ProteomicsNCBRFaculty of ScienceMasaryk UniversityBrnoCzech Republic
- Department of Cell Biology and RadiobiologyInstitute of Biophysics of the Czech Academy of SciencesBrnoCzech Republic
| | - Martina Dvořáčková
- Mendel Centre for Plant Genomics and ProteomicsCentral European Institute of Technology (CEITEC)Masaryk UniversityBrnoCzech Republic
| | - Tapio Mappes
- Department of Biological and Environmental ScienceUniversity of JyväskyläJyväskyläFinland
| | - Phillip C. Watts
- Department of Biological and Environmental ScienceUniversity of JyväskyläJyväskyläFinland
| |
Collapse
|
36
|
Morrison O, Thakur J. Molecular Complexes at Euchromatin, Heterochromatin and Centromeric Chromatin. Int J Mol Sci 2021; 22:6922. [PMID: 34203193 PMCID: PMC8268097 DOI: 10.3390/ijms22136922] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 06/23/2021] [Accepted: 06/24/2021] [Indexed: 01/19/2023] Open
Abstract
Chromatin consists of a complex of DNA and histone proteins as its core components and plays an important role in both packaging DNA and regulating DNA metabolic pathways such as DNA replication, transcription, recombination, and chromosome segregation. Proper functioning of chromatin further involves a network of interactions among molecular complexes that modify chromatin structure and organization to affect the accessibility of DNA to transcription factors leading to the activation or repression of the transcription of target DNA loci. Based on its structure and compaction state, chromatin is categorized into euchromatin, heterochromatin, and centromeric chromatin. In this review, we discuss distinct chromatin factors and molecular complexes that constitute euchromatin-open chromatin structure associated with active transcription; heterochromatin-less accessible chromatin associated with silencing; centromeric chromatin-the site of spindle binding in chromosome segregation.
Collapse
Affiliation(s)
| | - Jitendra Thakur
- Department of Biology, Emory University, 1510 Clifton Rd #2006, Atlanta, GA 30322, USA;
| |
Collapse
|
37
|
Thakur J, Packiaraj J, Henikoff S. Sequence, Chromatin and Evolution of Satellite DNA. Int J Mol Sci 2021; 22:ijms22094309. [PMID: 33919233 PMCID: PMC8122249 DOI: 10.3390/ijms22094309] [Citation(s) in RCA: 115] [Impact Index Per Article: 28.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 04/16/2021] [Accepted: 04/17/2021] [Indexed: 12/15/2022] Open
Abstract
Satellite DNA consists of abundant tandem repeats that play important roles in cellular processes, including chromosome segregation, genome organization and chromosome end protection. Most satellite DNA repeat units are either of nucleosomal length or 5–10 bp long and occupy centromeric, pericentromeric or telomeric regions. Due to high repetitiveness, satellite DNA sequences have largely been absent from genome assemblies. Although few conserved satellite-specific sequence motifs have been identified, DNA curvature, dyad symmetries and inverted repeats are features of various satellite DNAs in several organisms. Satellite DNA sequences are either embedded in highly compact gene-poor heterochromatin or specialized chromatin that is distinct from euchromatin. Nevertheless, some satellite DNAs are transcribed into non-coding RNAs that may play important roles in satellite DNA function. Intriguingly, satellite DNAs are among the most rapidly evolving genomic elements, such that a large fraction is species-specific in most organisms. Here we describe the different classes of satellite DNA sequences, their satellite-specific chromatin features, and how these features may contribute to satellite DNA biology and evolution. We also discuss how the evolution of functional satellite DNA classes may contribute to speciation in plants and animals.
Collapse
Affiliation(s)
- Jitendra Thakur
- Department of Biology, Emory University, Atlanta, GA 30322, USA;
- Correspondence:
| | - Jenika Packiaraj
- Department of Biology, Emory University, Atlanta, GA 30322, USA;
| | - Steven Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA;
- Fred Hutchinson Cancer Research Center, Howard Hughes Medical Institute, Seattle, WA 98109, USA
| |
Collapse
|
38
|
Arora UP, Charlebois C, Lawal RA, Dumont BL. Population and subspecies diversity at mouse centromere satellites. BMC Genomics 2021; 22:279. [PMID: 33865332 PMCID: PMC8052823 DOI: 10.1186/s12864-021-07591-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 04/08/2021] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Mammalian centromeres are satellite-rich chromatin domains that execute conserved roles in kinetochore assembly and chromosome segregation. Centromere satellites evolve rapidly between species, but little is known about population-level diversity across these loci. RESULTS We developed a k-mer based method to quantify centromere copy number and sequence variation from whole genome sequencing data. We applied this method to diverse inbred and wild house mouse (Mus musculus) genomes to profile diversity across the core centromere (minor) satellite and the pericentromeric (major) satellite repeat. We show that minor satellite copy number varies more than 10-fold among inbred mouse strains, whereas major satellite copy numbers span a 3-fold range. In contrast to widely held assumptions about the homogeneity of mouse centromere repeats, we uncover marked satellite sequence heterogeneity within single genomes, with diversity levels across the minor satellite exceeding those at the major satellite. Analyses in wild-caught mice implicate subspecies and population origin as significant determinants of variation in satellite copy number and satellite heterogeneity. Intriguingly, we also find that wild-caught mice harbor dramatically reduced minor satellite copy number and elevated satellite sequence heterogeneity compared to inbred strains, suggesting that inbreeding may reshape centromere architecture in pronounced ways. CONCLUSION Taken together, our results highlight the power of k-mer based approaches for probing variation across repetitive regions, provide an initial portrait of centromere variation across Mus musculus, and lay the groundwork for future functional studies on the consequences of natural genetic variation at these essential chromatin domains.
Collapse
Affiliation(s)
- Uma P Arora
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
- Tufts University, Graduate School of Biomedical Sciences, 136 Harrison Ave, Boston, MA, 02111, USA.
| | | | | | - Beth L Dumont
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
- Tufts University, Graduate School of Biomedical Sciences, 136 Harrison Ave, Boston, MA, 02111, USA.
| |
Collapse
|
39
|
The structure, function and evolution of a complete human chromosome 8. Nature 2021; 593:101-107. [PMID: 33828295 PMCID: PMC8099727 DOI: 10.1038/s41586-021-03420-7] [Citation(s) in RCA: 204] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/04/2021] [Indexed: 02/07/2023]
Abstract
The complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.
Collapse
|
40
|
Ahmad SF, Singchat W, Jehangir M, Suntronpong A, Panthum T, Malaivijitnond S, Srikulnath K. Dark Matter of Primate Genomes: Satellite DNA Repeats and Their Evolutionary Dynamics. Cells 2020; 9:E2714. [PMID: 33352976 PMCID: PMC7767330 DOI: 10.3390/cells9122714] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 12/12/2022] Open
Abstract
A substantial portion of the primate genome is composed of non-coding regions, so-called "dark matter", which includes an abundance of tandemly repeated sequences called satellite DNA. Collectively known as the satellitome, this genomic component offers exciting evolutionary insights into aspects of primate genome biology that raise new questions and challenge existing paradigms. A complete human reference genome was recently reported with telomere-to-telomere human X chromosome assembly that resolved hundreds of dark regions, encompassing a 3.1 Mb centromeric satellite array that had not been identified previously. With the recent exponential increase in the availability of primate genomes, and the development of modern genomic and bioinformatics tools, extensive growth in our knowledge concerning the structure, function, and evolution of satellite elements is expected. The current state of knowledge on this topic is summarized, highlighting various types of primate-specific satellite repeats to compare their proportions across diverse lineages. Inter- and intraspecific variation of satellite repeats in the primate genome are reviewed. The functional significance of these sequences is discussed by describing how the transcriptional activity of satellite repeats can affect gene expression during different cellular processes. Sex-linked satellites are outlined, together with their respective genomic organization. Mechanisms are proposed whereby satellite repeats might have emerged as novel sequences during different evolutionary phases. Finally, the main challenges that hinder the detection of satellite DNA are outlined and an overview of the latest methodologies to address technological limitations is presented.
Collapse
Affiliation(s)
- Syed Farhan Ahmad
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Worapong Singchat
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Maryam Jehangir
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Department of Structural and Functional Biology, Institute of Bioscience at Botucatu, São Paulo State University (UNESP), Botucatu, São Paulo 18618-689, Brazil
| | - Aorarat Suntronpong
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Thitipong Panthum
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Suchinda Malaivijitnond
- National Primate Research Center of Thailand, Chulalongkorn University, Saraburi 18110, Thailand;
- Department of Biology, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
| | - Kornsorn Srikulnath
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
- National Primate Research Center of Thailand, Chulalongkorn University, Saraburi 18110, Thailand;
- Center of Excellence on Agricultural Biotechnology (AG-BIO/PERDO-CHE), Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
| |
Collapse
|
41
|
Suzuki Y, Myers EW, Morishita S. Rapid and ongoing evolution of repetitive sequence structures in human centromeres. SCIENCE ADVANCES 2020; 6:6/50/eabd9230. [PMID: 33310858 PMCID: PMC7732198 DOI: 10.1126/sciadv.abd9230] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 10/30/2020] [Indexed: 06/12/2023]
Abstract
Our understanding of centromere sequence variation across human populations is limited by its extremely long nested repeat structures called higher-order repeats that are challenging to sequence. Here, we analyzed chromosomes 11, 17, and X using long-read sequencing data for 36 individuals from diverse populations including a Han Chinese trio and 21 Japanese. We revealed substantial structural diversity with many previously unidentified variant higher-order repeats specific to individuals characterizing rapid, haplotype-specific evolution of human centromeric arrays, while frequent single-nucleotide variants are largely conserved. We found a characteristic pattern shared among prevalent variants in human and chimpanzee. Our findings pave the way for studying sequence evolution in human and primate centromeres.
Collapse
Affiliation(s)
- Yuta Suzuki
- The University of Tokyo, Graduate School of Frontier Sciences, Department of Computational Biology and Medical Sciences, Kashiwa, Chiba 277-8568, Japan.
| | - Eugene W Myers
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Shinichi Morishita
- The University of Tokyo, Graduate School of Frontier Sciences, Department of Computational Biology and Medical Sciences, Kashiwa, Chiba 277-8568, Japan.
| |
Collapse
|
42
|
Bury L, Moodie B, Ly J, McKay LS, Miga KH, Cheeseman IM. Alpha-satellite RNA transcripts are repressed by centromere-nucleolus associations. eLife 2020; 9:59770. [PMID: 33174837 PMCID: PMC7679138 DOI: 10.7554/elife.59770] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 11/09/2020] [Indexed: 01/03/2023] Open
Abstract
Although originally thought to be silent chromosomal regions, centromeres are instead actively transcribed. However, the behavior and contributions of centromere-derived RNAs have remained unclear. Here, we used single-molecule fluorescence in-situ hybridization (smFISH) to detect alpha-satellite RNA transcripts in intact human cells. We find that alpha-satellite RNA-smFISH foci levels vary across cell lines and over the cell cycle, but do not remain associated with centromeres, displaying localization consistent with other long non-coding RNAs. Alpha-satellite expression occurs through RNA polymerase II-dependent transcription, but does not require established centromere or cell division components. Instead, our work implicates centromere–nucleolar interactions as repressing alpha-satellite expression. The fraction of nucleolar-localized centromeres inversely correlates with alpha-satellite transcripts levels across cell lines and transcript levels increase substantially when the nucleolus is disrupted. The control of alpha-satellite transcripts by centromere-nucleolar contacts provides a mechanism to modulate centromere transcription and chromatin dynamics across diverse cell states and conditions.
Collapse
Affiliation(s)
- Leah Bury
- Whitehead Institute for Biomedical Research, Cambridge, United States
| | - Brittania Moodie
- Whitehead Institute for Biomedical Research, Cambridge, United States
| | - Jimmy Ly
- Whitehead Institute for Biomedical Research, Cambridge, United States.,Department of Biology, Massachusetts Institute of Technology, Cambridge, United States
| | - Liliana S McKay
- Whitehead Institute for Biomedical Research, Cambridge, United States
| | - Karen Hh Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, United States
| | - Iain M Cheeseman
- Whitehead Institute for Biomedical Research, Cambridge, United States.,Department of Biology, Massachusetts Institute of Technology, Cambridge, United States
| |
Collapse
|
43
|
Balzano E, Giunta S. Centromeres under Pressure: Evolutionary Innovation in Conflict with Conserved Function. Genes (Basel) 2020; 11:E912. [PMID: 32784998 PMCID: PMC7463522 DOI: 10.3390/genes11080912] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 08/04/2020] [Accepted: 08/04/2020] [Indexed: 12/22/2022] Open
Abstract
Centromeres are essential genetic elements that enable spindle microtubule attachment for chromosome segregation during mitosis and meiosis. While this function is preserved across species, centromeres display an array of dynamic features, including: (1) rapidly evolving DNA; (2) wide evolutionary diversity in size, shape and organization; (3) evidence of mutational processes to generate homogenized repetitive arrays that characterize centromeres in several species; (4) tolerance to changes in position, as in the case of neocentromeres; and (5) intrinsic fragility derived by sequence composition and secondary DNA structures. Centromere drive underlies rapid centromere DNA evolution due to the "selfish" pursuit to bias meiotic transmission and promote the propagation of stronger centromeres. Yet, the origins of other dynamic features of centromeres remain unclear. Here, we review our current understanding of centromere evolution and plasticity. We also detail the mutagenic processes proposed to shape the divergent genetic nature of centromeres. Changes to centromeres are not simply evolutionary relics, but ongoing shifts that on one side promote centromere flexibility, but on the other can undermine centromere integrity and function with potential pathological implications such as genome instability.
Collapse
Affiliation(s)
- Elisa Balzano
- Dipartimento di Biologia e Biotecnologie “Charles Darwin”, Sapienza Università di Roma, 00185 Roma, Italy;
| | - Simona Giunta
- Laboratory of Chromosome and Cell Biology, The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA
| |
Collapse
|
44
|
Mahlke MA, Nechemia-Arbely Y. Guarding the Genome: CENP-A-Chromatin in Health and Cancer. Genes (Basel) 2020; 11:genes11070810. [PMID: 32708729 PMCID: PMC7397030 DOI: 10.3390/genes11070810] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/10/2020] [Accepted: 07/15/2020] [Indexed: 02/07/2023] Open
Abstract
Faithful chromosome segregation is essential for the maintenance of genomic integrity and requires functional centromeres. Centromeres are epigenetically defined by the histone H3 variant, centromere protein A (CENP-A). Here we highlight current knowledge regarding CENP-A-containing chromatin structure, specification of centromere identity, regulation of CENP-A deposition and possible contribution to cancer formation and/or progression. CENP-A overexpression is common among many cancers and predicts poor prognosis. Overexpression of CENP-A increases rates of CENP-A deposition ectopically at sites of high histone turnover, occluding CCCTC-binding factor (CTCF) binding. Ectopic CENP-A deposition leads to mitotic defects, centromere dysfunction and chromosomal instability (CIN), a hallmark of cancer. CENP-A overexpression is often accompanied by overexpression of its chaperone Holliday Junction Recognition Protein (HJURP), leading to epigenetic addiction in which increased levels of HJURP and CENP-A become necessary to support rapidly dividing p53 deficient cancer cells. Alterations in CENP-A posttranslational modifications are also linked to chromosome segregation errors and CIN. Collectively, CENP-A is pivotal to genomic stability through centromere maintenance, perturbation of which can lead to tumorigenesis.
Collapse
Affiliation(s)
- Megan A. Mahlke
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA;
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Yael Nechemia-Arbely
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA;
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, PA 15261, USA
- Correspondence: ; Tel.: +1-412-623-3228; Fax: +1-412-623-7828
| |
Collapse
|
45
|
Miga KH. Centromere studies in the era of 'telomere-to-telomere' genomics. Exp Cell Res 2020; 394:112127. [PMID: 32504677 DOI: 10.1016/j.yexcr.2020.112127] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 05/23/2020] [Accepted: 05/30/2020] [Indexed: 12/17/2022]
Abstract
We are entering into an exciting era of genomics where truly complete, high-quality assemblies of human chromosomes are available end-to-end, or from 'telomere-to-telomere' (T2T). This technological advance offers a new opportunity to include endogenous human centromeric regions in high-resolution, sequence-based studies. These emerging reference maps are expected to reveal a new functional landscape in the human genome, where centromere proteins, transcriptional regulation, and spatial organization can be examined with base-level resolution across different stages of development and disease. Such studies will depend on innovative assembly methods of extremely long tandem repeats (ETRs), or satellite DNAs, paired with the development of new, orthogonal validation methods to ensure accuracy and completeness. This review reflects the progress in centromere genomics, credited by recent advancements in long-read sequencing and assembly methods. In doing so, I will discuss the challenges that remain and the promise for a new period of scientific discovery for satellite DNA biology and centromere function.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, CA, 95064, USA.
| |
Collapse
|
46
|
Dumont M, Gamba R, Gestraud P, Klaasen S, Worrall JT, De Vries SG, Boudreau V, Salinas‐Luypaert C, Maddox PS, Lens SMA, Kops GJPL, McClelland SE, Miga KH, Fachinetti D. Human chromosome-specific aneuploidy is influenced by DNA-dependent centromeric features. EMBO J 2020; 39:e102924. [PMID: 31750958 PMCID: PMC6960447 DOI: 10.15252/embj.2019102924] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 10/21/2019] [Accepted: 10/29/2019] [Indexed: 12/11/2022] Open
Abstract
Intrinsic genomic features of individual chromosomes can contribute to chromosome-specific aneuploidy. Centromeres are key elements for the maintenance of chromosome segregation fidelity via a specialized chromatin marked by CENP-A wrapped by repetitive DNA. These long stretches of repetitive DNA vary in length among human chromosomes. Using CENP-A genetic inactivation in human cells, we directly interrogate if differences in the centromere length reflect the heterogeneity of centromeric DNA-dependent features and whether this, in turn, affects the genesis of chromosome-specific aneuploidy. Using three distinct approaches, we show that mis-segregation rates vary among different chromosomes under conditions that compromise centromere function. Whole-genome sequencing and centromere mapping combined with cytogenetic analysis, small molecule inhibitors, and genetic manipulation revealed that inter-chromosomal heterogeneity of centromeric features, but not centromere length, influences chromosome segregation fidelity. We conclude that faithful chromosome segregation for most of human chromosomes is biased in favor of centromeres with high abundance of DNA-dependent centromeric components. These inter-chromosomal differences in centromere features can translate into non-random aneuploidy, a hallmark of cancer and genetic diseases.
Collapse
Affiliation(s)
- Marie Dumont
- Institut CuriePSL Research UniversityCNRSUMR144ParisFrance
| | - Riccardo Gamba
- Institut CuriePSL Research UniversityCNRSUMR144ParisFrance
| | - Pierre Gestraud
- Institut CuriePSL Research UniversityCNRSUMR144ParisFrance
- PSL Research UniversityInstitut Curie Research CenterINSERM U900ParisFrance
- MINES ParisTechPSL Research UniversityCBIO‐Centre for Computational BiologyParisFrance
| | - Sjoerd Klaasen
- Oncode InstituteHubrecht Institute—KNAW (Royal Netherlands Academy of Arts and Sciences)UtrechtThe Netherlands
| | | | - Sippe G De Vries
- Oncode InstituteCenter for Molecular MedicineUniversity Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | - Vincent Boudreau
- Department of BiologyUniversity of North CarolinaChapel HillNCUSA
| | | | - Paul S Maddox
- Department of BiologyUniversity of North CarolinaChapel HillNCUSA
| | - Susanne MA Lens
- Oncode InstituteCenter for Molecular MedicineUniversity Medical Center UtrechtUtrecht UniversityUtrechtThe Netherlands
| | - Geert JPL Kops
- Oncode InstituteHubrecht Institute—KNAW (Royal Netherlands Academy of Arts and Sciences)UtrechtThe Netherlands
| | | | - Karen H Miga
- Center for Biomolecular Science & EngineeringUniversity of California Santa CruzSanta CruzCAUSA
| | | |
Collapse
|
47
|
Thongchum R, Nishihara H, Srikulnath K, Hirai H, Koga A. The CENP-B box, a nucleotide motif involved in centromere formation, has multiple origins in New World monkeys. Genes Genet Syst 2019; 94:301-306. [DOI: 10.1266/ggs.19-00042] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Affiliation(s)
- Ratchaphol Thongchum
- Primate Research Institute, Kyoto University
- Faculty of Science, Kasetsart University
| | - Hidenori Nishihara
- Department of Life Science and Technology, Tokyo Institute of Technology
| | - Kornsorn Srikulnath
- Faculty of Science, Kasetsart University
- National Primate Research Center of Thailand, Chulalongkorn University
| | | | | |
Collapse
|
48
|
Discovery of 33mer in chromosome 21 - the largest alpha satellite higher order repeat unit among all human somatic chromosomes. Sci Rep 2019; 9:12629. [PMID: 31477765 PMCID: PMC6718397 DOI: 10.1038/s41598-019-49022-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 08/13/2019] [Indexed: 11/10/2022] Open
Abstract
The centromere is important for segregation of chromosomes during cell division in eukaryotes. Its destabilization results in chromosomal missegregation, aneuploidy, hallmarks of cancers and birth defects. In primate genomes centromeres contain tandem repeats of ~171 bp alpha satellite DNA, commonly organized into higher order repeats (HORs). In spite of crucial importance, satellites have been understudied because of gaps in sequencing - genomic “black holes”. Bioinformatical studies of genomic sequences open possibilities to revolutionize understanding of repetitive DNA datasets. Here, using robust (Global Repeat Map) algorithm we identified in hg38 sequence of human chromosome 21 complete ensemble of alpha satellite HORs with six long repeat units (≥20 mers), five of them novel. Novel 33mer HOR has the longest HOR unit identified so far among all somatic chromosomes and novel 23mer reverse HOR is distant far from the centromere. Also, we discovered that for hg38 assembly the 33mer sequences in chromosomes 21, 13, 14, and 22 are 100% identical but nearby gaps are present; that seems to require an additional more precise sequencing. Chromosome 21 is of significant interest for deciphering the molecular base of Down syndrome and of aneuploidies in general. Since the chromosome identifier probes are largely based on the detection of higher order alpha satellite repeats, distinctions between alpha satellite HORs in chromosomes 21 and 13 here identified might lead to a unique chromosome 21 probe in molecular cytogenetics, which would find utility in diagnostics. It is expected that its complete sequence analysis will have profound implications for understanding pathogenesis of diseases and development of new therapeutic approaches.
Collapse
|
49
|
Miga KH. Centromeric Satellite DNAs: Hidden Sequence Variation in the Human Population. Genes (Basel) 2019; 10:E352. [PMID: 31072070 PMCID: PMC6562703 DOI: 10.3390/genes10050352] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 05/03/2019] [Accepted: 05/03/2019] [Indexed: 12/30/2022] Open
Abstract
The central goal of medical genomics is to understand the inherited basis of sequence variation that underlies human physiology, evolution, and disease. Functional association studies currently ignore millions of bases that span each centromeric region and acrocentric short arm. These regions are enriched in long arrays of tandem repeats, or satellite DNAs, that are known to vary extensively in copy number and repeat structure in the human population. Satellite sequence variation in the human genome is often so large that it is detected cytogenetically, yet due to the lack of a reference assembly and informatics tools to measure this variability, contemporary high-resolution disease association studies are unable to detect causal variants in these regions. Nevertheless, recently uncovered associations between satellite DNA variation and human disease support that these regions present a substantial and biologically important fraction of human sequence variation. Therefore, there is a pressing and unmet need to detect and incorporate this uncharacterized sequence variation into broad studies of human evolution and medical genomics. Here I discuss the current knowledge of satellite DNA variation in the human genome, focusing on centromeric satellites and their potential implications for disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, CA 95064, USA.
| |
Collapse
|
50
|
Centromere Repeats: Hidden Gems of the Genome. Genes (Basel) 2019; 10:genes10030223. [PMID: 30884847 PMCID: PMC6471113 DOI: 10.3390/genes10030223] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 03/07/2019] [Accepted: 03/11/2019] [Indexed: 01/08/2023] Open
Abstract
Satellite DNAs are now regarded as powerful and active contributors to genomic and chromosomal evolution. Paired with mobile transposable elements, these repetitive sequences provide a dynamic mechanism through which novel karyotypic modifications and chromosomal rearrangements may occur. In this review, we discuss the regulatory activity of satellite DNA and their neighboring transposable elements in a chromosomal context with a particular emphasis on the integral role of both in centromere function. In addition, we discuss the varied mechanisms by which centromeric repeats have endured evolutionary processes, producing a novel, species-specific centromeric landscape despite sharing a ubiquitously conserved function. Finally, we highlight the role these repetitive elements play in the establishment and functionality of de novo centromeres and chromosomal breakpoints that underpin karyotypic variation. By emphasizing these unique activities of satellite DNAs and transposable elements, we hope to disparage the conventional exemplification of repetitive DNA in the historically-associated context of ‘junk’.
Collapse
|