1
|
Andrade Ruiz L, Kops GJPL, Sacristan C. Vertebrate centromere architecture: from chromatin threads to functional structures. Chromosoma 2024:10.1007/s00412-024-00823-z. [PMID: 38856923 DOI: 10.1007/s00412-024-00823-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 05/21/2024] [Accepted: 05/27/2024] [Indexed: 06/11/2024]
Abstract
Centromeres are chromatin structures specialized in sister chromatid cohesion, kinetochore assembly, and microtubule attachment during chromosome segregation. The regional centromere of vertebrates consists of long regions of highly repetitive sequences occupied by the Histone H3 variant CENP-A, and which are flanked by pericentromeres. The three-dimensional organization of centromeric chromatin is paramount for its functionality and its ability to withstand spindle forces. Alongside CENP-A, key contributors to the folding of this structure include components of the Constitutive Centromere-Associated Network (CCAN), the protein CENP-B, and condensin and cohesin complexes. Despite its importance, the intricate architecture of the regional centromere of vertebrates remains largely unknown. Recent advancements in long-read sequencing, super-resolution and cryo-electron microscopy, and chromosome conformation capture techniques have significantly improved our understanding of this structure at various levels, from the linear arrangement of centromeric sequences and their epigenetic landscape to their higher-order compaction. In this review, we discuss the latest insights on centromere organization and place them in the context of recent findings describing a bipartite higher-order organization of the centromere.
Collapse
Affiliation(s)
- Lorena Andrade Ruiz
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, Utrecht, Netherlands
- University Medical Center Utrecht, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | - Geert J P L Kops
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, Utrecht, Netherlands
- University Medical Center Utrecht, Utrecht, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | - Carlos Sacristan
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences, Utrecht, Netherlands.
- University Medical Center Utrecht, Utrecht, Netherlands.
- Oncode Institute, Utrecht, Netherlands.
| |
Collapse
|
2
|
Makova KD, Pickett BD, Harris RS, Hartley GA, Cechova M, Pal K, Nurk S, Yoo D, Li Q, Hebbar P, McGrath BC, Antonacci F, Aubel M, Biddanda A, Borchers M, Bornberg-Bauer E, Bouffard GG, Brooks SY, Carbone L, Carrel L, Carroll A, Chang PC, Chin CS, Cook DE, Craig SJC, de Gennaro L, Diekhans M, Dutra A, Garcia GH, Grady PGS, Green RE, Haddad D, Hallast P, Harvey WT, Hickey G, Hillis DA, Hoyt SJ, Jeong H, Kamali K, Pond SLK, LaPolice TM, Lee C, Lewis AP, Loh YHE, Masterson P, McGarvey KM, McCoy RC, Medvedev P, Miga KH, Munson KM, Pak E, Paten B, Pinto BJ, Potapova T, Rhie A, Rocha JL, Ryabov F, Ryder OA, Sacco S, Shafin K, Shepelev VA, Slon V, Solar SJ, Storer JM, Sudmant PH, Sweetalana, Sweeten A, Tassia MG, Thibaud-Nissen F, Ventura M, Wilson MA, Young AC, Zeng H, Zhang X, Szpiech ZA, Huber CD, Gerton JL, Yi SV, Schatz MC, Alexandrov IA, Koren S, O'Neill RJ, Eichler EE, Phillippy AM. The complete sequence and comparative analysis of ape sex chromosomes. Nature 2024; 630:401-411. [PMID: 38811727 PMCID: PMC11168930 DOI: 10.1038/s41586-024-07473-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 04/26/2024] [Indexed: 05/31/2024]
Abstract
Apes possess two sex chromosomes-the male-specific Y chromosome and the X chromosome, which is present in both males and females. The Y chromosome is crucial for male reproduction, with deletions being linked to infertility1. The X chromosome is vital for reproduction and cognition2. Variation in mating patterns and brain function among apes suggests corresponding differences in their sex chromosomes. However, owing to their repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the methodology developed for the telomere-to-telomere (T2T) human genome, we produced gapless assemblies of the X and Y chromosomes for five great apes (bonobo (Pan paniscus), chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla gorilla), Bornean orangutan (Pongo pygmaeus) and Sumatran orangutan (Pongo abelii)) and a lesser ape (the siamang gibbon (Symphalangus syndactylus)), and untangled the intricacies of their evolution. Compared with the X chromosomes, the ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements-owing to the accumulation of lineage-specific ampliconic regions, palindromes, transposable elements and satellites. Many Y chromosome genes expand in multi-copy families and some evolve under purifying selection. Thus, the Y chromosome exhibits dynamic evolution, whereas the X chromosome is more stable. Mapping short-read sequencing data to these assemblies revealed diversity and selection patterns on sex chromosomes of more than 100 individual great apes. These reference assemblies are expected to inform human evolution and conservation genetics of non-human apes, all of which are endangered species.
Collapse
Affiliation(s)
| | - Brandon D Pickett
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Monika Cechova
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Karol Pal
- Penn State University, University Park, PA, USA
| | - Sergey Nurk
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - DongAhn Yoo
- University of Washington School of Medicine, Seattle, WA, USA
| | - Qiuhui Li
- Johns Hopkins University, Baltimore, MD, USA
| | - Prajna Hebbar
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | | | | | - Erich Bornberg-Bauer
- University of Münster, Münster, Germany
- MPI for Developmental Biology, Tübingen, Germany
| | - Gerard G Bouffard
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shelise Y Brooks
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lucia Carbone
- Oregon Health and Science University, Portland, OR, USA
- Oregon National Primate Research Center, Hillsboro, OR, USA
| | - Laura Carrel
- Penn State University School of Medicine, Hershey, PA, USA
| | | | | | - Chen-Shan Chin
- Foundation of Biological Data Sciences, Belmont, CA, USA
| | | | | | | | - Mark Diekhans
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Amalia Dutra
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gage H Garcia
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Diana Haddad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Glenn Hickey
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - David A Hillis
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | - Hyeonsoo Jeong
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Yong-Hwee E Loh
- University of California Santa Barbara, Santa Barbara, CA, USA
| | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Kelly M McGarvey
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Karen H Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Evgenia Pak
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Benedict Paten
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | - Arang Rhie
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Joana L Rocha
- University of California Berkeley, Berkeley, CA, USA
| | - Fedor Ryabov
- Masters Program in National Research, University Higher School of Economics, Moscow, Russia
| | | | - Samuel Sacco
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | - Steven J Solar
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Sweetalana
- Penn State University, University Park, PA, USA
| | - Alex Sweeten
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | | | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Mario Ventura
- Università degli Studi di Bari Aldo Moro, Bari, Italy
| | | | - Alice C Young
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Xinru Zhang
- Penn State University, University Park, PA, USA
| | | | | | | | - Soojin V Yi
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | | | - Sergey Koren
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Evan E Eichler
- University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| | - Adam M Phillippy
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
3
|
Logsdon GA, Rozanski AN, Ryabov F, Potapova T, Shepelev VA, Catacchio CR, Porubsky D, Mao Y, Yoo D, Rautiainen M, Koren S, Nurk S, Lucas JK, Hoekzema K, Munson KM, Gerton JL, Phillippy AM, Ventura M, Alexandrov IA, Eichler EE. The variation and evolution of complete human centromeres. Nature 2024; 629:136-145. [PMID: 38570684 PMCID: PMC11062924 DOI: 10.1038/s41586-024-07278-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 03/07/2024] [Indexed: 04/05/2024]
Abstract
Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size1. As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions2,3. Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome4,5. We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.
Collapse
Affiliation(s)
- Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Allison N Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | - Claudia R Catacchio
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies, Oxford, United Kingdom
| | - Julian K Lucas
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mario Ventura
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy
| | - Ivan A Alexandrov
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
- Department of Anatomy and Anthropology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Dan David Center for Human Evolution and Biohistory Research, Tel Aviv University, Tel Aviv, Israel
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
4
|
Zhang S, Xu N, Fu L, Yang X, Li Y, Yang Z, Feng Y, Ma K, Jiang X, Han J, Hu R, Zhang L, de Gennaro L, Ryabov F, Meng D, He Y, Wu D, Yang C, Paparella A, Mao Y, Bian X, Lu Y, Antonacci F, Ventura M, Shepelev VA, Miga KH, Alexandrov IA, Logsdon GA, Phillippy AM, Su B, Zhang G, Eichler EE, Lu Q, Shi Y, Sun Q, Mao Y. Comparative genomics of macaques and integrated insights into genetic variation and population history. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.07.588379. [PMID: 38645259 PMCID: PMC11030432 DOI: 10.1101/2024.04.07.588379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The crab-eating macaques ( Macaca fascicularis ) and rhesus macaques ( M. mulatta ) are widely studied nonhuman primates in biomedical and evolutionary research. Despite their significance, the current understanding of the complex genomic structure in macaques and the differences between species requires substantial improvement. Here, we present a complete genome assembly of a crab-eating macaque and 20 haplotype-resolved macaque assemblies to investigate the complex regions and major genomic differences between species. Segmental duplication in macaques is ∼42% lower, while centromeres are ∼3.7 times longer than those in humans. The characterization of ∼2 Mbp fixed genetic variants and ∼240 Mbp complex loci highlights potential associations with metabolic differences between the two macaque species (e.g., CYP2C76 and EHBP1L1 ). Additionally, hundreds of alternative splicing differences show post-transcriptional regulation divergence between these two species (e.g., PNPO ). We also characterize 91 large-scale genomic differences between macaques and humans at a single-base-pair resolution and highlight their impact on gene regulation in primate evolution (e.g., FOLH1 and PIEZO2 ). Finally, population genetics recapitulates macaque speciation and selective sweeps, highlighting potential genetic basis of reproduction and tail phenotype differences (e.g., STAB1 , SEMA3F , and HOXD13 ). In summary, the integrated analysis of genetic variation and population genetics in macaques greatly enhances our comprehension of lineage-specific phenotypes, adaptation, and primate evolution, thereby improving their biomedical applications in human diseases.
Collapse
|
5
|
Makova KD, Pickett BD, Harris RS, Hartley GA, Cechova M, Pal K, Nurk S, Yoo D, Li Q, Hebbar P, McGrath BC, Antonacci F, Aubel M, Biddanda A, Borchers M, Bomberg E, Bouffard GG, Brooks SY, Carbone L, Carrel L, Carroll A, Chang PC, Chin CS, Cook DE, Craig SJ, de Gennaro L, Diekhans M, Dutra A, Garcia GH, Grady PG, Green RE, Haddad D, Hallast P, Harvey WT, Hickey G, Hillis DA, Hoyt SJ, Jeong H, Kamali K, Kosakovsky Pond SL, LaPolice TM, Lee C, Lewis AP, Loh YHE, Masterson P, McCoy RC, Medvedev P, Miga KH, Munson KM, Pak E, Paten B, Pinto BJ, Potapova T, Rhie A, Rocha JL, Ryabov F, Ryder OA, Sacco S, Shafin K, Shepelev VA, Slon V, Solar SJ, Storer JM, Sudmant PH, Sweetalana, Sweeten A, Tassia MG, Thibaud-Nissen F, Ventura M, Wilson MA, Young AC, Zeng H, Zhang X, Szpiech ZA, Huber CD, Gerton JL, Yi SV, Schatz MC, Alexandrov IA, Koren S, O’Neill RJ, Eichler E, Phillippy AM. The Complete Sequence and Comparative Analysis of Ape Sex Chromosomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.30.569198. [PMID: 38077089 PMCID: PMC10705393 DOI: 10.1101/2023.11.30.569198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Apes possess two sex chromosomes-the male-specific Y and the X shared by males and females. The Y chromosome is crucial for male reproduction, with deletions linked to infertility. The X chromosome carries genes vital for reproduction and cognition. Variation in mating patterns and brain function among great apes suggests corresponding differences in their sex chromosome structure and evolution. However, due to their highly repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the state-of-the-art experimental and computational methods developed for the telomere-to-telomere (T2T) human genome, we produced gapless, complete assemblies of the X and Y chromosomes for five great apes (chimpanzee, bonobo, gorilla, Bornean and Sumatran orangutans) and a lesser ape, the siamang gibbon. These assemblies completely resolved ampliconic, palindromic, and satellite sequences, including the entire centromeres, allowing us to untangle the intricacies of ape sex chromosome evolution. We found that, compared to the X, ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements. This divergence on the Y arises from the accumulation of lineage-specific ampliconic regions and palindromes (which are shared more broadly among species on the X) and from the abundance of transposable elements and satellites (which have a lower representation on the X). Our analysis of Y chromosome genes revealed lineage-specific expansions of multi-copy gene families and signatures of purifying selection. In summary, the Y exhibits dynamic evolution, while the X is more stable. Finally, mapping short-read sequencing data from >100 great ape individuals revealed the patterns of diversity and selection on their sex chromosomes, demonstrating the utility of these reference assemblies for studies of great ape evolution. These complete sex chromosome assemblies are expected to further inform conservation genetics of nonhuman apes, all of which are endangered species.
Collapse
Affiliation(s)
| | - Brandon D. Pickett
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Monika Cechova
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Karol Pal
- Penn State University, University Park, PA, USA
| | - Sergey Nurk
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - DongAhn Yoo
- University of Washington School of Medicine, Seattle, WA, USA
| | - Qiuhui Li
- Johns Hopkins University, Baltimore, MD, USA
| | - Prajna Hebbar
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | | | | | - Erich Bomberg
- University of Münster, Münster, Germany
- MPI for Developmental Biology, Tübingen, Germany
| | - Gerard G. Bouffard
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shelise Y. Brooks
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lucia Carbone
- Oregon Health & Science University, Portland, OR, USA
- Oregon National Primate Research Center, Hillsboro, OR, USA
| | - Laura Carrel
- Penn State University School of Medicine, Hershey, PA, USA
| | | | | | - Chen-Shan Chin
- Foundation of Biological Data Sciences, Belmont, CA, USA
| | | | | | | | - Mark Diekhans
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Amalia Dutra
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gage H. Garcia
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Diana Haddad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Glenn Hickey
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - David A. Hillis
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | - Hyeonsoo Jeong
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | | | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Karen H. Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Evgenia Pak
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Benedict Paten
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | - Arang Rhie
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | | | - Samuel Sacco
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | - Steven J. Solar
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Sweetalana
- Penn State University, University Park, PA, USA
| | - Alex Sweeten
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | | | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Alice C. Young
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Xinru Zhang
- Penn State University, University Park, PA, USA
| | | | | | | | - Soojin V. Yi
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | | | - Sergey Koren
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Evan Eichler
- University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Adam M. Phillippy
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
6
|
Bzikadze AV, Pevzner PA. UniAligner: a parameter-free framework for fast sequence alignment. Nat Methods 2023; 20:1346-1354. [PMID: 37580559 DOI: 10.1038/s41592-023-01970-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 07/05/2023] [Indexed: 08/16/2023]
Abstract
Even though the recent advances in 'complete genomics' revealed the previously inaccessible genomic regions, analysis of variations in centromeres and other extra-long tandem repeats (ETRs) faces an algorithmic challenge since there are currently no tools for accurate sequence comparison of ETRs. Counterintuitively, the classical alignment approaches, such as the Smith-Waterman algorithm, fail to construct biologically adequate alignments of ETRs. We present UniAligner-the parameter-free sequence alignment algorithm with sequence-dependent alignment scoring that automatically changes for any pair of compared sequences. UniAligner prioritizes matches of rare substrings that are more likely to be relevant to the evolutionary relationship between two sequences. We apply UniAligner to estimate the mutation rates in human centromeres, and quantify the extremely high rate of large duplications and deletions in centromeres. This high rate suggests that centromeres may represent some of the most rapidly evolving regions of the human genome with respect to their structural organization.
Collapse
Affiliation(s)
- Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, La Jolla, CA, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
7
|
Logsdon GA, Rozanski AN, Ryabov F, Potapova T, Shepelev VA, Mao Y, Rautiainen M, Koren S, Nurk S, Porubsky D, Lucas JK, Hoekzema K, Munson KM, Gerton JL, Phillippy AM, Alexandrov IA, Eichler EE. The variation and evolution of complete human centromeres. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.30.542849. [PMID: 37398417 PMCID: PMC10312506 DOI: 10.1101/2023.05.30.542849] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
We completely sequenced and assembled all centromeres from a second human genome and used two reference sets to benchmark genetic, epigenetic, and evolutionary variation within centromeres from a diversity panel of humans and apes. We find that centromere single-nucleotide variation can increase by up to 4.1-fold relative to other genomic regions, with the caveat that up to 45.8% of centromeric sequence, on average, cannot be reliably aligned with current methods due to the emergence of new α-satellite higher-order repeat (HOR) structures and two to threefold differences in the length of the centromeres. The extent to which this occurs differs depending on the chromosome and haplotype. Comparing the two sets of complete human centromeres, we find that eight harbor distinctly different α-satellite HOR array structures and four contain novel α-satellite HOR variants in high abundance. DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by at least 500 kbp-a property not readily associated with novel α-satellite HORs. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan, and macaque genomes. Comparative analyses reveal nearly complete turnover of α-satellite HORs, but with idiosyncratic changes in structure characteristic to each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the p- and q-arms of human chromosomes and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.
Collapse
Affiliation(s)
- Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Allison N. Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | - Yafei Mao
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Julian K. Lucas
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M. Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ivan A. Alexandrov
- Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel
- Department of Anatomy and Anthropology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Dan David Center for Human Evolution and Biohistory Research, Tel Aviv University, Tel Aviv, Israel
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
8
|
Wlodzimierz P, Rabanal FA, Burns R, Naish M, Primetis E, Scott A, Mandáková T, Gorringe N, Tock AJ, Holland D, Fritschi K, Habring A, Lanz C, Patel C, Schlegel T, Collenberg M, Mielke M, Nordborg M, Roux F, Shirsekar G, Alonso-Blanco C, Lysak MA, Novikova PY, Bousios A, Weigel D, Henderson IR. Cycles of satellite and transposon evolution in Arabidopsis centromeres. Nature 2023:10.1038/s41586-023-06062-z. [PMID: 37198485 DOI: 10.1038/s41586-023-06062-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 04/06/2023] [Indexed: 05/19/2023]
Abstract
Centromeres are critical for cell division, loading CENH3 or CENPA histone variant nucleosomes, directing kinetochore formation and allowing chromosome segregation1,2. Despite their conserved function, centromere size and structure are diverse across species. To understand this centromere paradox3,4, it is necessary to know how centromeric diversity is generated and whether it reflects ancient trans-species variation or, instead, rapid post-speciation divergence. To address these questions, we assembled 346 centromeres from 66 Arabidopsis thaliana and 2 Arabidopsis lyrata accessions, which exhibited a remarkable degree of intra- and inter-species diversity. A. thaliana centromere repeat arrays are embedded in linkage blocks, despite ongoing internal satellite turnover, consistent with roles for unidirectional gene conversion or unequal crossover between sister chromatids in sequence diversification. Additionally, centrophilic ATHILA transposons have recently invaded the satellite arrays. To counter ATHILA invasion, chromosome-specific bursts of satellite homogenization generate higher-order repeats and purge transposons, in line with cycles of repeat evolution. Centromeric sequence changes are even more extreme in comparison between A. thaliana and A. lyrata. Together, our findings identify rapid cycles of transposon invasion and purging through satellite homogenization, which drive centromere evolution and ultimately contribute to speciation.
Collapse
Affiliation(s)
- Piotr Wlodzimierz
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Fernando A Rabanal
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Robin Burns
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Matthew Naish
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Elias Primetis
- School of Life Sciences, University of Sussex, Brighton, UK
| | - Alison Scott
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Terezie Mandáková
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Nicola Gorringe
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Andrew J Tock
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Daniel Holland
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Katrin Fritschi
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Anette Habring
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Christa Lanz
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Christie Patel
- Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Theresa Schlegel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Maximilian Collenberg
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Miriam Mielke
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Magnus Nordborg
- Gregor Mendel Institute, Vienna, Austrian Academy of Sciences, Vienna BioCenter, Vienna, Austria
| | - Fabrice Roux
- LIPME, INRAE, CNRS, Université de Toulouse, Castanet-Tolosan, France
| | - Gautam Shirsekar
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Carlos Alonso-Blanco
- Departamento de Genética Molecular de Plantas, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | - Martin A Lysak
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Polina Y Novikova
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | | | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany.
| | - Ian R Henderson
- Department of Plant Sciences, University of Cambridge, Cambridge, UK.
| |
Collapse
|
9
|
Logsdon GA, Eichler EE. The Dynamic Structure and Rapid Evolution of Human Centromeric Satellite DNA. Genes (Basel) 2022; 14:92. [PMID: 36672831 PMCID: PMC9859433 DOI: 10.3390/genes14010092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 12/22/2022] [Accepted: 12/24/2022] [Indexed: 12/31/2022] Open
Abstract
The complete sequence of a human genome provided our first comprehensive view of the organization of satellite DNA associated with heterochromatin. We review how our understanding of the genetic architecture and epigenetic properties of human centromeric DNA have advanced as a result. Preliminary studies of human and nonhuman ape centromeres reveal complex, saltatory mutational changes organized around distinct evolutionary layers. Pockets of regional hypomethylation within higher-order α-satellite DNA, termed centromere dip regions, appear to define the site of kinetochore attachment in all human chromosomes, although such epigenetic features can vary even within the same chromosome. Sequence resolution of satellite DNA is providing new insights into centromeric function with potential implications for improving our understanding of human biology and health.
Collapse
Affiliation(s)
- Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
10
|
Urban JA, Ranjan R, Chen X. Asymmetric Histone Inheritance: Establishment, Recognition, and Execution. Annu Rev Genet 2022; 56:113-143. [PMID: 35905975 PMCID: PMC10054593 DOI: 10.1146/annurev-genet-072920-125226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The discovery of biased histone inheritance in asymmetrically dividing Drosophila melanogaster male germline stem cells demonstrates one means to produce two distinct daughter cells with identical genetic material. This inspired further studies in different systems, which revealed that this phenomenon may be a widespread mechanism to introduce cellular diversity. While the extent of asymmetric histone inheritance could vary among systems, this phenomenon is proposed to occur in three steps: first, establishment of histone asymmetry between sister chromatids during DNA replication; second, recognition of sister chromatids carrying asymmetric histone information during mitosis; and third, execution of this asymmetry in the resulting daughter cells. By compiling the current knowledge from diverse eukaryotic systems, this review comprehensively details and compares known chromatin factors, mitotic machinery components, and cell cycle regulators that may contribute to each of these three steps. Also discussed are potential mechanisms that introduce and regulate variable histone inheritance modes and how these different modes may contribute to cell fate decisions in multicellular organisms.
Collapse
Affiliation(s)
- Jennifer A Urban
- Department of Biology, The Johns Hopkins University, Baltimore, Maryland, USA;
| | - Rajesh Ranjan
- Department of Biology, The Johns Hopkins University, Baltimore, Maryland, USA; .,Howard Hughes Medical Institute, The Johns Hopkins University, Baltimore, Maryland, USA; ,
| | - Xin Chen
- Department of Biology, The Johns Hopkins University, Baltimore, Maryland, USA; .,Howard Hughes Medical Institute, The Johns Hopkins University, Baltimore, Maryland, USA; ,
| |
Collapse
|
11
|
Cechova M, Miga KH. Satellite DNAs and human sex chromosome variation. Semin Cell Dev Biol 2022; 128:15-25. [PMID: 35644878 DOI: 10.1016/j.semcdb.2022.04.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 04/26/2022] [Accepted: 04/27/2022] [Indexed: 11/17/2022]
Abstract
Satellite DNAs are present on every chromosome in the cell and are typically enriched in repetitive, heterochromatic parts of the human genome. Sex chromosomes represent a unique genomic and epigenetic context. In this review, we first report what is known about satellite DNA biology on human X and Y chromosomes, including repeat content and organization, as well as satellite variation in typical euploid individuals. Then, we review sex chromosome aneuploidies that are among the most common types of aneuploidies in the general population, and are better tolerated than autosomal aneuploidies. This is demonstrated also by the fact that aging is associated with the loss of the X, and especially the Y chromosome. In addition, supernumerary sex chromosomes enable us to study general processes in a cell, such as analyzing heterochromatin dosage (i.e. additional Barr bodies and long heterochromatin arrays on Yq) and their downstream consequences. Finally, genomic and epigenetic organization and regulation of satellite DNA could influence chromosome stability and lead to aneuploidy. In this review, we argue that the complete annotation of satellite DNA on sex chromosomes in human, and especially in centromeric regions, will aid in explaining the prevalence and the consequences of sex chromosome aneuploidies.
Collapse
Affiliation(s)
- Monika Cechova
- Faculty of Informatics, Masaryk University, Czech Republic
| | - Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA; UC Santa Cruz Genomics Institute, University of California Santa Cruz, CA 95064, USA
| |
Collapse
|
12
|
Haig D. Paradox lost: Concerted evolution and centromeric instability: Centromeres are hospitable habitats for repeats that evolve adaptations for proliferation within the nucleus sometimes at organismal cost.: Centromeres are hospitable habitats for repeats that evolve adaptations for proliferation within the nucleus sometimes at organismal cost. Bioessays 2022; 44:e2200023. [PMID: 35748194 DOI: 10.1002/bies.202200023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 06/07/2022] [Accepted: 06/09/2022] [Indexed: 11/11/2022]
Abstract
Homologous centromeres compete for segregation to the secondary oocyte nucleus at female meiosis I. Centromeric repeats also compete with each other to populate centromeres in mitotic cells of the germline and have become adapted to use the recombinational machinery present at centromeres to promote their own propagation. Repeats are not needed at centromeres, rather centromeres appear to be hospitable habitats for the colonization and proliferation of repeats. This is probably an indirect consequence of two distinctive features of centromeric DNA. Centromeres are subject to breakage by the mechanical forces exerted by microtubules and meiotic crossing-over is suppressed. Centromeric proteins acting in trans are under selection to mitigate the costs of centromeric repeats acting in cis. Collateral costs of mitotic competition at centromeres may help to explain the high rates of aneuploidy observed in early human embryos.
Collapse
Affiliation(s)
- David Haig
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
13
|
Molecular Dynamics and Evolution of Centromeres in the Genus Equus. Int J Mol Sci 2022; 23:ijms23084183. [PMID: 35457002 PMCID: PMC9024551 DOI: 10.3390/ijms23084183] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 04/05/2022] [Accepted: 04/06/2022] [Indexed: 02/01/2023] Open
Abstract
The centromere is the chromosomal locus essential for proper chromosome segregation. While the centromeric function is well conserved and epigenetically specified, centromeric DNA sequences are typically composed of satellite DNA and represent the most rapidly evolving sequences in eukaryotic genomes. The presence of satellite sequences at centromeres hampered the comprehensive molecular analysis of these enigmatic loci. The discovery of functional centromeres completely devoid of satellite repetitions and fixed in some animal and plant species represented a turning point in centromere biology, definitively proving the epigenetic nature of the centromere. The first satellite-free centromere, fixed in a vertebrate species, was discovered in the horse. Later, an extraordinary number of satellite-free neocentromeres had been discovered in other species of the genus Equus, which remains the only mammalian genus with numerous satellite-free centromeres described thus far. These neocentromeres arose recently during evolution and are caught in a stage of incomplete maturation. Their presence made the equids a unique model for investigating, at molecular level, the minimal requirements for centromere seeding and evolution. This model system provided new insights on how centromeres are established and transmitted to the progeny and on the role of satellite DNA in different aspects of centromere biology.
Collapse
|
14
|
Altemose N, Glennis A, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, Hoyt SJ, Uralsky L, Ryabov FD, Shew CJ, Sauria MEG, Borchers M, Gershman A, Mikheenko A, Shepelev VA, Dvorkina T, Kunyavskaya O, Vollger MR, Rhie A, McCartney AM, Asri M, Lorig-Roach R, Shafin K, Aganezov S, Olson D, de Lima LG, Potapova T, Hartley GA, Haukness M, Kerpedjiev P, Gusev F, Tigyi K, Brooks S, Young A, Nurk S, Koren S, Salama SR, Paten B, Rogaev EI, Streets A, Karpen GH, Dernburg AF, Sullivan BA, Straight AF, Wheeler TJ, Gerton JL, Eichler EE, Phillippy AM, Timp W, Dennis MY, O'Neill RJ, Zook JM, Schatz MC, Pevzner PA, Diekhans M, Langley CH, Alexandrov IA, Miga KH. Complete genomic and epigenetic maps of human centromeres. Science 2022; 376:eabl4178. [PMID: 35357911 PMCID: PMC9233505 DOI: 10.1126/science.abl4178] [Citation(s) in RCA: 174] [Impact Index Per Article: 87.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.
Collapse
Affiliation(s)
- Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - A. Glennis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Pragya Sidhwani
- Department of Biochemistry, Stanford University, Stanford, CA, USA
| | - Sasha A. Langley
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Gina V. Caldas
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Savannah J. Hoyt
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Lev Uralsky
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
| | | | - Colin J. Shew
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | | | | | - Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | | | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ann M. McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ryan Lorig-Roach
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Daniel Olson
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | | | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Gabrielle A. Hartley
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Fedor Gusev
- Vavilov Institute of General Genetics, Moscow, Russia
| | - Kristof Tigyi
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Shelise Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alice Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sofie R. Salama
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| | - Evgeny I. Rogaev
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School, Worcester, MA, USA
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Gary H. Karpen
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- BioEngineering and BioMedical Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Abby F. Dernburg
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA
| | | | - Travis J. Wheeler
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | - Jennifer L. Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical School, Department of Biochemistry and Molecular Biology and Cancer Center, University of Kansas, Kansas City, KS, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | - Rachel J. O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California at San Diego, San Diego, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Charles H. Langley
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
| | - Ivan A. Alexandrov
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
- Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| |
Collapse
|
15
|
Jeffery D, Lochhead M, Almouzni G. CENP-A: A Histone H3 Variant with Key Roles in Centromere Architecture in Healthy and Diseased States. Results Probl Cell Differ 2022; 70:221-261. [PMID: 36348109 DOI: 10.1007/978-3-031-06573-6_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Centromeres are key architectural components of chromosomes. Here, we examine their construction, maintenance, and functionality. Focusing on the mammalian centromere- specific histone H3 variant, CENP-A, we highlight its coevolution with both centromeric DNA and its chaperone, HJURP. We then consider CENP-A de novo deposition and the importance of centromeric DNA recently uncovered with the added value from new ultra-long-read sequencing. We next review how to ensure the maintenance of CENP-A at the centromere throughout the cell cycle. Finally, we discuss the impact of disrupting CENP-A regulation on cancer and cell fate.
Collapse
Affiliation(s)
- Daniel Jeffery
- Equipe Labellisée Ligue contre le Cancer, Institut Curie, PSL Research University, CNRS, Sorbonne Université, Nuclear Dynamics Unit, UMR3664, Paris, France
| | - Marina Lochhead
- Equipe Labellisée Ligue contre le Cancer, Institut Curie, PSL Research University, CNRS, Sorbonne Université, Nuclear Dynamics Unit, UMR3664, Paris, France
| | - Geneviève Almouzni
- Equipe Labellisée Ligue contre le Cancer, Institut Curie, PSL Research University, CNRS, Sorbonne Université, Nuclear Dynamics Unit, UMR3664, Paris, France.
| |
Collapse
|
16
|
Abstract
We are entering a new era in genomics where entire centromeric regions are accurately represented in human reference assemblies. Access to these high-resolution maps will enable new surveys of sequence and epigenetic variation in the population and offer new insight into satellite array genomics and centromere function. Here, we focus on the sequence organization and evolution of alpha satellites, which are credited as the genetic and genomic definition of human centromeres due to their interaction with inner kinetochore proteins and their importance in the development of human artificial chromosome assays. We provide an overview of alpha satellite repeat structure and array organization in the context of these high-quality reference data sets; discuss the emergence of variation-based surveys; and provide perspective on the role of this new source of genetic and epigenetic variation in the context of chromosome biology, genome instability, and human disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California 95064, USA; .,Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA
| | - Ivan A Alexandrov
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; .,Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199004, Russia.,Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| |
Collapse
|
17
|
Dvorkina T, Kunyavskaya O, Bzikadze AV, Alexandrov I, Pevzner PA. CentromereArchitect: inference and analysis of the architecture of centromeres. Bioinformatics 2021; 37:i196-i204. [PMID: 34252949 PMCID: PMC8336445 DOI: 10.1093/bioinformatics/btab265] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Motivation Recent advances in long-read sequencing technologies led to rapid progress in centromere assembly in the last year and, for the first time, opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. However, since these advances have not been yet accompanied by the development of the centromere-specific bioinformatics algorithms, even the fundamental questions (e.g. centromere annotation by deriving the complete set of human monomers and high-order repeats), let alone more complex questions (e.g. explaining how monomers and high-order repeats evolved) about human centromeres remain open. Moreover, even though there was a four-decade-long series of studies aimed at cataloging all human monomers and high-order repeats, the rigorous algorithmic definitions of these concepts are still lacking. Thus, the development of a centromere annotation tool is a prerequisite for follow-up personalized biomedical studies of centromeres across the human population and evolutionary studies of centromeres across various species. Results We describe the CentromereArchitect, the first tool for the centromere annotation in a newly sequenced genome, apply it to the recently generated complete assembly of a human genome by the Telomere-to-Telomere consortium, generate the complete set of human monomers and high-order repeats for ‘live’ centromeres, and reveal a vast set of hybrid monomers that may represent the focal points of centromere evolution. Availability and implementation CentromereArchitect is publicly available on https://github.com/ablab/stringdecomposer/tree/ismb2021 Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA 92093, USA
| | - Ivan Alexandrov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA
| |
Collapse
|
18
|
Lopes M, Louzada S, Gama-Carvalho M, Chaves R. Genomic Tackling of Human Satellite DNA: Breaking Barriers through Time. Int J Mol Sci 2021; 22:4707. [PMID: 33946766 PMCID: PMC8125562 DOI: 10.3390/ijms22094707] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/24/2021] [Accepted: 04/27/2021] [Indexed: 12/12/2022] Open
Abstract
(Peri)centromeric repetitive sequences and, more specifically, satellite DNA (satDNA) sequences, constitute a major human genomic component. SatDNA sequences can vary on a large number of features, including nucleotide composition, complexity, and abundance. Several satDNA families have been identified and characterized in the human genome through time, albeit at different speeds. Human satDNA families present a high degree of sub-variability, leading to the definition of various subfamilies with different organization and clustered localization. Evolution of satDNA analysis has enabled the progressive characterization of satDNA features. Despite recent advances in the sequencing of centromeric arrays, comprehensive genomic studies to assess their variability are still required to provide accurate and proportional representation of satDNA (peri)centromeric/acrocentric short arm sequences. Approaches combining multiple techniques have been successfully applied and seem to be the path to follow for generating integrated knowledge in the promising field of human satDNA biology.
Collapse
Affiliation(s)
- Mariana Lopes
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Sandra Louzada
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Margarida Gama-Carvalho
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Raquel Chaves
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| |
Collapse
|
19
|
Dvorkina T, Bzikadze AV, Pevzner PA. The string decomposition problem and its applications to centromere analysis and assembly. Bioinformatics 2021; 36:i93-i101. [PMID: 32657390 PMCID: PMC7428072 DOI: 10.1093/bioinformatics/btaa454] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Motivation Recent attempts to assemble extra-long tandem repeats (such as centromeres) faced the challenge of translating long error-prone reads from the nucleotide alphabet into the alphabet of repeat units. Human centromeres represent a particularly complex type of high-order repeats (HORs) formed by chromosome-specific monomers. Given a set of all human monomers, translating a read from a centromere into the monomer alphabet is modeled as the String Decomposition Problem. The accurate translation of reads into the monomer alphabet turns the notoriously difficult problem of assembling centromeres from reads (in the nucleotide alphabet) into a more tractable problem of assembling centromeres from translated reads. Results We describe a StringDecomposer (SD) algorithm for solving this problem, benchmark it on the set of long error-prone Oxford Nanopore reads generated by the Telomere-to-Telomere consortium and identify a novel (rare) monomer that extends the set of known X-chromosome specific monomers. Our identification of a novel monomer emphasizes the importance of identification of all (even rare) monomers for future centromere assembly efforts and evolutionary studies. To further analyze novel monomers, we applied SD to the set of recently generated long accurate Pacific Biosciences HiFi reads. This analysis revealed that the set of known human monomers and HORs remains incomplete. SD opens a possibility to generate a complete set of human monomers and HORs for using in the ongoing efforts to generate the complete assembly of the human genome. Availability and implementation StringDecomposer is publicly available on https://github.com/ablab/stringdecomposer. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA 92093, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA
| |
Collapse
|
20
|
The structure, function and evolution of a complete human chromosome 8. Nature 2021; 593:101-107. [PMID: 33828295 PMCID: PMC8099727 DOI: 10.1038/s41586-021-03420-7] [Citation(s) in RCA: 169] [Impact Index Per Article: 56.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 03/04/2021] [Indexed: 02/07/2023]
Abstract
The complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.
Collapse
|
21
|
Ahmad SF, Singchat W, Jehangir M, Suntronpong A, Panthum T, Malaivijitnond S, Srikulnath K. Dark Matter of Primate Genomes: Satellite DNA Repeats and Their Evolutionary Dynamics. Cells 2020; 9:E2714. [PMID: 33352976 PMCID: PMC7767330 DOI: 10.3390/cells9122714] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 12/12/2022] Open
Abstract
A substantial portion of the primate genome is composed of non-coding regions, so-called "dark matter", which includes an abundance of tandemly repeated sequences called satellite DNA. Collectively known as the satellitome, this genomic component offers exciting evolutionary insights into aspects of primate genome biology that raise new questions and challenge existing paradigms. A complete human reference genome was recently reported with telomere-to-telomere human X chromosome assembly that resolved hundreds of dark regions, encompassing a 3.1 Mb centromeric satellite array that had not been identified previously. With the recent exponential increase in the availability of primate genomes, and the development of modern genomic and bioinformatics tools, extensive growth in our knowledge concerning the structure, function, and evolution of satellite elements is expected. The current state of knowledge on this topic is summarized, highlighting various types of primate-specific satellite repeats to compare their proportions across diverse lineages. Inter- and intraspecific variation of satellite repeats in the primate genome are reviewed. The functional significance of these sequences is discussed by describing how the transcriptional activity of satellite repeats can affect gene expression during different cellular processes. Sex-linked satellites are outlined, together with their respective genomic organization. Mechanisms are proposed whereby satellite repeats might have emerged as novel sequences during different evolutionary phases. Finally, the main challenges that hinder the detection of satellite DNA are outlined and an overview of the latest methodologies to address technological limitations is presented.
Collapse
Affiliation(s)
- Syed Farhan Ahmad
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Worapong Singchat
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Maryam Jehangir
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Department of Structural and Functional Biology, Institute of Bioscience at Botucatu, São Paulo State University (UNESP), Botucatu, São Paulo 18618-689, Brazil
| | - Aorarat Suntronpong
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Thitipong Panthum
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Suchinda Malaivijitnond
- National Primate Research Center of Thailand, Chulalongkorn University, Saraburi 18110, Thailand;
- Department of Biology, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
| | - Kornsorn Srikulnath
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
- National Primate Research Center of Thailand, Chulalongkorn University, Saraburi 18110, Thailand;
- Center of Excellence on Agricultural Biotechnology (AG-BIO/PERDO-CHE), Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
| |
Collapse
|
22
|
Bzikadze AV, Pevzner PA. Automated assembly of centromeres from ultra-long error-prone reads. Nat Biotechnol 2020; 38:1309-1316. [PMID: 32665660 PMCID: PMC10718184 DOI: 10.1038/s41587-020-0582-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2019] [Accepted: 05/29/2020] [Indexed: 12/12/2022]
Abstract
Centromeric variation has been linked to cancer and infertility, but centromere sequences contain multiple tandem repeats and can only be assembled manually from long error-prone reads. Here we describe the centroFlye algorithm for centromere assembly using long error-prone reads, and apply it to assemble human centromeres on chromosomes 6 and X. Our analyses reveal putative breakpoints in the manual reconstruction of the human X centromere, demonstrate that human X chromosome is partitioned into repeat subfamilies and provide initial insights into centromere evolution. We anticipate that centroFlye could be applied to automatically close remaining multimegabase gaps in the reference human genome.
Collapse
Affiliation(s)
- Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
23
|
Balzano E, Giunta S. Centromeres under Pressure: Evolutionary Innovation in Conflict with Conserved Function. Genes (Basel) 2020; 11:E912. [PMID: 32784998 PMCID: PMC7463522 DOI: 10.3390/genes11080912] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 08/04/2020] [Accepted: 08/04/2020] [Indexed: 12/22/2022] Open
Abstract
Centromeres are essential genetic elements that enable spindle microtubule attachment for chromosome segregation during mitosis and meiosis. While this function is preserved across species, centromeres display an array of dynamic features, including: (1) rapidly evolving DNA; (2) wide evolutionary diversity in size, shape and organization; (3) evidence of mutational processes to generate homogenized repetitive arrays that characterize centromeres in several species; (4) tolerance to changes in position, as in the case of neocentromeres; and (5) intrinsic fragility derived by sequence composition and secondary DNA structures. Centromere drive underlies rapid centromere DNA evolution due to the "selfish" pursuit to bias meiotic transmission and promote the propagation of stronger centromeres. Yet, the origins of other dynamic features of centromeres remain unclear. Here, we review our current understanding of centromere evolution and plasticity. We also detail the mutagenic processes proposed to shape the divergent genetic nature of centromeres. Changes to centromeres are not simply evolutionary relics, but ongoing shifts that on one side promote centromere flexibility, but on the other can undermine centromere integrity and function with potential pathological implications such as genome instability.
Collapse
Affiliation(s)
- Elisa Balzano
- Dipartimento di Biologia e Biotecnologie “Charles Darwin”, Sapienza Università di Roma, 00185 Roma, Italy;
| | - Simona Giunta
- Laboratory of Chromosome and Cell Biology, The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA
| |
Collapse
|
24
|
Dutrillaux AM, Dutrillaux B. Different behaviour of C-banded peri-centromeric heterochromatin between sex chromosomes and autosomes in Polyphagan beetles. COMPARATIVE CYTOGENETICS 2019; 13:179-192. [PMID: 31327988 PMCID: PMC6620206 DOI: 10.3897/compcytogen.v13i2.34746] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 05/21/2019] [Indexed: 06/10/2023]
Abstract
Heterochromatin variation was studied after C-banding of male karyotypes with a XY sex formula from 224 species belonging to most of the main families of Coleoptera. The karyotypes were classified in relation with the ratio heterochromatin/euchromatin total amounts and the amounts of heterochromatin on autosomes and gonosomes were compared. The C-banded karyotypes of 19 species, representing characteristic profiles are presented. This analysis shows that there is a strong tendency for the homogenization of the size of the peri-centromeric C-banded heterochromatin on autosomes. The amount of heterochromatin on the X roughly follows the variations of autosomes. At contrast, the C-banded heterochromatin of the Y, most frequently absent or very small and rarely amplified, looks quite independent from that of other chromosomes. We conclude that the Xs and autosomes, but not the Y, possibly share some, but not all mechanisms of heterochromatin amplification/reduction. The theoretical models of heterochromatin expansion are discussed in the light of these data.
Collapse
Affiliation(s)
- Anne-Marie Dutrillaux
- UMR7205 MNHN CNRS UMPC EPHE Institut de Systématique, Evolution, Biodiversité. Muséum National d’histoire Naturelle, Sorbonne Universités, 57, rue Cuvier, CP39, UMR7205 Paris, FranceSorbonne UniversitésParisFrance
| | - Bernard Dutrillaux
- UMR7205 MNHN CNRS UMPC EPHE Institut de Systématique, Evolution, Biodiversité. Muséum National d’histoire Naturelle, Sorbonne Universités, 57, rue Cuvier, CP39, UMR7205 Paris, FranceSorbonne UniversitésParisFrance
| |
Collapse
|
25
|
Centromere Repeats: Hidden Gems of the Genome. Genes (Basel) 2019; 10:genes10030223. [PMID: 30884847 PMCID: PMC6471113 DOI: 10.3390/genes10030223] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 03/07/2019] [Accepted: 03/11/2019] [Indexed: 01/08/2023] Open
Abstract
Satellite DNAs are now regarded as powerful and active contributors to genomic and chromosomal evolution. Paired with mobile transposable elements, these repetitive sequences provide a dynamic mechanism through which novel karyotypic modifications and chromosomal rearrangements may occur. In this review, we discuss the regulatory activity of satellite DNA and their neighboring transposable elements in a chromosomal context with a particular emphasis on the integral role of both in centromere function. In addition, we discuss the varied mechanisms by which centromeric repeats have endured evolutionary processes, producing a novel, species-specific centromeric landscape despite sharing a ubiquitously conserved function. Finally, we highlight the role these repetitive elements play in the establishment and functionality of de novo centromeres and chromosomal breakpoints that underpin karyotypic variation. By emphasizing these unique activities of satellite DNAs and transposable elements, we hope to disparage the conventional exemplification of repetitive DNA in the historically-associated context of ‘junk’.
Collapse
|
26
|
Uralsky L, Shepelev V, Alexandrov A, Yurov Y, Rogaev E, Alexandrov I. Classification and monomer-by-monomer annotation dataset of suprachromosomal family 1 alpha satellite higher-order repeats in hg38 human genome assembly. Data Brief 2019; 24:103708. [PMID: 30989093 PMCID: PMC6447721 DOI: 10.1016/j.dib.2019.103708] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2018] [Revised: 01/16/2019] [Accepted: 01/22/2019] [Indexed: 01/27/2023] Open
Abstract
In the latest hg38 human genome assembly, centromeric gaps has been filled in by alpha satellite (AS) reference models (RMs) which are statistical representations of homogeneous higher-order repeat (HOR) arrays that make up the bulk of the centromeric regions. We analyzed these models to compose an atlas of human AS HORs where each monomer of a HOR was represented by a number of its polymorphic sequence variants. We combined these data and HMMER sequence analysis platform to annotate AS HORs in the assembly. This led to discovery of a new type of low copy number highly divergent HORs which were not represented by RMs. These were included in the dataset. The annotation can be viewed as UCSC Genome Browser custom track (the HOR-track) and used together with our previous annotation of AS suprachromosomal families (SFs) in the same assembly, where each AS monomer can be viewed in its genomic context together with its classification into one of the 5 major SFs (the SF-track). To catalog the diversity of AS HORs in the human genome we introduced a new naming system. Each HOR received a name which showed its SF, chromosomal location and index number. Here we present the first installment of the HOR-track covering only the 17 HORs that belong to SF1 which forms live functional centromeres in chromosomes 1, 3, 5, 6, 7, 10, 12, 16 and 19 and also a large number of minor dead HOR domains, both homogeneous and divergent. Monomer-by-monomer HOR annotation used for this dataset as opposed to annotation of whole HOR repeats provides for mapping and quantification of various structural variants of AS HORs which can be used to collect data on inter-individual polymorphism of AS.
Collapse
Affiliation(s)
- L.I. Uralsky
- Institute of Molecular Genetics, Russian Academy of Sciences, Kurchatov Sq. 2, Moscow 123182, Russia
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
| | - V.A. Shepelev
- Institute of Molecular Genetics, Russian Academy of Sciences, Kurchatov Sq. 2, Moscow 123182, Russia
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
| | - A.A. Alexandrov
- Institute of Molecular Genetics, Russian Academy of Sciences, Kurchatov Sq. 2, Moscow 123182, Russia
| | - Y.B. Yurov
- Research Center of Mental Health, Zagorodnoe Sh. 2, Moscow 113152, Russia
| | - E.I. Rogaev
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
- Department of Psychiatry, Brudnick Neuropsychiatric Research Institute, University of Massachusetts Medical School, Worcester, MA 01604, USA
- Lomonosov Moscow State University, Biological Department, Center for Genetics and Genetic Technologies, Moscow, 119192, Russia
- Corresponding authors. Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia.
| | - I.A. Alexandrov
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia
- Research Center of Mental Health, Zagorodnoe Sh. 2, Moscow 113152, Russia
- Corresponding authors. Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia.
| |
Collapse
|
27
|
McNulty SM, Sullivan BA. Alpha satellite DNA biology: finding function in the recesses of the genome. Chromosome Res 2018; 26:115-138. [PMID: 29974361 DOI: 10.1007/s10577-018-9582-3] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Accepted: 06/14/2018] [Indexed: 02/05/2023]
Abstract
Repetitive DNA, formerly referred to by the misnomer "junk DNA," comprises a majority of the human genome. One class of this DNA, alpha satellite, comprises up to 10% of the genome. Alpha satellite is enriched at all human centromere regions and is competent for de novo centromere assembly. Because of the highly repetitive nature of alpha satellite, it has been difficult to achieve genome assemblies at centromeres using traditional next-generation sequencing approaches, and thus, centromeres represent gaps in the current human genome assembly. Moreover, alpha satellite DNA is transcribed into repetitive noncoding RNA and contributes to a large portion of the transcriptome. Recent efforts to characterize these transcripts and their function have uncovered pivotal roles for satellite RNA in genome stability, including silencing "selfish" DNA elements and recruiting centromere and kinetochore proteins. This review will describe the genomic and epigenetic features of alpha satellite DNA, discuss recent findings of noncoding transcripts produced from distinct alpha satellite arrays, and address current progress in the functional understanding of this oft-neglected repetitive sequence. We will discuss unique challenges of studying human satellite DNAs and RNAs and point toward new technologies that will continue to advance our understanding of this largely untapped portion of the genome.
Collapse
Affiliation(s)
- Shannon M McNulty
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, 27710, USA
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, 27710, USA. .,Division of Human Genetics, Duke University Medical Center, Durham, NC, 27710, USA.
| |
Collapse
|
28
|
Cacheux L, Ponger L, Gerbault-Seureau M, Loll F, Gey D, Richard FA, Escudé C. The Targeted Sequencing of Alpha Satellite DNA in Cercopithecus pogonias Provides New Insight Into the Diversity and Dynamics of Centromeric Repeats in Old World Monkeys. Genome Biol Evol 2018; 10:1837-1851. [PMID: 29860303 PMCID: PMC6061836 DOI: 10.1093/gbe/evy109] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/29/2018] [Indexed: 02/06/2023] Open
Abstract
Alpha satellite is the major repeated DNA element of primate centromeres. Specific evolutionary mechanisms have led to a great diversity of sequence families with peculiar genomic organization and distribution, which have till now been studied mostly in great apes. Using high throughput sequencing of alpha satellite monomers obtained by enzymatic digestion followed by computational and cytogenetic analysis, we compare here the diversity and genomic distribution of alpha satellite DNA in two related Old World monkey species, Cercopithecus pogonias and Cercopithecus solatus, which are known to have diverged about 7 Ma. Two main families of monomers, called C1 and C2, are found in both species. A detailed analysis of our data sets revealed the existence of numerous subfamilies within the centromeric C1 family. Although the most abundant subfamily is conserved between both species, our fluorescence in situ hybridization (FISH) experiments clearly show that some subfamilies are specific for each species and that their distribution is restricted to a subset of chromosomes, thereby pointing to the existence of recurrent amplification/homogenization events. The pericentromeric C2 family is very abundant on the short arm of all acrocentric chromosomes in both species, pointing to specific mechanisms that lead to this distribution. Results obtained using two different restriction enzymes are fully consistent with a predominant monomeric organization of alpha satellite DNA that coexists with higher order organization patterns in the C. pogonias genome. Our study suggests a high dynamics of alpha satellite DNA in Cercopithecini, with recurrent apparition of new sequence variants and interchromosomal sequence transfer.
Collapse
Affiliation(s)
- Lauriane Cacheux
- Département Adaptations du Vivant, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum National d’Histoire Naturelle, Paris, France
- Département Origines et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum National d’Histoire Naturelle, Paris, France
| | - Loïc Ponger
- Département Adaptations du Vivant, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum National d’Histoire Naturelle, Paris, France
| | - Michèle Gerbault-Seureau
- Département Origines et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum National d’Histoire Naturelle, Paris, France
| | - François Loll
- Département Adaptations du Vivant, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum National d’Histoire Naturelle, Paris, France
| | - Delphine Gey
- Service de Systématique Moléculaire, UMS 2700 CNRS, Sorbonne Universités, Muséum National d’Histoire Naturelle, Paris, France
| | - Florence Anne Richard
- Département Origines et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum National d’Histoire Naturelle, Paris, France
- Université Versailles St-Quentin, Montigny-le-Bretonneux, France
| | - Christophe Escudé
- Département Adaptations du Vivant, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum National d’Histoire Naturelle, Paris, France
| |
Collapse
|
29
|
McNulty SM, Sullivan LL, Sullivan BA. Human Centromeres Produce Chromosome-Specific and Array-Specific Alpha Satellite Transcripts that Are Complexed with CENP-A and CENP-C. Dev Cell 2017; 42:226-240.e6. [PMID: 28787590 DOI: 10.1016/j.devcel.2017.07.001] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Revised: 05/24/2017] [Accepted: 07/03/2017] [Indexed: 11/28/2022]
Abstract
Human centromeres are defined by alpha satellite DNA arrays that are distinct and chromosome specific. Most human chromosomes contain multiple alpha satellite arrays that are competent for centromere assembly. Here, we show that human centromeres are defined by chromosome-specific RNAs linked to underlying organization of distinct alpha satellite arrays. Active and inactive arrays on the same chromosome produce discrete sets of transcripts in cis. Non-coding RNAs produced from active arrays are complexed with CENP-A and CENP-C, while inactive-array transcripts associate with CENP-B and are generally less stable. Loss of CENP-A does not affect transcript abundance or stability. However, depletion of array-specific RNAs reduces CENP-A and CENP-C at the targeted centromere via faulty CENP-A loading, arresting cells before mitosis. This work shows that each human alpha satellite array produces a unique set of non-coding transcripts, and RNAs present at active centromeres are necessary for kinetochore assembly and cell-cycle progression.
Collapse
Affiliation(s)
- Shannon M McNulty
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Lori L Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Division of Human Genetics, Duke University Medical Center, Durham, NC 27710, USA.
| |
Collapse
|
30
|
Garrido-Ramos MA. Satellite DNA: An Evolving Topic. Genes (Basel) 2017; 8:genes8090230. [PMID: 28926993 PMCID: PMC5615363 DOI: 10.3390/genes8090230] [Citation(s) in RCA: 222] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 09/12/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022] Open
Abstract
Satellite DNA represents one of the most fascinating parts of the repetitive fraction of the eukaryotic genome. Since the discovery of highly repetitive tandem DNA in the 1960s, a lot of literature has extensively covered various topics related to the structure, organization, function, and evolution of such sequences. Today, with the advent of genomic tools, the study of satellite DNA has regained a great interest. Thus, Next-Generation Sequencing (NGS), together with high-throughput in silico analysis of the information contained in NGS reads, has revolutionized the analysis of the repetitive fraction of the eukaryotic genomes. The whole of the historical and current approaches to the topic gives us a broad view of the function and evolution of satellite DNA and its role in chromosomal evolution. Currently, we have extensive information on the molecular, chromosomal, biological, and population factors that affect the evolutionary fate of satellite DNA, knowledge that gives rise to a series of hypotheses that get on well with each other about the origin, spreading, and evolution of satellite DNA. In this paper, I review these hypotheses from a methodological, conceptual, and historical perspective and frame them in the context of chromosomal organization and evolution.
Collapse
Affiliation(s)
- Manuel A Garrido-Ramos
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, 18071 Granada, Spain.
| |
Collapse
|
31
|
Abstract
Genomic variation is a source of functional diversity that is typically studied in genic and non-coding regulatory regions. However, the extent of variation within noncoding portions of the human genome, particularly highly repetitive regions, and the functional consequences are not well understood. Satellite DNA, including α satellite DNA found at human centromeres, comprises up to 10% of the genome, but is difficult to study because its repetitive nature hinders contiguous sequence assemblies. We recently described variation within α satellite DNA that affects centromere function. On human chromosome 17 (HSA17), we showed that size and sequence polymorphisms within primary array D17Z1 are associated with chromosome aneuploidy and defective centromere architecture. However, HSA17 can counteract this instability by assembling the centromere at a second, "backup" array lacking variation. Here, we discuss our findings in a broader context of human centromere assembly, and highlight areas of future study to uncover links between genomic and epigenetic features of human centromeres.
Collapse
Affiliation(s)
- Lori L Sullivan
- a Department of Molecular Genetics and Microbiology , Duke University Medical Center , Durham , NC , USA
| | - Kimberline Chew
- a Department of Molecular Genetics and Microbiology , Duke University Medical Center , Durham , NC , USA
| | - Beth A Sullivan
- a Department of Molecular Genetics and Microbiology , Duke University Medical Center , Durham , NC , USA
| |
Collapse
|
32
|
Dumont M, Fachinetti D. DNA Sequences in Centromere Formation and Function. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2017; 56:305-336. [PMID: 28840243 DOI: 10.1007/978-3-319-58592-5_13] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Faithful chromosome segregation during cell division depends on the centromere, a complex DNA/protein structure that links chromosomes to spindle microtubules. This chromosomal domain has to be marked throughout cell division and its chromosomal localization preserved across cell generations. From fission yeast to human, centromeres are established on a series of repetitive DNA sequences and on specialized centromeric chromatin. This chromatin is enriched with the histone H3 variant, named CENP-A, that was demonstrated to be the epigenetic mark that maintains centromere identity and function indefinitely. Although centromere identity is thought to be exclusively epigenetic, the presence of specific DNA sequences in the majority of eukaryotes and of the centromeric protein CENP-B that binds to these sequences, suggests the existence of a genetic component as well. In this review, we will highlight the importance of centromeric sequences for centromere formation and function, and discuss the centromere DNA sequence/CENP-B paradox.
Collapse
Affiliation(s)
- M Dumont
- Institut Curie, PSL Research University, CNRS, UMR 144, 26 rue d'Ulm, 75005, Paris, France
| | - D Fachinetti
- Institut Curie, PSL Research University, CNRS, UMR 144, 26 rue d'Ulm, 75005, Paris, France.
| |
Collapse
|
33
|
Giulotto E, Raimondi E, Sullivan KF. The Unique DNA Sequences Underlying Equine Centromeres. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2017; 56:337-354. [PMID: 28840244 DOI: 10.1007/978-3-319-58592-5_14] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Centromeres are highly distinctive genetic loci whose function is specified largely by epigenetic mechanisms. Understanding the role of DNA sequences in centromere function has been a daunting task due to the highly repetitive nature of centromeres in animal chromosomes. The discovery of a centromere devoid of satellite DNA in the domestic horse consolidated observations on the epigenetic nature of centromere identity, showing that entirely natural chromosomes could function without satellite DNA cues. Horses belong to the genus Equus which exhibits a very high degree of evolutionary plasticity in centromere position and DNA sequence composition. Examination of horses has revealed that the position of the satellite-free centromere is variable among individuals. Analysis of centromere location and composition in other Equus species, including domestic donkey and zebras, confirms that the satellite-less configuration of centromeres is common in this group which has undergone particularly rapid karyotype evolution. These features have established the equids as a new mammalian system in which to investigate the molecular organization, dynamics and evolutionary behaviour of centromeres.
Collapse
Affiliation(s)
- Elena Giulotto
- Dipartimento di Biologia e Biotecnologie, Università di Pavia, Via Ferrata 1, 27100, Pavia, Italy.
| | - Elena Raimondi
- Dipartimento di Biologia e Biotecnologie, Università di Pavia, Via Ferrata 1, 27100, Pavia, Italy
| | - Kevin F Sullivan
- National University of Ireland Galway, University Road, Galway, Ireland
| |
Collapse
|
34
|
Cacheux L, Ponger L, Gerbault-Seureau M, Richard FA, Escudé C. Diversity and distribution of alpha satellite DNA in the genome of an Old World monkey: Cercopithecus solatus. BMC Genomics 2016; 17:916. [PMID: 27842493 PMCID: PMC5109768 DOI: 10.1186/s12864-016-3246-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Accepted: 11/02/2016] [Indexed: 11/10/2022] Open
Abstract
Background Alpha satellite is the major repeated DNA element of primate centromeres. Evolution of these tandemly repeated sequences has led to the existence of numerous families of monomers exhibiting specific organizational patterns. The limited amount of information available in non-human primates is a restriction to the understanding of the evolutionary dynamics of alpha satellite DNA. Results We carried out the targeted high-throughput sequencing of alpha satellite monomers and dimers from the Cercopithecus solatus genome, an Old World monkey from the Cercopithecini tribe. Computational approaches were used to infer the existence of sequence families and to study how these families are organized with respect to each other. While previous studies had suggested that alpha satellites in Old World monkeys were poorly diversified, our analysis provides evidence for the existence of at least four distinct families of sequences within the studied species and of higher order organizational patterns. Fluorescence in situ hybridization using oligonucleotide probes that are able to target each family in a specific way showed that the different families had distinct distributions on chromosomes and were not homogeneously distributed between chromosomes. Conclusions Our new approach provides an unprecedented and comprehensive view of the diversity and organization of alpha satellites in a species outside the hominoid group. We consider these data with respect to previously known alpha satellite families and to potential mechanisms for satellite DNA evolution. Applying this approach to other species will open new perspectives regarding the integration of satellite DNA into comparative genomic and cytogenetic studies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3246-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lauriane Cacheux
- Département Régulations, Développement et Diversité Moléculaire, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France.,Département Systématique et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France
| | - Loïc Ponger
- Département Régulations, Développement et Diversité Moléculaire, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France
| | - Michèle Gerbault-Seureau
- Département Systématique et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France
| | - Florence Anne Richard
- Département Systématique et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France.,Université Versailles St-Quentin, Montigny-le-Bretonneux, France
| | - Christophe Escudé
- Département Régulations, Développement et Diversité Moléculaire, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France.
| |
Collapse
|
35
|
Aldrup-MacDonald ME, Kuo ME, Sullivan LL, Chew K, Sullivan BA. Genomic variation within alpha satellite DNA influences centromere location on human chromosomes with metastable epialleles. Genome Res 2016; 26:1301-1311. [PMID: 27510565 PMCID: PMC5052062 DOI: 10.1101/gr.206706.116] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 08/08/2016] [Indexed: 01/27/2023]
Abstract
Alpha satellite is a tandemly organized type of repetitive DNA that comprises 5% of the genome and is found at all human centromeres. A defined number of 171-bp monomers are organized into chromosome-specific higher-order repeats (HORs) that are reiterated thousands of times. At least half of all human chromosomes have two or more distinct HOR alpha satellite arrays within their centromere regions. We previously showed that the two alpha satellite arrays of Homo sapiens Chromosome 17 (HSA17), D17Z1 and D17Z1-B, behave as centromeric epialleles, that is, the centromere, defined by chromatin containing the centromeric histone variant CENPA and recruitment of other centromere proteins, can form at either D17Z1 or D17Z1-B. Some individuals in the human population are functional heterozygotes in that D17Z1 is the active centromere on one homolog and D17Z1-B is active on the other. In this study, we aimed to understand the molecular basis for how centromere location is determined on HSA17. Specifically, we focused on D17Z1 genomic variation as a driver of epiallele formation. We found that D17Z1 arrays that are predominantly composed of HOR size and sequence variants were functionally less competent. They either recruited decreased amounts of the centromere-specific histone variant CENPA and the HSA17 was mitotically unstable, or alternatively, the centromere was assembled at D17Z1-B and the HSA17 was stable. Our study demonstrates that genomic variation within highly repetitive, noncoding DNA of human centromere regions has a pronounced impact on genome stability and basic chromosomal function.
Collapse
Affiliation(s)
- Megan E Aldrup-MacDonald
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Molly E Kuo
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Lori L Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Kimberline Chew
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA; Division of Human Genetics, Duke University Medical Center, Durham, North Carolina 27710, USA
| |
Collapse
|
36
|
Clusters of alpha satellite on human chromosome 21 are dispersed far onto the short arm and lack ancient layers. Chromosome Res 2016; 24:421-36. [PMID: 27430641 DOI: 10.1007/s10577-016-9530-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Accepted: 06/03/2016] [Indexed: 10/21/2022]
Abstract
Human alpha satellite (AS) sequence domains that currently function as centromeres are typically flanked by layers of evolutionarily older AS that presumably represent the remnants of earlier primate centromeres. Studies on several human chromosomes reveal that these older AS arrays are arranged in an age gradient, with the oldest arrays farthest from the functional centromere and arrays progressively closer to the centromere being progressively younger. The organization of AS on human chromosome 21 (HC21) has not been well-characterized. We have used newly available HC21 sequence data and an HC21p YAC map to determine the size, organization, and location of the AS arrays, and compared them to AS arrays found on other chromosomes. We find that the majority of the HC21 AS sequences are present on the p-arm of the chromosome and are organized into at least five distinct isolated clusters which are distributed over a larger distance from the functional centromere than that typically seen for AS on other chromosomes. Using both phylogenetic and L1 element age estimations, we found that all of the HC21 AS clusters outside the functional centromere are of a similar relatively recent evolutionary origin. HC21 contains none of the ancient AS layers associated with early primate evolution which is present on other chromosomes, possibly due to the fact that the p-arm of HC21 and the other acrocentric chromosomes underwent substantial reorganization about 20 million years ago.
Collapse
|
37
|
Cech JN, Peichel CL. Identification of the centromeric repeat in the threespine stickleback fish (Gasterosteus aculeatus). Chromosome Res 2015; 23:767-79. [PMID: 26424612 DOI: 10.1007/s10577-015-9495-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Revised: 09/11/2015] [Accepted: 09/17/2015] [Indexed: 01/09/2023]
Abstract
Centromere sequences exist as gaps in many genome assemblies due to their repetitive nature. Here we take an unbiased approach utilizing centromere protein A (CENP-A) chomatin immunoprecipitation followed by high-throughput sequencing to identify the centromeric repeat sequence in the threespine stickleback fish (Gasterosteus aculeatus). A 186-bp, AT-rich repeat was validated as centromeric using both fluorescence in situ hybridization (FISH) and immunofluorescence combined with FISH (IF-FISH) on interphase nuclei and metaphase spreads. This repeat hybridizes strongly to the centromere on all chromosomes, with the exception of weak hybridization to the Y chromosome. Together, our work provides the first validated sequence information for the threespine stickleback centromere.
Collapse
Affiliation(s)
- Jennifer N Cech
- Divisions of Human Biology and Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave North, Mailstop C2-023, Seattle, WA, 98109, USA.,Graduate Program in Molecular and Cellular Biology, University of Washington, Seattle, WA, 98195, USA
| | - Catherine L Peichel
- Divisions of Human Biology and Basic Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave North, Mailstop C2-023, Seattle, WA, 98109, USA.
| |
Collapse
|
38
|
Shepelev VA, Uralsky LI, Alexandrov AA, Yurov YB, Rogaev EI, Alexandrov IA. Annotation of suprachromosomal families reveals uncommon types of alpha satellite organization in pericentromeric regions of hg38 human genome assembly. GENOMICS DATA 2015; 5:139-146. [PMID: 26167452 PMCID: PMC4496801 DOI: 10.1016/j.gdata.2015.05.035] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- V A Shepelev
- Institute of Molecular Genetics, Russian Academy of Sciences, Kurchatov sq. 2, Moscow 123182, Russia ; Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia ; Center for Brain Neurobiology and Neurogenetics, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russia
| | - L I Uralsky
- Institute of Molecular Genetics, Russian Academy of Sciences, Kurchatov sq. 2, Moscow 123182, Russia ; Center for Brain Neurobiology and Neurogenetics, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russia
| | - A A Alexandrov
- Institute of Molecular Genetics, Russian Academy of Sciences, Kurchatov sq. 2, Moscow 123182, Russia
| | - Y B Yurov
- Research Center of Mental Health, Russian Academy of Medical Sciences, Zagorodnoe sh. 2, Moscow 113152, Russia
| | - E I Rogaev
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia ; Center for Brain Neurobiology and Neurogenetics, Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk 630090, Russia ; Department of Psychiatry, Brudnick Neuropsychiatric Research Institute, University of Massachusetts Medical School, Worcester, MA 01604, USA ; Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow 119234, Russia
| | - I A Alexandrov
- Research Center of Mental Health, Russian Academy of Medical Sciences, Zagorodnoe sh. 2, Moscow 113152, Russia
| |
Collapse
|
39
|
Matylla-Kulinska K, Tafer H, Weiss A, Schroeder R. Functional repeat-derived RNAs often originate from retrotransposon-propagated ncRNAs. WILEY INTERDISCIPLINARY REVIEWS-RNA 2014; 5:591-600. [PMID: 25045147 PMCID: PMC4233971 DOI: 10.1002/wrna.1243] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Revised: 04/15/2014] [Accepted: 04/22/2014] [Indexed: 12/19/2022]
Abstract
The human genome is scattered with repetitive sequences, and the ENCODE project revealed that 60–70% of the genomic DNA is transcribed into RNA. As a consequence, the human transcriptome contains a large portion of repeat-derived RNAs (repRNAs). Here, we present a hypothesis for the evolution of novel functional repeat-derived RNAs from non-coding RNAs (ncRNAs) by retrotransposition. Upon amplification, the ncRNAs can diversify in sequence and subsequently evolve new activities, which can result in novel functions. Non-coding transcripts derived from highly repetitive regions can therefore serve as a reservoir for the evolution of novel functional RNAs. We base our hypothetical model on observations reported for short interspersed nuclear elements derived from 7SL RNA and tRNAs, α satellites derived from snoRNAs and SL RNAs derived from U1 small nuclear RNA. Furthermore, we present novel putative human repeat-derived ncRNAs obtained by the comparison of the Dfam and Rfam databases, as well as several examples in other species. We hypothesize that novel functional ncRNAs can derive also from other repetitive regions and propose Genomic SELEX as a tool for their identification.
Collapse
Affiliation(s)
- Katarzyna Matylla-Kulinska
- Department of Biochemistry and Cell Biology, Max F. Perutz Laboratories, University of Vienna, Vienna, Austria
| | | | | | | |
Collapse
|
40
|
Satyaki PRV, Cuykendall TN, Wei KHC, Brideau NJ, Kwak H, Aruna S, Ferree PM, Ji S, Barbash DA. The Hmr and Lhr hybrid incompatibility genes suppress a broad range of heterochromatic repeats. PLoS Genet 2014; 10:e1004240. [PMID: 24651406 PMCID: PMC3961192 DOI: 10.1371/journal.pgen.1004240] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Accepted: 01/30/2014] [Indexed: 11/19/2022] Open
Abstract
Hybrid incompatibilities (HIs) cause reproductive isolation between species and thus contribute to speciation. Several HI genes encode adaptively evolving proteins that localize to or interact with heterochromatin, suggesting that HIs may result from co-evolution with rapidly evolving heterochromatic DNA. Little is known, however, about the intraspecific function of these HI genes, the specific sequences they interact with, or the evolutionary forces that drive their divergence. The genes Hmr and Lhr genetically interact to cause hybrid lethality between Drosophila melanogaster and D. simulans, yet mutations in both genes are viable. Here, we report that Hmr and Lhr encode proteins that form a heterochromatic complex with Heterochromatin Protein 1 (HP1a). Using RNA-Seq analyses we discovered that Hmr and Lhr are required to repress transcripts from satellite DNAs and many families of transposable elements (TEs). By comparing Hmr and Lhr function between D. melanogaster and D. simulans we identify several satellite DNAs and TEs that are differentially regulated between the species. Hmr and Lhr mutations also cause massive overexpression of telomeric TEs and significant telomere lengthening. Hmr and Lhr therefore regulate three types of heterochromatic sequences that are responsible for the significant differences in genome size and structure between D. melanogaster and D. simulans and have high potential to cause genetic conflicts with host fitness. We further find that many TEs are overexpressed in hybrids but that those specifically mis-expressed in lethal hybrids do not closely correlate with Hmr function. Our results therefore argue that adaptive divergence of heterochromatin proteins in response to repetitive DNAs is an important underlying force driving the evolution of hybrid incompatibility genes, but that hybrid lethality likely results from novel epistatic genetic interactions that are distinct to the hybrid background. Sister species capable of mating often produce hybrids that are sterile or die during development. This reproductive isolation is caused by incompatibilities between the two sister species' genomes. Some hybrid incompatibilities involve genes that encode rapidly evolving proteins that localize to heterochromatin. Heterochromatin is largely made up of highly repetitive transposable elements and satellite DNAs. It has been hypothesized that rapid changes in heterochromatic DNA drives the changes in these HI genes and thus the evolution of reproductive isolation. In support of this model, we show that two rapidly evolving HI proteins, Lhr and Hmr, which reproductively isolate the fruit fly sister species D. melanogaster and D. simulans, repress transposable elements and satellite DNAs. These proteins also help regulate the length of the atypical Drosophila telomeres, which are themselves made of domesticated transposable elements. Our data suggest that these proteins are part of the adaptive machinery that allows the host to respond to changes and increases in heterochromatin and to maintain the activity of genes located within or adjacent to heterochromatin.
Collapse
Affiliation(s)
- P. R. V. Satyaki
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Tawny N. Cuykendall
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Kevin H-C. Wei
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Nicholas J. Brideau
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Hojoong Kwak
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - S. Aruna
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Patrick M. Ferree
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Shuqing Ji
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
| | - Daniel A. Barbash
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York, United States of America
- * E-mail:
| |
Collapse
|
41
|
Abstract
Advances in human genomics have accelerated studies in evolution, disease, and cellular regulation. However, centromere sequences, defining the chromosomal interface with spindle microtubules, remain largely absent from ongoing genomic studies and disconnected from functional, genome-wide analyses. This disparity results from the challenge of predicting the linear order of multi-megabase-sized regions that are composed almost entirely of near-identical satellite DNA. Acknowledging these challenges, the field of human centromere genomics possesses the potential to rapidly advance given the availability of individual, or personalized, genome projects matched with the promise of long-read sequencing technologies. Here I review the current genomic model of human centromeres in consideration of those studies involving functional datasets that examine the role of sequence in centromere identity.
Collapse
|
42
|
Rosandić M, Glunčić M, Paar V. Start/stop codon like trinucleotides extensions in primate alpha satellites. J Theor Biol 2012; 317:301-9. [PMID: 23026763 DOI: 10.1016/j.jtbi.2012.09.022] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2012] [Revised: 09/07/2012] [Accepted: 09/19/2012] [Indexed: 11/28/2022]
Abstract
The centromeres remain "the final frontier" in unexplored segments of genome landscape in primate genomes, characterized by 2-5 Mb arrays of evolutionary rapidly evolving alpha satellite (AS) higher order repeats (HORs). Alpha satellites as specific noncoding sequences may be also significant in light of regulatory role of noncoding sequences. Using the Global Repeat Map (GRM) algorithm we identify in NCBI assemblies of chromosome 5 the species-specific alpha satellite HORs: 13mer in human, 5mer in chimpanzee, 14mer in orangutan and 3mers in macaque. The suprachromosomal family (SF) classification of alpha satellite HORs and surrounding monomeric alpha satellites is performed and specific segmental structure was found for major alpha satellite arrays in chromosome 5 of primates. In the framework of our novel concept of start/stop Codon Like Trinucleotides (CLTs) as a "new DNA language in noncoding sequences", we find characteristics and differences of these species in CLT extensions, in particular the extensions of stop-TGA CLT. We hypothesize that these are regulators in noncoding sequences, acting at a distance, and that they can amplify or weaken the activity of start/stop codons in coding sequences in protein genesis, increasing the richness of regulatory phenomena.
Collapse
Affiliation(s)
- Marija Rosandić
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia.
| | | | | |
Collapse
|
43
|
Hayden KE, Willard HF. Composition and organization of active centromere sequences in complex genomes. BMC Genomics 2012; 13:324. [PMID: 22817545 PMCID: PMC3422206 DOI: 10.1186/1471-2164-13-324] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Accepted: 07/20/2012] [Indexed: 01/13/2023] Open
Abstract
Background Centromeres are sites of chromosomal spindle attachment during mitosis and meiosis. While the sequence basis for centromere identity remains a subject of considerable debate, one approach is to examine the genomic organization at these active sites that are correlated with epigenetic marks of centromere function. Results We have developed an approach to characterize both satellite and non-satellite centromeric sequences that are missing from current assemblies in complex genomes, using the dog genome as an example. Combining this genomic reference with an epigenetic dataset corresponding to sequences associated with the histone H3 variant centromere protein A (CENP-A), we identify active satellite sequence domains that appear to be both functionally and spatially distinct within the overall definition of satellite families. Conclusions These findings establish a genomic and epigenetic foundation for exploring the functional role of centromeric sequences in the previously sequenced dog genome and provide a model for similar studies within the context of less-characterized genomes.
Collapse
Affiliation(s)
- Karen E Hayden
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, NC, USA.
| | | |
Collapse
|
44
|
Bun1 C, Ziccardi W, Doering J, Putonti C. MilP: The Monomer Identification and Isolation Program. Evol Bioinform Online 2012. [PMCID: PMC3382395 DOI: 10.4137/ebo.s9248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Repetitive elements within genomic DNA are both functionally and evolutionarily informative. Discovering these sequences ab initio is computationally challenging, compounded by the fact that selection on these repeats is often relaxed; thus sequence identity between repetitive elements can vary significantly. Here we present a new application, the Monomer Identification and Isolation Program (MiIP), which provides functionality to both search for a particular repeat as well as discover repetitive elements within a larger genomic sequence. To compare MiIP’s performance with other repeat detection tools, analysis was conducted for synthetic sequences as well as several α21-II clones and HC21 BAC sequences. The primary benefit of MiIP is the fact that it is a single tool capable of searching for both known monomeric sequences as well as discovering the occurrence of repeats ab initio, per the user’s required sensitivity of the search. Furthermore, the report functionality helps easily facilitate subsequent phylogenetic analysis.
Collapse
Affiliation(s)
- Christopher Bun1
- Department of Computer Science, Loyola University Chicago, 820 N Michigan Avenue, Chicago, IL 60611 USA
- Current Affiliation: Department of Computer Science, University of Chicago, 1100 E 58th Street, Chicago, IL 60637 USA
| | - William Ziccardi
- Department of Biology, Loyola University Chicago, 1032 W Sheridan Road, Chicago, IL 60660 USA
| | - Jeffrey Doering
- Department of Biology, Loyola University Chicago, 1032 W Sheridan Road, Chicago, IL 60660 USA
| | - Catherine Putonti
- Department of Computer Science, Loyola University Chicago, 820 N Michigan Avenue, Chicago, IL 60611 USA
- Department of Biology, Loyola University Chicago, 1032 W Sheridan Road, Chicago, IL 60660 USA
- Bioinformatics Program, Loyola University Chicago, 1032 W Sheridan Road, Chicago, IL 60660 USA
| |
Collapse
|
45
|
|
46
|
Lee HR, Hayden KE, Willard HF. Organization and molecular evolution of CENP-A--associated satellite DNA families in a basal primate genome. Genome Biol Evol 2011; 3:1136-49. [PMID: 21828373 PMCID: PMC3194837 DOI: 10.1093/gbe/evr083] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Centromeric regions in many complex eukaryotic species contain highly repetitive satellite DNAs. Despite the diversity of centromeric DNA sequences among species, the functional centromeres in all species studied to date are marked by CENP-A, a centromere-specific histone H3 variant. Although it is well established that families of multimeric higher-order alpha satellite are conserved at the centromeres of human and great ape chromosomes and that diverged monomeric alpha satellite is found in old and new world monkey genomes, little is known about the organization, function, and evolution of centromeric sequences in more distant primates, including lemurs. Aye-Aye (Daubentonia madagascariensis) is a basal primate and is located at a key position in the evolutionary tree to study centromeric satellite transitions in primate genomes. Using the approach of chromatin immunoprecipitation with antibodies directed to CENP-A, we have identified two satellite families, Daubentonia madagascariensis Aye-Aye 1 (DMA1) and Daubentonia madagascariensis Aye-Aye 2 (DMA2), related to each other but unrelated in sequence to alpha satellite or any other previously described primate or mammalian satellite DNA families. Here, we describe the initial genomic and phylogenetic organization of DMA1 and DMA2 and present evidence of higher-order repeats in Aye-Aye centromeric domains, providing an opportunity to study the emergence of chromosome-specific modes of satellite DNA evolution in primate genomes.
Collapse
Affiliation(s)
- Hye-Ran Lee
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, USA
| | | | | |
Collapse
|
47
|
Shang WH, Hori T, Toyoda A, Kato J, Popendorf K, Sakakibara Y, Fujiyama A, Fukagawa T. Chickens possess centromeres with both extended tandem repeats and short non-tandem-repetitive sequences. Genome Res 2010; 20:1219-28. [PMID: 20534883 DOI: 10.1101/gr.106245.110] [Citation(s) in RCA: 136] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The centromere is essential for faithful chromosome segregation by providing the site for kinetochore assembly. Although the role of the centromere is conserved throughout evolution, the DNA sequences associated with centromere regions are highly divergent among species and it remains to be determined how centromere DNA directs kinetochore formation. Despite the active use of chicken DT40 cells in studies of chromosome segregation, the sequence of the chicken centromere was unclear. Here, we performed a comprehensive analysis of chicken centromere DNA which revealed unique features of chicken centromeres compared with previously studied vertebrates. Centromere DNA sequences from the chicken macrochromosomes, with the exception of chromosome 5, contain chromosome-specific homogenous tandem repetitive arrays that span several hundred kilobases. In contrast, the centromeres of chromosomes 5, 27, and Z do not contain tandem repetitive sequences and span non-tandem-repetitive sequences of only approximately 30 kb. To test the function of these centromere sequences, we conditionally removed the centromere from the Z chromosome using genetic engineering and have shown that that the non-tandem-repeat sequence of chromosome Z is a functional centromere.
Collapse
Affiliation(s)
- Wei-Hao Shang
- Department of Molecular Genetics, National Institute of Genetics and The Graduate University for Advanced Studies (SOKENDAI), Mishima, Shizuoka 411-8540, Japan
| | | | | | | | | | | | | | | |
Collapse
|
48
|
Robertsonian fusions, pericentromeric repeat organization and evolution: a case study within a highly polymorphic rodent species, Gerbillus nigeriae. Chromosome Res 2010; 18:473-86. [DOI: 10.1007/s10577-010-9128-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2010] [Accepted: 03/11/2010] [Indexed: 10/19/2022]
|