1
|
Glunčić M, Vlahović I, Rosandić M, Paar V. Novel Cascade Alpha Satellite HORs in Orangutan Chromosome 13 Assembly: Discovery of the 59mer HOR-The largest Unit in Primates-And the Missing Triplet 45/27/18 HOR in Human T2T-CHM13v2.0 Assembly. Int J Mol Sci 2024; 25:7596. [PMID: 39062839 PMCID: PMC11276891 DOI: 10.3390/ijms25147596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 07/05/2024] [Accepted: 07/09/2024] [Indexed: 07/28/2024] Open
Abstract
From the recent genome assembly NHGRI_mPonAbe1-v2.0_NCBI (GCF_028885655.2) of orangutan chromosome 13, we computed the precise alpha satellite higher-order repeat (HOR) structure using the novel high-precision GRM2023 algorithm with Global Repeat Map (GRM) and Monomer Distance (MD) diagrams. This study rigorously identified alpha satellite HORs in the centromere of orangutan chromosome 13, discovering a novel 59mer HOR-the longest HOR unit identified in any primate to date. Additionally, it revealed the first intertwined sequence of three HORs, 18mer/27mer/45mer HORs, with a common aligned "backbone" across all HOR copies. The major 7mer HOR exhibits a Willard's-type canonical copy, although some segments of the array display significant irregularities. In contrast, the 14mer HOR forms a regular Willard's-type HOR array. Surprisingly, the GRM2023 high-precision analysis of chromosome 13 of human genome assembly T2T-CHM13v2.0 reveals the presence of only a 7mer HOR, despite both the orangutan and human genome assemblies being derived from whole genome shotgun sequences.
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
| | - Ines Vlahović
- Department of Interdisciplinary Sciences, Algebra University College, 10000 Zagreb, Croatia;
| | - Marija Rosandić
- University Hospital Centre Zagreb (Ret.), 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| |
Collapse
|
2
|
Glunčić M, Vlahović I, Rosandić M, Paar V. Precise identification of cascading alpha satellite higher order repeats in T2T-CHM13 assembly of human chromosome 3. Croat Med J 2024; 65:209-219. [PMID: 38868967 PMCID: PMC11157248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Accepted: 05/28/2024] [Indexed: 06/14/2024] Open
Abstract
AIM To precisely identify and analyze alpha-satellite higher-order repeats (HORs) in T2T-CHM13 assembly of human chromosome 3. METHODS From the recently sequenced complete T2T-CHM13 assembly of human chromosome 3, the precise alpha satellite HOR structure was computed by using the novel high-precision GRM2023 algorithm with global repeat map (GRM) and monomer distance (MD) diagrams. RESULTS The major alpha satellite HOR array in chromosome 3 revealed a novel cascading HOR, housing 17mer HOR copies with subfragments of periods 15 and 2. Within each row in the cascading HOR, the monomers were of different types, but different rows within the same cascading 17mer HOR contained more than one monomer of the same type. Each canonical 17mer HOR copy comprised 17 monomers belonging to 16 different monomer types. Another pronounced 10mer HOR array was of the regular Willard's type. CONCLUSION Our findings emphasize the complexity within the chromosome 3 centromere as well as deviations from expected highly regular patterns.
Collapse
Affiliation(s)
- Matko Glunčić
- Matko Glunčić, Department of Physics, Faculty of Science, University of Zagreb, Bijenička cesta 32, 10000 Zagreb, Croatia,
| | | | | | | |
Collapse
|
3
|
Glunčić M, Vlahović I, Rosandić M, Paar V. Novel Concept of Alpha Satellite Cascading Higher-Order Repeats (HORs) and Precise Identification of 15mer and 20mer Cascading HORs in Complete T2T-CHM13 Assembly of Human Chromosome 15. Int J Mol Sci 2024; 25:4395. [PMID: 38673983 PMCID: PMC11050224 DOI: 10.3390/ijms25084395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 04/28/2024] Open
Abstract
Unraveling the intricate centromere structure of human chromosomes holds profound implications, illuminating fundamental genetic mechanisms and potentially advancing our comprehension of genetic disorders and therapeutic interventions. This study rigorously identified and structurally analyzed alpha satellite higher-order repeats (HORs) within the centromere of human chromosome 15 in the complete T2T-CHM13 assembly using the high-precision GRM2023 algorithm. The most extensive alpha satellite HOR array in chromosome 15 reveals a novel cascading HOR, housing 429 15mer HOR copies, containing 4-, 7- and 11-monomer subfragments. Within each row of cascading HORs, all alpha satellite monomers are of distinct types, as in regular Willard's HORs. However, different HOR copies within the same cascading 15mer HOR contain more than one monomer of the same type. Each canonical 15mer HOR copy comprises 15 monomers belonging to only 9 different monomer types. Notably, 65% of the 429 15mer cascading HOR copies exhibit canonical structures, while 35% display variant configurations. Identified as the second most extensive alpha satellite HOR, another novel cascading HOR within human chromosome 15 encompasses 164 20mer HOR copies, each featuring two subfragments. Moreover, a distinct pattern emerges as interspersed 25mer/26mer structures differing from regular Willard's HORs and giving rise to a 34-monomer subfragment. Only a minor 18mer HOR array of 12 HOR copies is of the regular Willard's type. These revelations highlight the complexity within the chromosome 15 centromeric region, accentuating deviations from anticipated highly regular patterns and hinting at profound information encoding and functional potential within the human centromere.
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
| | - Ines Vlahović
- Algebra LAB, Algebra University College, 10000 Zagreb, Croatia;
| | - Marija Rosandić
- Department of Internal Medicine, University Hospital Centre Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| |
Collapse
|
4
|
Gambogi CW, Birchak GJ, Mer E, Brown DM, Yankson G, Kixmoeller K, Gavade JN, Espinoza JL, Kashyap P, Dupont CL, Logsdon GA, Heun P, Glass JI, Black BE. Efficient formation of single-copy human artificial chromosomes. Science 2024; 383:1344-1349. [PMID: 38513017 PMCID: PMC11059994 DOI: 10.1126/science.adj3566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 01/23/2024] [Indexed: 03/23/2024]
Abstract
Large DNA assembly methodologies underlie milestone achievements in synthetic prokaryotic and budding yeast chromosomes. While budding yeast control chromosome inheritance through ~125-base pair DNA sequence-defined centromeres, mammals and many other eukaryotes use large, epigenetic centromeres. Harnessing centromere epigenetics permits human artificial chromosome (HAC) formation but is not sufficient to avoid rampant multimerization of the initial DNA molecule upon introduction to cells. We describe an approach that efficiently forms single-copy HACs. It employs a ~750-kilobase construct that is sufficiently large to house the distinct chromatin types present at the inner and outer centromere, obviating the need to multimerize. Delivery to mammalian cells is streamlined by employing yeast spheroplast fusion. These developments permit faithful chromosome engineering in the context of metazoan cells.
Collapse
Affiliation(s)
- Craig W. Gambogi
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute
| | - Gabriel J. Birchak
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Graduate Program in Cell and Molecular Biology Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104 USA
| | - Elie Mer
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute
| | | | - George Yankson
- Wellcome Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK
| | - Kathryn Kixmoeller
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute
| | - Janardan N. Gavade
- Department of Biochemistry and Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute
| | | | - Prakriti Kashyap
- Department of Biochemistry and Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute
| | | | - Glennis A. Logsdon
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute
| | - Patrick Heun
- Wellcome Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK
| | | | - Ben E. Black
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute
- Graduate Program in Cell and Molecular Biology Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104 USA
| |
Collapse
|
5
|
Wu Z, Li T, Jiang Z, Zheng J, Gu Y, Liu Y, Liu Y, Xie Z. Human pangenome analysis of sequences missing from the reference genome reveals their widespread evolutionary, phenotypic, and functional roles. Nucleic Acids Res 2024; 52:2212-2230. [PMID: 38364871 PMCID: PMC10954445 DOI: 10.1093/nar/gkae086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 01/18/2024] [Accepted: 01/27/2024] [Indexed: 02/18/2024] Open
Abstract
Nonreference sequences (NRSs) are DNA sequences present in global populations but absent in the current human reference genome. However, the extent and functional significance of NRSs in the human genomes and populations remains unclear. Here, we de novo assembled 539 genomes from five genetically divergent human populations using long-read sequencing technology, resulting in the identification of 5.1 million NRSs. These were merged into 45284 unique NRSs, with 29.7% being novel discoveries. Among these NRSs, 38.7% were common across the five populations, and 35.6% were population specific. The use of a graph-based pangenome approach allowed for the detection of 565 transcript expression quantitative trait loci on NRSs, with 426 of these being novel findings. Moreover, 26 NRS candidates displayed evidence of adaptive selection within human populations. Genes situated in close proximity to or intersecting with these candidates may be associated with metabolism and type 2 diabetes. Genome-wide association studies revealed 14 NRSs to be significantly associated with eight phenotypes. Additionally, 154 NRSs were found to be in strong linkage disequilibrium with 258 phenotype-associated SNPs in the GWAS catalogue. Our work expands the understanding of human NRSs and provides novel insights into their functions, facilitating evolutionary and biomedical researches.
Collapse
Affiliation(s)
- Zhikun Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zehang Jiang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Jingjing Zheng
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yizhou Gu
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China
- University of Wisconsin-Madison, WI, USA
| | - Yizhi Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yun Liu
- MOE Key Laboratory of Metabolism and Molecular Medicine, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences and Shanghai Xuhui Central Hospital, Fudan University, Shanghai, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
6
|
Sawada Y, Minei R, Tabata H, Ikemura T, Wada K, Wada Y, Nagata H, Iwasaki Y. Unsupervised AI reveals insect species-specific genome signatures. PeerJ 2024; 12:e17025. [PMID: 38464746 PMCID: PMC10924456 DOI: 10.7717/peerj.17025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 02/07/2024] [Indexed: 03/12/2024] Open
Abstract
Insects are a highly diverse phylogeny and possess a wide variety of traits, including the presence or absence of wings and metamorphosis. These diverse traits are of great interest for studying genome evolution, and numerous comparative genomic studies have examined a wide phylogenetic range of insects. Here, we analyzed 22 insects belonging to a wide phylogenetic range (Endopterygota, Paraneoptera, Polyneoptera, Palaeoptera, and other insects) by using a batch-learning self-organizing map (BLSOM) for oligonucleotide compositions in their genomic fragments (100-kb or 1-Mb sequences), which is an unsupervised machine learning algorithm that can extract species-specific characteristics of the oligonucleotide compositions (genome signatures). The genome signature is of particular interest in terms of the mechanisms and biological significance that have caused the species-specific difference, and can be used as a powerful search needle to explore the various roles of genome sequences other than protein coding, and can be used to unveil mysteries hidden in the genome sequence. Since BLSOM is an unsupervised clustering method, the clustering of sequences was performed based on the oligonucleotide composition alone, without providing information about the species from which each fragment sequence was derived. Therefore, not only the interspecies separation, but also the intraspecies separation can be achieved. Here, we have revealed the specific genomic regions with oligonucleotide compositions distinct from the usual sequences of each insect genome, e.g., Mb-level structures found for a grasshopper Schistocerca americana. One aim of this study was to compare the genome characteristics of insects with those of vertebrates, especially humans, which are phylogenetically distant from insects. Recently, humans seem to be the "model organism" for which a large amount of information has been accumulated using a variety of cutting-edge and high-throughput technologies. Therefore, it is reasonable to use the abundant information from humans to study insect lineages. The specific regions of Mb length with distinct oligonucleotide compositions have also been previously observed in the human genome. These regions were enriched by transcription factor binding motifs (TFBSs) and hypothesized to be involved in the three-dimensional arrangement of chromosomal DNA in interphase nuclei. The present study characterized the species-specific oligonucleotide compositions (i.e., genome signatures) in insect genomes and identified specific genomic regions with distinct oligonucleotide compositions.
Collapse
Affiliation(s)
- Yui Sawada
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Tamura-cho, Japan
| | - Ryuhei Minei
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Tamura-cho, Japan
| | - Hiromasa Tabata
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Tamura-cho, Japan
| | - Toshimichi Ikemura
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Tamura-cho, Japan
| | - Kennosuke Wada
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Tamura-cho, Japan
| | - Yoshiko Wada
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Tamura-cho, Japan
| | - Hiroshi Nagata
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Tamura-cho, Japan
| | - Yuki Iwasaki
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Nagahama-shi, Tamura-cho, Japan
| |
Collapse
|
7
|
Gambogi CW, Mer E, Brown DM, Yankson G, Gavade JN, Logsdon GA, Heun P, Glass JI, Black BE. Efficient Formation of Single-copy Human Artificial Chromosomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.30.547284. [PMID: 37546784 PMCID: PMC10402137 DOI: 10.1101/2023.06.30.547284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Large DNA assembly methodologies underlie milestone achievements in synthetic prokaryotic and budding yeast chromosomes. While budding yeast control chromosome inheritance through ~125 bp DNA sequence-defined centromeres, mammals and many other eukaryotes use large, epigenetic centromeres. Harnessing centromere epigenetics permits human artificial chromosome (HAC) formation but is not sufficient to avoid rampant multimerization of the initial DNA molecule upon introduction to cells. Here, we describe an approach that efficiently forms single-copy HACs. It employs a ~750 kb construct that is sufficiently large to house the distinct chromatin types present at the inner and outer centromere, obviating the need to multimerize. Delivery to mammalian cells is streamlined by employing yeast spheroplast fusion. These developments permit faithful chromosome engineering in the context of metazoan cells.
Collapse
Affiliation(s)
- Craig W. Gambogi
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104 USA
| | - Elie Mer
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104 USA
| | | | - George Yankson
- Wellcome Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK
| | - Janardan N. Gavade
- Department of Biochemistry and Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104 USA
| | - Glennis A. Logsdon
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104 USA
| | - Patrick Heun
- Wellcome Centre for Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK
| | | | - Ben E. Black
- Department of Biochemistry and Biophysics
- Graduate Program in Biochemistry and Molecular Biophysics
- Penn Center for Genome Integrity
- Epigenetics Institute Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104 USA
| |
Collapse
|
8
|
Malik KK, Sridhara SC, Lone KA, Katariya PD, Pulimamidi D, Tyagi S. MLL methyltransferases regulate H3K4 methylation to ensure CENP-A assembly at human centromeres. PLoS Biol 2023; 21:e3002161. [PMID: 37379335 PMCID: PMC10335677 DOI: 10.1371/journal.pbio.3002161] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 07/11/2023] [Accepted: 05/12/2023] [Indexed: 06/30/2023] Open
Abstract
The active state of centromeres is epigenetically defined by the presence of CENP-A interspersed with histone H3 nucleosomes. While the importance of dimethylation of H3K4 for centromeric transcription has been highlighted in various studies, the identity of the enzyme(s) depositing these marks on the centromere is still unknown. The MLL (KMT2) family plays a crucial role in RNA polymerase II (Pol II)-mediated gene regulation by methylating H3K4. Here, we report that MLL methyltransferases regulate transcription of human centromeres. CRISPR-mediated down-regulation of MLL causes loss of H3K4me2, resulting in an altered epigenetic chromatin state of the centromeres. Intriguingly, our results reveal that loss of MLL, but not SETD1A, increases co-transcriptional R-loop formation, and Pol II accumulation at the centromeres. Finally, we report that the presence of MLL and SETD1A is crucial for kinetochore maintenance. Altogether, our data reveal a novel molecular framework where both the H3K4 methylation mark and the methyltransferases regulate stability and identity of the centromere.
Collapse
Affiliation(s)
- Kausika Kumar Malik
- Laboratory of Cell Cycle Regulation, Centre for DNA Fingerprinting and Diagnostics (CDFD), Uppal, Hyderabad, India
- Graduate Studies, Manipal Academy of Higher Education, Manipal, India
| | - Sreerama Chaitanya Sridhara
- Laboratory of Cell Cycle Regulation, Centre for DNA Fingerprinting and Diagnostics (CDFD), Uppal, Hyderabad, India
| | - Kaisar Ahmad Lone
- Laboratory of Cell Cycle Regulation, Centre for DNA Fingerprinting and Diagnostics (CDFD), Uppal, Hyderabad, India
- Graduate Studies, Regional Centre for Biotechnology, Faridabad, India
| | - Payal Deepakbhai Katariya
- Laboratory of Cell Cycle Regulation, Centre for DNA Fingerprinting and Diagnostics (CDFD), Uppal, Hyderabad, India
- Graduate Studies, Manipal Academy of Higher Education, Manipal, India
| | - Deepshika Pulimamidi
- Laboratory of Cell Cycle Regulation, Centre for DNA Fingerprinting and Diagnostics (CDFD), Uppal, Hyderabad, India
| | - Shweta Tyagi
- Laboratory of Cell Cycle Regulation, Centre for DNA Fingerprinting and Diagnostics (CDFD), Uppal, Hyderabad, India
| |
Collapse
|
9
|
Boukaba A, Wu Q, Liu J, Chen C, Liang J, Li J, Strunnikov A. Mapping separase-mediated cleavage in situ. NAR Genom Bioinform 2022; 4:lqac085. [PMID: 36415827 PMCID: PMC9673495 DOI: 10.1093/nargab/lqac085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 10/13/2022] [Accepted: 10/25/2022] [Indexed: 11/21/2022] Open
Abstract
Separase is a protease that performs critical functions in the maintenance of genetic homeostasis. Among them, the cleavage of the meiotic cohesin during meiosis is a key step in producing gametes in eukaryotes. However, the exact chromosomal localization of this proteolytic cleavage was not addressed due to the lack of experimental tools. To this end, we developed a method based on monoclonal antibodies capable of recognizing the predicted neo-epitopes produced by separase-mediated proteolysis in the RAD21 and REC8 cohesin subunits. To validate the epigenomic strategy of mapping cohesin proteolysis, anti-RAD21 neo-epitopes antibodies were used in ChIP-On-ChEPseq analysis of human cells undergoing mitotic anaphase. Second, a similar analysis applied for mapping of REC8 cleavage in germline cells in Macaque showed a correlation with a subset of alpha-satellites and other repeats, directly demonstrating that the site-specific mei-cohesin proteolysis hotspots are coincident but not identical with centromeres. The sequences for the corresponding immunoglobulin genes show a convergence of antibodies with close specificity. This approach could be potentially used to investigate cohesin ring opening events in other chromosomal locations, if applied to single cells.
Collapse
Affiliation(s)
- Abdelhalim Boukaba
- Molecular Epigenetics Laboratory, Guangzhou Institutes of Biomedicine and Health , Guangzhou , Guangdong , 510530 , China
| | - Qiongfang Wu
- Molecular Epigenetics Laboratory, Guangzhou Institutes of Biomedicine and Health , Guangzhou , Guangdong , 510530 , China
| | - Jian Liu
- Molecular Epigenetics Laboratory, Guangzhou Institutes of Biomedicine and Health , Guangzhou , Guangdong , 510530 , China
| | - Cheng Chen
- Molecular Epigenetics Laboratory, Guangzhou Institutes of Biomedicine and Health , Guangzhou , Guangdong , 510530 , China
| | - Jierong Liang
- Molecular Epigenetics Laboratory, Guangzhou Institutes of Biomedicine and Health , Guangzhou , Guangdong , 510530 , China
| | - Jingjing Li
- Molecular Epigenetics Laboratory, Guangzhou Institutes of Biomedicine and Health , Guangzhou , Guangdong , 510530 , China
| | - Alexander V Strunnikov
- Molecular Epigenetics Laboratory, Guangzhou Institutes of Biomedicine and Health , Guangzhou , Guangdong , 510530 , China
| |
Collapse
|
10
|
Naughton C, Huidobro C, Catacchio CR, Buckle A, Grimes GR, Nozawa RS, Purgato S, Rocchi M, Gilbert N. Human centromere repositioning activates transcription and opens chromatin fibre structure. Nat Commun 2022; 13:5609. [PMID: 36153345 PMCID: PMC9509383 DOI: 10.1038/s41467-022-33426-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 09/14/2022] [Indexed: 11/09/2022] Open
Abstract
AbstractHuman centromeres appear as constrictions on mitotic chromosomes and form a platform for kinetochore assembly in mitosis. Biophysical experiments led to a suggestion that repetitive DNA at centromeric regions form a compact scaffold necessary for function, but this was revised when neocentromeres were discovered on non-repetitive DNA. To test whether centromeres have a special chromatin structure we have analysed the architecture of a neocentromere. Centromere repositioning is accompanied by RNA polymerase II recruitment and active transcription to form a decompacted, negatively supercoiled domain enriched in ‘open’ chromatin fibres. In contrast, centromerisation causes a spreading of repressive epigenetic marks to surrounding regions, delimited by H3K27me3 polycomb boundaries and divergent genes. This flanking domain is transcriptionally silent and partially remodelled to form ‘compact’ chromatin, similar to satellite-containing DNA sequences, and exhibits genomic instability. We suggest transcription disrupts chromatin to provide a foundation for kinetochore formation whilst compact pericentromeric heterochromatin generates mechanical rigidity.
Collapse
|
11
|
Iwasaki Y, Ikemura T, Wada K, Wada Y, Abe T. Comparative genomic analysis of the human genome and six bat genomes using unsupervised machine learning: Mb-level CpG and TFBS islands. BMC Genomics 2022; 23:497. [PMID: 35804296 PMCID: PMC9264310 DOI: 10.1186/s12864-022-08664-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 05/31/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Emerging infectious disease-causing RNA viruses, such as the SARS-CoV-2 and Ebola viruses, are thought to rely on bats as natural reservoir hosts. Since these zoonotic viruses pose a great threat to humans, it is important to characterize the bat genome from multiple perspectives. Unsupervised machine learning methods for extracting novel information from big sequence data without prior knowledge or particular models are highly desirable for obtaining unexpected insights. We previously established a batch-learning self-organizing map (BLSOM) of the oligonucleotide composition that reveals novel genome characteristics from big sequence data. RESULTS In this study, using the oligonucleotide BLSOM, we conducted a comparative genomic study of humans and six bat species. BLSOM is an explainable-type machine learning algorithm that reveals the diagnostic oligonucleotides contributing to sequence clustering (self-organization). When unsupervised machine learning reveals unexpected and/or characteristic features, these features can be studied in more detail via the much simpler and more direct standard distribution map method. Based on this combined strategy, we identified the Mb-level enrichment of CG dinucleotide (Mb-level CpG islands) around the termini of bat long-scaffold sequences. In addition, a class of CG-containing oligonucleotides were enriched in the centromeric and pericentromeric regions of human chromosomes. Oligonucleotides longer than tetranucleotides often represent binding motifs for a wide variety of proteins (e.g., transcription factor binding sequences (TFBSs)). By analyzing the penta- and hexanucleotide composition, we observed the evident enrichment of a wide range of hexanucleotide TFBSs in centromeric and pericentromeric heterochromatin regions on all human chromosomes. CONCLUSION Function of transcription factors (TFs) beyond their known regulation of gene expression (e.g., TF-mediated looping interactions between two different genomic regions) has received wide attention. The Mb-level TFBS and CpG islands are thought to be involved in the large-scale nuclear organization, such as centromere and telomere clustering. TFBSs, which are enriched in centromeric and pericentromeric heterochromatin regions, are thought to play an important role in the formation of nuclear 3D structures. Our machine learning-based analysis will help us to understand the differential features of nuclear 3D structures in the human and bat genomes.
Collapse
Affiliation(s)
- Yuki Iwasaki
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Tamura-cho 1266, Nagahama-shi, Shiga-ken, 526-0829, Japan
| | - Toshimichi Ikemura
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Tamura-cho 1266, Nagahama-shi, Shiga-ken, 526-0829, Japan.
| | - Kennosuke Wada
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Tamura-cho 1266, Nagahama-shi, Shiga-ken, 526-0829, Japan
| | - Yoshiko Wada
- Department of Bioscience, Nagahama Institute of Bio-Science and Technology, Tamura-cho 1266, Nagahama-shi, Shiga-ken, 526-0829, Japan
| | - Takashi Abe
- Smart Information Systems, Faculty of Engineering, Niigata University, Niigata-ken, 950-2181, Japan.
| |
Collapse
|
12
|
Altemose N, Maslan A, Smith OK, Sundararajan K, Brown RR, Mishra R, Detweiler AM, Neff N, Miga KH, Straight AF, Streets A. DiMeLo-seq: a long-read, single-molecule method for mapping protein-DNA interactions genome wide. Nat Methods 2022; 19:711-723. [PMID: 35396487 PMCID: PMC9189060 DOI: 10.1038/s41592-022-01475-6] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Accepted: 03/24/2022] [Indexed: 12/13/2022]
Abstract
Studies of genome regulation routinely use high-throughput DNA sequencing approaches to determine where specific proteins interact with DNA, and they rely on DNA amplification and short-read sequencing, limiting their quantitative application in complex genomic regions. To address these limitations, we developed directed methylation with long-read sequencing (DiMeLo-seq), which uses antibody-tethered enzymes to methylate DNA near a target protein's binding sites in situ. These exogenous methylation marks are then detected simultaneously with endogenous CpG methylation on unamplified DNA using long-read, single-molecule sequencing technologies. We optimized and benchmarked DiMeLo-seq by mapping chromatin-binding proteins and histone modifications across the human genome. Furthermore, we identified where centromere protein A localizes within highly repetitive regions that were unmappable with short sequencing reads, and we estimated the density of centromere protein A molecules along single chromatin fibers. DiMeLo-seq is a versatile method that provides multimodal, genome-wide information for investigating protein-DNA interactions.
Collapse
Affiliation(s)
- Nicolas Altemose
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Department of Molecular & Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Annie Maslan
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Owen K Smith
- Department of Biochemistry, Stanford University, Stanford, CA, USA
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA, USA
| | | | - Rachel R Brown
- Department of Biochemistry, Stanford University, Stanford, CA, USA
| | - Reet Mishra
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
| | | | - Norma Neff
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Karen H Miga
- Department of Molecular & Cell Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Aaron F Straight
- Department of Biochemistry, Stanford University, Stanford, CA, USA.
| | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA.
- UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, Berkeley, Berkeley, CA, USA.
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
13
|
Altemose N, Glennis A, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, Hoyt SJ, Uralsky L, Ryabov FD, Shew CJ, Sauria MEG, Borchers M, Gershman A, Mikheenko A, Shepelev VA, Dvorkina T, Kunyavskaya O, Vollger MR, Rhie A, McCartney AM, Asri M, Lorig-Roach R, Shafin K, Aganezov S, Olson D, de Lima LG, Potapova T, Hartley GA, Haukness M, Kerpedjiev P, Gusev F, Tigyi K, Brooks S, Young A, Nurk S, Koren S, Salama SR, Paten B, Rogaev EI, Streets A, Karpen GH, Dernburg AF, Sullivan BA, Straight AF, Wheeler TJ, Gerton JL, Eichler EE, Phillippy AM, Timp W, Dennis MY, O'Neill RJ, Zook JM, Schatz MC, Pevzner PA, Diekhans M, Langley CH, Alexandrov IA, Miga KH. Complete genomic and epigenetic maps of human centromeres. Science 2022; 376:eabl4178. [PMID: 35357911 PMCID: PMC9233505 DOI: 10.1126/science.abl4178] [Citation(s) in RCA: 193] [Impact Index Per Article: 96.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.
Collapse
Affiliation(s)
- Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - A. Glennis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Pragya Sidhwani
- Department of Biochemistry, Stanford University, Stanford, CA, USA
| | - Sasha A. Langley
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Gina V. Caldas
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Savannah J. Hoyt
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Lev Uralsky
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
| | | | - Colin J. Shew
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | | | | | - Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | | | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ann M. McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ryan Lorig-Roach
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Daniel Olson
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | | | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Gabrielle A. Hartley
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Fedor Gusev
- Vavilov Institute of General Genetics, Moscow, Russia
| | - Kristof Tigyi
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Shelise Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alice Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sofie R. Salama
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| | - Evgeny I. Rogaev
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School, Worcester, MA, USA
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Gary H. Karpen
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- BioEngineering and BioMedical Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Abby F. Dernburg
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA
| | | | - Travis J. Wheeler
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | - Jennifer L. Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical School, Department of Biochemistry and Molecular Biology and Cancer Center, University of Kansas, Kansas City, KS, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | - Rachel J. O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California at San Diego, San Diego, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Charles H. Langley
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
| | - Ivan A. Alexandrov
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
- Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| |
Collapse
|
14
|
Jeffery D, Lochhead M, Almouzni G. CENP-A: A Histone H3 Variant with Key Roles in Centromere Architecture in Healthy and Diseased States. Results Probl Cell Differ 2022; 70:221-261. [PMID: 36348109 DOI: 10.1007/978-3-031-06573-6_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Centromeres are key architectural components of chromosomes. Here, we examine their construction, maintenance, and functionality. Focusing on the mammalian centromere- specific histone H3 variant, CENP-A, we highlight its coevolution with both centromeric DNA and its chaperone, HJURP. We then consider CENP-A de novo deposition and the importance of centromeric DNA recently uncovered with the added value from new ultra-long-read sequencing. We next review how to ensure the maintenance of CENP-A at the centromere throughout the cell cycle. Finally, we discuss the impact of disrupting CENP-A regulation on cancer and cell fate.
Collapse
Affiliation(s)
- Daniel Jeffery
- Equipe Labellisée Ligue contre le Cancer, Institut Curie, PSL Research University, CNRS, Sorbonne Université, Nuclear Dynamics Unit, UMR3664, Paris, France
| | - Marina Lochhead
- Equipe Labellisée Ligue contre le Cancer, Institut Curie, PSL Research University, CNRS, Sorbonne Université, Nuclear Dynamics Unit, UMR3664, Paris, France
| | - Geneviève Almouzni
- Equipe Labellisée Ligue contre le Cancer, Institut Curie, PSL Research University, CNRS, Sorbonne Université, Nuclear Dynamics Unit, UMR3664, Paris, France.
| |
Collapse
|
15
|
Ikemura T, Iwasaki Y, Wada K, Wada Y, Abe T. AI for the collective analysis of a massive number of genome sequences: various examples from the small genome of pandemic SARS-CoV-2 to the human genome. Genes Genet Syst 2021; 96:165-176. [PMID: 34565757 DOI: 10.1266/ggs.21-00025] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
In genetics and related fields, huge amounts of data, such as genome sequences, are accumulating, and the use of artificial intelligence (AI) suitable for big data analysis has become increasingly important. Unsupervised AI that can reveal novel knowledge from big data without prior knowledge or particular models is highly desirable for analyses of genome sequences, particularly for obtaining unexpected insights. We have developed a batch-learning self-organizing map (BLSOM) for oligonucleotide compositions that can reveal various novel genome characteristics. Here, we explain the data mining by the BLSOM: an unsupervised AI. As a specific target, we first selected SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) because a large number of viral genome sequences have been accumulated via worldwide efforts. We analyzed more than 0.6 million sequences collected primarily in the first year of the pandemic. BLSOMs for short oligonucleotides (e.g., 4-6-mers) allowed separation into known clades, but longer oligonucleotides further increased the separation ability and revealed subgrouping within known clades. In the case of 15-mers, there is mostly one copy in the genome; thus, 15-mers that appeared after the epidemic started could be connected to mutations, and the BLSOM for 15-mers revealed the mutations that contributed to separation into known clades and their subgroups. After introducing the detailed methodological strategies, we explain BLSOMs for various topics, such as the tetranucleotide BLSOM for over 5 million 5-kb fragment sequences derived from almost all microorganisms currently available and its use in metagenome studies. We also explain BLSOMs for various eukaryotes, including fishes, frogs and Drosophila species, and found a high separation ability among closely related species. When analyzing the human genome, we found enrichments in transcription factor-binding sequences in centromeric and pericentromeric heterochromatin regions. The tDNAs (tRNA genes) could be separated according to their corresponding amino acid.
Collapse
Affiliation(s)
| | - Yuki Iwasaki
- Faculty of Bioscience, Nagahama Institute of Bio-Science and Technology
| | - Kennosuke Wada
- Faculty of Bioscience, Nagahama Institute of Bio-Science and Technology
| | - Yoshiko Wada
- Faculty of Bioscience, Nagahama Institute of Bio-Science and Technology
| | - Takashi Abe
- Department of Information Engineering, Faculty of Engineering, Niigata University
| |
Collapse
|
16
|
Joshi A, Musicante MJ, Wheeler BS. Defining the consequences of endogenous genetic variation within a novel family of Schizosaccharomyces pombe heterochromatin nucleating sequences. G3 GENES|GENOMES|GENETICS 2021; 11:6291246. [PMID: 34849813 PMCID: PMC8496282 DOI: 10.1093/g3journal/jkab185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Accepted: 05/20/2021] [Indexed: 11/13/2022]
Abstract
Centromeres are essential for genetic inheritance—they prevent aneuploidy by providing a physical link between DNA and chromosome segregation machinery. In many organisms, centromeres form at sites of repetitive DNAs that help establish the chromatin architecture required for centromere function. These repeats are often rapidly evolving and subject to homogenization, which causes the expansion of novel repeats and sequence turnover. Thus, centromere sequence varies between individuals and across species. This variation can affect centromere function. We utilized Schizosaccharomyces pombe to assess the relationship between centromere sequence and chromatin structure and determine how sensitive this relationship is to genetic variation. In S. pombe, nucleating sequences within centromere repeats recruit heterochromatin via multiple mechanisms, which include RNA-interference (RNAi) . Heterochromatin, in turn, contributes to centromere function through its participation in three essential processes; establishment of a kinetochore, cohesion of sister chromatids, and suppression of recombination. Here, we show that a centromere element containing RevCen, a target of the RNAi pathway, establishes heterochromatin and gene silencing when relocated to a chromosome arm. Within this RevCen-containing element (RCE), a highly conserved domain is necessary for full heterochromatin nucleation but cannot establish heterochromatin independently. We characterize the 10 unique RCEs in the S. pombe centromere assembly, which range from 60% to 99.6% identical, and show that all are sufficient to establish heterochromatin. These data affirm the importance of centromere repeats in establishing heterochromatin and suggest there is flexibility within the sequences that mediate this process. Such flexibility may preserve centromere function despite the rapid evolution of centromere repeats.
Collapse
Affiliation(s)
- Arati Joshi
- Department of Biology, Rhodes College, Memphis, TN 38112, USA
| | | | - Bayly S Wheeler
- Department of Biology, Rhodes College, Memphis, TN 38112, USA
| |
Collapse
|
17
|
Decombe S, Loll F, Caccianini L, Affannoukoué K, Izeddin I, Mozziconacci J, Escudé C, Lopes J. Epigenetic rewriting at centromeric DNA repeats leads to increased chromatin accessibility and chromosomal instability. Epigenetics Chromatin 2021; 14:35. [PMID: 34321103 PMCID: PMC8317386 DOI: 10.1186/s13072-021-00410-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 07/18/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Centromeric regions of human chromosomes contain large numbers of tandemly repeated α-satellite sequences. These sequences are covered with constitutive heterochromatin which is enriched in trimethylation of histone H3 on lysine 9 (H3K9me3). Although well studied using artificial chromosomes and global perturbations, the contribution of this epigenetic mark to chromatin structure and genome stability remains poorly known in a more natural context. RESULTS Using transcriptional activator-like effectors (TALEs) fused to a histone lysine demethylase (KDM4B), we were able to reduce the level of H3K9me3 on the α-satellites repeats of human chromosome 7. We show that the removal of H3K9me3 affects chromatin structure by increasing the accessibility of DNA repeats to the TALE protein. Tethering TALE-demethylase to centromeric repeats impairs the recruitment of HP1α and proteins of Chromosomal Passenger Complex (CPC) on this specific centromere without affecting CENP-A loading. Finally, the epigenetic re-writing by the TALE-KDM4B affects specifically the stability of chromosome 7 upon mitosis, highlighting the importance of H3K9me3 in centromere integrity and chromosome stability, mediated by the recruitment of HP1α and the CPC. CONCLUSION Our cellular model allows to demonstrate the direct role of pericentromeric H3K9me3 epigenetic mark on centromere integrity and function in a natural context and opens interesting possibilities for further studies regarding the role of the H3K9me3 mark.
Collapse
Affiliation(s)
- Sheldon Decombe
- Laboratoire Structure et Instabilité des Génomes, INSERM U1154, CNRS UM7196, Muséum National d'Histoire Naturelle, 43 rue Cuvier, 75005, Paris, France.,DCCBR, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | - François Loll
- Laboratoire Structure et Instabilité des Génomes, INSERM U1154, CNRS UM7196, Muséum National d'Histoire Naturelle, 43 rue Cuvier, 75005, Paris, France.,INSERM, UMR 1229, Regenerative Medicine and Skeleton, Université de Nantes, ONIRIS, 44042, Nantes, France
| | - Laura Caccianini
- Laboratoire Physico-Chimie, Institut Curie, CNRS UMR168, Paris-Science Lettres, Sorbonne Université, 75005, Paris, France
| | - Kévin Affannoukoué
- Institut Langevin, ESPCI Paris, PSL Université, CNRS, 75005, Paris, France.,Institut Fresnel, Aix Marseille Université CNRS Centrale Marseille, Marseille, France
| | - Ignacio Izeddin
- Institut Langevin, ESPCI Paris, PSL Université, CNRS, 75005, Paris, France
| | - Julien Mozziconacci
- Laboratoire Structure et Instabilité des Génomes, INSERM U1154, CNRS UM7196, Muséum National d'Histoire Naturelle, 43 rue Cuvier, 75005, Paris, France
| | - Christophe Escudé
- Laboratoire Structure et Instabilité des Génomes, INSERM U1154, CNRS UM7196, Muséum National d'Histoire Naturelle, 43 rue Cuvier, 75005, Paris, France
| | - Judith Lopes
- Laboratoire Structure et Instabilité des Génomes, INSERM U1154, CNRS UM7196, Muséum National d'Histoire Naturelle, 43 rue Cuvier, 75005, Paris, France.
| |
Collapse
|
18
|
Lopes M, Louzada S, Gama-Carvalho M, Chaves R. Genomic Tackling of Human Satellite DNA: Breaking Barriers through Time. Int J Mol Sci 2021; 22:4707. [PMID: 33946766 PMCID: PMC8125562 DOI: 10.3390/ijms22094707] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/24/2021] [Accepted: 04/27/2021] [Indexed: 12/12/2022] Open
Abstract
(Peri)centromeric repetitive sequences and, more specifically, satellite DNA (satDNA) sequences, constitute a major human genomic component. SatDNA sequences can vary on a large number of features, including nucleotide composition, complexity, and abundance. Several satDNA families have been identified and characterized in the human genome through time, albeit at different speeds. Human satDNA families present a high degree of sub-variability, leading to the definition of various subfamilies with different organization and clustered localization. Evolution of satDNA analysis has enabled the progressive characterization of satDNA features. Despite recent advances in the sequencing of centromeric arrays, comprehensive genomic studies to assess their variability are still required to provide accurate and proportional representation of satDNA (peri)centromeric/acrocentric short arm sequences. Approaches combining multiple techniques have been successfully applied and seem to be the path to follow for generating integrated knowledge in the promising field of human satDNA biology.
Collapse
Affiliation(s)
- Mariana Lopes
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Sandra Louzada
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Margarida Gama-Carvalho
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| | - Raquel Chaves
- Laboratory of Cytogenomics and Animal Genomics (CAG), Department of Genetics and Biotechnology (DGB), University of Trás-os-Montes and Alto Douro (UTAD), 5000-801 Vila Real, Portugal; (M.L.); (S.L.)
- Biosystems and Integrative Sciences Institute (BioISI), Faculty of Sciences, University of Lisbon, 1749-016 Lisbon, Portugal;
| |
Collapse
|
19
|
Thakur J, Packiaraj J, Henikoff S. Sequence, Chromatin and Evolution of Satellite DNA. Int J Mol Sci 2021; 22:ijms22094309. [PMID: 33919233 PMCID: PMC8122249 DOI: 10.3390/ijms22094309] [Citation(s) in RCA: 92] [Impact Index Per Article: 30.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 04/16/2021] [Accepted: 04/17/2021] [Indexed: 12/15/2022] Open
Abstract
Satellite DNA consists of abundant tandem repeats that play important roles in cellular processes, including chromosome segregation, genome organization and chromosome end protection. Most satellite DNA repeat units are either of nucleosomal length or 5–10 bp long and occupy centromeric, pericentromeric or telomeric regions. Due to high repetitiveness, satellite DNA sequences have largely been absent from genome assemblies. Although few conserved satellite-specific sequence motifs have been identified, DNA curvature, dyad symmetries and inverted repeats are features of various satellite DNAs in several organisms. Satellite DNA sequences are either embedded in highly compact gene-poor heterochromatin or specialized chromatin that is distinct from euchromatin. Nevertheless, some satellite DNAs are transcribed into non-coding RNAs that may play important roles in satellite DNA function. Intriguingly, satellite DNAs are among the most rapidly evolving genomic elements, such that a large fraction is species-specific in most organisms. Here we describe the different classes of satellite DNA sequences, their satellite-specific chromatin features, and how these features may contribute to satellite DNA biology and evolution. We also discuss how the evolution of functional satellite DNA classes may contribute to speciation in plants and animals.
Collapse
Affiliation(s)
- Jitendra Thakur
- Department of Biology, Emory University, Atlanta, GA 30322, USA;
- Correspondence:
| | - Jenika Packiaraj
- Department of Biology, Emory University, Atlanta, GA 30322, USA;
| | - Steven Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA;
- Fred Hutchinson Cancer Research Center, Howard Hughes Medical Institute, Seattle, WA 98109, USA
| |
Collapse
|
20
|
Smith OK, Limouse C, Fryer KA, Teran NA, Sundararajan K, Heald R, Straight AF. Identification and characterization of centromeric sequences in Xenopus laevis. Genome Res 2021; 31:958-967. [PMID: 33875480 PMCID: PMC8168581 DOI: 10.1101/gr.267781.120] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 04/08/2021] [Indexed: 11/24/2022]
Abstract
Centromeres play an essential function in cell division by specifying the site of kinetochore formation on each chromosome for mitotic spindle attachment. Centromeres are defined epigenetically by the histone H3 variant Centromere Protein A (Cenpa). Cenpa nucleosomes maintain the centromere by designating the site for new Cenpa assembly after dilution by replication. Vertebrate centromeres assemble on tandem arrays of repetitive sequences, but the function of repeat DNA in centromere formation has been challenging to dissect due to the difficulty in manipulating centromeres in cells. Xenopus laevis egg extracts assemble centromeres in vitro, providing a system for studying centromeric DNA functions. However, centromeric sequences in Xenopus laevis have not been extensively characterized. In this study, we combine Cenpa ChIP-seq with a k-mer based analysis approach to identify the Xenopus laevis centromere repeat sequences. By in situ hybridization, we show that Xenopus laevis centromeres contain diverse repeat sequences, and we map the centromere position on each Xenopus laevis chromosome using the distribution of centromere-enriched k-mers. Our identification of Xenopus laevis centromere sequences enables previously unapproachable centromere genomic studies. Our approach should be broadly applicable for the analysis of centromere and other repetitive sequences in any organism.
Collapse
Affiliation(s)
- Owen K Smith
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305-5307, USA.,Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Charles Limouse
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305-5307, USA
| | - Kelsey A Fryer
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305-5307, USA.,Department of Genetics, Stanford University School of Medicine, Stanford, California 94305-5120, USA
| | - Nicole A Teran
- Department of Genetics, Stanford University School of Medicine, Stanford, California 94305-5120, USA
| | - Kousik Sundararajan
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305-5307, USA
| | - Rebecca Heald
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, California 94720-3200, USA
| | - Aaron F Straight
- Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305-5307, USA
| |
Collapse
|
21
|
Dvorkina T, Bzikadze AV, Pevzner PA. The string decomposition problem and its applications to centromere analysis and assembly. Bioinformatics 2021; 36:i93-i101. [PMID: 32657390 PMCID: PMC7428072 DOI: 10.1093/bioinformatics/btaa454] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Motivation Recent attempts to assemble extra-long tandem repeats (such as centromeres) faced the challenge of translating long error-prone reads from the nucleotide alphabet into the alphabet of repeat units. Human centromeres represent a particularly complex type of high-order repeats (HORs) formed by chromosome-specific monomers. Given a set of all human monomers, translating a read from a centromere into the monomer alphabet is modeled as the String Decomposition Problem. The accurate translation of reads into the monomer alphabet turns the notoriously difficult problem of assembling centromeres from reads (in the nucleotide alphabet) into a more tractable problem of assembling centromeres from translated reads. Results We describe a StringDecomposer (SD) algorithm for solving this problem, benchmark it on the set of long error-prone Oxford Nanopore reads generated by the Telomere-to-Telomere consortium and identify a novel (rare) monomer that extends the set of known X-chromosome specific monomers. Our identification of a novel monomer emphasizes the importance of identification of all (even rare) monomers for future centromere assembly efforts and evolutionary studies. To further analyze novel monomers, we applied SD to the set of recently generated long accurate Pacific Biosciences HiFi reads. This analysis revealed that the set of known human monomers and HORs remains incomplete. SD opens a possibility to generate a complete set of human monomers and HORs for using in the ongoing efforts to generate the complete assembly of the human genome. Availability and implementation StringDecomposer is publicly available on https://github.com/ablab/stringdecomposer. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA 92093, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA
| |
Collapse
|
22
|
Bzikadze AV, Pevzner PA. Automated assembly of centromeres from ultra-long error-prone reads. Nat Biotechnol 2020; 38:1309-1316. [PMID: 32665660 PMCID: PMC10718184 DOI: 10.1038/s41587-020-0582-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2019] [Accepted: 05/29/2020] [Indexed: 12/12/2022]
Abstract
Centromeric variation has been linked to cancer and infertility, but centromere sequences contain multiple tandem repeats and can only be assembled manually from long error-prone reads. Here we describe the centroFlye algorithm for centromere assembly using long error-prone reads, and apply it to assemble human centromeres on chromosomes 6 and X. Our analyses reveal putative breakpoints in the manual reconstruction of the human X centromere, demonstrate that human X chromosome is partitioned into repeat subfamilies and provide initial insights into centromere evolution. We anticipate that centroFlye could be applied to automatically close remaining multimegabase gaps in the reference human genome.
Collapse
Affiliation(s)
- Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
23
|
Pálinkás HL, Békési A, Róna G, Pongor L, Papp G, Tihanyi G, Holub E, Póti Á, Gemma C, Ali S, Morten MJ, Rothenberg E, Pagano M, Szűts D, Győrffy B, Vértessy BG. Genome-wide alterations of uracil distribution patterns in human DNA upon chemotherapeutic treatments. eLife 2020; 9:e60498. [PMID: 32956035 PMCID: PMC7505663 DOI: 10.7554/elife.60498] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 08/23/2020] [Indexed: 12/17/2022] Open
Abstract
Numerous anti-cancer drugs perturb thymidylate biosynthesis and lead to genomic uracil incorporation contributing to their antiproliferative effect. Still, it is not yet characterized if uracil incorporations have any positional preference. Here, we aimed to uncover genome-wide alterations in uracil pattern upon drug treatments in human cancer cell line models derived from HCT116. We developed a straightforward U-DNA sequencing method (U-DNA-Seq) that was combined with in situ super-resolution imaging. Using a novel robust analysis pipeline, we found broad regions with elevated probability of uracil occurrence both in treated and non-treated cells. Correlation with chromatin markers and other genomic features shows that non-treated cells possess uracil in the late replicating constitutive heterochromatic regions, while drug treatment induced a shift of incorporated uracil towards segments that are normally more active/functional. Data were corroborated by colocalization studies via dSTORM microscopy. This approach can be applied to study the dynamic spatio-temporal nature of genomic uracil.
Collapse
Affiliation(s)
- Hajnalka L Pálinkás
- Genome Metabolism Research Group, Institute of Enzymology, Research Centre for Natural SciencesBudapestHungary
- Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and EconomicsBudapestHungary
- Doctoral School of Multidisciplinary Medical Science, University of SzegedSzegedHungary
| | - Angéla Békési
- Genome Metabolism Research Group, Institute of Enzymology, Research Centre for Natural SciencesBudapestHungary
- Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and EconomicsBudapestHungary
| | - Gergely Róna
- Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and EconomicsBudapestHungary
- Department of Biochemistry and Molecular Pharmacology, New York University School of MedicineNew YorkUnited States
- Perlmutter Cancer Center, New York University School of MedicineNew YorkUnited States
- Howard Hughes Medical Institute, New York University School of MedicineNew YorkUnited States
| | - Lőrinc Pongor
- Cancer Biomarker Research Group, Institute of Enzymology, Research Centre for Natural SciencesBudapestHungary
- Department of Bioinformatics and 2nd Department of Pediatrics, Semmelweis UniversityBudapestHungary
| | - Gábor Papp
- Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and EconomicsBudapestHungary
| | - Gergely Tihanyi
- Genome Metabolism Research Group, Institute of Enzymology, Research Centre for Natural SciencesBudapestHungary
- Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and EconomicsBudapestHungary
| | - Eszter Holub
- Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and EconomicsBudapestHungary
| | - Ádám Póti
- Genome Stability Research Group, Institute of Enzymology, Research Centre for Natural SciencesBudapestHungary
| | - Carolina Gemma
- Department of Surgery and Cancer, Imperial College London, Hammersmith Hospital CampusLondonUnited Kingdom
| | - Simak Ali
- Department of Surgery and Cancer, Imperial College London, Hammersmith Hospital CampusLondonUnited Kingdom
| | - Michael J Morten
- Department of Biochemistry and Molecular Pharmacology, New York University School of MedicineNew YorkUnited States
| | - Eli Rothenberg
- Department of Biochemistry and Molecular Pharmacology, New York University School of MedicineNew YorkUnited States
| | - Michele Pagano
- Department of Biochemistry and Molecular Pharmacology, New York University School of MedicineNew YorkUnited States
- Perlmutter Cancer Center, New York University School of MedicineNew YorkUnited States
- Howard Hughes Medical Institute, New York University School of MedicineNew YorkUnited States
| | - Dávid Szűts
- Genome Stability Research Group, Institute of Enzymology, Research Centre for Natural SciencesBudapestHungary
| | - Balázs Győrffy
- Cancer Biomarker Research Group, Institute of Enzymology, Research Centre for Natural SciencesBudapestHungary
- Department of Bioinformatics and 2nd Department of Pediatrics, Semmelweis UniversityBudapestHungary
| | - Beáta G Vértessy
- Genome Metabolism Research Group, Institute of Enzymology, Research Centre for Natural SciencesBudapestHungary
- Department of Applied Biotechnology and Food Sciences, Budapest University of Technology and EconomicsBudapestHungary
| |
Collapse
|
24
|
Construction and analysis of artificial chromosomes with de novo holocentromeres in Caenorhabditis elegans. Essays Biochem 2020; 64:233-249. [PMID: 32756873 DOI: 10.1042/ebc20190067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 07/16/2020] [Accepted: 07/20/2020] [Indexed: 02/07/2023]
Abstract
Artificial chromosomes (ACs), generated in yeast (YACs) and human cells (HACs), have facilitated our understanding of the trans-acting proteins, cis-acting elements, such as the centromere, and epigenetic environments that are necessary to maintain chromosome stability. The centromere is the unique chromosomal region that assembles the kinetochore and connects to microtubules to orchestrate chromosome movement during cell division. While monocentromeres are the most commonly characterized centromere organization found in studied organisms, diffused holocentromeres along the chromosome length are observed in some plants, insects and nematodes. Based on the well-established DNA microinjection method in holocentric Caenorhabditis elegans, concatemerization of foreign DNA can efficiently generate megabase-sized extrachromosomal arrays (Exs), or worm ACs (WACs), for analyzing the mechanisms of WAC formation, de novo centromere formation, and segregation through mitosis and meiosis. This review summarizes the structural, size and stability characteristics of WACs. Incorporating LacO repeats in WACs and expressing LacI::GFP allows real-time tracking of newly formed WACs in vivo, whereas expressing LacI::GFP-chromatin modifier fusions can specifically adjust the chromatin environment of WACs. The WACs mature from passive transmission to autonomous segregation by establishing a holocentromere efficiently in a few cell cycles. Importantly, WAC formation does not require any C. elegans genomic DNA sequence. Thus, DNA substrates injected can be changed to evaluate the effects of DNA sequence and structure in WAC segregation. By injecting a complex mixture of DNA, a less repetitive WAC can be generated and propagated in successive generations for DNA sequencing and analysis of the established holocentromere on the WAC.
Collapse
|
25
|
Martins NMC, Cisneros-Soberanis F, Pesenti E, Kochanova NY, Shang WH, Hori T, Nagase T, Kimura H, Larionov V, Masumoto H, Fukagawa T, Earnshaw WC. H3K9me3 maintenance on a human artificial chromosome is required for segregation but not centromere epigenetic memory. J Cell Sci 2020; 133:jcs242610. [PMID: 32576667 PMCID: PMC7390644 DOI: 10.1242/jcs.242610] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Accepted: 06/11/2020] [Indexed: 12/24/2022] Open
Abstract
Most eukaryotic centromeres are located within heterochromatic regions. Paradoxically, heterochromatin can also antagonize de novo centromere formation, and some centromeres lack it altogether. In order to investigate the importance of heterochromatin at centromeres, we used epigenetic engineering of a synthetic alphoidtetO human artificial chromosome (HAC), to which chimeric proteins can be targeted. By tethering the JMJD2D demethylase (also known as KDM4D), we removed heterochromatin mark H3K9me3 (histone 3 lysine 9 trimethylation) specifically from the HAC centromere. This caused no short-term defects, but long-term tethering reduced HAC centromere protein levels and triggered HAC mis-segregation. However, centromeric CENP-A was maintained at a reduced level. Furthermore, HAC centromere function was compatible with an alternative low-H3K9me3, high-H3K27me3 chromatin signature, as long as residual levels of H3K9me3 remained. When JMJD2D was released from the HAC, H3K9me3 levels recovered over several days back to initial levels along with CENP-A and CENP-C centromere levels, and mitotic segregation fidelity. Our results suggest that a minimal level of heterochromatin is required to stabilize mitotic centromere function but not for maintaining centromere epigenetic memory, and that a homeostatic pathway maintains heterochromatin at centromeres.This article has an associated First Person interview with the first authors of the paper.
Collapse
Affiliation(s)
| | | | - Elisa Pesenti
- Wellcome Trust Centre for Cell Biology, Edinburgh, UK
| | | | - Wei-Hao Shang
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| | - Tetsuya Hori
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| | | | - Hiroshi Kimura
- Cell Biology Unit, Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Japan
| | - Vladimir Larionov
- National Cancer Institute, National Institutes of Health, Bethesda, USA
| | | | - Tatsuo Fukagawa
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| | | |
Collapse
|
26
|
Mahlke MA, Nechemia-Arbely Y. Guarding the Genome: CENP-A-Chromatin in Health and Cancer. Genes (Basel) 2020; 11:genes11070810. [PMID: 32708729 PMCID: PMC7397030 DOI: 10.3390/genes11070810] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/10/2020] [Accepted: 07/15/2020] [Indexed: 02/07/2023] Open
Abstract
Faithful chromosome segregation is essential for the maintenance of genomic integrity and requires functional centromeres. Centromeres are epigenetically defined by the histone H3 variant, centromere protein A (CENP-A). Here we highlight current knowledge regarding CENP-A-containing chromatin structure, specification of centromere identity, regulation of CENP-A deposition and possible contribution to cancer formation and/or progression. CENP-A overexpression is common among many cancers and predicts poor prognosis. Overexpression of CENP-A increases rates of CENP-A deposition ectopically at sites of high histone turnover, occluding CCCTC-binding factor (CTCF) binding. Ectopic CENP-A deposition leads to mitotic defects, centromere dysfunction and chromosomal instability (CIN), a hallmark of cancer. CENP-A overexpression is often accompanied by overexpression of its chaperone Holliday Junction Recognition Protein (HJURP), leading to epigenetic addiction in which increased levels of HJURP and CENP-A become necessary to support rapidly dividing p53 deficient cancer cells. Alterations in CENP-A posttranslational modifications are also linked to chromosome segregation errors and CIN. Collectively, CENP-A is pivotal to genomic stability through centromere maintenance, perturbation of which can lead to tumorigenesis.
Collapse
Affiliation(s)
- Megan A. Mahlke
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA;
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Yael Nechemia-Arbely
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA;
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, PA 15261, USA
- Correspondence: ; Tel.: +1-412-623-3228; Fax: +1-412-623-7828
| |
Collapse
|
27
|
Mikheenko A, Bzikadze AV, Gurevich A, Miga KH, Pevzner PA. TandemTools: mapping long reads and assessing/improving assembly quality in extra-long tandem repeats. Bioinformatics 2020; 36:i75-i83. [PMID: 32657355 PMCID: PMC7355294 DOI: 10.1093/bioinformatics/btaa440] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION Extra-long tandem repeats (ETRs) are widespread in eukaryotic genomes and play an important role in fundamental cellular processes, such as chromosome segregation. Although emerging long-read technologies have enabled ETR assemblies, the accuracy of such assemblies is difficult to evaluate since there are no tools for their quality assessment. Moreover, since the mapping of error-prone reads to ETRs remains an open problem, it is not clear how to polish draft ETR assemblies. RESULTS To address these problems, we developed the TandemTools software that includes the TandemMapper tool for mapping reads to ETRs and the TandemQUAST tool for polishing ETR assemblies and their quality assessment. We demonstrate that TandemTools not only reveals errors in ETR assemblies but also improves the recently generated assemblies of human centromeres. AVAILABILITY AND IMPLEMENTATION https://github.com/ablab/TandemTools. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA 92093, USA
| | - Alexey Gurevich
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199034, Russia
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Pavel A Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA
| |
Collapse
|
28
|
Miga KH. Centromere studies in the era of 'telomere-to-telomere' genomics. Exp Cell Res 2020; 394:112127. [PMID: 32504677 DOI: 10.1016/j.yexcr.2020.112127] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 05/23/2020] [Accepted: 05/30/2020] [Indexed: 12/17/2022]
Abstract
We are entering into an exciting era of genomics where truly complete, high-quality assemblies of human chromosomes are available end-to-end, or from 'telomere-to-telomere' (T2T). This technological advance offers a new opportunity to include endogenous human centromeric regions in high-resolution, sequence-based studies. These emerging reference maps are expected to reveal a new functional landscape in the human genome, where centromere proteins, transcriptional regulation, and spatial organization can be examined with base-level resolution across different stages of development and disease. Such studies will depend on innovative assembly methods of extremely long tandem repeats (ETRs), or satellite DNAs, paired with the development of new, orthogonal validation methods to ensure accuracy and completeness. This review reflects the progress in centromere genomics, credited by recent advancements in long-read sequencing and assembly methods. In doing so, I will discuss the challenges that remain and the promise for a new period of scientific discovery for satellite DNA biology and centromere function.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, CA, 95064, USA.
| |
Collapse
|
29
|
Sullivan LL, Sullivan BA. Genomic and functional variation of human centromeres. Exp Cell Res 2020; 389:111896. [PMID: 32035947 PMCID: PMC7140587 DOI: 10.1016/j.yexcr.2020.111896] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 01/29/2020] [Accepted: 02/05/2020] [Indexed: 10/25/2022]
Abstract
Centromeres are central to chromosome segregation and genome stability, and thus their molecular foundations are important for understanding their function and the ways in which they go awry. Human centromeres typically form at large megabase-sized arrays of alpha satellite DNA for which there is little genomic understanding due to its repetitive nature. Consequently, it has been difficult to achieve genome assemblies at centromeres using traditional next generation sequencing approaches, so that centromeres represent gaps in the current human genome assembly. The role of alpha satellite DNA has been debated since centromeres can form, albeit rarely, on non-alpha satellite DNA. Conversely, the simple presence of alpha satellite DNA is not sufficient for centromere function since chromosomes with multiple alpha satellite arrays only exhibit a single location of centromere assembly. Here, we discuss the organization of human centromeres as well as genomic and functional variation in human centromere location, and current understanding of the genomic and epigenetic mechanisms that underlie centromere flexibility in humans.
Collapse
Affiliation(s)
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, USA; Division of Human Genetics, Duke University School of Medicine, Durham, NC, 27710, USA.
| |
Collapse
|
30
|
Gambogi CW, Dawicki-McKenna JM, Logsdon GA, Black BE. The unique kind of human artificial chromosome: Bypassing the requirement for repetitive centromere DNA. Exp Cell Res 2020; 391:111978. [PMID: 32246994 DOI: 10.1016/j.yexcr.2020.111978] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Revised: 03/23/2020] [Accepted: 03/25/2020] [Indexed: 12/20/2022]
Abstract
Centromeres are essential components of all eukaryotic chromosomes, including artificial/synthetic ones built in the laboratory. In humans, centromeres are typically located on repetitive α-satellite DNA, and these sequences are the "major ingredient" in first-generation human artificial chromosomes (HACs). Repetitive centromeric sequences present a major challenge for the design of synthetic mammalian chromosomes because they are difficult to synthesize, assemble, and characterize. Additionally, in most eukaryotes, centromeres are defined epigenetically. Here, we review the role of the genetic and epigenetic contributions to establishing centromere identity, highlighting recent work to hijack the epigenetic machinery to initiate centromere identity on a new generation of HACs built without α-satellite DNA. We also discuss the opportunities and challenges in developing useful unique sequence-based HACs.
Collapse
Affiliation(s)
- Craig W Gambogi
- Department of Biochemistry and Biophysics, Graduate Program in Biochemistry and Molecular Biophysics, Penn Center for Genome Integrity, and Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jennine M Dawicki-McKenna
- Department of Biochemistry and Biophysics, Graduate Program in Biochemistry and Molecular Biophysics, Penn Center for Genome Integrity, and Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, 98195, USA
| | - Ben E Black
- Department of Biochemistry and Biophysics, Graduate Program in Biochemistry and Molecular Biophysics, Penn Center for Genome Integrity, and Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
31
|
Talbert PB, Henikoff S. What makes a centromere? Exp Cell Res 2020; 389:111895. [PMID: 32035948 DOI: 10.1016/j.yexcr.2020.111895] [Citation(s) in RCA: 101] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2019] [Revised: 01/18/2020] [Accepted: 02/05/2020] [Indexed: 12/26/2022]
Abstract
Centromeres are the eukaryotic chromosomal sites at which the kinetochore forms and attaches to spindle microtubules to orchestrate chromosomal segregation in mitosis and meiosis. Although centromeres are essential for cell division, their sequences are not conserved and evolve rapidly. Centromeres vary dramatically in size and organization. Here we categorize their diversity and explore the evolutionary forces shaping them. Nearly all centromeres favor AT-rich DNA that is gene-free and transcribed at a very low level. Repair of frequent centromere-proximal breaks probably contributes to their rapid sequence evolution. Point centromeres are only ~125 bp and are specified by common protein-binding motifs, whereas short regional centromeres are 1-5 kb, typically have unique sequences, and may have pericentromeric repeats adapted to facilitate centromere clustering. Transposon-rich centromeres are often ~100-300 kb and are favored by RNAi machinery that silences transposons, by suppression of meiotic crossovers at centromeres, and by the ability of some transposons to target centromeres. Megabase-length satellite centromeres arise in plants and animals with asymmetric female meiosis that creates centromere competition, and favors satellite monomers one or two nucleosomes in length that position and stabilize centromeric nucleosomes. Holocentromeres encompass the length of a chromosome and may differ dramatically between mitosis and meiosis. We propose a model in which low level transcription of centromeres facilitates the formation of non-B DNA that specifies centromeres and promotes loading of centromeric nucleosomes.
Collapse
Affiliation(s)
- Paul B Talbert
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, 98109, USA
| | - Steven Henikoff
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, 98109, USA.
| |
Collapse
|
32
|
Ling YH, Lin Z, Yuen KWY. Genetic and epigenetic effects on centromere establishment. Chromosoma 2019; 129:1-24. [PMID: 31781852 DOI: 10.1007/s00412-019-00727-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Revised: 09/24/2019] [Accepted: 10/10/2019] [Indexed: 01/19/2023]
Abstract
Endogenous chromosomes contain centromeres to direct equal chromosomal segregation in mitosis and meiosis. The location and function of existing centromeres is usually maintained through cell cycles and generations. Recent studies have investigated how the centromere-specific histone H3 variant CENP-A is assembled and replenished after DNA replication to epigenetically propagate the centromere identity. However, existing centromeres occasionally become inactivated, with or without change in underlying DNA sequences, or lost after chromosomal rearrangements, resulting in acentric chromosomes. New centromeres, known as neocentromeres, may form on ectopic, non-centromeric chromosomal regions to rescue acentric chromosomes from being lost, or form dicentric chromosomes if the original centromere is still active. In addition, de novo centromeres can form after chromatinization of purified DNA that is exogenously introduced into cells. Here, we review the phenomena of naturally occurring and experimentally induced new centromeres and summarize the genetic (DNA sequence) and epigenetic features of these new centromeres. We compare the characteristics of new and native centromeres to understand whether there are different requirements for centromere establishment and propagation. Based on our understanding of the mechanisms of new centromere formation, we discuss the perspectives of developing more stably segregating human artificial chromosomes to facilitate gene delivery in therapeutics and research.
Collapse
Affiliation(s)
- Yick Hin Ling
- School of Biological Sciences, The University of Hong Kong, Kadoorie Biological Sciences Building, Pokfulam Road, Hong Kong
| | - Zhongyang Lin
- School of Biological Sciences, The University of Hong Kong, Kadoorie Biological Sciences Building, Pokfulam Road, Hong Kong
| | - Karen Wing Yee Yuen
- School of Biological Sciences, The University of Hong Kong, Kadoorie Biological Sciences Building, Pokfulam Road, Hong Kong.
| |
Collapse
|
33
|
Cechova M, Harris RS, Tomaszkiewicz M, Arbeithuber B, Chiaromonte F, Makova KD. High Satellite Repeat Turnover in Great Apes Studied with Short- and Long-Read Technologies. Mol Biol Evol 2019; 36:2415-2431. [PMID: 31273383 PMCID: PMC6805231 DOI: 10.1093/molbev/msz156] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 06/12/2019] [Accepted: 06/13/2019] [Indexed: 12/23/2022] Open
Abstract
Satellite repeats are a structural component of centromeres and telomeres, and in some instances, their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50 bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently shared among species, which formed two groups: 1) the (AATGG)n repeat (critical for heat shock response) and its derivatives; and 2) subtelomeric 32-mers involved in telomeric metabolism. Using the densities of abundant repeats, individuals could be classified into species. However, clustering did not reproduce the accepted species phylogeny, suggesting rapid repeat evolution. Several abundant repeats were enriched in males versus females; using Y chromosome assemblies or Fluorescent In Situ Hybridization, we validated their location on the Y. Finally, applying a novel computational tool, we identified many satellite repeats completely embedded within long Oxford Nanopore and Pacific Biosciences reads. Such repeats were up to 59 kb in length and consisted of perfect repeats interspersed with other similar sequences. Our results based on sequencing reads generated with three different technologies provide the first detailed characterization of great ape satellite repeats, and open new avenues for exploring their functions.
Collapse
Affiliation(s)
- Monika Cechova
- Department of Biology, Pennsylvania State University, University Park, PA
| | - Robert S Harris
- Department of Biology, Pennsylvania State University, University Park, PA
| | | | | | - Francesca Chiaromonte
- Department of Statistics, Pennsylvania State University, University Park, PA
- EMbeDS, Sant’Anna School of Advanced Studies, Pisa, Italy
- Center for Medical Genomics, Penn State, University Park, PA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA
- Center for Medical Genomics, Penn State, University Park, PA
| |
Collapse
|
34
|
Discovery of 33mer in chromosome 21 - the largest alpha satellite higher order repeat unit among all human somatic chromosomes. Sci Rep 2019; 9:12629. [PMID: 31477765 PMCID: PMC6718397 DOI: 10.1038/s41598-019-49022-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 08/13/2019] [Indexed: 11/10/2022] Open
Abstract
The centromere is important for segregation of chromosomes during cell division in eukaryotes. Its destabilization results in chromosomal missegregation, aneuploidy, hallmarks of cancers and birth defects. In primate genomes centromeres contain tandem repeats of ~171 bp alpha satellite DNA, commonly organized into higher order repeats (HORs). In spite of crucial importance, satellites have been understudied because of gaps in sequencing - genomic “black holes”. Bioinformatical studies of genomic sequences open possibilities to revolutionize understanding of repetitive DNA datasets. Here, using robust (Global Repeat Map) algorithm we identified in hg38 sequence of human chromosome 21 complete ensemble of alpha satellite HORs with six long repeat units (≥20 mers), five of them novel. Novel 33mer HOR has the longest HOR unit identified so far among all somatic chromosomes and novel 23mer reverse HOR is distant far from the centromere. Also, we discovered that for hg38 assembly the 33mer sequences in chromosomes 21, 13, 14, and 22 are 100% identical but nearby gaps are present; that seems to require an additional more precise sequencing. Chromosome 21 is of significant interest for deciphering the molecular base of Down syndrome and of aneuploidies in general. Since the chromosome identifier probes are largely based on the detection of higher order alpha satellite repeats, distinctions between alpha satellite HORs in chromosomes 21 and 13 here identified might lead to a unique chromosome 21 probe in molecular cytogenetics, which would find utility in diagnostics. It is expected that its complete sequence analysis will have profound implications for understanding pathogenesis of diseases and development of new therapeutic approaches.
Collapse
|
35
|
Logsdon GA, Gambogi CW, Liskovykh MA, Barrey EJ, Larionov V, Miga KH, Heun P, Black BE. Human Artificial Chromosomes that Bypass Centromeric DNA. Cell 2019; 178:624-639.e19. [PMID: 31348889 PMCID: PMC6657561 DOI: 10.1016/j.cell.2019.06.006] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2018] [Revised: 04/07/2019] [Accepted: 06/03/2019] [Indexed: 11/29/2022]
Abstract
Recent breakthroughs with synthetic budding yeast chromosomes expedite the creation of synthetic mammalian chromosomes and genomes. Mammals, unlike budding yeast, depend on the histone H3 variant, CENP-A, to epigenetically specify the location of the centromere-the locus essential for chromosome segregation. Prior human artificial chromosomes (HACs) required large arrays of centromeric α-satellite repeats harboring binding sites for the DNA sequence-specific binding protein, CENP-B. We report the development of a type of HAC that functions independently of these constraints. Formed by an initial CENP-A nucleosome seeding strategy, a construct lacking repetitive centromeric DNA formed several self-sufficient HACs that showed no uptake of genomic DNA. In contrast to traditional α-satellite HAC formation, the non-repetitive construct can form functional HACs without CENP-B or initial CENP-A nucleosome seeding, revealing distinct paths to centromere formation for different DNA sequence types. Our developments streamline the construction and characterization of HACs to facilitate mammalian synthetic genome efforts.
Collapse
Affiliation(s)
- Glennis A Logsdon
- Department of Biochemistry and Biophysics, Graduate Program in Biochemistry and Molecular Biophysics, and Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Craig W Gambogi
- Department of Biochemistry and Biophysics, Graduate Program in Biochemistry and Molecular Biophysics, and Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Mikhail A Liskovykh
- Developmental Therapeutics Branch, National Cancer Institute, Bethesda, MD 20892, USA
| | - Evelyne J Barrey
- Wellcome Trust Centre for Cell Biology, Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK
| | - Vladimir Larionov
- Developmental Therapeutics Branch, National Cancer Institute, Bethesda, MD 20892, USA
| | - Karen H Miga
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Patrick Heun
- Wellcome Trust Centre for Cell Biology, Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK
| | - Ben E Black
- Department of Biochemistry and Biophysics, Graduate Program in Biochemistry and Molecular Biophysics, and Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
36
|
DNA replication acts as an error correction mechanism to maintain centromere identity by restricting CENP-A to centromeres. Nat Cell Biol 2019; 21:743-754. [PMID: 31160708 DOI: 10.1038/s41556-019-0331-4] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2018] [Accepted: 04/18/2019] [Indexed: 12/17/2022]
Abstract
Chromatin assembled with the histone H3 variant CENP-A is the heritable epigenetic determinant of human centromere identity. Using genome-wide mapping and reference models for 23 human centromeres, CENP-A binding sites are identified within the megabase-long, repetitive α-satellite DNAs at each centromere. CENP-A is shown in early G1 to be assembled into nucleosomes within each centromere and onto 11,390 transcriptionally active sites on the chromosome arms. DNA replication is demonstrated to remove ectopically loaded, non-centromeric CENP-A. In contrast, tethering of centromeric CENP-A to the sites of DNA replication through the constitutive centromere associated network (CCAN) is shown to enable precise reloading of centromere-bound CENP-A onto the same DNA sequences as in its initial prereplication loading. Thus, DNA replication acts as an error correction mechanism for maintaining centromere identity through its removal of non-centromeric CENP-A coupled with CCAN-mediated retention and precise reloading of centromeric CENP-A.
Collapse
|
37
|
Miga KH. Centromeric Satellite DNAs: Hidden Sequence Variation in the Human Population. Genes (Basel) 2019; 10:E352. [PMID: 31072070 PMCID: PMC6562703 DOI: 10.3390/genes10050352] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 05/03/2019] [Accepted: 05/03/2019] [Indexed: 12/30/2022] Open
Abstract
The central goal of medical genomics is to understand the inherited basis of sequence variation that underlies human physiology, evolution, and disease. Functional association studies currently ignore millions of bases that span each centromeric region and acrocentric short arm. These regions are enriched in long arrays of tandem repeats, or satellite DNAs, that are known to vary extensively in copy number and repeat structure in the human population. Satellite sequence variation in the human genome is often so large that it is detected cytogenetically, yet due to the lack of a reference assembly and informatics tools to measure this variability, contemporary high-resolution disease association studies are unable to detect causal variants in these regions. Nevertheless, recently uncovered associations between satellite DNA variation and human disease support that these regions present a substantial and biologically important fraction of human sequence variation. Therefore, there is a pressing and unmet need to detect and incorporate this uncharacterized sequence variation into broad studies of human evolution and medical genomics. Here I discuss the current knowledge of satellite DNA variation in the human genome, focusing on centromeric satellites and their potential implications for disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, CA 95064, USA.
| |
Collapse
|
38
|
Abstract
Animal and plant centromeres are embedded in repetitive "satellite" DNA, but are thought to be epigenetically specified. To define genetic characteristics of centromeres, we surveyed satellite DNA from diverse eukaryotes and identified variation in <10-bp dyad symmetries predicted to adopt non-B-form conformations. Organisms lacking centromeric dyad symmetries had binding sites for sequence-specific DNA-binding proteins with DNA-bending activity. For example, human and mouse centromeres are depleted for dyad symmetries, but are enriched for non-B-form DNA and are associated with binding sites for the conserved DNA-binding protein CENP-B, which is required for artificial centromere function but is paradoxically nonessential. We also detected dyad symmetries and predicted non-B-form DNA structures at neocentromeres, which form at ectopic loci. We propose that centromeres form at non-B-form DNA because of dyad symmetries or are strengthened by sequence-specific DNA binding proteins. This may resolve the CENP-B paradox and provide a general basis for centromere specification.
Collapse
Affiliation(s)
- Sivakanthan Kasinathan
- Medical Scientist Training Program, University of Washington School of Medicine, Seattle, WA.,Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA
| | - Steven Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA.,Howard Hughes Medical Institute, Seattle, WA
| |
Collapse
|
39
|
Black EM, Giunta S. Repetitive Fragile Sites: Centromere Satellite DNA As a Source of Genome Instability in Human Diseases. Genes (Basel) 2018; 9:E615. [PMID: 30544645 PMCID: PMC6315641 DOI: 10.3390/genes9120615] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Revised: 12/03/2018] [Accepted: 12/03/2018] [Indexed: 12/31/2022] Open
Abstract
Maintenance of an intact genome is essential for cellular and organismal homeostasis. The centromere is a specialized chromosomal locus required for faithful genome inheritance at each round of cell division. Human centromeres are composed of large tandem arrays of repetitive alpha-satellite DNA, which are often sites of aberrant rearrangements that may lead to chromosome fusions and genetic abnormalities. While the centromere has an essential role in chromosome segregation during mitosis, the long and repetitive nature of the highly identical repeats has greatly hindered in-depth genetic studies, and complete annotation of all human centromeres is still lacking. Here, we review our current understanding of human centromere genetics and epigenetics as well as recent investigations into the role of centromere DNA in disease, with a special focus on cancer, aging, and human immunodeficiency⁻centromeric instability⁻facial anomalies (ICF) syndrome. We also highlight the causes and consequences of genomic instability at these large repetitive arrays and describe the possible sources of centromere fragility. The novel connection between alpha-satellite DNA instability and human pathological conditions emphasizes the importance of obtaining a truly complete human genome assembly and accelerating our understanding of centromere repeats' role in physiology and beyond.
Collapse
Affiliation(s)
- Elizabeth M Black
- Laboratory of Chromosome and Cell Biology, The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA.
| | - Simona Giunta
- Laboratory of Chromosome and Cell Biology, The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA.
| |
Collapse
|
40
|
McNulty SM, Sullivan BA. Alpha satellite DNA biology: finding function in the recesses of the genome. Chromosome Res 2018; 26:115-138. [PMID: 29974361 DOI: 10.1007/s10577-018-9582-3] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Accepted: 06/14/2018] [Indexed: 02/05/2023]
Abstract
Repetitive DNA, formerly referred to by the misnomer "junk DNA," comprises a majority of the human genome. One class of this DNA, alpha satellite, comprises up to 10% of the genome. Alpha satellite is enriched at all human centromere regions and is competent for de novo centromere assembly. Because of the highly repetitive nature of alpha satellite, it has been difficult to achieve genome assemblies at centromeres using traditional next-generation sequencing approaches, and thus, centromeres represent gaps in the current human genome assembly. Moreover, alpha satellite DNA is transcribed into repetitive noncoding RNA and contributes to a large portion of the transcriptome. Recent efforts to characterize these transcripts and their function have uncovered pivotal roles for satellite RNA in genome stability, including silencing "selfish" DNA elements and recruiting centromere and kinetochore proteins. This review will describe the genomic and epigenetic features of alpha satellite DNA, discuss recent findings of noncoding transcripts produced from distinct alpha satellite arrays, and address current progress in the functional understanding of this oft-neglected repetitive sequence. We will discuss unique challenges of studying human satellite DNAs and RNAs and point toward new technologies that will continue to advance our understanding of this largely untapped portion of the genome.
Collapse
Affiliation(s)
- Shannon M McNulty
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, 27710, USA
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC, 27710, USA. .,Division of Human Genetics, Duke University Medical Center, Durham, NC, 27710, USA.
| |
Collapse
|
41
|
Zhu Q, Hoong N, Aslanian A, Hara T, Benner C, Heinz S, Miga KH, Ke E, Verma S, Soroczynski J, Yates JR, Hunter T, Verma IM. Heterochromatin-Encoded Satellite RNAs Induce Breast Cancer. Mol Cell 2018; 70:842-853.e7. [PMID: 29861157 DOI: 10.1016/j.molcel.2018.04.023] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Revised: 02/22/2018] [Accepted: 04/26/2018] [Indexed: 12/19/2022]
Abstract
Heterochromatic repetitive satellite RNAs are extensively transcribed in a variety of human cancers, including BRCA1 mutant breast cancer. Aberrant expression of satellite RNAs in cultured cells induces the DNA damage response, activates cell cycle checkpoints, and causes defects in chromosome segregation. However, the mechanism by which satellite RNA expression leads to genomic instability is not well understood. Here we provide evidence that increased levels of satellite RNAs in mammary glands induce tumor formation in mice. Using mass spectrometry, we further show that genomic instability induced by satellite RNAs occurs through interactions with BRCA1-associated protein networks required for the stabilization of DNA replication forks. Additionally, de-stabilized replication forks likely promote the formation of RNA-DNA hybrids in cells expressing satellite RNAs. These studies lay the foundation for developing novel therapeutic strategies that block the effects of non-coding satellite RNAs in cancer cells.
Collapse
Affiliation(s)
- Quan Zhu
- Laboratory of Genetics, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Nien Hoong
- Laboratory of Genetics, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Aaron Aslanian
- Molecular and Cell Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA; Department of Chemical Physiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Toshiro Hara
- Laboratory of Genetics, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Christopher Benner
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Sven Heinz
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Karen H Miga
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | - Eugene Ke
- Laboratory of Genetics, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Sachin Verma
- Laboratory of Genetics, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - Jan Soroczynski
- Molecular and Cell Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA
| | - John R Yates
- Department of Chemical Physiology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Tony Hunter
- Molecular and Cell Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA 92037, USA.
| | - Inder M Verma
- Laboratory of Genetics, Salk Institute for Biological Studies, La Jolla, CA 92037, USA.
| |
Collapse
|
42
|
Nergadze SG, Piras FM, Gamba R, Corbo M, Cerutti F, McCarter JGW, Cappelletti E, Gozzo F, Harman RM, Antczak DF, Miller D, Scharfe M, Pavesi G, Raimondi E, Sullivan KF, Giulotto E. Birth, evolution, and transmission of satellite-free mammalian centromeric domains. Genome Res 2018; 28:789-799. [PMID: 29712753 PMCID: PMC5991519 DOI: 10.1101/gr.231159.117] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 04/13/2018] [Indexed: 11/25/2022]
Abstract
Mammalian centromeres are associated with highly repetitive DNA (satellite DNA), which has so far hindered molecular analysis of this chromatin domain. Centromeres are epigenetically specified, and binding of the CENPA protein is their main determinant. In previous work, we described the first example of a natural satellite-free centromere on Equus caballus Chromosome 11. Here, we investigated the satellite-free centromeres of Equus asinus by using ChIP-seq with anti-CENPA antibodies. We identified an extraordinarily high number of centromeres lacking satellite DNA (16 of 31). All of them lay in LINE- and AT-rich regions. A subset of these centromeres is associated with DNA amplification. The location of CENPA binding domains can vary in different individuals, giving rise to epialleles. The analysis of epiallele transmission in hybrids (three mules and one hinny) showed that centromeric domains are inherited as Mendelian traits, but their position can slide in one generation. Conversely, centromere location is stable during mitotic propagation of cultured cells. Our results demonstrate that the presence of more than half of centromeres void of satellite DNA is compatible with genome stability and species survival. The presence of amplified DNA at some centromeres suggests that these arrays may represent an intermediate stage toward satellite DNA formation during evolution. The fact that CENPA binding domains can move within relatively restricted regions (a few hundred kilobases) suggests that the centromeric function is physically limited by epigenetic boundaries.
Collapse
Affiliation(s)
- Solomon G Nergadze
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| | - Francesca M Piras
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| | - Riccardo Gamba
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| | - Marco Corbo
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| | - Federico Cerutti
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| | - Joseph G W McCarter
- Centre for Chromosome Biology, School of Natural Sciences, National University of Ireland, Galway, H91 TK33, Ireland
| | - Eleonora Cappelletti
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| | - Francesco Gozzo
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| | - Rebecca M Harman
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, New York 14850, USA
| | - Douglas F Antczak
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, New York 14850, USA
| | - Donald Miller
- Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, New York 14850, USA
| | - Maren Scharfe
- Genomanalytik (GMAK), Helmholtz Centre for Infection Research (HZI), 38124 Braunschweig, Germany
| | - Giulio Pavesi
- Department of Biosciences, University of Milano, 20122 Milano, Italy
| | - Elena Raimondi
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| | - Kevin F Sullivan
- Centre for Chromosome Biology, School of Natural Sciences, National University of Ireland, Galway, H91 TK33, Ireland
| | - Elena Giulotto
- Department of Biology and Biotechnology "Lazzaro Spallanzani," University of Pavia, 27100 Pavia, Italy
| |
Collapse
|
43
|
Jain M, Olsen HE, Turner DJ, Stoddart D, Bulazel KV, Paten B, Haussler D, Willard HF, Akeson M, Miga KH. Linear assembly of a human centromere on the Y chromosome. Nat Biotechnol 2018; 36:321-323. [PMID: 29553574 PMCID: PMC5886786 DOI: 10.1038/nbt.4109] [Citation(s) in RCA: 167] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2017] [Accepted: 02/22/2018] [Indexed: 01/21/2023]
Abstract
The human genome reference sequence remains incomplete owing to the challenge of assembling long tracts of near-identical tandem repeats in centromeres. We implemented a nanopore sequencing strategy to generate high-quality reads that span hundreds of kilobases of highly repetitive DNA in a human Y chromosome centromere. Combining these data with short-read variant validation, we assembled and characterized the centromeric region of a human Y chromosome.
Collapse
Affiliation(s)
- Miten Jain
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California USA
| | - Hugh E Olsen
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California USA
| | | | | | - Kira V Bulazel
- Duke Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California USA
| | - David Haussler
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California USA
| | - Huntington F Willard
- Duke Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina USA
- Geisinger National, Bethesda, Maryland USA
| | - Mark Akeson
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California USA
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California USA
- Duke Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina USA
| |
Collapse
|
44
|
Lampson MA, Black BE. Cellular and Molecular Mechanisms of Centromere Drive. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2018; 82:249-257. [PMID: 29440567 PMCID: PMC6041145 DOI: 10.1101/sqb.2017.82.034298] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
The asymmetric outcome of female meiosis I, whereby an entire set of chromosomes are discarded into a polar body, presents an opportunity for selfish genetic elements to cheat the process and disproportionately segregate to the egg. Centromeres, the chromosomal loci that connect to spindle microtubules, could potentially act as selfish elements and "drive" in meiosis. We review the current understanding of the genetic and epigenetic contributions to centromere identity and describe recent progress in a powerful model system to study centromere drive in mice. The progress includes mechanistic findings regarding two main requirements for a centromere to exploit the asymmetric outcome of female meiosis. The first is an asymmetry between centromeres of homologous chromosomes, and we found this is accomplished through massive changes in the abundance of the repetitive DNA underlying centromeric chromatin. The second requirement is an asymmetry in the meiotic spindle, which is achieved through signaling from the oocyte cortex that leads to asymmetry in a posttranslational modification of tubulin, tyrosination. Together, these two asymmetries culminate in the biased segregation of expanded centromeres to the egg, and we describe a mechanistic framework to understand this process.
Collapse
Affiliation(s)
- Michael A Lampson
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104
| | - Ben E Black
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6059
| |
Collapse
|
45
|
Ai H, Ai Y, Meng F. GenomeLandscaper: Landscape analysis of genome-fingerprints maps assessing chromosome architecture. Sci Rep 2018; 8:1026. [PMID: 29348569 PMCID: PMC5773709 DOI: 10.1038/s41598-018-19366-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 12/28/2017] [Indexed: 01/07/2023] Open
Abstract
Assessing correctness of an assembled chromosome architecture is a central challenge. We create a geometric analysis method (called GenomeLandscaper) to conduct landscape analysis of genome-fingerprints maps (GFM), trace large-scale repetitive regions, and assess their impacts on the global architectures of assembled chromosomes. We develop an alignment-free method for phylogenetics analysis. The human Y chromosomes (GRCh.chrY, HuRef.chrY and YH.chrY) are analysed as a proof-of-concept study. We construct a galaxy of genome-fingerprints maps (GGFM) for them, and a landscape compatibility among relatives is observed. But a long sharp straight line on the GGFM breaks such a landscape compatibility, distinguishing GRCh38p1.chrY (and throughout GRCh38p7.chrY) from GRCh37p13.chrY, HuRef.chrY and YH.chrY. We delete a 1.30-Mbp target segment to rescue the landscape compatibility, matching the antecedent GRCh37p13.chrY. We re-locate it into the modelled centromeric and pericentromeric region of GRCh38p10.chrY, matching a gap placeholder of GRCh37p13.chrY. We decompose it into sub-constituents (such as BACs, interspersed repeats, and tandem repeats) and trace their homologues by phylogenetics analysis. We elucidate that most examined tandem repeats are of reasonable quality, but the BAC-sized repeats, 173U1020C (176.46 Kbp) and 5U41068C (205.34 Kbp), are likely over-repeated. These results offer unique insights into the centromeric and pericentromeric regions of the human Y chromosomes.
Collapse
Affiliation(s)
- Hannan Ai
- State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, 510275, China
- Department of Electrical and Computer Engineering, College of Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Yuncan Ai
- State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, 510275, China.
| | - Fanmei Meng
- State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, 510275, China
| |
Collapse
|
46
|
Henikoff S, Thakur J, Kasinathan S, Talbert PB. Remarkable Evolutionary Plasticity of Centromeric Chromatin. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2017; 82:71-82. [PMID: 29196559 DOI: 10.1101/sqb.2017.82.033605] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Centromeres were familiar to cell biologists in the late 19th century, but for most eukaryotes the basis for centromere specification has remained enigmatic. Much attention has been focused on the cenH3 (CENP-A) histone variant, which forms the foundation of the centromere. To investigate the DNA sequence requirements for centromere specification, we applied a variety of epigenomic approaches, which have revealed surprising diversity in centromeric chromatin properties. Whereas each point centromere of budding yeast is occupied by a single precisely positioned tetrameric nucleosome with one cenH3 molecule, the "regional" centromeres of fission yeast contain unphased presumably octameric nucleosomes with two cenH3s. In Caenorhabditis elegans, kinetochores assemble all along the chromosome at sites of cenH3 nucleosomes that resemble budding yeast point centromeres, whereas holocentric insects lack cenH3 entirely. The "satellite" centromeres of most animals and plants consist of cenH3-containing particles that are precisely positioned over homogeneous tandem repeats, but in humans, different α-satellite subfamilies are occupied by CENP-A nucleosomes with very different conformations. We suggest that this extraordinary evolutionary diversity of centromeric chromatin architectures can be understood in terms of the simplicity of the task of equal chromosome segregation that is continually subverted by selfish DNA sequences.
Collapse
Affiliation(s)
- Steven Henikoff
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109.,Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109
| | - Jitendra Thakur
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109.,Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109
| | - Sivakanthan Kasinathan
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109.,Medical Scientist Training Program, University of Washington School of Medicine, Seattle, Washington 98195
| | - Paul B Talbert
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109.,Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109
| |
Collapse
|
47
|
McNulty SM, Sullivan LL, Sullivan BA. Human Centromeres Produce Chromosome-Specific and Array-Specific Alpha Satellite Transcripts that Are Complexed with CENP-A and CENP-C. Dev Cell 2017; 42:226-240.e6. [PMID: 28787590 DOI: 10.1016/j.devcel.2017.07.001] [Citation(s) in RCA: 132] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2017] [Revised: 05/24/2017] [Accepted: 07/03/2017] [Indexed: 11/28/2022]
Abstract
Human centromeres are defined by alpha satellite DNA arrays that are distinct and chromosome specific. Most human chromosomes contain multiple alpha satellite arrays that are competent for centromere assembly. Here, we show that human centromeres are defined by chromosome-specific RNAs linked to underlying organization of distinct alpha satellite arrays. Active and inactive arrays on the same chromosome produce discrete sets of transcripts in cis. Non-coding RNAs produced from active arrays are complexed with CENP-A and CENP-C, while inactive-array transcripts associate with CENP-B and are generally less stable. Loss of CENP-A does not affect transcript abundance or stability. However, depletion of array-specific RNAs reduces CENP-A and CENP-C at the targeted centromere via faulty CENP-A loading, arresting cells before mitosis. This work shows that each human alpha satellite array produces a unique set of non-coding transcripts, and RNAs present at active centromeres are necessary for kinetochore assembly and cell-cycle progression.
Collapse
Affiliation(s)
- Shannon M McNulty
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Lori L Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Beth A Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; Division of Human Genetics, Duke University Medical Center, Durham, NC 27710, USA.
| |
Collapse
|
48
|
Analysis of global DNA methylation changes in primary human fibroblasts in the early phase following X-ray irradiation. PLoS One 2017; 12:e0177442. [PMID: 28489894 PMCID: PMC5425224 DOI: 10.1371/journal.pone.0177442] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2016] [Accepted: 04/27/2017] [Indexed: 01/09/2023] Open
Abstract
Epigenetic alterations may contribute to the generation of cancer cells in a multi-step process of tumorigenesis following irradiation of normal body cells. Primary human fibroblasts with intact cell cycle checkpoints were used as a model to test whether X-ray irradiation with 2 and 4 Gray induces direct epigenetic effects (within the first cell cycle) in the exposed cells. ELISA-based fluorometric assays were consistent with slightly reduced global DNA methylation and hydroxymethylation, however the observed between-group differences were usually not significant. Similarly, bisulfite pyrosequencing of interspersed LINE-1 repeats and centromeric α-satellite DNA did not detect significant methylation differences between irradiated and non-irradiated cultures. Methylation of interspersed ALU repeats appeared to be slightly increased (one percentage point; p = 0.01) at 6 h after irradiation with 4 Gy. Single-cell analysis showed comparable variations in repeat methylation among individual cells in both irradiated and control cultures. Radiation-induced changes in global repeat methylation, if any, were much smaller than methylation variation between different fibroblast strains. Interestingly, α-satellite DNA methylation positively correlated with gestational age. Finally, 450K methylation arrays mainly targeting genes and CpG islands were used for global DNA methylation analysis. There were no detectable methylation differences in genic (promoter, 5' UTR, first exon, gene body, 3' UTR) and intergenic regions between irradiated and control fibroblast cultures. Although we cannot exclude minor effects, i.e. on individual CpG sites, collectively our data suggest that global DNA methylation remains rather stable in irradiated normal body cells in the early phase of DNA damage response.
Collapse
|
49
|
Nechemia-Arbely Y, Fachinetti D, Miga KH, Sekulic N, Soni GV, Kim DH, Wong AK, Lee AY, Nguyen K, Dekker C, Ren B, Black BE, Cleveland DW. Human centromeric CENP-A chromatin is a homotypic, octameric nucleosome at all cell cycle points. J Cell Biol 2017; 216:607-621. [PMID: 28235947 PMCID: PMC5350519 DOI: 10.1083/jcb.201608083] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Revised: 11/18/2016] [Accepted: 01/17/2017] [Indexed: 12/11/2022] Open
Abstract
In this study, the authors use new reference models for 23 human centromeres and find that at all cell cycle phases centromeric CENP-A chromatin complexes are octameric nucleosomes with two molecules of CENP-A. This finding refutes previous models that have suggested that hemisomes may briefly transition to octameric nucleosomes. Chromatin assembled with centromere protein A (CENP-A) is the epigenetic mark of centromere identity. Using new reference models, we now identify sites of CENP-A and histone H3.1 binding within the megabase, α-satellite repeat–containing centromeres of 23 human chromosomes. The overwhelming majority (97%) of α-satellite DNA is found to be assembled with histone H3.1–containing nucleosomes with wrapped DNA termini. In both G1 and G2 cell cycle phases, the 2–4% of α-satellite assembled with CENP-A protects DNA lengths centered on 133 bp, consistent with octameric nucleosomes with DNA unwrapping at entry and exit. CENP-A chromatin is shown to contain equimolar amounts of CENP-A and histones H2A, H2B, and H4, with no H3. Solid-state nanopore analyses show it to be nucleosomal in size. Thus, in contrast to models for hemisomes that briefly transition to octameric nucleosomes at specific cell cycle points or heterotypic nucleosomes containing both CENP-A and histone H3, human CENP-A chromatin complexes are octameric nucleosomes with two molecules of CENP-A at all cell cycle phases.
Collapse
Affiliation(s)
- Yael Nechemia-Arbely
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093
| | - Daniele Fachinetti
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093
| | - Karen H Miga
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064
| | - Nikolina Sekulic
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Gautam V Soni
- Department of Bionanoscience, Kavli Institute of Nanoscience, Delft University of Technology, 2628 CJ Delft, Netherlands
| | - Dong Hyun Kim
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093
| | - Adeline K Wong
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093
| | - Ah Young Lee
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093
| | - Kristen Nguyen
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093
| | - Cees Dekker
- Department of Bionanoscience, Kavli Institute of Nanoscience, Delft University of Technology, 2628 CJ Delft, Netherlands
| | - Bing Ren
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093
| | - Ben E Black
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - Don W Cleveland
- Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093
| |
Collapse
|
50
|
Integrity of the human centromere DNA repeats is protected by CENP-A, CENP-C, and CENP-T. Proc Natl Acad Sci U S A 2017; 114:1928-1933. [PMID: 28167779 DOI: 10.1073/pnas.1615133114] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Centromeres are highly specialized chromatin domains that enable chromosome segregation and orchestrate faithful cell division. Human centromeres are composed of tandem arrays of α-satellite DNA, which spans up to several megabases. Little is known about the mechanisms that maintain integrity of the long arrays of α-satellite DNA repeats. Here, we monitored centromeric repeat stability in human cells using chromosome-orientation fluorescent in situ hybridization (CO-FISH). This assay detected aberrant centromeric CO-FISH patterns consistent with sister chromatid exchange at the frequency of 5% in primary tissue culture cells, whereas higher levels were seen in several cancer cell lines and during replicative senescence. To understand the mechanism(s) that maintains centromere integrity, we examined the contribution of the centromere-specific histone variant CENP-A and members of the constitutive centromere-associated network (CCAN), CENP-C, CENP-T, and CENP-W. Depletion of CENP-A and CCAN proteins led to an increase in centromere aberrations, whereas enhancing chromosome missegregation by alternative methods did not, suggesting that CENP-A and CCAN proteins help maintain centromere integrity independently of their role in chromosome segregation. Furthermore, superresolution imaging of centromeric CO-FISH using structured illumination microscopy implied that CENP-A protects α-satellite repeats from extensive rearrangements. Our study points toward the presence of a centromere-specific mechanism that actively maintains α-satellite repeat integrity during human cell proliferation.
Collapse
|