1
|
Glunčić M, Vlahović I, Rosandić M, Paar V. Novel Concept of Alpha Satellite Cascading Higher-Order Repeats (HORs) and Precise Identification of 15mer and 20mer Cascading HORs in Complete T2T-CHM13 Assembly of Human Chromosome 15. Int J Mol Sci 2024; 25:4395. [PMID: 38673983 PMCID: PMC11050224 DOI: 10.3390/ijms25084395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 04/28/2024] Open
Abstract
Unraveling the intricate centromere structure of human chromosomes holds profound implications, illuminating fundamental genetic mechanisms and potentially advancing our comprehension of genetic disorders and therapeutic interventions. This study rigorously identified and structurally analyzed alpha satellite higher-order repeats (HORs) within the centromere of human chromosome 15 in the complete T2T-CHM13 assembly using the high-precision GRM2023 algorithm. The most extensive alpha satellite HOR array in chromosome 15 reveals a novel cascading HOR, housing 429 15mer HOR copies, containing 4-, 7- and 11-monomer subfragments. Within each row of cascading HORs, all alpha satellite monomers are of distinct types, as in regular Willard's HORs. However, different HOR copies within the same cascading 15mer HOR contain more than one monomer of the same type. Each canonical 15mer HOR copy comprises 15 monomers belonging to only 9 different monomer types. Notably, 65% of the 429 15mer cascading HOR copies exhibit canonical structures, while 35% display variant configurations. Identified as the second most extensive alpha satellite HOR, another novel cascading HOR within human chromosome 15 encompasses 164 20mer HOR copies, each featuring two subfragments. Moreover, a distinct pattern emerges as interspersed 25mer/26mer structures differing from regular Willard's HORs and giving rise to a 34-monomer subfragment. Only a minor 18mer HOR array of 12 HOR copies is of the regular Willard's type. These revelations highlight the complexity within the chromosome 15 centromeric region, accentuating deviations from anticipated highly regular patterns and hinting at profound information encoding and functional potential within the human centromere.
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
| | - Ines Vlahović
- Algebra LAB, Algebra University College, 10000 Zagreb, Croatia;
| | - Marija Rosandić
- Department of Internal Medicine, University Hospital Centre Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia;
- Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia
| |
Collapse
|
2
|
Ruzanov P, Evdokimova V, Pachva MC, Minkovich A, Zhang Z, Langman S, Gassmann H, Thiel U, Orlic-Milacic M, Zaidi SH, Peltekova V, Heisler LE, Sharma M, Cox ME, McKee TD, Zaidi M, Lapouble E, McPherson JD, Delattre O, Radvanyi L, Burdach SE, Stein LD, Sorensen PH. Oncogenic ETS fusions promote DNA damage and proinflammatory responses via pericentromeric RNAs in extracellular vesicles. J Clin Invest 2024; 134:e169470. [PMID: 38530366 PMCID: PMC11060741 DOI: 10.1172/jci169470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 03/12/2024] [Indexed: 03/28/2024] Open
Abstract
Aberrant expression of the E26 transformation-specific (ETS) transcription factors characterizes numerous human malignancies. Many of these proteins, including EWS:FLI1 and EWS:ERG fusions in Ewing sarcoma (EwS) and TMPRSS2:ERG in prostate cancer (PCa), drive oncogenic programs via binding to GGAA repeats. We report here that both EWS:FLI1 and ERG bind and transcriptionally activate GGAA-rich pericentromeric heterochromatin. The respective pathogen-like HSAT2 and HSAT3 RNAs, together with LINE, SINE, ERV, and other repeat transcripts, are expressed in EwS and PCa tumors, secreted in extracellular vesicles (EVs), and are highly elevated in plasma of patients with EwS with metastatic disease. High human satellite 2 and 3 (HSAT2,3) levels in EWS:FLI1- or ERG-expressing cells and tumors were associated with induction of G2/M checkpoint, mitotic spindle, and DNA damage programs. These programs were also activated in EwS EV-treated fibroblasts, coincident with accumulation of HSAT2,3 RNAs, proinflammatory responses, mitotic defects, and senescence. Mechanistically, HSAT2,3-enriched cancer EVs induced cGAS-TBK1 innate immune signaling and formation of cytosolic granules positive for double-strand RNAs, RNA-DNA, and cGAS. Hence, aberrantly expressed ETS proteins derepress pericentromeric heterochromatin, yielding pathogenic RNAs that transmit genotoxic stress and inflammation to local and distant sites. Monitoring HSAT2,3 plasma levels and preventing their dissemination may thus improve therapeutic strategies and blood-based diagnostics.
Collapse
Affiliation(s)
- Peter Ruzanov
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | | | - Manideep C. Pachva
- Department of Molecular Oncology, British Columbia Cancer Research Centre and
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Alon Minkovich
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Zhenbo Zhang
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Sofya Langman
- Department of Molecular Oncology, British Columbia Cancer Research Centre and
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Hendrik Gassmann
- Department of Pediatrics, Children’s Cancer Research Center, Kinderklinik München Schwabing, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
| | - Uwe Thiel
- Department of Pediatrics, Children’s Cancer Research Center, Kinderklinik München Schwabing, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
| | | | - Syed H. Zaidi
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Vanya Peltekova
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | | | - Manju Sharma
- Vancouver Prostate Centre, Vancouver, British Columbia, Canada
| | - Michael E. Cox
- Vancouver Prostate Centre, Vancouver, British Columbia, Canada
| | - Trevor D. McKee
- STTARR Innovation Centre, Radiation Medicine Program, Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Pathomics Inc., Toronto, Ontario, Canada
| | - Mark Zaidi
- Pathomics Inc., Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Eve Lapouble
- Unité Génétique Somatique (UGS), Institut Curie, Centre Hospitalier Paris, France
| | - John D. McPherson
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Biochemistry and Molecular Medicine, University of California Davis Comprehensive Cancer Center, Sacramento, California, USA
| | - Olivier Delattre
- Unité Génétique Somatique (UGS), Institut Curie, Centre Hospitalier Paris, France
- Diversity and Plasticity of Childhood tumors, INSERM U830, Institut Curie Research Center, PSL Research University, Paris, France
| | - Laszlo Radvanyi
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Immunology, University of Toronto, Toronto, Ontario, Canada
| | - Stefan E.G. Burdach
- Department of Molecular Oncology, British Columbia Cancer Research Centre and
- Department of Pediatrics, Children’s Cancer Research Center, Kinderklinik München Schwabing, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
- CCC München Comprehensive Cancer Center, DKTK German Cancer Consortium, Munich, Germany
- Institute of Pathology, Translation Pediatric Cancer Research Action, School of Medicine, Technical University of Munich, Munich, Germany
| | - Lincoln D. Stein
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Poul H. Sorensen
- Department of Molecular Oncology, British Columbia Cancer Research Centre and
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
3
|
Naish M, Henderson IR. The structure, function, and evolution of plant centromeres. Genome Res 2024; 34:161-178. [PMID: 38485193 PMCID: PMC10984392 DOI: 10.1101/gr.278409.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2024]
Abstract
Centromeres are essential regions of eukaryotic chromosomes responsible for the formation of kinetochore complexes, which connect to spindle microtubules during cell division. Notably, although centromeres maintain a conserved function in chromosome segregation, the underlying DNA sequences are diverse both within and between species and are predominantly repetitive in nature. The repeat content of centromeres includes high-copy tandem repeats (satellites), and/or specific families of transposons. The functional region of the centromere is defined by loading of a specific histone 3 variant (CENH3), which nucleates the kinetochore and shows dynamic regulation. In many plants, the centromeres are composed of satellite repeat arrays that are densely DNA methylated and invaded by centrophilic retrotransposons. In some cases, the retrotransposons become the sites of CENH3 loading. We review the structure of plant centromeres, including monocentric, holocentric, and metapolycentric architectures, which vary in the number and distribution of kinetochore attachment sites along chromosomes. We discuss how variation in CENH3 loading can drive genome elimination during early cell divisions of plant embryogenesis. We review how epigenetic state may influence centromere identity and discuss evolutionary models that seek to explain the paradoxically rapid change of centromere sequences observed across species, including the potential roles of recombination. We outline putative modes of selection that could act within the centromeres, as well as the role of repeats in driving cycles of centromere evolution. Although our primary focus is on plant genomes, we draw comparisons with animal and fungal centromeres to derive a eukaryote-wide perspective of centromere structure and function.
Collapse
Affiliation(s)
- Matthew Naish
- Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom
| | - Ian R Henderson
- Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom
| |
Collapse
|
4
|
Annapragada AV, Niknafs N, White JR, Bruhm DC, Cherry C, Medina JE, Adleff V, Hruban C, Mathios D, Foda ZH, Phallen J, Scharpf RB, Velculescu VE. Genome-wide repeat landscapes in cancer and cell-free DNA. Sci Transl Med 2024; 16:eadj9283. [PMID: 38478628 DOI: 10.1126/scitranslmed.adj9283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 02/16/2024] [Indexed: 03/22/2024]
Abstract
Genetic changes in repetitive sequences are a hallmark of cancer and other diseases, but characterizing these has been challenging using standard sequencing approaches. We developed a de novo kmer finding approach, called ARTEMIS (Analysis of RepeaT EleMents in dISease), to identify repeat elements from whole-genome sequencing. Using this method, we analyzed 1.2 billion kmers in 2837 tissue and plasma samples from 1975 patients, including those with lung, breast, colorectal, ovarian, liver, gastric, head and neck, bladder, cervical, thyroid, or prostate cancer. We identified tumor-specific changes in these patients in 1280 repeat element types from the LINE, SINE, LTR, transposable element, and human satellite families. These included changes to known repeats and 820 elements that were not previously known to be altered in human cancer. Repeat elements were enriched in regions of driver genes, and their representation was altered by structural changes and epigenetic states. Machine learning analyses of genome-wide repeat landscapes and fragmentation profiles in cfDNA detected patients with early-stage lung or liver cancer in cross-validated and externally validated cohorts. In addition, these repeat landscapes could be used to noninvasively identify the tissue of origin of tumors. These analyses reveal widespread changes in repeat landscapes of human cancers and provide an approach for their detection and characterization that could benefit early detection and disease monitoring of patients with cancer.
Collapse
Affiliation(s)
- Akshaya V Annapragada
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Noushin Niknafs
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - James R White
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Daniel C Bruhm
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Christopher Cherry
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Jamie E Medina
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Vilmos Adleff
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Carolyn Hruban
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Dimitrios Mathios
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Zachariah H Foda
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Jillian Phallen
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Robert B Scharpf
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Victor E Velculescu
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| |
Collapse
|
5
|
Chung TH, Zhuravskaya A, Makeyev EV. Regulation potential of transcribed simple repeated sequences in developing neurons. Hum Genet 2023:10.1007/s00439-023-02626-1. [PMID: 38153590 DOI: 10.1007/s00439-023-02626-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 11/28/2023] [Indexed: 12/29/2023]
Abstract
Simple repeated sequences (SRSs), defined as tandem iterations of microsatellite- to satellite-sized DNA units, occupy a substantial part of the human genome. Some of these elements are known to be transcribed in the context of repeat expansion disorders. Mounting evidence suggests that the transcription of SRSs may also contribute to normal cellular functions. Here, we used genome-wide bioinformatics approaches to systematically examine SRS transcriptional activity in cells undergoing neuronal differentiation. We identified thousands of long noncoding RNAs containing >200-nucleotide-long SRSs (SRS-lncRNAs), with hundreds of these transcripts significantly upregulated in the neural lineage. We show that SRS-lncRNAs often originate from telomere-proximal regions and that they have a strong potential to form multivalent contacts with a wide range of RNA-binding proteins. Our analyses also uncovered a cluster of neurally upregulated SRS-lncRNAs encoded in a centromere-proximal part of chromosome 9, which underwent an evolutionarily recent segmental duplication. Using a newly established in vitro system for rapid neuronal differentiation of induced pluripotent stem cells, we demonstrate that at least some of the bioinformatically predicted SRS-lncRNAs, including those encoded in the segmentally duplicated part of chromosome 9, indeed increase their expression in developing neurons to readily detectable levels. These and other lines of evidence suggest that many SRSs may be expressed in a cell type and developmental stage-specific manner, providing a valuable resource for further studies focused on the functional consequences of SRS-lncRNAs in the normal development of the human brain, as well as in the context of neurodevelopmental disorders.
Collapse
Affiliation(s)
- Tek Hong Chung
- Centre for Developmental Neurobiology, New Hunt's House, King's College London, London, SE1 1UL, UK
| | - Anna Zhuravskaya
- Centre for Developmental Neurobiology, New Hunt's House, King's College London, London, SE1 1UL, UK
| | - Eugene V Makeyev
- Centre for Developmental Neurobiology, New Hunt's House, King's College London, London, SE1 1UL, UK.
| |
Collapse
|
6
|
Chaisson MJP, Sulovari A, Valdmanis PN, Miller DE, Eichler EE. Advances in the discovery and analyses of human tandem repeats. Emerg Top Life Sci 2023; 7:361-381. [PMID: 37905568 PMCID: PMC10806765 DOI: 10.1042/etls20230074] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/18/2023] [Accepted: 10/18/2023] [Indexed: 11/02/2023]
Abstract
Long-read sequencing platforms provide unparalleled access to the structure and composition of all classes of tandemly repeated DNA from STRs to satellite arrays. This review summarizes our current understanding of their organization within the human genome, their importance with respect to disease, as well as the advances and challenges in understanding their genetic diversity and functional effects. Novel computational methods are being developed to visualize and associate these complex patterns of human variation with disease, expression, and epigenetic differences. We predict accurate characterization of this repeat-rich form of human variation will become increasingly relevant to both basic and clinical human genetics.
Collapse
Affiliation(s)
- Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, U.S.A
- The Genomic and Epigenomic Regulation Program, USC Norris Cancer Center, University of Southern California, Los Angeles, CA 90089, U.S.A
| | - Arvis Sulovari
- Computational Biology, Cajal Neuroscience Inc, Seattle, WA 98102, U.S.A
| | - Paul N Valdmanis
- Division of Medical Genetics, Department of Medicine, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, U.S.A
| | - Danny E Miller
- Department of Laboratory Medicine and Pathology, University of Washington, Seattle, WA 98195, U.S.A
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA 98195, U.S.A
- Department of Pediatrics, University of Washington, Seattle, WA 98195, U.S.A
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, U.S.A
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, U.S.A
| |
Collapse
|
7
|
Zhang Y, Chu J, Cheng H, Li H. De novo reconstruction of satellite repeat units from sequence data. Genome Res 2023; 33:gr.278005.123. [PMID: 37918962 PMCID: PMC10760446 DOI: 10.1101/gr.278005.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 10/18/2023] [Indexed: 11/04/2023]
Abstract
Satellite DNA are long tandemly repeating sequences in a genome and may be organized as high-order repeats (HORs). They are enriched in centromeres and are challenging to assemble. Existing algorithms for identifying satellite repeats either require the complete assembly of satellites or only work for simple repeat structures without HORs. Here we describe Satellite Repeat Finder (SRF), a new algorithm for reconstructing satellite repeat units and HORs from accurate reads or assemblies without prior knowledge on repeat structures. Applying SRF to real sequence data, we show that SRF could reconstruct known satellites in human and well-studied model organisms. We also find satellite repeats are pervasive in various other species, accounting for up to 12% of their genome contents but are often underrepresented in assemblies. With the rapid progress in genome sequencing, SRF will help the annotation of new genomes and the study of satellite DNA evolution even if such repeats are not fully assembled.
Collapse
Affiliation(s)
- Yujie Zhang
- Harvard School of Public Health, Boston, Massachusetts 02115, USA
| | - Justin Chu
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Haoyu Cheng
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Heng Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA;
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02115, USA
| |
Collapse
|
8
|
Horton CA, Alexandari AM, Hayes MGB, Marklund E, Schaepe JM, Aditham AK, Shah N, Suzuki PH, Shrikumar A, Afek A, Greenleaf WJ, Gordân R, Zeitlinger J, Kundaje A, Fordyce PM. Short tandem repeats bind transcription factors to tune eukaryotic gene expression. Science 2023; 381:eadd1250. [PMID: 37733848 DOI: 10.1126/science.add1250] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 07/26/2023] [Indexed: 09/23/2023]
Abstract
Short tandem repeats (STRs) are enriched in eukaryotic cis-regulatory elements and alter gene expression, yet how they regulate transcription remains unknown. We found that STRs modulate transcription factor (TF)-DNA affinities and apparent on-rates by about 70-fold by directly binding TF DNA-binding domains, with energetic impacts exceeding many consensus motif mutations. STRs maximize the number of weakly preferred microstates near target sites, thereby increasing TF density, with impacts well predicted by statistical mechanics. Confirming that STRs also affect TF binding in cells, neural networks trained only on in vivo occupancies predicted effects identical to those observed in vitro. Approximately 90% of TFs preferentially bound STRs that need not resemble known motifs, providing a cis-regulatory mechanism to target TFs to genomic sites.
Collapse
Affiliation(s)
- Connor A Horton
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Amr M Alexandari
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Michael G B Hayes
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Emil Marklund
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Julia M Schaepe
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Arjun K Aditham
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
| | - Nilay Shah
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Peter H Suzuki
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Avanti Shrikumar
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Ariel Afek
- Center for Genomic and Computational Biology, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot 7610001, Israel
| | | | - Raluca Gordân
- Center for Genomic and Computational Biology, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27710, USA
- Department of Computer Science, Duke University, Durham, NC 27708, USA
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC 27710, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
- The University of Kansas Medical Center, Kansas City, KS 66103, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Polly M Fordyce
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
- ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94110, USA
| |
Collapse
|
9
|
Ninomiya K, Yamazaki T, Hirose T. Satellite RNAs: emerging players in subnuclear architecture and gene regulation. EMBO J 2023; 42:e114331. [PMID: 37526230 PMCID: PMC10505914 DOI: 10.15252/embj.2023114331] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/13/2023] [Accepted: 07/22/2023] [Indexed: 08/02/2023] Open
Abstract
Satellite DNA is characterized by long, tandemly repeated sequences mainly found in centromeres and pericentromeric chromosomal regions. The recent advent of telomere-to-telomere sequencing data revealed the complete sequences of satellite regions, including centromeric α-satellites and pericentromeric HSat1-3, which together comprise ~ 5.7% of the human genome. Despite possessing constitutive heterochromatin features, these regions are transcribed to produce long noncoding RNAs with highly repetitive sequences that associate with specific sets of proteins to play various regulatory roles. In certain stress or pathological conditions, satellite RNAs are induced to assemble mesoscopic membraneless organelles. Specifically, under heat stress, nuclear stress bodies (nSBs) are scaffolded by HSat3 lncRNAs, which sequester hundreds of RNA-binding proteins. Upon removal of the stressor, nSBs recruit additional regulatory proteins, including protein kinases and RNA methylases, which modify the previously sequestered nSB components. The sequential recruitment of substrates and enzymes enables nSBs to efficiently regulate the splicing of hundreds of pre-mRNAs under limited temperature conditions. This review discusses the structural features and regulatory roles of satellite RNAs in intracellular architecture and gene regulation.
Collapse
Affiliation(s)
- Kensuke Ninomiya
- Graduate School of Frontier BiosciencesOsaka UniversitySuitaJapan
| | | | - Tetsuro Hirose
- Graduate School of Frontier BiosciencesOsaka UniversitySuitaJapan
- Institute for Open and Transdisciplinary Research Initiatives (OTRI)Osaka UniversitySuitaJapan
| |
Collapse
|
10
|
Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, Hook PW, Koren S, Rautiainen M, Alexandrov IA, Allen J, Asri M, Bzikadze AV, Chen NC, Chin CS, Diekhans M, Flicek P, Formenti G, Fungtammasan A, Garcia Giron C, Garrison E, Gershman A, Gerton JL, Grady PGS, Guarracino A, Haggerty L, Halabian R, Hansen NF, Harris R, Hartley GA, Harvey WT, Haukness M, Heinz J, Hourlier T, Hubley RM, Hunt SE, Hwang S, Jain M, Kesharwani RK, Lewis AP, Li H, Logsdon GA, Lucas JK, Makalowski W, Markovic C, Martin FJ, Mc Cartney AM, McCoy RC, McDaniel J, McNulty BM, Medvedev P, Mikheenko A, Munson KM, Murphy TD, Olsen HE, Olson ND, Paulin LF, Porubsky D, Potapova T, Ryabov F, Salzberg SL, Sauria MEG, Sedlazeck FJ, Shafin K, Shepelev VA, Shumate A, Storer JM, Surapaneni L, Taravella Oill AM, Thibaud-Nissen F, Timp W, Tomaszkiewicz M, Vollger MR, Walenz BP, Watwood AC, Weissensteiner MH, Wenger AM, Wilson MA, Zarate S, Zhu Y, Zook JM, Eichler EE, O'Neill RJ, Schatz MC, Miga KH, Makova KD, Phillippy AM. The complete sequence of a human Y chromosome. Nature 2023; 621:344-354. [PMID: 37612512 PMCID: PMC10752217 DOI: 10.1038/s41586-023-06457-y] [Citation(s) in RCA: 56] [Impact Index Per Article: 56.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 07/19/2023] [Indexed: 08/25/2023]
Abstract
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.
Collapse
Affiliation(s)
- Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies Inc., Oxford, UK
| | - Monika Cechova
- Faculty of Informatics, Masaryk University, Brno, Czech Republic
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Savannah J Hoyt
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Dylan J Taylor
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Paul W Hook
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ivan A Alexandrov
- Federal Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
- Center for Algorithmic Biotechnology, Saint Petersburg State University, St Petersburg, Russia
- Department of Anatomy and Anthropology and Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv-Yafo, Israel
| | - Jamie Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA, USA
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Chen-Shan Chin
- GeneDX Holdings Corp, Stamford, CT, USA
- Foundation of Biological Data Science, Belmont, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | | | | | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Ariel Gershman
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer L Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical Center, Kansas City, MO, USA
| | - Patrick G S Grady
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Reza Halabian
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, Münster, Germany
| | - Nancy F Hansen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Robert Harris
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Gabrielle A Hartley
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Jakob Heinz
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Stephen Hwang
- XDBio Program, Johns Hopkins University, Baltimore, MD, USA
| | - Miten Jain
- Department of Bioengineering, Department of Physics, Northeastern University, Boston, MA, USA
| | - Rupesh K Kesharwani
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Julian K Lucas
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Wojciech Makalowski
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, Münster, Germany
| | - Christopher Markovic
- Genome Technology Access Center at the McDonnell Genome Institute, Washington University, St. Louis, MO, USA
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ann M Mc Cartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer McDaniel
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Brandy M McNulty
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Paul Medvedev
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Saint Petersburg State University, St Petersburg, Russia
- UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Hugh E Olsen
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Nathan D Olson
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Luis F Paulin
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Steven L Salzberg
- Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | | | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | | | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | | | - Likhitha Surapaneni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Angela M Taravella Oill
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University, University Park, PA, USA
- Department of Biomedical Engineering, Pennsylvania State University, State College, PA, USA
| | - Mitchell R Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Allison C Watwood
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | | | | | - Melissa A Wilson
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Yiming Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - Justin M Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Investigator, Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Rachel J O'Neill
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Genetics and Genome Sciences, UConn Health, Farmington, CT, USA
| | - Michael C Schatz
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
11
|
Hallast P, Ebert P, Loftus M, Yilmaz F, Audano PA, Logsdon GA, Bonder MJ, Zhou W, Höps W, Kim K, Li C, Hoyt SJ, Dishuck PC, Porubsky D, Tsetsos F, Kwon JY, Zhu Q, Munson KM, Hasenfeld P, Harvey WT, Lewis AP, Kordosky J, Hoekzema K, O'Neill RJ, Korbel JO, Tyler-Smith C, Eichler EE, Shi X, Beck CR, Marschall T, Konkel MK, Lee C. Assembly of 43 human Y chromosomes reveals extensive complexity and variation. Nature 2023; 621:355-364. [PMID: 37612510 PMCID: PMC10726138 DOI: 10.1038/s41586-023-06425-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 07/11/2023] [Indexed: 08/25/2023]
Abstract
The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.
Collapse
Affiliation(s)
- Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Peter Ebert
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Core Unit Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Mark Loftus
- Department of Genetics & Biochemistry, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
| | - Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Peter A Audano
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marc Jan Bonder
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Weichen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Wolfram Höps
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Kwondo Kim
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Chong Li
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Savannah J Hoyt
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Fotios Tsetsos
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Jee Young Kwon
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Qihui Zhu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Patrick Hasenfeld
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jennifer Kordosky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Rachel J O'Neill
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- The University of Connecticut Health Center, Farmington, CT, USA
| | - Jan O Korbel
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Xinghua Shi
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- The University of Connecticut Health Center, Farmington, CT, USA
| | - Tobias Marschall
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Miriam K Konkel
- Department of Genetics & Biochemistry, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
| |
Collapse
|
12
|
Abstract
The p-arms of the five human acrocentric chromosomes bear nucleolar organizer regions (NORs) comprising ribosomal gene (rDNA) repeats that are organized in a homogeneous tandem array and transcribed in a telomere-to-centromere direction. Precursor ribosomal RNA transcripts are processed and assembled into ribosomal subunits, the nucleolus being the physical manifestation of this process. I review current understanding of nucleolar chromosome biology and describe current exploration into a role for the NOR chromosomal context. Full DNA sequences for acrocentric p-arms are now emerging, aided by the current revolution in long-read sequencing and genome assembly. Acrocentric p-arms vary from 10.1 to 16.7 Mb, accounting for ∼2.2% of the genome. Bordering rDNA arrays, distal junctions, and proximal junctions are shared among the p-arms, with distal junctions showing evidence of functionality. The remaining p-arm sequences comprise multiple satellite DNA classes and segmental duplications that facilitate recombination between heterologous chromosomes, which is likely also involved in Robertsonian translocations.
Collapse
Affiliation(s)
- Brian McStay
- Centre for Chromosome Biology, College of Science and Engineering, University of Galway, Galway, Ireland;
| |
Collapse
|
13
|
Ponomartsev N, Zilov D, Gushcha E, Travina A, Sergeev A, Enukashvily N. Overexpression of Pericentromeric HSAT2 DNA Increases Expression of EMT Markers in Human Epithelial Cancer Cell Lines. Int J Mol Sci 2023; 24:ijms24086918. [PMID: 37108080 PMCID: PMC10138405 DOI: 10.3390/ijms24086918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 04/02/2023] [Accepted: 04/04/2023] [Indexed: 04/29/2023] Open
Abstract
Pericentromeric tandemly repeated DNA of human satellites 1, 2, and 3 (HS1, HS2, and HS3) is actively transcribed in some cells. However, the functionality of the transcription remains obscure. Studies in this area have been hampered by the absence of a gapless genome assembly. The aim of our study was to map a transcript that we have previously described as HS2/HS3 on chromosomes using a newly published gapless genome assembly T2T-CHM13, and create a plasmid overexpressing the transcript to assess the influence of HS2/HS3 transcription on cancer cells. We report here that the sequence of the transcript is tandemly repeated on nine chromosomes (1, 2, 7, 9, 10, 16, 17, 22, and Y). A detailed analysis of its genomic localization and annotation in the T2T-CHM13 assembly revealed that the sequence belonged to HSAT2 (HS2) but not to the HS3 family of tandemly repeated DNA. The transcript was found on both strands of HSAT2 arrays. The overexpression of the HSAT2 transcript increased the transcription of the genes encoding the proteins involved in the epithelial-to-mesenchymal transition, EMT (SNAI1, ZEB1, and SNAI2), and the genes that mark cancer-associated fibroblasts (VIM, COL1A1, COL11A1, and ACTA2) in cancer cell lines A549 and HeLa. Co-transfection of the overexpression plasmid and antisense nucleotides eliminated the transcription of EMT genes observed after HSAT2 overexpression. Antisense oligonucleotides also decreased transcription of the EMT genes induced by tumor growth factor beta 1 (TGFβ1). Thus, our study suggests HSAT2 lncRNA transcribed from the pericentromeric tandemly repeated DNA is involved in EMT regulation in cancer cells.
Collapse
Affiliation(s)
- Nikita Ponomartsev
- Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Danil Zilov
- Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
- Applied Genomics Laboratory, SCAMT Institute, ITMO University, Saint Petersburg 191002, Russia
| | - Ekaterina Gushcha
- Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Alexandra Travina
- Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Alexander Sergeev
- Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| | - Natella Enukashvily
- Institute of Cytology, Russian Academy of Sciences, St. Petersburg 194064, Russia
| |
Collapse
|
14
|
Talbert P, Henikoff S. Centromere drive: chromatin conflict in meiosis. Curr Opin Genet Dev 2022; 77:102005. [PMID: 36372007 DOI: 10.1016/j.gde.2022.102005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/08/2022] [Accepted: 10/24/2022] [Indexed: 11/13/2022]
Abstract
Centromeres are essential loci in eukaryotes that are necessary for the faithful segregation of chromosomes in mitosis and meiosis. Centromeres organize the kinetochore, the protein machine that attaches sister chromatids or homologous chromosomes to spindle microtubules and regulates their disjunction. Centromeres have both genetic and epigenetic determinants, which can come into conflict in asymmetric female meiosis in seed plants and animals. The centromere drive model was proposed to describe this conflict and explain how it leads to the rapid evolution of both centromeres and kinetochores. Recent studies confirm key aspects of the centromere drive model, clarify its mechanisms, and implicate rapid centromere/kinetochore evolution in hybrid inviability between species.
Collapse
Affiliation(s)
- Paul Talbert
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
| | - Steven Henikoff
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Center, 1100 Fairview Ave N, Seattle, WA 98109, USA.
| |
Collapse
|
15
|
Dumetier B, Sauter C, Hajmirza A, Pernon B, Aucagne R, Fournier C, Row C, Guidez F, Rossi C, Lepage C, Delva L, Callanan MB. Repeat Element Activation-Driven Inflammation: Role of NFκB and Implications in Normal Development and Cancer? Biomedicines 2022; 10:biomedicines10123101. [PMID: 36551854 PMCID: PMC9775655 DOI: 10.3390/biomedicines10123101] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 11/14/2022] [Accepted: 11/23/2022] [Indexed: 12/04/2022] Open
Abstract
The human genome is composed of unique DNA sequences that encode proteins and unique sequence noncoding RNAs that are essential for normal development and cellular differentiation. The human genome also contains over 50% of genome sequences that are repeat in nature (tandem and interspersed repeats) that are now known to contribute dynamically to genetic diversity in populations, to be transcriptionally active under certain physiological conditions, and to be aberrantly active in disease states including cancer, where consequences are pleiotropic with impact on cancer cell phenotypes and on the tumor immune microenvironment. Repeat element-derived RNAs play unique roles in exogenous and endogenous cell signaling under normal and disease conditions. A key component of repeat element-derived transcript-dependent signaling occurs via triggering of innate immune receptor signaling that then feeds forward to inflammatory responses through interferon and NFκB signaling. It has recently been shown that cancer cells display abnormal transcriptional activity of repeat elements and that this is linked to either aggressive disease and treatment failure or to improved prognosis/treatment response, depending on cell context and the amplitude of the so-called 'viral mimicry' response that is engaged. 'Viral mimicry' refers to a cellular state of active antiviral response triggered by endogenous nucleic acids often derived from aberrantly transcribed endogenous retrotransposons and other repeat elements. In this paper, the literature regarding transcriptional activation of repeat elements and engagement of inflammatory signaling in normal (focusing on hematopoiesis) and cancer is reviewed with an emphasis on the role of innate immune receptor signaling, in particular by dsRNA receptors of the RIG-1 like receptor family and interferons/NFκB. How repeat element-derived RNA reprograms cell identity through RNA-guided chromatin state modulation is also discussed.
Collapse
Affiliation(s)
- Baptiste Dumetier
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
- Correspondence: (B.D.); (M.B.C.)
| | - Camille Sauter
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
| | - Azadeh Hajmirza
- Institute for Research in Immunology and Cancer, Montreal, QC H3C 3J7, Canada
| | - Baptiste Pernon
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
| | - Romain Aucagne
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
- Unit for Innovation in Genetics and Epigenetics in Oncology, Dijon University Hospital, 21000 Dijon, France
- CRIGEN, Crispr-Functional Genomics, Dijon University Hospital and University of Burgundy, 21000 Dijon, France
| | - Cyril Fournier
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
- Unit for Innovation in Genetics and Epigenetics in Oncology, Dijon University Hospital, 21000 Dijon, France
| | - Céline Row
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
- Unit for Innovation in Genetics and Epigenetics in Oncology, Dijon University Hospital, 21000 Dijon, France
| | - Fabien Guidez
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
| | - Cédric Rossi
- School of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Côme Lepage
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
| | - Laurent Delva
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
| | - Mary B. Callanan
- Faculty of Medicine, INSERM1231, University of Burgundy, 21000 Dijon, France
- Unit for Innovation in Genetics and Epigenetics in Oncology, Dijon University Hospital, 21000 Dijon, France
- CRIGEN, Crispr-Functional Genomics, Dijon University Hospital and University of Burgundy, 21000 Dijon, France
- Correspondence: (B.D.); (M.B.C.)
| |
Collapse
|
16
|
Mirceta M, Shum N, Schmidt MHM, Pearson CE. Fragile sites, chromosomal lesions, tandem repeats, and disease. Front Genet 2022; 13:985975. [PMID: 36468036 PMCID: PMC9714581 DOI: 10.3389/fgene.2022.985975] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 09/02/2022] [Indexed: 09/16/2023] Open
Abstract
Expanded tandem repeat DNAs are associated with various unusual chromosomal lesions, despiralizations, multi-branched inter-chromosomal associations, and fragile sites. Fragile sites cytogenetically manifest as localized gaps or discontinuities in chromosome structure and are an important genetic, biological, and health-related phenomena. Common fragile sites (∼230), present in most individuals, are induced by aphidicolin and can be associated with cancer; of the 27 molecularly-mapped common sites, none are associated with a particular DNA sequence motif. Rare fragile sites ( ≳ 40 known), ≤ 5% of the population (may be as few as a single individual), can be associated with neurodevelopmental disease. All 10 molecularly-mapped folate-sensitive fragile sites, the largest category of rare fragile sites, are caused by gene-specific CGG/CCG tandem repeat expansions that are aberrantly CpG methylated and include FRAXA, FRAXE, FRAXF, FRA2A, FRA7A, FRA10A, FRA11A, FRA11B, FRA12A, and FRA16A. The minisatellite-associated rare fragile sites, FRA10B, FRA16B, can be induced by AT-rich DNA-ligands or nucleotide analogs. Despiralized lesions and multi-branched inter-chromosomal associations at the heterochromatic satellite repeats of chromosomes 1, 9, 16 are inducible by de-methylating agents like 5-azadeoxycytidine and can spontaneously arise in patients with ICF syndrome (Immunodeficiency Centromeric instability and Facial anomalies) with mutations in genes regulating DNA methylation. ICF individuals have hypomethylated satellites I-III, alpha-satellites, and subtelomeric repeats. Ribosomal repeats and subtelomeric D4Z4 megasatellites/macrosatellites, are associated with chromosome location, fragility, and disease. Telomere repeats can also assume fragile sites. Dietary deficiencies of folate or vitamin B12, or drug insults are associated with megaloblastic and/or pernicious anemia, that display chromosomes with fragile sites. The recent discovery of many new tandem repeat expansion loci, with varied repeat motifs, where motif lengths can range from mono-nucleotides to megabase units, could be the molecular cause of new fragile sites, or other chromosomal lesions. This review focuses on repeat-associated fragility, covering their induction, cytogenetics, epigenetics, cell type specificity, genetic instability (repeat instability, micronuclei, deletions/rearrangements, and sister chromatid exchange), unusual heritability, disease association, and penetrance. Understanding tandem repeat-associated chromosomal fragile sites provides insight to chromosome structure, genome packaging, genetic instability, and disease.
Collapse
Affiliation(s)
- Mila Mirceta
- Program of Genetics and Genome Biology, The Hospital for Sick Children, The Peter Gilgan Centre for Research and Learning, Toronto, ON, Canada
- Program of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Natalie Shum
- Program of Genetics and Genome Biology, The Hospital for Sick Children, The Peter Gilgan Centre for Research and Learning, Toronto, ON, Canada
- Program of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Monika H. M. Schmidt
- Program of Genetics and Genome Biology, The Hospital for Sick Children, The Peter Gilgan Centre for Research and Learning, Toronto, ON, Canada
- Program of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Christopher E. Pearson
- Program of Genetics and Genome Biology, The Hospital for Sick Children, The Peter Gilgan Centre for Research and Learning, Toronto, ON, Canada
- Program of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|