1
|
Oketch DJA, Giulietti M, Piva F. Copy Number Variations in Pancreatic Cancer: From Biological Significance to Clinical Utility. Int J Mol Sci 2023; 25:391. [PMID: 38203561 PMCID: PMC10779192 DOI: 10.3390/ijms25010391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 12/20/2023] [Accepted: 12/24/2023] [Indexed: 01/12/2024] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is the most common type of pancreatic cancer, characterized by high tumor heterogeneity and a poor prognosis. Inter- and intra-tumoral heterogeneity in PDAC is a major obstacle to effective PDAC treatment; therefore, it is highly desirable to explore the tumor heterogeneity and underlying mechanisms for the improvement of PDAC prognosis. Gene copy number variations (CNVs) are increasingly recognized as a common and heritable source of inter-individual variation in genomic sequence. In this review, we outline the origin, main characteristics, and pathological aspects of CNVs. We then describe the occurrence of CNVs in PDAC, including those that have been clearly shown to have a pathogenic role, and further highlight some key examples of their involvement in tumor development and progression. The ability to efficiently identify and analyze CNVs in tumor samples is important to support translational research and foster precision oncology, as copy number variants can be utilized to guide clinical decisions. We provide insights into understanding the CNV landscapes and the role of both somatic and germline CNVs in PDAC, which could lead to significant advances in diagnosis, prognosis, and treatment. Although there has been significant progress in this field, understanding the full contribution of CNVs to the genetic basis of PDAC will require further research, with more accurate CNV assays such as single-cell techniques and larger cohorts than have been performed to date.
Collapse
Affiliation(s)
| | - Matteo Giulietti
- Department of Specialistic Clinical and Odontostomatological Sciences, Polytechnic University of Marche, 60131 Ancona, Italy
| | - Francesco Piva
- Department of Specialistic Clinical and Odontostomatological Sciences, Polytechnic University of Marche, 60131 Ancona, Italy
| |
Collapse
|
2
|
Liu N, Li H, Li M, Gao Y, Yan H. Prenatally diagnosed 16p11.2 copy number variations by SNP Array: A retrospective case series. Clin Chim Acta 2023; 538:15-21. [PMID: 36374846 DOI: 10.1016/j.cca.2022.10.016] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 10/14/2022] [Accepted: 10/17/2022] [Indexed: 01/12/2023]
Abstract
OBJECTIVE The 16p11.2 copy number variations (CNVs) are increasingly recognized as one of the most frequent genomic disorders, with a broad spectrum of phenotypes. The fetal phenotype associated with 16p11.2 CNVs is poorly described. The current study presents prenatal series of 16p11.2 CNVs and provides a better understanding of this submicroscopic imbalance in prenatal diagnosis. METHOD Retrospective case series were extracted from a single tertiary referral center performing prenatal single nucleotide polymorphism (SNP) array from April 2017 to December 2021. The maternal demographics, indication for amniocentesis, ultrasound findings, SNP array results, inheritance of the CNVs, and pregnancy outcomes were studied. RESULTS We indentified 30 fetuses carrying 16p11.2 CNVs, representing 0.35% (30/8578) of prenatal SNP array results. The series included 17 fetuses with a proximal deletion, 7 with a distal deletion, 4 with a proximal duplication, and 2 with a distal duplication. Prenatal ultrasound anomalies were reported in 80% of these cases. The most common presentation was vertebralanomalies (9/30). Other features noted in more than one fetus were increased nuchal translucency/nuchal fold (NT/NF) (5/30), absent/hypoplastic nasal bone (3/30), polyhydramnios (3/30), ventricular septal defect (VSD) (2/30), unilateral mild ventriculomegaly (2/30), fetal growth restriction (FGR) (2/30), right aortic arch (2/30). All the 9 vertebralanomalies were present in fetuses harboring proximal deletion (9/17). Familial transmission was confirmed in 44% of cases (11/25) and termination of pregnancy was requested in 62.1% (18/29) of cases. CONCLUSION The 16p11.2 CNVs can have variable prenatal phenotypes and these CNVs are frequently inherited from parents with a milder or normal phenotype. Our results underline that vertebral deformities were frequent in cases of 16p11.2 proximal deletion, and further demonstrate the incomplete penetrance of the CNVs.
Collapse
Affiliation(s)
- Nian Liu
- Department of Health Toxicology, MOE Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China; Prenatal Diagnostic Center, Genetic Lab, Maternal and Child Health Hospital of Hubei Province, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Hui Li
- Prenatal Diagnostic Center, Genetic Lab, Maternal and Child Health Hospital of Hubei Province, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Manman Li
- Prenatal Diagnostic Center, Genetic Lab, Maternal and Child Health Hospital of Hubei Province, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Yanduo Gao
- Department of Ultrasound, Maternal and Child Health Hospital of Hubei Province, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Hong Yan
- Department of Health Toxicology, MOE Key Lab of Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
| |
Collapse
|
3
|
Logsdon GA, Eichler EE. The Dynamic Structure and Rapid Evolution of Human Centromeric Satellite DNA. Genes (Basel) 2022; 14:92. [PMID: 36672831 PMCID: PMC9859433 DOI: 10.3390/genes14010092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 12/22/2022] [Accepted: 12/24/2022] [Indexed: 12/31/2022] Open
Abstract
The complete sequence of a human genome provided our first comprehensive view of the organization of satellite DNA associated with heterochromatin. We review how our understanding of the genetic architecture and epigenetic properties of human centromeric DNA have advanced as a result. Preliminary studies of human and nonhuman ape centromeres reveal complex, saltatory mutational changes organized around distinct evolutionary layers. Pockets of regional hypomethylation within higher-order α-satellite DNA, termed centromere dip regions, appear to define the site of kinetochore attachment in all human chromosomes, although such epigenetic features can vary even within the same chromosome. Sequence resolution of satellite DNA is providing new insights into centromeric function with potential implications for improving our understanding of human biology and health.
Collapse
Affiliation(s)
- Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
4
|
Fingerhut JM, Yamashita YM. The regulation and potential functions of intronic satellite DNA. Semin Cell Dev Biol 2022; 128:69-77. [PMID: 35469677 DOI: 10.1016/j.semcdb.2022.04.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 04/11/2022] [Accepted: 04/12/2022] [Indexed: 12/15/2022]
Abstract
Satellite DNAs are arrays of tandem repeats found in the eukaryotic genome. They are mainly found in pericentromeric heterochromatin and have been believed to be mostly inert, leading satellite DNAs to be erroneously regarded as junk. Recent studies have started to elucidate the function of satellite DNA, yet little is known about the peculiar case where satellite DNA is found within the introns of protein coding genes, resulting in incredibly large introns, a phenomenon termed intron gigantism. Studies in Drosophila demonstrated that satellite DNA-containing introns are transcribed with the gene and require specialized mechanisms to overcome the burdens imposed by the extremely long stretches of repetitive DNA. Whether intron gigantism confers any benefit or serves any functional purpose for cells and/or organisms remains elusive. Here we review our current understanding of intron gigantism: where it is found, the challenges it imposes, how it is regulated and what purpose it may serve.
Collapse
Affiliation(s)
- Jaclyn M Fingerhut
- Whitehead Institute for Biomedical Research and Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA; Howard Hughes Medical Institute, Cambridge, MA, USA.
| | - Yukiko M Yamashita
- Whitehead Institute for Biomedical Research and Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA; Howard Hughes Medical Institute, Cambridge, MA, USA.
| |
Collapse
|
5
|
A classical revival: Human satellite DNAs enter the genomics era. Semin Cell Dev Biol 2022; 128:2-14. [PMID: 35487859 DOI: 10.1016/j.semcdb.2022.04.012] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 04/11/2022] [Accepted: 04/12/2022] [Indexed: 12/30/2022]
Abstract
The classical human satellite DNAs, also referred to as human satellites 1, 2 and 3 (HSat1, HSat2, HSat3, or collectively HSat1-3), occur on most human chromosomes as large, pericentromeric tandem repeat arrays, which together constitute roughly 3% of the human genome (100 megabases, on average). Even though HSat1-3 were among the first human DNA sequences to be isolated and characterized at the dawn of molecular biology, they have remained almost entirely missing from the human genome reference assembly for 20 years, hindering studies of their sequence, regulation, and potential structural roles in the nucleus. Recently, the Telomere-to-Telomere Consortium produced the first truly complete assembly of a human genome, paving the way for new studies of HSat1-3 with modern genomic tools. This review provides an account of the history and current understanding of HSat1-3, with a view towards future studies of their evolution and roles in health and disease.
Collapse
|
6
|
Altemose N, Glennis A, Bzikadze AV, Sidhwani P, Langley SA, Caldas GV, Hoyt SJ, Uralsky L, Ryabov FD, Shew CJ, Sauria MEG, Borchers M, Gershman A, Mikheenko A, Shepelev VA, Dvorkina T, Kunyavskaya O, Vollger MR, Rhie A, McCartney AM, Asri M, Lorig-Roach R, Shafin K, Aganezov S, Olson D, de Lima LG, Potapova T, Hartley GA, Haukness M, Kerpedjiev P, Gusev F, Tigyi K, Brooks S, Young A, Nurk S, Koren S, Salama SR, Paten B, Rogaev EI, Streets A, Karpen GH, Dernburg AF, Sullivan BA, Straight AF, Wheeler TJ, Gerton JL, Eichler EE, Phillippy AM, Timp W, Dennis MY, O'Neill RJ, Zook JM, Schatz MC, Pevzner PA, Diekhans M, Langley CH, Alexandrov IA, Miga KH. Complete genomic and epigenetic maps of human centromeres. Science 2022; 376:eabl4178. [PMID: 35357911 PMCID: PMC9233505 DOI: 10.1126/science.abl4178] [Citation(s) in RCA: 182] [Impact Index Per Article: 91.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, and functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere human genome assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric and centromeric repeats, which constitute 6.2% of the genome (189.9 megabases). Detailed maps of these regions revealed multimegabase structural rearrangements, including in active centromeric repeat arrays. Analysis of centromere-associated sequences uncovered a strong relationship between the position of the centromere and the evolution of the surrounding DNA through layered repeat expansions. Furthermore, comparisons of chromosome X centromeres across a diverse panel of individuals illuminated high degrees of structural, epigenetic, and sequence variation in these complex and rapidly evolving regions.
Collapse
Affiliation(s)
- Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - A. Glennis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Pragya Sidhwani
- Department of Biochemistry, Stanford University, Stanford, CA, USA
| | - Sasha A. Langley
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Gina V. Caldas
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Savannah J. Hoyt
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Lev Uralsky
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
| | | | - Colin J. Shew
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | | | | | - Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | | | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ann M. McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Ryan Lorig-Roach
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Daniel Olson
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | | | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Gabrielle A. Hartley
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Fedor Gusev
- Vavilov Institute of General Genetics, Moscow, Russia
| | - Kristof Tigyi
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Shelise Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alice Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sofie R. Salama
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| | - Evgeny I. Rogaev
- Sirius University of Science and Technology, Sochi, Russia
- Vavilov Institute of General Genetics, Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School, Worcester, MA, USA
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Gary H. Karpen
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- BioEngineering and BioMedical Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Abby F. Dernburg
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
- Institute for Quantitative Biosciences (QB3), University of California, Berkeley, Berkeley, CA, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA
| | | | - Travis J. Wheeler
- Department of Computer Science, University of Montana, Missoula, MT. USA
| | - Jennifer L. Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical School, Department of Biochemistry and Molecular Biology and Cancer Center, University of Kansas, Kansas City, KS, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, and Department of Biochemistry and Molecular Medicine, School of Medicine, University of California, Davis, Davis, CA, USA
| | - Rachel J. O'Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California at San Diego, San Diego, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Charles H. Langley
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
| | - Ivan A. Alexandrov
- Vavilov Institute of General Genetics, Moscow, Russia
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia
- Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| |
Collapse
|
7
|
Abstract
We are entering a new era in genomics where entire centromeric regions are accurately represented in human reference assemblies. Access to these high-resolution maps will enable new surveys of sequence and epigenetic variation in the population and offer new insight into satellite array genomics and centromere function. Here, we focus on the sequence organization and evolution of alpha satellites, which are credited as the genetic and genomic definition of human centromeres due to their interaction with inner kinetochore proteins and their importance in the development of human artificial chromosome assays. We provide an overview of alpha satellite repeat structure and array organization in the context of these high-quality reference data sets; discuss the emergence of variation-based surveys; and provide perspective on the role of this new source of genetic and epigenetic variation in the context of chromosome biology, genome instability, and human disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California 95064, USA; .,Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA
| | - Ivan A Alexandrov
- Department of Genomics and Human Genetics, Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow 119991, Russia; .,Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg 199004, Russia.,Research Center of Biotechnology of the Russian Academy of Sciences, Moscow 119071, Russia
| |
Collapse
|
8
|
Regulatory roles of nucleolus organizer region-derived long non-coding RNAs. Mamm Genome 2021; 33:402-411. [PMID: 34436664 DOI: 10.1007/s00335-021-09906-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 08/20/2021] [Indexed: 12/13/2022]
Abstract
The nucleolus is the largest sub-nuclear domain, serving primarily as the place for ribosome biogenesis. A delicately regulated function of the nucleolus is vital to the cell not only for maintaining proper protein synthesis but is also tightly associated with responses to different types of cellular stresses. Recently, several long non-coding RNAs (lncRNAs) were found to be part of the regulatory network that modulate nucleolar functions. Several of these lncRNAs are encoded in the ribosomal DNA (rDNA) repeats or are transcribed from the genomic regions that are located near the nucleolus organizer regions (NORs). In this review, we first discuss the current understanding of the sequence of the NORs and variations between different NORs. We then focus on the NOR-derived lncRNAs in mammalian cells and their functions in rRNA transcription and the organization of nucleolar structure under different cellular conditions. The identification of these lncRNAs reveals great potential of the NORs in harboring novel genes involved in the regulation of nucleolar functions.
Collapse
|
9
|
Saint-Leandre B, Capy P, Hua-Van A, Filée J. piRNA and Transposon Dynamics in Drosophila: A Female Story. Genome Biol Evol 2021; 12:931-947. [PMID: 32396626 PMCID: PMC7337185 DOI: 10.1093/gbe/evaa094] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/06/2020] [Indexed: 12/12/2022] Open
Abstract
The germlines of metazoans contain transposable elements (TEs) causing genetic instability and affecting fitness. To protect the germline from TE activity, gonads of metazoans produce TE-derived PIWI-interacting RNAs (piRNAs) that silence TE expression. In Drosophila, our understanding of piRNA biogenesis is mainly based on studies of the Drosophila melanogaster female germline. However, it is not known whether piRNA functions are also important in the male germline or whether and how piRNAs are affected by the global genomic context. To address these questions, we compared genome sequences, transcriptomes, and small RNA libraries extracted from entire testes and ovaries of two sister species: D. melanogaster and Drosophila simulans. We found that most TE-derived piRNAs were produced in ovaries and that piRNA pathway genes were strongly overexpressed in ovaries compared with testes, indicating that the silencing of TEs by the piRNA pathway mainly took place in the female germline. To study the relationship between host piRNAs and TE landscape, we analyzed TE genomic features and how they correlate with piRNA production in the two species. In D. melanogaster, we found that TE-derived piRNAs target recently active TEs. In contrast, although Drosophila simulans TEs do not display any features of recent activity, the host still intensively produced silencing piRNAs targeting old TE relics. Together, our results show that the piRNA silencing response mainly takes place in Drosophila ovaries and indicate that the host piRNA response is implemented following a burst of TE activity and could persist long after the extinction of active TE families.
Collapse
Affiliation(s)
- Bastien Saint-Leandre
- Laboratoire Evolution, Génomes, Comportement, Ecologie CNRS, Université Paris-Sud, IRD, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Pierre Capy
- Laboratoire Evolution, Génomes, Comportement, Ecologie CNRS, Université Paris-Sud, IRD, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Aurelie Hua-Van
- Laboratoire Evolution, Génomes, Comportement, Ecologie CNRS, Université Paris-Sud, IRD, Université Paris-Saclay, Gif-sur-Yvette, France
| | - Jonathan Filée
- Laboratoire Evolution, Génomes, Comportement, Ecologie CNRS, Université Paris-Sud, IRD, Université Paris-Saclay, Gif-sur-Yvette, France
| |
Collapse
|
10
|
Abdullaev ET, Umarova IR, Arndt PF. Modelling segmental duplications in the human genome. BMC Genomics 2021; 22:496. [PMID: 34215180 PMCID: PMC8254307 DOI: 10.1186/s12864-021-07789-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Accepted: 06/10/2021] [Indexed: 11/22/2022] Open
Abstract
Background Segmental duplications (SDs) are long DNA sequences that are repeated in a genome and have high sequence identity. In contrast to repetitive elements they are often unique and only sometimes have multiple copies in a genome. There are several well-studied mechanisms responsible for segmental duplications: non-allelic homologous recombination, non-homologous end joining and replication slippage. Such duplications play an important role in evolution, however, we do not have a full understanding of the dynamic properties of the duplication process. Results We study segmental duplications through a graph representation where nodes represent genomic regions and edges represent duplications between them. The resulting network (the SD network) is quite complex and has distinct features which allow us to make inference on the evolution of segmantal duplications. We come up with the network growth model that explains features of the SD network thus giving us insights on dynamics of segmental duplications in the human genome. Based on our analysis of genomes of other species the network growth model seems to be applicable for multiple mammalian genomes. Conclusions Our analysis suggests that duplication rates of genomic loci grow linearly with the number of copies of a duplicated region. Several scenarios explaining such a preferential duplication rates were suggested. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-021-07789-7).
Collapse
Affiliation(s)
- Eldar T Abdullaev
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63/73, Berlin, 14195, Germany.
| | - Iren R Umarova
- Faculty of Computational Mathematics and Cybernetics, Moscow State University, Leninskiye Gory 1-52, Moscow, 119991, Russia
| | - Peter F Arndt
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63/73, Berlin, 14195, Germany
| |
Collapse
|
11
|
Heerschop S, Fagrouch Z, Verschoor EJ, Zischler H. Pinpointing the PRDM9-PRDM7 Gene Duplication Event During Primate Divergence. Front Genet 2021; 12:593725. [PMID: 33719332 PMCID: PMC7943923 DOI: 10.3389/fgene.2021.593725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 02/01/2021] [Indexed: 12/03/2022] Open
Abstract
Studies on the function of PRDM9 in model systems and its evolution during vertebrate divergence shed light on the basic molecular mechanisms of hybrid sterility and its evolutionary consequences. However, information regarding PRDM9-homolog, PRDM7, whose origin is placed in the primate evolutionary tree, as well as information about the fast-evolving DNA-binding zinc finger array of strepsirrhine PRDM9 are scarce. Thus, we aimed to narrow down the date of the duplication event leading to the emergence of PRDM7 during primate evolution by comparing the phylogenetic tree reconstructions of representative primate samples of PRDM orthologs and paralogs. To confirm our PRDM7 paralogization pattern, database-deposited sequences were used to test the presence/absence patterns expected from the paralogization timing. In addition, we extended the existing phylogenetic tree of haplorrhine PRDM9 zinc fingers with their strepsirrhine counterparts. The inclusion of strepsirrhine zinc fingers completes the PRDM9 primate phylogeny. Moreover, the updated phylogeny of PRDM9 zinc fingers showed distinct clusters of strepsirrhine, tarsier, and anthropoid degenerated zinc fingers. Here, we show that PRDM7 emerged on the branch leading to the most recent common ancestor of catarrhines; therefore, its origin is more recent than previously expected. A more detailed character evolutionary study suggests that PRDM7 may have evolved differently in Cercopithecoidea as compared to Hominoidea: it lacks the first four exons in Old World monkeys orthologs and exon 10 in Papionini orthologs. Dating the origin of PRDM7 is essential for further studies investigating why Hominoidea representatives need another putative histone methyltransferase in the testis.
Collapse
Affiliation(s)
- Sacha Heerschop
- Division of Anthropology, Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Zahra Fagrouch
- Department of Virology, Biomedical Primate Research Centre, Rijswijk, Netherlands
| | - Ernst J Verschoor
- Department of Virology, Biomedical Primate Research Centre, Rijswijk, Netherlands
| | - Hans Zischler
- Division of Anthropology, Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
12
|
Mihìc P, Hédouin S, Francastel C. Centromeres Transcription and Transcripts for Better and for Worse. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2021; 60:169-201. [PMID: 34386876 DOI: 10.1007/978-3-030-74889-0_7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Centromeres are chromosomal regions that are essential for the faithful transmission of genetic material through each cell division. They represent the chromosomal platform on which assembles a protein complex, the kinetochore, which mediates attachment to the mitotic spindle. In most organisms, centromeres assemble on large arrays of tandem satellite repeats, although their DNA sequences and organization are highly divergent among species. It has become evident that centromeres are not defined by underlying DNA sequences, but are instead epigenetically defined by the deposition of the centromere-specific histone H3 variant, CENP-A. In addition, and although long regarded as silent chromosomal loci, centromeres are in fact transcriptionally competent in most species, yet at low levels in normal somatic cells, but where the resulting transcripts participate in centromere architecture, identity, and function. In this chapter, we discuss the various roles proposed for centromere transcription and their transcripts, and the potential molecular mechanisms involved. We also discuss pathological cases in which unscheduled transcription of centromeric repeats or aberrant accumulation of their transcripts are pathological signatures of chromosomal instability diseases. In sum, tight regulation of centromeric satellite repeats transcription is critical for healthy development and tissue homeostasis, and thus prevents the emergence of disease states.
Collapse
Affiliation(s)
- Pia Mihìc
- Université De Paris, Epigenetics and Cell Fate, CNRS UMR7216, Paris, France
| | - Sabrine Hédouin
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Claire Francastel
- Université De Paris, Epigenetics and Cell Fate, CNRS UMR7216, Paris, France.
| |
Collapse
|
13
|
Nurk S, Walenz BP, Rhie A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phillippy AM, Koren S. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res 2020; 30:1291-1305. [PMID: 32801147 PMCID: PMC7545148 DOI: 10.1101/gr.263566.120] [Citation(s) in RCA: 320] [Impact Index Per Article: 80.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Accepted: 08/04/2020] [Indexed: 12/14/2022]
Abstract
Complete and accurate genome assemblies form the basis of most downstream genomic analyses and are of critical importance. Recent genome assembly projects have relied on a combination of noisy long-read sequencing and accurate short-read sequencing, with the former offering greater assembly continuity and the latter providing higher consensus accuracy. The recently introduced Pacific Biosciences (PacBio) HiFi sequencing technology bridges this divide by delivering long reads (>10 kbp) with high per-base accuracy (>99.9%). Here we present HiCanu, a modification of the Canu assembler designed to leverage the full potential of HiFi reads via homopolymer compression, overlap-based error correction, and aggressive false overlap filtering. We benchmark HiCanu with a focus on the recovery of haplotype diversity, major histocompatibility complex (MHC) variants, satellite DNAs, and segmental duplications. For diploid human genomes sequenced to 30× HiFi coverage, HiCanu achieved superior accuracy and allele recovery compared to the current state of the art. On the effectively haploid CHM13 human cell line, HiCanu achieved an NG50 contig size of 77 Mbp with a per-base consensus accuracy of 99.999% (QV50), surpassing recent assemblies of high-coverage, ultralong Oxford Nanopore Technologies (ONT) reads in terms of both accuracy and continuity. This HiCanu assembly correctly resolves 337 out of 341 validation BACs sampled from known segmental duplications and provides the first preliminary assemblies of nine complete human centromeric regions. Although gaps and errors still remain within the most challenging regions of the genome, these results represent a significant advance toward the complete assembly of human genomes.
Collapse
Affiliation(s)
- Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Mitchell R Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Robert Grothe
- Pacific Biosciences, Menlo Park, California 94025, USA
| | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California 95064, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA
| |
Collapse
|
14
|
Miga KH. Centromere studies in the era of 'telomere-to-telomere' genomics. Exp Cell Res 2020; 394:112127. [PMID: 32504677 DOI: 10.1016/j.yexcr.2020.112127] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 05/23/2020] [Accepted: 05/30/2020] [Indexed: 12/17/2022]
Abstract
We are entering into an exciting era of genomics where truly complete, high-quality assemblies of human chromosomes are available end-to-end, or from 'telomere-to-telomere' (T2T). This technological advance offers a new opportunity to include endogenous human centromeric regions in high-resolution, sequence-based studies. These emerging reference maps are expected to reveal a new functional landscape in the human genome, where centromere proteins, transcriptional regulation, and spatial organization can be examined with base-level resolution across different stages of development and disease. Such studies will depend on innovative assembly methods of extremely long tandem repeats (ETRs), or satellite DNAs, paired with the development of new, orthogonal validation methods to ensure accuracy and completeness. This review reflects the progress in centromere genomics, credited by recent advancements in long-read sequencing and assembly methods. In doing so, I will discuss the challenges that remain and the promise for a new period of scientific discovery for satellite DNA biology and centromere function.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, CA, 95064, USA.
| |
Collapse
|
15
|
Ma JY, Feng X, Tian XY, Chen LN, Fan XY, Guo L, Li S, Yin S, Luo SM, Ou XH. The repair of endo/exogenous DNA double-strand breaks and its effects on meiotic chromosome segregation in oocytes. Hum Mol Genet 2020; 28:3422-3430. [PMID: 31384951 DOI: 10.1093/hmg/ddz156] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Revised: 06/28/2019] [Accepted: 06/30/2019] [Indexed: 11/14/2022] Open
Abstract
Germ cell-derived genomic structure variants not only drive the evolution of species but also induce developmental defects in offspring. The genomic structure variants have different types, but most of them are originated from DNA double-strand breaks (DSBs). It is still not well known whether DNA DSBs exist in adult mammalian oocytes and how the growing and fully grown oocytes repair their DNA DSBs induced by endogenous or exogenous factors. In this study, we detected the endogenous DNA DSBs in the growing and fully grown mouse oocytes and found that the DNA DSBs mainly localized at the centromere-adjacent regions, which are also copy number variation hotspots. When the exogenous DNA DSBs were introduced by Etoposide, we found that Rad51-mediated homologous recombination (HR) was used to repair the broken DNA. However, the HR repair caused the chromatin intertwined and impaired the homologous chromosome segregation in oocytes. Although we had not detected the indication about HR repair of endogenous centromere-adjacent DNA DSBs, we found that Rad52 and RNA:DNA hybrids colocalized with these DNA DSBs, indicating that a Rad52-dependent DNA repair might exist in oocytes. In summary, our results not only demonstrated an association between endogenous DNA DSBs with genomic structure variants but also revealed one specific DNA DSB repair manner in oocytes.
Collapse
Affiliation(s)
- Jun-Yu Ma
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Xie Feng
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Xin-Yi Tian
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Lei-Ning Chen
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Xiao-Yan Fan
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Lei Guo
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Sen Li
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Shen Yin
- College of Life Sciences, Qingdao Agricultural University, Qingdao, China
| | - Shi-Ming Luo
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Xiang-Hong Ou
- Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China
| |
Collapse
|
16
|
Brasó-Vives M, Povolotskaya IS, Hartasánchez DA, Farré X, Fernandez-Callejo M, Raveendran M, Harris RA, Rosene DL, Lorente-Galdos B, Navarro A, Marques-Bonet T, Rogers J, Juan D. Copy number variants and fixed duplications among 198 rhesus macaques (Macaca mulatta). PLoS Genet 2020; 16:e1008742. [PMID: 32392208 PMCID: PMC7241854 DOI: 10.1371/journal.pgen.1008742] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 05/21/2020] [Accepted: 03/27/2020] [Indexed: 01/01/2023] Open
Abstract
The rhesus macaque is an abundant species of Old World monkeys and a valuable model organism for biomedical research due to its close phylogenetic relationship to humans. Copy number variation is one of the main sources of genomic diversity within and between species and a widely recognized cause of inter-individual differences in disease risk. However, copy number differences among rhesus macaques and between the human and macaque genomes, as well as the relevance of this diversity to research involving this nonhuman primate, remain understudied. Here we present a high-resolution map of sequence copy number for the rhesus macaque genome constructed from a dataset of 198 individuals. Our results show that about one-eighth of the rhesus macaque reference genome is composed of recently duplicated regions, either copy number variable regions or fixed duplications. Comparison with human genomic copy number maps based on previously published data shows that, despite overall similarities in the genome-wide distribution of these regions, there are specific differences at the chromosome level. Some of these create differences in the copy number profile between human disease genes and their rhesus macaque orthologs. Our results highlight the importance of addressing the number of copies of target genes in the design of experiments and cautions against human-centered assumptions in research conducted with model organisms. Overall, we present a genome-wide copy number map from a large sample of rhesus macaque individuals representing an important novel contribution concerning the evolution of copy number in primate genomes.
Collapse
Affiliation(s)
- Marina Brasó-Vives
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
- Laboratoire de Biométrie et Biologie Évolutive UMR 5558, Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| | - Inna S. Povolotskaya
- Veltischev Research and Clinical Institute for Pediatrics of the Pirogov Russian National Research Medical University, Moscow, Russia
| | - Diego A. Hartasánchez
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
| | - Xavier Farré
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
| | - Marcos Fernandez-Callejo
- National Centre for Genomic Analysis-Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - R. Alan Harris
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Douglas L. Rosene
- Department of Anatomy and Neurobiology, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Belen Lorente-Galdos
- Department of Neuroscience, Yale School of Medicine, New Haven, Connecticut, United States of America
| | - Arcadi Navarro
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Catalonia, Spain
| | - Tomas Marques-Bonet
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
- National Centre for Genomic Analysis-Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Catalonia, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Catalonia, Spain
| | - Jeffrey Rogers
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - David Juan
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Parc de Recerca Biomèdica de Barcelona, Barcelona, Catalonia, Spain
| |
Collapse
|
17
|
Genome-wide unique insertion sequences among five Brucella species and demonstration of differential identification of Brucella by multiplex PCR assay. Sci Rep 2020; 10:6368. [PMID: 32286356 PMCID: PMC7156498 DOI: 10.1038/s41598-020-62472-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 03/02/2020] [Indexed: 11/17/2022] Open
Abstract
Brucellosis is a neglected zoonotic disease caused by alpha proteobacterial genus Brucella comprising of facultative intracellular pathogenic species that can infect both animals and humans. In this study, we aimed to identify genome-wide unique insertion sequence (IS) elements among Brucella abortus, B. melitensis, B. ovis, B. suis and B. canis for use in species differentiation by conducting an intensive in silico-based comparative genomic analysis. As a result, 25, 27, 37, 86 and 3 unique ISs were identified respectively and they had a striking pattern of distribution among them. To explain, a particular IS would be present in four species with 100% identity whereas completely absent in the fifth species. However, flanking regions of that IS element would be highly identical and conserved in all five species. Species-specific primers designed on these flanking conserved regions resulted in two different amplicons grouping the species into two: one that possesses IS and the other that lacks it. Seeking for species-specific amplicon size for particular species was sufficient to identify it irrespective of biovar. A multiplex PCR developed using these primers resulted in successful differentiation of the five species irrespective of biovars with significant specificity and sensitivity when examined on clinical samples.
Collapse
|
18
|
McCartney AM, Hyland EM, Cormican P, Moran RJ, Webb AE, Lee KD, Hernandez-Rodriguez J, Prado-Martinez J, Creevey CJ, Aspden JL, McInerney JO, Marques-Bonet T, O'Connell MJ. Gene Fusions Derived by Transcriptional Readthrough are Driven by Segmental Duplication in Human. Genome Biol Evol 2020; 11:2678-2690. [PMID: 31400206 PMCID: PMC6764479 DOI: 10.1093/gbe/evz163] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/17/2019] [Indexed: 12/14/2022] Open
Abstract
Gene fusion occurs when two or more individual genes with independent open reading frames becoming juxtaposed under the same open reading frame creating a new fused gene. A small number of gene fusions described in detail have been associated with novel functions, for example, the hominid-specific PIPSL gene, TNFSF12, and the TWE-PRIL gene family. We use Sequence Similarity Networks and species level comparisons of great ape genomes to identify 45 new genes that have emerged by transcriptional readthrough, that is, transcription-derived gene fusion. For 35 of these putative gene fusions, we have been able to assess available RNAseq data to determine whether there are reads that map to each breakpoint. A total of 29 of the putative gene fusions had annotated transcripts (9/29 of which are human-specific). We carried out RT-qPCR in a range of human tissues (placenta, lung, liver, brain, and testes) and found that 23 of the putative gene fusion events were expressed in at least one tissue. Examining the available ribosome foot-printing data, we find evidence for translation of three of the fused genes in human. Finally, we find enrichment for transcription-derived gene fusions in regions of known segmental duplication in human. Together, our results implicate chromosomal structural variation brought about by segmental duplication with the emergence of novel transcripts and translated protein products.
Collapse
Affiliation(s)
- Ann M McCartney
- Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Ireland.,Computational and Molecular Evolutionary Biology Group, School of Biology, Faculty of Biological Sciences, The University of Leeds, United Kingdom
| | - Edel M Hyland
- Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Ireland.,Institute for Global Food Security, Queens University Belfast, United Kingdom
| | - Paul Cormican
- Teagasc Animal and Bioscience Research Department, Animal & Grassland Research and Innovation Centre, Teagasc, Grange, Dunsany, County Meath, Ireland
| | - Raymond J Moran
- Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Ireland.,Computational and Molecular Evolutionary Biology Group, School of Biology, Faculty of Biological Sciences, The University of Leeds, United Kingdom
| | - Andrew E Webb
- Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Ireland
| | - Kate D Lee
- Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Ireland.,School of Biological Sciences, University of Auckland, New Zealand.,School of Fundamental Sciences, Massey University, New Zealand
| | | | - Javier Prado-Martinez
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain.,Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, United Kingdom
| | - Christopher J Creevey
- Institute for Global Food Security, Queens University Belfast, United Kingdom.,Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, United Kingdom
| | - Julie L Aspden
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, The University of Leeds, United Kingdom
| | - James O McInerney
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, M13 9PL, United Kingdom.,School of Life Sciences, Faculty of Medicine and Health Sciences, The University of Nottingham, NG7 2RD, United Kingdom
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain.,Catalan Institution of Research and Advanced Studies (ICREA), Passeig de Lluís Companys, 23, 08010, Barcelona, Spain.,NAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain.,Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Edifici ICTA-ICP, c/ Columnes s/n, 08193 Cerdanyola del Vallés, Barcelona, Spain
| | - Mary J O'Connell
- Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Ireland.,Computational and Molecular Evolutionary Biology Group, School of Biology, Faculty of Biological Sciences, The University of Leeds, United Kingdom.,School of Life Sciences, Faculty of Medicine and Health Sciences, The University of Nottingham, NG7 2RD, United Kingdom
| |
Collapse
|
19
|
VOLLGER MITCHELLR, LOGSDON GLENNISA, AUDANO PETERA, SULOVARI ARVIS, PORUBSKY DAVID, PELUSO PAUL, WENGER AARONM, CONCEPCION GREGORYT, KRONENBERG ZEVN, MUNSON KATHERINEM, BAKER CARL, SANDERS ASHLEYD, SPIERINGS DIANAC, LANSDORP PETERM, SURTI URVASHI, HUNKAPILLER MICHAELW, EICHLER EVANE. Improved assembly and variant detection of a haploid human genome using single-molecule, high-fidelity long reads. Ann Hum Genet 2020; 84:125-140. [PMID: 31711268 PMCID: PMC7015760 DOI: 10.1111/ahg.12364] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 10/17/2019] [Accepted: 10/18/2019] [Indexed: 01/14/2023]
Abstract
The sequence and assembly of human genomes using long-read sequencing technologies has revolutionized our understanding of structural variation and genome organization. We compared the accuracy, continuity, and gene annotation of genome assemblies generated from either high-fidelity (HiFi) or continuous long-read (CLR) datasets from the same complete hydatidiform mole human genome. We find that the HiFi sequence data assemble an additional 10% of duplicated regions and more accurately represent the structure of tandem repeats, as validated with orthogonal analyses. As a result, an additional 5 Mbp of pericentromeric sequences are recovered in the HiFi assembly, resulting in a 2.5-fold increase in the NG50 within 1 Mbp of the centromere (HiFi 480.6 kbp, CLR 191.5 kbp). Additionally, the HiFi genome assembly was generated in significantly less time with fewer computational resources than the CLR assembly. Although the HiFi assembly has significantly improved continuity and accuracy in many complex regions of the genome, it still falls short of the assembly of centromeric DNA and the largest regions of segmental duplication using existing assemblers. Despite these shortcomings, our results suggest that HiFi may be the most effective standalone technology for de novo assembly of human genomes.
Collapse
Affiliation(s)
- MITCHELL R. VOLLGER
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- These authors contributed equally to this work
| | - GLENNIS A. LOGSDON
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- These authors contributed equally to this work
| | - PETER A. AUDANO
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - ARVIS SULOVARI
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - DAVID PORUBSKY
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - PAUL PELUSO
- Pacific Biosciences of California, Inc., Menlo Park, CA 94025, USA
| | - AARON M. WENGER
- Pacific Biosciences of California, Inc., Menlo Park, CA 94025, USA
| | | | | | - KATHERINE M. MUNSON
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - CARL BAKER
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - ASHLEY D. SANDERS
- European Molecular Biology Laboratory, Genome Biology Unit, 69117, Heidelberg, Germany
| | - DIANA C.J. SPIERINGS
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV Groningen, The Netherlands
| | - PETER M. LANSDORP
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV Groningen, The Netherlands
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, BC V5Z 1L3, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - URVASHI SURTI
- Department of Pathology, University of Pittsburgh School of Medicine, and University of Pittsburgh Medical Center, Pittsburgh, PA 15213, USA
| | | | - EVAN E. EICHLER
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
20
|
Bracewell R, Chatla K, Nalley MJ, Bachtrog D. Dynamic turnover of centromeres drives karyotype evolution in Drosophila. eLife 2019; 8:e49002. [PMID: 31524597 PMCID: PMC6795482 DOI: 10.7554/elife.49002] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Accepted: 09/12/2019] [Indexed: 12/21/2022] Open
Abstract
Centromeres are the basic unit for chromosome inheritance, but their evolutionary dynamics is poorly understood. We generate high-quality reference genomes for multiple Drosophila obscura group species to reconstruct karyotype evolution. All chromosomes in this lineage were ancestrally telocentric and the creation of metacentric chromosomes in some species was driven by de novo seeding of new centromeres at ancestrally gene-rich regions, independently of chromosomal rearrangements. The emergence of centromeres resulted in a drastic size increase due to repeat accumulation, and dozens of genes previously located in euchromatin are now embedded in pericentromeric heterochromatin. Metacentric chromosomes secondarily became telocentric in the pseudoobscura subgroup through centromere repositioning and a pericentric inversion. The former (peri)centric sequences left behind shrunk dramatically in size after their inactivation, yet contain remnants of their evolutionary past, including increased repeat-content and heterochromatic environment. Centromere movements are accompanied by rapid turnover of the major satellite DNA detected in (peri)centromeric regions.
Collapse
Affiliation(s)
- Ryan Bracewell
- Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyUnited States
| | - Kamalakar Chatla
- Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyUnited States
| | - Matthew J Nalley
- Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyUnited States
| | - Doris Bachtrog
- Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyUnited States
| |
Collapse
|
21
|
A New Portrait of Constitutive Heterochromatin: Lessons from Drosophila melanogaster. Trends Genet 2019; 35:615-631. [PMID: 31320181 DOI: 10.1016/j.tig.2019.06.002] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 06/05/2019] [Accepted: 06/06/2019] [Indexed: 12/14/2022]
Abstract
Constitutive heterochromatin represents a significant portion of eukaryotic genomes, but its functions still need to be elucidated. Even in the most updated genetics and molecular biology textbooks, constitutive heterochromatin is portrayed mainly as the 'silent' component of eukaryotic genomes. However, there may be more complexity to the relationship between heterochromatin and gene expression. In the fruit fly Drosophila melanogaster, a model for heterochromatin studies, about one-third of the genome is heterochromatic and is concentrated in the centric, pericentric, and telomeric regions of the chromosomes. Recent findings indicate that hundreds of D. melanogaster genes can 'live and work' properly within constitutive heterochromatin. The genomic size of these genes is generally larger than that of euchromatic genes and together they account for a significant fraction of the entire constitutive heterochromatin. Thus, this peculiar genome component in spite its ability to induce silencing, has in fact the means for being quite dynamic. A major scope of this review is to revisit the 'dogma of silent heterochromatin'.
Collapse
|
22
|
Miga KH. Centromeric Satellite DNAs: Hidden Sequence Variation in the Human Population. Genes (Basel) 2019; 10:E352. [PMID: 31072070 PMCID: PMC6562703 DOI: 10.3390/genes10050352] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 05/03/2019] [Accepted: 05/03/2019] [Indexed: 12/30/2022] Open
Abstract
The central goal of medical genomics is to understand the inherited basis of sequence variation that underlies human physiology, evolution, and disease. Functional association studies currently ignore millions of bases that span each centromeric region and acrocentric short arm. These regions are enriched in long arrays of tandem repeats, or satellite DNAs, that are known to vary extensively in copy number and repeat structure in the human population. Satellite sequence variation in the human genome is often so large that it is detected cytogenetically, yet due to the lack of a reference assembly and informatics tools to measure this variability, contemporary high-resolution disease association studies are unable to detect causal variants in these regions. Nevertheless, recently uncovered associations between satellite DNA variation and human disease support that these regions present a substantial and biologically important fraction of human sequence variation. Therefore, there is a pressing and unmet need to detect and incorporate this uncharacterized sequence variation into broad studies of human evolution and medical genomics. Here I discuss the current knowledge of satellite DNA variation in the human genome, focusing on centromeric satellites and their potential implications for disease.
Collapse
Affiliation(s)
- Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, CA 95064, USA.
| |
Collapse
|
23
|
Analysis of 62 hybrid assembled human Y chromosomes exposes rapid structural changes and high rates of gene conversion. PLoS Genet 2017; 13:e1006834. [PMID: 28846694 PMCID: PMC5591018 DOI: 10.1371/journal.pgen.1006834] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 09/08/2017] [Accepted: 05/22/2017] [Indexed: 11/21/2022] Open
Abstract
The human Y-chromosome does not recombine across its male-specific part and is therefore an excellent marker of human migrations. It also plays an important role in male fertility. However, its evolution is difficult to fully understand because of repetitive sequences, inverted repeats and the potentially large role of gene conversion. Here we perform an evolutionary analysis of 62 Y-chromosomes of Danish descent sequenced using a wide range of library insert sizes and high coverage, thus allowing large regions of these chromosomes to be well assembled. These include 17 father-son pairs, which we use to validate variation calling. Using a recent method that can integrate variants based on both mapping and de novo assembly, we genotype 10898 SNVs and 2903 indels (max length of 27241 bp) in our sample and show by father-son concordance and experimental validation that the non-recurrent SNP and indel variation on the Y chromosome tree is called very accurately. This includes variation called in a 0.9 Mb centromeric heterochromatic region, which is by far the most variable in the Y chromosome. Among the variation is also longer sequence-stretches not present in the reference genome but shared with the chimpanzee Y chromosome. We analyzed 2.7 Mb of large inverted repeats (palindromes) for variation patterns among the two palindrome arms and identified 603 mutation and 416 gene conversions events. We find clear evidence for GC-biased gene conversion in the palindromes (and a balancing AT mutation bias), but irrespective of this, also a strong bias towards gene conversion towards the ancestral state, suggesting that palindromic gene conversion may alleviate Muller’s ratchet. Finally, we also find a large number of large-scale gene duplications and deletions in the palindromic regions (at least 24) and find that such events can consist of complex combinations of simultaneous insertions and deletions of long stretches of the Y chromosome. The Y chromosome is extraordinary in many respects; it is non-recombining along most of its length, it carries many testis-expressed genes that are often found in palindromes and thus in several copies, and it is generally highly repetitive with very few unique genes. Its evolutionary process is not well understood in general because short-read mapping in such complex sequence is difficult. We combine de novo assembly and mapping to investigate evolution in more than 60% of the length of 62 Y chromosomes of Danish descent. We find that Y chromosome evolution is very dynamic even among the set of closely related Y chromosomes in Denmark with many cases of complex duplications and deletions of large regions including whole genes, clear evidence of GC-biased gene conversion in the palindromes and a tendency for gene conversion to revert mutations to their ancestral state.
Collapse
|
24
|
Miga KH. The Promises and Challenges of Genomic Studies of Human Centromeres. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2017; 56:285-304. [PMID: 28840242 DOI: 10.1007/978-3-319-58592-5_12] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Human centromeres are genomic regions that act as sites of kinetochore assembly to ensure proper chromosome segregation during mitosis and meiosis. Although the biological importance of centromeres in genome stability, and ultimately, cell viability are well understood, the complete sequence content and organization in these multi-megabase-sized regions remains unknown. The lack of a high-resolution reference assembly inhibits standard bioinformatics protocols, and as a result, sequence-based studies involving human centromeres lag far behind the advances made for the non-repetitive sequences in the human genome. In this chapter, I introduce what is known about the genomic organization in the highly repetitive regions spanning human centromeres, and discuss the challenges these sequences pose for assembly, alignment, and data interpretation. Overcoming these obstacles is expected to issue a new era for centromere genomics, which will offer new discoveries in basic cell biology and human biomedical research.
Collapse
Affiliation(s)
- Karen H Miga
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA, USA.
| |
Collapse
|
25
|
Giulotto E, Raimondi E, Sullivan KF. The Unique DNA Sequences Underlying Equine Centromeres. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2017; 56:337-354. [PMID: 28840244 DOI: 10.1007/978-3-319-58592-5_14] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Centromeres are highly distinctive genetic loci whose function is specified largely by epigenetic mechanisms. Understanding the role of DNA sequences in centromere function has been a daunting task due to the highly repetitive nature of centromeres in animal chromosomes. The discovery of a centromere devoid of satellite DNA in the domestic horse consolidated observations on the epigenetic nature of centromere identity, showing that entirely natural chromosomes could function without satellite DNA cues. Horses belong to the genus Equus which exhibits a very high degree of evolutionary plasticity in centromere position and DNA sequence composition. Examination of horses has revealed that the position of the satellite-free centromere is variable among individuals. Analysis of centromere location and composition in other Equus species, including domestic donkey and zebras, confirms that the satellite-less configuration of centromeres is common in this group which has undergone particularly rapid karyotype evolution. These features have established the equids as a new mammalian system in which to investigate the molecular organization, dynamics and evolutionary behaviour of centromeres.
Collapse
Affiliation(s)
- Elena Giulotto
- Dipartimento di Biologia e Biotecnologie, Università di Pavia, Via Ferrata 1, 27100, Pavia, Italy.
| | - Elena Raimondi
- Dipartimento di Biologia e Biotecnologie, Università di Pavia, Via Ferrata 1, 27100, Pavia, Italy
| | - Kevin F Sullivan
- National University of Ireland Galway, University Road, Galway, Ireland
| |
Collapse
|
26
|
Cacheux L, Ponger L, Gerbault-Seureau M, Richard FA, Escudé C. Diversity and distribution of alpha satellite DNA in the genome of an Old World monkey: Cercopithecus solatus. BMC Genomics 2016; 17:916. [PMID: 27842493 PMCID: PMC5109768 DOI: 10.1186/s12864-016-3246-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Accepted: 11/02/2016] [Indexed: 11/10/2022] Open
Abstract
Background Alpha satellite is the major repeated DNA element of primate centromeres. Evolution of these tandemly repeated sequences has led to the existence of numerous families of monomers exhibiting specific organizational patterns. The limited amount of information available in non-human primates is a restriction to the understanding of the evolutionary dynamics of alpha satellite DNA. Results We carried out the targeted high-throughput sequencing of alpha satellite monomers and dimers from the Cercopithecus solatus genome, an Old World monkey from the Cercopithecini tribe. Computational approaches were used to infer the existence of sequence families and to study how these families are organized with respect to each other. While previous studies had suggested that alpha satellites in Old World monkeys were poorly diversified, our analysis provides evidence for the existence of at least four distinct families of sequences within the studied species and of higher order organizational patterns. Fluorescence in situ hybridization using oligonucleotide probes that are able to target each family in a specific way showed that the different families had distinct distributions on chromosomes and were not homogeneously distributed between chromosomes. Conclusions Our new approach provides an unprecedented and comprehensive view of the diversity and organization of alpha satellites in a species outside the hominoid group. We consider these data with respect to previously known alpha satellite families and to potential mechanisms for satellite DNA evolution. Applying this approach to other species will open new perspectives regarding the integration of satellite DNA into comparative genomic and cytogenetic studies. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3246-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lauriane Cacheux
- Département Régulations, Développement et Diversité Moléculaire, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France.,Département Systématique et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France
| | - Loïc Ponger
- Département Régulations, Développement et Diversité Moléculaire, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France
| | - Michèle Gerbault-Seureau
- Département Systématique et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France
| | - Florence Anne Richard
- Département Systématique et Evolution, Institut de Systématique, Evolution, Biodiversité, UMR 7205 MNHN, CNRS, UPMC, EPHE, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France.,Université Versailles St-Quentin, Montigny-le-Bretonneux, France
| | - Christophe Escudé
- Département Régulations, Développement et Diversité Moléculaire, Structure et Instabilité des Génomes, INSERM U1154, CNRS UMR7196, Sorbonne Universités, Muséum national d'Histoire naturelle, Paris, France.
| |
Collapse
|
27
|
Fan X, Supiwong W, Weise A, Mrasek K, Kosyakova N, Tanomtong A, Pinthong K, Trifonov VA, Cioffi MDB, Grothmann P, Liehr T, Oliveira EH. Comprehensive characterization of evolutionary conserved breakpoints in four New World Monkey karyotypes compared to Chlorocebus aethiops and Homo sapiens. Heliyon 2015; 1:e00042. [PMID: 27441227 PMCID: PMC4945616 DOI: 10.1016/j.heliyon.2015.e00042] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Revised: 10/20/2015] [Accepted: 10/23/2015] [Indexed: 11/21/2022] Open
Abstract
Comparative cytogenetic analysis in New World Monkeys (NWMs) using human multicolor banding (MCB) probe sets were not previously done. Here we report on an MCB based FISH-banding study complemented with selected locus-specific and heterochromatin specific probes in four NWMs and one Old World Monkey (OWM) species, i.e. in Alouatta caraya (ACA), Callithrix jacchus (CJA), Cebus apella (CAP), Saimiri sciureus (SSC), and Chlorocebus aethiops (CAE), respectively. 107 individual evolutionary conserved breakpoints (ECBs) among those species were identified and compared with those of other species in previous reports. Especially for chromosomal regions being syntenic to human chromosomes 6, 8, 9, 10, 11, 12 and 16 previously cryptic rearrangements could be observed. 50.4% (54/107) NWM-ECBs were colocalized with those of OWMs, 62.6% (62/99) NWM-ECBs were related with those of Hylobates lar (HLA) and 66.3% (71/107) NWM-ECBs corresponded with those known from other mammalians. Furthermore, human fragile sites were aligned with the ECBs found in the five studied species and interestingly 66.3% ECBs colocalized with those fragile sites (FS). Overall, this study presents detailed chromosomal maps of one OWM and four NWM species. This data will be helpful to further investigation on chromosome evolution in NWM and hominoids in general and is prerequisite for correct interpretation of future sequencing based genomic studies in those species.
Collapse
Key Words
- ACA, Alouatta caraya
- Atelidae
- BACs, bacterial artificial chromosomes
- CAE, Chlorocebus aethiops
- CAP, Cebus apella
- CJA, Callithrix jacchus
- Cebidae
- EC, evolutionary conserved
- ECBs, evolutionary conserved breakpoints
- Evolutionary conserved breakpoints
- Evolutionary genetics
- FISH, fluorescence in situ hybridization
- FS, fragile site
- Fragile sites
- Genetics
- HCM, heterochromatin mix
- HLA, Hylobates lar
- HSA, Homo sapiens
- HSBs, homologous syntenic blocks
- MCB, multicolor banding
- Multicolor banding
- NGS, Next-generation sequencing
- NOR, nucleolus organizer region
- NWMs, New World Monkeys
- New World Monkeys
- OWMs, Old World Monkeys
- Old World Monkeys
- SSC, Saimiri sciureus
- subCTM, sub-centromere/subtelomere-specific multicolor (FISH)
- wcp, whole human chromosome painting
Collapse
Affiliation(s)
- Xiaobo Fan
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Kollegiengasse 10, D-07743 Jena, Germany
| | - Weerayuth Supiwong
- Department of Biology Faculty of Science, KhonKaen University, 123 Moo 16 Mittapap Rd., Muang District, KhonKaen 40002, Thailand
| | - Anja Weise
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Kollegiengasse 10, D-07743 Jena, Germany
| | - Kristin Mrasek
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Kollegiengasse 10, D-07743 Jena, Germany
| | - Nadezda Kosyakova
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Kollegiengasse 10, D-07743 Jena, Germany
| | - Alongkoad Tanomtong
- Department of Biology Faculty of Science, KhonKaen University, 123 Moo 16 Mittapap Rd., Muang District, KhonKaen 40002, Thailand
| | - Krit Pinthong
- Department of Biology Faculty of Science, KhonKaen University, 123 Moo 16 Mittapap Rd., Muang District, KhonKaen 40002, Thailand
| | | | - Marcelo de Bello Cioffi
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, São Carlos, SP, Brazil
| | - Pierre Grothmann
- Serengeti-Park Hodenhagen GmbH, Am Safaripark 1, 29693, Hodenhagen, Germany
| | - Thomas Liehr
- Jena University Hospital, Friedrich Schiller University, Institute of Human Genetics, Kollegiengasse 10, D-07743 Jena, Germany
| | - Edivaldo H.C.de Oliveira
- Faculdade de Ciências Naturais, ICEN, Universidade Federal do Pará, Campus Universitário do Guamá, 66075-110 Belém-PA, Brazil
| |
Collapse
|
28
|
Evolution of Vertebrate Adam Genes; Duplication of Testicular Adams from Ancient Adam9/9-like Loci. PLoS One 2015; 10:e0136281. [PMID: 26308360 PMCID: PMC4550289 DOI: 10.1371/journal.pone.0136281] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Accepted: 08/02/2015] [Indexed: 01/20/2023] Open
Abstract
Members of the disintegrin metalloproteinase (ADAM) family have important functions in regulating cell-cell and cell-matrix interactions as well as cell signaling. There are two major types of ADAMs: the somatic ADAMs (sADAMs) that have a significant presence in somatic tissues, and the testicular ADAMs (tADAMs) that are expressed predominantly in the testis. Genes encoding tADAMs can be further divided into two groups: group I (intronless) and group II (intron-containing). To date, tAdams have only been reported in placental mammals, and their evolutionary origin and relationship to sAdams remain largely unknown. Using phylogenetic and syntenic tools, we analyzed the Adam genes in various vertebrates ranging from fishes to placental mammals. Our analyses reveal duplication and loss of some sAdams in certain vertebrate species. In particular, there exists an Adam9-like gene in non-mammalian vertebrates but not mammals. We also identified putative group I and group II tAdams in all amniote species that have been examined. These tAdam homologues are more closely related to Adams 9 and 9-like than to other sAdams. In all amniote species examined, group II tAdams lie in close vicinity to Adam9 and hence likely arose from tandem duplication, whereas group I tAdams likely originated through retroposition because of their lack of introns. Clusters of multiple group I tAdams are also common, suggesting tandem duplication after retroposition. Therefore, Adam9/9-like and some of the derived tAdam loci are likely preferred targets for tandem duplication and/or retroposition. Consistent with this hypothesis, we identified a young retroposed gene that duplicated recently from Adam9 in the opossum. As a result of gene duplication, some tAdams were pseudogenized in certain species, whereas others acquired new expression patterns and functions. The rapid duplication of Adam genes has a major contribution to the diversity of ADAMs in various vertebrate species.
Collapse
|
29
|
Parks MM, Lawrence CE, Raphael BJ. Detecting non-allelic homologous recombination from high-throughput sequencing data. Genome Biol 2015; 16:72. [PMID: 25886137 PMCID: PMC4425883 DOI: 10.1186/s13059-015-0633-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 03/16/2015] [Indexed: 12/27/2022] Open
Abstract
Non-allelic homologous recombination (NAHR) is a common mechanism for generating genome rearrangements and is implicated in numerous genetic disorders, but its detection in high-throughput sequencing data poses a serious challenge. We present a probabilistic model of NAHR and demonstrate its ability to find NAHR in low-coverage sequencing data from 44 individuals. We identify NAHR-mediated deletions or duplications in 109 of 324 potential NAHR loci in at least one of the individuals. These calls segregate by ancestry, are more common in closely spaced repeats, often result in duplicated genes or pseudogenes, and affect highly studied genes such as GBA and CYP2E1.
Collapse
Affiliation(s)
- Matthew M Parks
- Division of Applied Mathematics, Brown University, Providence, USA.
| | - Charles E Lawrence
- Division of Applied Mathematics, Brown University, Providence, USA. .,Center for Computational Molecular Biology, Brown University, Providence, USA.
| | - Benjamin J Raphael
- Center for Computational Molecular Biology, Brown University, Providence, USA. .,Department of Computer Science, Brown University, Providence, USA.
| |
Collapse
|
30
|
Urdinguio RG, Bayón GF, Dmitrijeva M, Toraño EG, Bravo C, Fraga MF, Bassas L, Larriba S, Fernández AF. Aberrant DNA methylation patterns of spermatozoa in men with unexplained infertility. Hum Reprod 2015; 30:1014-28. [PMID: 25753583 DOI: 10.1093/humrep/dev053] [Citation(s) in RCA: 118] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 02/16/2015] [Indexed: 12/17/2022] Open
Abstract
STUDY QUESTION Are there DNA methylation alterations in sperm that could explain the reduced biological fertility of male partners from couples with unexplained infertility? SUMMARY ANSWER DNA methylation patterns, not only at specific loci but also at Alu Yb8 repetitive sequences, are altered in infertile individuals compared with fertile controls. WHAT IS KNOWN ALREADY Aberrant DNA methylation of sperm has been associated with human male infertility in patients demonstrating either deficiencies in the process of spermatogenesis or low semen quality. STUDY DESIGN, SIZE, DURATION Case and control prospective study. This study compares 46 sperm samples obtained from 17 normospermic fertile men and 29 normospermic infertile patients. PARTICIPANTS/MATERIALS, SETTING, METHODS Illumina Infinium HD Human Methylation 450K arrays were used to identify genomic regions showing differences in sperm DNA methylation patterns between five fertile and seven infertile individuals. Additionally, global DNA methylation of sperm was measured using the Methylamp Global DNA Methylation Quantification Ultra kit (Epigentek) in 14 samples, and DNA methylation at several repetitive sequences (LINE-1, Alu Yb8, NBL2, D4Z4) measured by bisulfite pyrosequencing in 44 sperm samples. A sperm-specific DNA methylation pattern was obtained by comparing the sperm methylomes with the DNA methylomes of differentiated somatic cells using data obtained from methylation arrays (Illumina 450 K) of blood, neural and glial cells deposited in public databases. MAIN RESULTS AND THE ROLE OF CHANCE In this study we conduct, for the first time, a genome-wide study to identify alterations of sperm DNA methylation in individuals with unexplained infertility that may account for the differences in their biological fertility compared with fertile individuals. We have identified 2752 CpGs showing aberrant DNA methylation patterns, and more importantly, these differentially methylated CpGs were significantly associated with CpG sites which are specifically methylated in sperm when compared with somatic cells. We also found statistically significant (P < 0.001) associations between DNA hypomethylation and regions corresponding to those which, in somatic cells, are enriched in the repressive histone mark H3K9me3, and between DNA hypermethylation and regions enriched in H3K4me1 and CTCF, suggesting that the relationship between chromatin context and aberrant DNA methylation of sperm in infertile men could be locus-dependent. Finally, we also show that DNA methylation patterns, not only at specific loci but also at several repetitive sequences (LINE-1, Alu Yb8, NBL2, D4Z4), were lower in sperm than in somatic cells. Interestingly, sperm samples at Alu Yb8 repetitive sequences of infertile patients showed significantly lower DNA methylation levels than controls. LIMITATIONS, REASONS FOR CAUTION Our results are descriptive and further studies would be needed to elucidate the functional effects of aberrant DNA methylation on male fertility. WIDER IMPLICATIONS OF THE FINDINGS Overall, our data suggest that aberrant sperm DNA methylation might contribute to fertility impairment in couples with unexplained infertility and they provide a promising basis for future research. STUDY FUNDING/COMPETING INTERESTS This work has been financially supported by Fundación Cientifica de la AECC (to R.G.U.); IUOPA (to G.F.B.); FICYT (to E.G.T.); the Spanish National Research Council (CSIC; 200820I172 to M.F.F.); Fundación Ramón Areces (to M.F.F); the Plan Nacional de I+D+I 2008-2011/2013-2016/FEDER (PI11/01728 to AF.F., PI12/01080 to M.F.F. and PI12/00361 to S.L.); the PN de I+D+I 2008-20011 and the Generalitat de Catalunya (2009SGR01490). A.F.F. is sponsored by ISCIII-Subdirección General de Evaluación y Fomento de la Investigación (CP11/00131). S.L. is sponsored by the Researchers Stabilization Program from the Spanish National Health System (CES09/020). The IUOPA is supported by the Obra Social Cajastur, Spain.
Collapse
Affiliation(s)
- Rocío G Urdinguio
- Cancer Epigenetics Laboratory, Institute of Oncology of Asturias (IUOPA), HUCA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Gustavo F Bayón
- Cancer Epigenetics Laboratory, Institute of Oncology of Asturias (IUOPA), HUCA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Marija Dmitrijeva
- Cancer Epigenetics Laboratory, Institute of Oncology of Asturias (IUOPA), HUCA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Estela G Toraño
- Cancer Epigenetics Laboratory, Institute of Oncology of Asturias (IUOPA), HUCA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Cristina Bravo
- Cancer Epigenetics Laboratory, Institute of Oncology of Asturias (IUOPA), HUCA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Mario F Fraga
- Cancer Epigenetics Laboratory, Institute of Oncology of Asturias (IUOPA), HUCA, Universidad de Oviedo, Oviedo 33006, Spain Department of Immunology and Oncology, National Center for Biotechnology, CNB-CSIC, Cantoblanco, Madrid 28049, Spain
| | - Lluís Bassas
- Laboratory of Seminology and Embryology, Andrology Service-Fundació Puigvert, Barcelona 08025, Spain
| | - Sara Larriba
- Human Molecular Genetics Group-IDIBELL, L'Hospitalet de Llobregat, Barcelona 08908, Spain
| | - Agustín F Fernández
- Cancer Epigenetics Laboratory, Institute of Oncology of Asturias (IUOPA), HUCA, Universidad de Oviedo, Oviedo 33006, Spain
| |
Collapse
|
31
|
LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 2015; 47:291-5. [PMID: 25642630 DOI: 10.1038/ng.3211] [Citation(s) in RCA: 2860] [Impact Index Per Article: 317.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2014] [Accepted: 01/07/2015] [Indexed: 12/16/2022]
Abstract
Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.
Collapse
|
32
|
Bolon YT, Stec AO, Michno JM, Roessler J, Bhaskar PB, Ries L, Dobbels AA, Campbell BW, Young NP, Anderson JE, Grant DM, Orf JH, Naeve SL, Muehlbauer GJ, Vance CP, Stupar RM. Genome resilience and prevalence of segmental duplications following fast neutron irradiation of soybean. Genetics 2014; 198:967-81. [PMID: 25213171 PMCID: PMC4224183 DOI: 10.1534/genetics.114.170340] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Accepted: 09/02/2014] [Indexed: 01/14/2023] Open
Abstract
Fast neutron radiation has been used as a mutagen to develop extensive mutant collections. However, the genome-wide structural consequences of fast neutron radiation are not well understood. Here, we examine the genome-wide structural variants observed among 264 soybean [Glycine max (L.) Merrill] plants sampled from a large fast neutron-mutagenized population. While deletion rates were similar to previous reports, surprisingly high rates of segmental duplication were also found throughout the genome. Duplication coverage extended across entire chromosomes and often prevailed at chromosome ends. High-throughput resequencing analysis of selected mutants resolved specific chromosomal events, including the rearrangement junctions for a large deletion, a tandem duplication, and a translocation. Genetic mapping associated a large deletion on chromosome 10 with a quantitative change in seed composition for one mutant. A tandem duplication event, located on chromosome 17 in a second mutant, was found to cosegregate with a short petiole mutant phenotype, and thus may serve as an example of a morphological change attributable to a DNA copy number gain. Overall, this study provides insight into the resilience of the soybean genome, the patterns of structural variation resulting from fast neutron mutagenesis, and the utility of fast neutron-irradiated mutants as a source of novel genetic losses and gains.
Collapse
Affiliation(s)
- Yung-Tsi Bolon
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Adrian O Stec
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Jean-Michel Michno
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Jeffrey Roessler
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Pudota B Bhaskar
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Landon Ries
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Austin A Dobbels
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Benjamin W Campbell
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Nathan P Young
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Justin E Anderson
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - David M Grant
- Corn Insects and Crop Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, Ames, Iowa 50011
| | - James H Orf
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Seth L Naeve
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| | - Gary J Muehlbauer
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108 Department of Plant Biology, University of Minnesota, St. Paul, Minnesota 55108
| | - Carroll P Vance
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108 Plant Science Research Unit, United States Department of Agriculture-Agricultural Research Service, St. Paul, Minnesota 55108
| | - Robert M Stupar
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, Minnesota 55108
| |
Collapse
|
33
|
Prendergast JGD, Chambers EV, Semple CAM. Sequence-level mechanisms of human epigenome evolution. Genome Biol Evol 2014; 6:1758-71. [PMID: 24966180 PMCID: PMC4122940 DOI: 10.1093/gbe/evu142] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
DNA methylation and chromatin states play key roles in development and disease. However, the extent of recent evolutionary divergence in the human epigenome and the influential factors that have shaped it are poorly understood. To determine the links between genome sequence and human epigenome evolution, we examined the divergence of DNA methylation and chromatin states following segmental duplication events in the human lineage. Chromatin and DNA methylation states were found to have been generally well conserved following a duplication event, with the evolution of the epigenome largely uncoupled from the total number of genetic changes in the surrounding DNA sequence. However, the epigenome at tissue-specific, distal regulatory regions was observed to be unusually prone to diverge following duplication, with particular sequence differences, altering known sequence motifs, found to be associated with divergence in patterns of DNA methylation and chromatin. Alu elements were found to have played a particularly prominent role in shaping human epigenome evolution, and we show that human-specific AluY insertion events are strongly linked to the evolution of the DNA methylation landscape and gene expression levels, including at key neurological genes in the human brain. Studying paralogous regions within the same sample enables the study of the links between genome and epigenome evolution while controlling for biological and technical variation. We show DNA methylation and chromatin divergence between duplicated regions are linked to the divergence of particular genetic motifs, with Alu elements having played a disproportionate role in the evolution of the epigenome in the human lineage.
Collapse
Affiliation(s)
| | - Emily V Chambers
- The Roslin Institute, The University of Edinburgh, Midlothian, United Kingdom
| | - Colin A M Semple
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, United Kingdom
| |
Collapse
|
34
|
Giannuzzi G, Migliavacca E, Reymond A. Novel H3K4me3 marks are enriched at human- and chimpanzee-specific cytogenetic structures. Genome Res 2014; 24:1455-68. [PMID: 24916972 PMCID: PMC4158755 DOI: 10.1101/gr.167742.113] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Human and chimpanzee genomes are 98.8% identical within comparable sequences. However, they differ structurally in nine pericentric inversions, one fusion that originated human chromosome 2, and content and localization of heterochromatin and lineage-specific segmental duplications. The possible functional consequences of these cytogenetic and structural differences are not fully understood and their possible involvement in speciation remains unclear. We show that subtelomeric regions—regions that have a species-specific organization, are more divergent in sequence, and are enriched in genes and recombination hotspots—are significantly enriched for species-specific histone modifications that decorate transcription start sites in different tissues in both human and chimpanzee. The human lineage-specific chromosome 2 fusion point and ancestral centromere locus as well as chromosome 1 and 18 pericentric inversion breakpoints showed enrichment of human-specific H3K4me3 peaks in the prefrontal cortex. Our results reveal an association between plastic regions and potential novel regulatory elements.
Collapse
Affiliation(s)
- Giuliana Giannuzzi
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland;
| | - Eugenia Migliavacca
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland;
| |
Collapse
|
35
|
Light S, Basile W, Elofsson A. Orphans and new gene origination, a structural and evolutionary perspective. Curr Opin Struct Biol 2014; 26:73-83. [DOI: 10.1016/j.sbi.2014.05.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Revised: 05/07/2014] [Accepted: 05/16/2014] [Indexed: 12/28/2022]
|
36
|
Altemose N, Miga KH, Maggioni M, Willard HF. Genomic characterization of large heterochromatic gaps in the human genome assembly. PLoS Comput Biol 2014; 10:e1003628. [PMID: 24831296 PMCID: PMC4022460 DOI: 10.1371/journal.pcbi.1003628] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2013] [Accepted: 03/26/2014] [Indexed: 01/24/2023] Open
Abstract
The largest gaps in the human genome assembly correspond to multi-megabase heterochromatic regions composed primarily of two related families of tandem repeats, Human Satellites 2 and 3 (HSat2,3). The abundance of repetitive DNA in these regions challenges standard mapping and assembly algorithms, and as a result, the sequence composition and potential biological functions of these regions remain largely unexplored. Furthermore, existing genomic tools designed to predict consensus-based descriptions of repeat families cannot be readily applied to complex satellite repeats such as HSat2,3, which lack a consistent repeat unit reference sequence. Here we present an alignment-free method to characterize complex satellites using whole-genome shotgun read datasets. Utilizing this approach, we classify HSat2,3 sequences into fourteen subfamilies and predict their chromosomal distributions, resulting in a comprehensive satellite reference database to further enable genomic studies of heterochromatic regions. We also identify 1.3 Mb of non-repetitive sequence interspersed with HSat2,3 across 17 unmapped assembly scaffolds, including eight annotated gene predictions. Finally, we apply our satellite reference database to high-throughput sequence data from 396 males to estimate array size variation of the predominant HSat3 array on the Y chromosome, confirming that satellite array sizes can vary between individuals over an order of magnitude (7 to 98 Mb) and further demonstrating that array sizes are distributed differently within distinct Y haplogroups. In summary, we present a novel framework for generating initial reference databases for unassembled genomic regions enriched with complex satellite DNA, and we further demonstrate the utility of these reference databases for studying patterns of sequence variation within human populations. At least 5–10% of the human genome remains unassembled, unmapped, and poorly characterized. The reference assembly annotates these missing regions as multi-megabase heterochromatic gaps, found primarily near centromeres and on the short arms of the acrocentric chromosomes. This missing fraction of the genome consists predominantly of long arrays of near-identical tandem repeats called satellite DNA. Due to the repetitive nature of satellite DNA, sequence assembly algorithms cannot uniquely align overlapping sequence reads, and thus satellite-rich domains have been omitted from the reference assembly and from most genome-wide studies of variation and function. Existing methods for analyzing some satellite DNAs cannot be easily extended to a large portion of satellites whose repeat structures are complex and largely uncharacterized, such as Human Satellites 2 and 3 (HSat2,3). Here we characterize HSat2,3 using a novel approach that does not depend on having a well-defined repeat structure. By classifying genome-wide HSat2,3 sequences into subfamilies and localizing them to chromosomes, we have generated an initial HSat2,3 genomic reference, which serves as a critical foundation for future studies of variation and function in these regions. This approach should be generally applicable to other classes of satellite DNA, in both the human genome and other complex genomes.
Collapse
Affiliation(s)
- Nicolas Altemose
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina, United States of America
| | - Karen H. Miga
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina, United States of America
- * E-mail:
| | - Mauro Maggioni
- Department of Mathematics, Duke University, Durham, North Carolina, United States of America
| | - Huntington F. Willard
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina, United States of America
| |
Collapse
|
37
|
Abstract
The centromere is the chromosomal locus essential for chromosome inheritance and genome stability. Human centromeres are located at repetitive alpha satellite DNA arrays that compose approximately 5% of the genome. Contiguous alpha satellite DNA sequence is absent from the assembled reference genome, limiting current understanding of centromere organization and function. Here, we review the progress in centromere genomics spanning the discovery of the sequence to its molecular characterization and the work done during the Human Genome Project era to elucidate alpha satellite structure and sequence variation. We discuss exciting recent advances in alpha satellite sequence assembly that have provided important insight into the abundance and complex organization of this sequence on human chromosomes. In light of these new findings, we offer perspectives for future studies of human centromere assembly and function.
Collapse
Affiliation(s)
- Megan E. Aldrup-MacDonald
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; E-Mail:
- Division of Human Genetics, Duke University, Durham, NC 27710, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710, USA; E-Mail:
- Division of Human Genetics, Duke University, Durham, NC 27710, USA
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +1-919-684-9038
| |
Collapse
|
38
|
Castronovo C, Valtorta E, Crippa M, Tedoldi S, Romitti L, Amione MC, Guerneri S, Rusconi D, Ballarati L, Milani D, Grosso E, Cavalli P, Giardino D, Bonati MT, Larizza L, Finelli P. Design and validation of a pericentromeric BAC clone set aimed at improving diagnosis and phenotype prediction of supernumerary marker chromosomes. Mol Cytogenet 2013; 6:45. [PMID: 24171812 PMCID: PMC4176193 DOI: 10.1186/1755-8166-6-45] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Accepted: 10/08/2013] [Indexed: 12/17/2022] Open
Abstract
Background Small supernumerary marker chromosomes (sSMCs) are additional, structurally abnormal chromosomes, generally smaller than chromosome 20 of the same metaphase spread. Due to their small size, they are difficult to characterize by conventional cytogenetics alone. In regard to their clinical effects, sSMCs are a heterogeneous group: in particular, sSMCs containing pericentromeric euchromatin are likely to be associated with abnormal outcomes, although exceptions have been reported. To improve characterization of the genetic content of sSMCs, several approaches might be applied based on different molecular and molecular-cytogenetic assays, e.g., fluorescent in situ hybridization (FISH), array-based comparative genomic hybridization (array CGH), and multiplex ligation-dependent probe amplification (MLPA). To provide a complementary tool for the characterization of sSMCs, we constructed and validated a new, FISH-based, pericentromeric Bacterial Artificial Chromosome (BAC) clone set that with a high resolution spans the most proximal euchromatic sequences of all human chromosome arms, excluding the acrocentric short arms. Results By FISH analysis, we assayed 561 pericentromeric BAC probes and excluded 75 that showed a wrong chromosomal localization. The remaining 486 probes were used to establish 43 BAC-based pericentromeric panels. Each panel consists of a core, which with a high resolution covers the most proximal euchromatic ~0.7 Mb (on average) of each chromosome arm and generally bridges the heterochromatin/euchromatin junction, as well as clones located proximally and distally to the core. The pericentromeric clone set was subsequently validated by the characterization of 19 sSMCs. Using the core probes, we could rapidly distinguish between heterochromatic (1/19) and euchromatic (11/19) sSMCs, and estimate the euchromatic DNA content, which ranged from approximately 0.13 to more than 10 Mb. The characterization was not completed for seven sSMCs due to a lack of information about the covered region in the reference sequence (1/19) or sample insufficiency (6/19). Conclusions Our results demonstrate that this pericentromeric clone set is useful as an alternative tool for sSMC characterization, primarily in cases of very small SMCs that contain either heterochromatin exclusively or a tiny amount of euchromatic sequence, and also in cases of low-level or cryptic mosaicism. The resulting data will foster knowledge of human proximal euchromatic regions involved in chromosomal imbalances, thereby improving genotype–phenotype correlations.
Collapse
Affiliation(s)
- Chiara Castronovo
- Laboratorio di Citogenetica Medica e Genetica Molecolare, IRCCS Istituto Auxologico Italiano, via Ariosto 13, 20145, Milano, Italy.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Livnat A. Interaction-based evolution: how natural selection and nonrandom mutation work together. Biol Direct 2013; 8:24. [PMID: 24139515 PMCID: PMC4231362 DOI: 10.1186/1745-6150-8-24] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Accepted: 09/26/2013] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND The modern evolutionary synthesis leaves unresolved some of the most fundamental, long-standing questions in evolutionary biology: What is the role of sex in evolution? How does complex adaptation evolve? How can selection operate effectively on genetic interactions? More recently, the molecular biology and genomics revolutions have raised a host of critical new questions, through empirical findings that the modern synthesis fails to explain: for example, the discovery of de novo genes; the immense constructive role of transposable elements in evolution; genetic variance and biochemical activity that go far beyond what traditional natural selection can maintain; perplexing cases of molecular parallelism; and more. PRESENTATION OF THE HYPOTHESIS Here I address these questions from a unified perspective, by means of a new mechanistic view of evolution that offers a novel connection between selection on the phenotype and genetic evolutionary change (while relying, like the traditional theory, on natural selection as the only source of feedback on the fit between an organism and its environment). I hypothesize that the mutation that is of relevance for the evolution of complex adaptation-while not Lamarckian, or "directed" to increase fitness-is not random, but is instead the outcome of a complex and continually evolving biological process that combines information from multiple loci into one. This allows selection on a fleeting combination of interacting alleles at different loci to have a hereditary effect according to the combination's fitness. TESTING AND IMPLICATIONS OF THE HYPOTHESIS This proposed mechanism addresses the problem of how beneficial genetic interactions can evolve under selection, and also offers an intuitive explanation for the role of sex in evolution, which focuses on sex as the generator of genetic combinations. Importantly, it also implies that genetic variation that has appeared neutral through the lens of traditional theory can actually experience selection on interactions and thus has a much greater adaptive potential than previously considered. Empirical evidence for the proposed mechanism from both molecular evolution and evolution at the organismal level is discussed, and multiple predictions are offered by which it may be tested. REVIEWERS This article was reviewed by Nigel Goldenfeld (nominated by Eugene V. Koonin), Jürgen Brosius and W. Ford Doolittle.
Collapse
Affiliation(s)
- Adi Livnat
- Department of Biological Sciences, Virginia Tech, Blacksburg, VA, 24061,
USA
| |
Collapse
|
40
|
Dumont BL, Eichler EE. Signals of historical interlocus gene conversion in human segmental duplications. PLoS One 2013; 8:e75949. [PMID: 24124524 PMCID: PMC3790853 DOI: 10.1371/journal.pone.0075949] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 08/17/2013] [Indexed: 12/04/2022] Open
Abstract
Standard methods of DNA sequence analysis assume that sequences evolve independently, yet this assumption may not be appropriate for segmental duplications that exchange variants via interlocus gene conversion (IGC). Here, we use high quality multiple sequence alignments from well-annotated segmental duplications to systematically identify IGC signals in the human reference genome. Our analysis combines two complementary methods: (i) a paralog quartet method that uses DNA sequence simulations to identify a statistical excess of sites consistent with inter-paralog exchange, and (ii) the alignment-based method implemented in the GENECONV program. One-quarter (25.4%) of the paralog families in our analysis harbor clear IGC signals by the quartet approach. Using GENECONV, we identify 1477 gene conversion tracks that cumulatively span 1.54 Mb of the genome. Our analyses confirm the previously reported high rates of IGC in subtelomeric regions and Y-chromosome palindromes, and identify multiple novel IGC hotspots, including the pregnancy specific glycoproteins and the neuroblastoma breakpoint gene families. Although the duplication history of a paralog family is described by a single tree, we show that IGC has introduced incredible site-to-site variation in the evolutionary relationships among paralogs in the human genome. Our findings indicate that IGC has left significant footprints in patterns of sequence diversity across segmental duplications in the human genome, out-pacing the contributions of single base mutation by orders of magnitude. Collectively, the IGC signals we report comprise a catalog that will provide a critical reference for interpreting observed patterns of DNA sequence variation across duplicated genomic regions, including targets of recent adaptive evolution in humans.
Collapse
Affiliation(s)
- Beth L. Dumont
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Seattle, Washington, United States of America
| |
Collapse
|
41
|
Floutsakou I, Agrawal S, Nguyen TT, Seoighe C, Ganley ARD, McStay B. The shared genomic architecture of human nucleolar organizer regions. Genome Res 2013; 23:2003-12. [PMID: 23990606 PMCID: PMC3847771 DOI: 10.1101/gr.157941.113] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The short arms of the five acrocentric human chromosomes harbor sequences that direct the assembly and function of the nucleolus, one of the key functional domains of the nucleus, yet they are absent from the current human genome assembly. Here we describe the genomic architecture of these human nucleolar organizers. Sequences distal and proximal to ribosomal gene arrays are conserved among the acrocentric chromosomes, suggesting they are sites of frequent recombination. Although previously believed to be heterochromatic, characterization of these two flanking regions reveals that they share a complex genomic architecture similar to other euchromatic regions of the genome, but they have distinct genomic characteristics. Proximal sequences are almost entirely segmentally duplicated, similar to the regions bordering centromeres. In contrast, the distal sequence is predominantly unique to the acrocentric short arms and is dominated by a very large inverted repeat. We show that the distal element is localized to the periphery of the nucleolus, where it appears to anchor the ribosomal gene repeats. This, combined with its complex chromatin structure and transcriptional activity, suggests that this region is involved in nucleolar organization. Our results provide a platform for investigating the role of NORs in nucleolar formation and function, and open the door for determining the role of these regions in the well-known empirical association of nucleoli with pathology.
Collapse
Affiliation(s)
- Ioanna Floutsakou
- Centre for Chromosome Biology, School of Natural Sciences, National University of Ireland, Galway, Galway, Ireland
| | | | | | | | | | | |
Collapse
|
42
|
Sassa T. The Role of Human-Specific Gene Duplications During Brain Development and Evolution. J Neurogenet 2013; 27:86-96. [DOI: 10.3109/01677063.2013.789512] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
43
|
Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep 2013; 3:2179-90. [PMID: 23791531 DOI: 10.1016/j.celrep.2013.05.031] [Citation(s) in RCA: 403] [Impact Index Per Article: 36.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 04/17/2013] [Accepted: 05/21/2013] [Indexed: 01/01/2023] Open
Abstract
Understanding the extent of genomic transcription and its functional relevance is a central goal in genomics research. However, detailed genome-wide investigations of transcriptome complexity in major mammalian organs have been scarce. Here, using extensive RNA-seq data, we show that transcription of the genome is substantially more widespread in the testis than in other organs across representative mammals. Furthermore, we reveal that meiotic spermatocytes and especially postmeiotic round spermatids have remarkably diverse transcriptomes, which explains the high transcriptome complexity of the testis as a whole. The widespread transcriptional activity in spermatocytes and spermatids encompasses protein-coding and long noncoding RNA genes but also poorly conserves intergenic sequences, suggesting that it may not be of immediate functional relevance. Rather, our analyses of genome-wide epigenetic data suggest that this prevalent transcription, which most likely promoted the birth of new genes during evolution, is facilitated by an overall permissive chromatin in these germ cells that results from extensive chromatin remodeling.
Collapse
|
44
|
Genovese G, Handsaker RE, Li H, Altemose N, Lindgren AM, Chambert K, Pasaniuc B, Price AL, Reich D, Morton CC, Pollak MR, Wilson JG, McCarroll SA. Using population admixture to help complete maps of the human genome. Nat Genet 2013; 45:406-14, 414e1-2. [PMID: 23435088 PMCID: PMC3683849 DOI: 10.1038/ng.2565] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Accepted: 01/31/2013] [Indexed: 12/16/2022]
Abstract
Tens of millions of base pairs of euchromatic human genome sequence, including many protein-coding genes, have no known location in the human genome. We describe an approach for localizing the human genome's missing pieces using the patterns of genome sequence variation created by population admixture. We mapped the locations of 70 scaffolds spanning 4 million base pairs of the human genome's unplaced euchromatic sequence, including more than a dozen protein-coding genes, and identified 8 new large interchromosomal segmental duplications. We find that most of these sequences are hidden in the genome's heterochromatin, particularly its pericentromeric regions. Many cryptic, pericentromeric genes are expressed at the RNA level and have been maintained intact for millions of years while their expression patterns diverged from those of paralogous genes elsewhere in the genome. We describe how knowledge of the locations of these sequences can inform disease association and genome biology studies.
Collapse
Affiliation(s)
- Giulio Genovese
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Fountzilas G, Dafni U, Bobos M, Kotoula V, Batistatou A, Xanthakis I, Papadimitriou C, Kostopoulos I, Koletsa T, Tsolaki E, Televantou D, Timotheadou E, Koutras A, Klouvas G, Samantas E, Pisanidis N, Karanikiotis C, Sfakianaki I, Pavlidis N, Gogas H, Linardou H, Kalogeras KT, Pectasides D, Dimopoulos MA. Evaluation of the prognostic role of centromere 17 gain and HER2/topoisomerase II alpha gene status and protein expression in patients with breast cancer treated with anthracycline-containing adjuvant chemotherapy: pooled analysis of two Hellenic Cooperative Oncology Group (HeCOG) phase III trials. BMC Cancer 2013; 13:163. [PMID: 23537287 PMCID: PMC3621498 DOI: 10.1186/1471-2407-13-163] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Accepted: 03/20/2013] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND The HER2 gene has been established as a valid biological marker for the treatment of breast cancer patients with trastuzumab and probably other agents, such as paclitaxel and anthracyclines. The TOP2A gene has been associated with response to anthracyclines. Limited information exists on the relationship of HER2/TOP2A gene status in the presence of centromere 17 (CEP17) gain with outcome of patients treated with anthracycline-containing adjuvant chemotherapy. METHODS Formalin-fixed paraffin-embedded tumor tissue samples from 1031 patients with high-risk operable breast cancer, enrolled in two consecutive phase III trials, were assessed in a central laboratory by fluorescence in situ hybridization for HER2/TOP2A gene amplification and CEP17 gain (CEP17 probe). Amplification of HER2 and TOP2A were defined as a gene/CEP17 ratio of >2.2 and ≥2.0, respectively, or gene copy number higher than 6. Additionally, HER2, TopoIIa, ER/PgR and Ki67 protein expression was assessed by immunohistochemistry (IHC) and patients were classified according to their IHC phenotype. Treatment consisted of epirubicin-based adjuvant chemotherapy followed by hormonal therapy and radiation, as indicated. RESULTS HER2 amplification was found in 23.7% of the patients and TOP2A amplification in 10.1%. In total, 41.8% of HER2-amplified tumors demonstrated TOP2A co-amplification. The median (range) of HER2, TOP2A and CEP17 gain was 2.55 (0.70-45.15), 2.20 (0.70-26.15) and 2.00 (0.70-26.55), respectively. Forty percent of the tumors had CEP17 gain (51% of those with HER2 amplification). Adjusting for treatment groups in the Cox model, HER2 amplification, TOP2A amplification, CEP17 gain and HER2/TOP2A co-amplification were not associated with time to relapse or time to death. CONCLUSION HER2 amplification, TOP2A amplification, CEP17 gain and HER2/TOP2A co-amplification were not associated with outcome in high-risk breast cancer patients treated with anthracycline-based adjuvant chemotherapy. TRIAL REGISTRATION Australian New Zealand Clinical Trials Registry (ANZCTR) ACTRN12611000506998 and ACTRN12609001036202.
Collapse
Affiliation(s)
- George Fountzilas
- Department of Medical Oncology, Papageorgiou Hospital, Aristotle University of Thessaloniki School of Medicine, Thessaloniki, Greece.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Shaffer LG, Ballif BC, Theisen A, Rorem E, Bejjani BA, Torchia BA. In the middle of it all: a centered approach to chromosome analysis. ACTA ACUST UNITED AC 2013; 2:221-9. [PMID: 23485141 DOI: 10.1517/17530059.2.2.221] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
BACKGROUND The pericentromeric areas immediately flanking the centromeres are prone to instability owing to their high levels of repetitive sequences. This genomic instability makes the pericentromeric regions ideal candidates for the investigation of chromosomal abnormalities resulting in genetic disease. However, it is this instability that confounds attempts to analyze these regions of the genome. The sequencing of the human genome, while illuminating the complexity of the pericentromeric regions, has enabled the development of high-resolution microarrays for the characterization of chromosomal abnormalities. OBJECTIVE The MarkerChip(™) was developed specifically to target the pericentromeres for the identification and characterization of pericentromeric chromosomal abnormalities. METHODS The authors' experience with this microarray is reviewed in their clinical diagnostic laboratory. RESULTS/DISCUSSION The MarkerChip demonstrates the utility of constructing a microarray for the analysis of chromosome abnormalities with coverage concentrated on areas of the genome particularly susceptible to rearrangement.
Collapse
Affiliation(s)
- Lisa G Shaffer
- Signature Genomic Laboratories, 120 N Pine St, Spokane, WA 99202, USA +1 509 474 6840 ; +1 509 474 6839 ;
| | | | | | | | | | | |
Collapse
|
47
|
Watson CT, Garg P, Sharp AJ. Comment on "genomic hypomethylation in the human germline associates with selective structural mutability in the human genome". PLoS Genet 2013; 9:e1003332. [PMID: 23468658 PMCID: PMC3585013 DOI: 10.1371/journal.pgen.1003332] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Corey T. Watson
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | - Paras Garg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | - Andrew J. Sharp
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
48
|
Yang L, Zou M, Fu B, He S. Genome-wide identification, characterization, and expression analysis of lineage-specific genes within zebrafish. BMC Genomics 2013; 14:65. [PMID: 23368736 PMCID: PMC3599513 DOI: 10.1186/1471-2164-14-65] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2012] [Accepted: 01/29/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genomic basis of teleost phenotypic complexity remains obscure, despite increasing availability of genome and transcriptome sequence data. Fish-specific genome duplication cannot provide sufficient explanation for the morphological complexity of teleosts, considering the relatively large number of extinct basal ray-finned fishes. RESULTS In this study, we performed comparative genomic analysis to discover the Conserved Teleost-Specific Genes (CTSGs) and orphan genes within zebrafish and found that these two sets of lineage-specific genes may have played important roles during zebrafish embryogenesis. Lineage-specific genes within zebrafish share many of the characteristics of their counterparts in other species: shorter length, fewer exon numbers, higher GC content, and fewer of them have transcript support. Chromosomal location analysis indicated that neither the CTSGs nor the orphan genes were distributed evenly in the chromosomes of zebrafish. The significant enrichment of immunity proteins in CTSGs annotated by gene ontology (GO) or predicted ab initio may imply that defense against pathogens may be an important reason for the diversification of teleosts. The evolutionary origin of the lineage-specific genes was determined and a very high percentage of lineage-specific genes were generated via gene duplications. The temporal and spatial expression profile of lineage-specific genes obtained by expressed sequence tags (EST) and RNA-seq data revealed two novel properties: in addition to being highly tissue-preferred expression, lineage-specific genes are also highly temporally restricted, namely they are expressed in narrower time windows than evolutionarily conserved genes and are specifically enriched in later-stage embryos and early larval stages. CONCLUSIONS Our study provides the first systematic identification of two different sets of lineage-specific genes within zebrafish and provides valuable information leading towards a better understanding of the molecular mechanisms of the genomic basis of teleost phenotypic complexity for future studies.
Collapse
Affiliation(s)
- Liandong Yang
- The Key Laboratory of Aquatic Biodiversity and Conservation of Chinese Academy of Sciences, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei 430072, People's Republic of China
| | | | | | | |
Collapse
|
49
|
Lorente-Galdos B, Bleyhl J, Santpere G, Vives L, Ramírez O, Hernandez J, Anglada R, Cooper GM, Navarro A, Eichler EE, Marques-Bonet T. Accelerated exon evolution within primate segmental duplications. Genome Biol 2013; 14:R9. [PMID: 23360670 PMCID: PMC3906575 DOI: 10.1186/gb-2013-14-1-r9] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Revised: 12/20/2012] [Accepted: 01/29/2013] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND The identification of signatures of natural selection has long been used as an approach to understanding the unique features of any given species. Genes within segmental duplications are overlooked in most studies of selection due to the limitations of draft nonhuman genome assemblies and to the methodological reliance on accurate gene trees, which are difficult to obtain for duplicated genes. RESULTS In this work, we detected exons with an accumulation of high-quality nucleotide differences between the human assembly and shotgun sequencing reads from single human and macaque individuals. Comparing the observed rates of nucleotide differences between coding exons and their flanking intronic sequences with a likelihood-ratio test, we identified 74 exons with evidence for rapid coding sequence evolution during the evolution of humans and Old World monkeys. Fifty-five percent of rapidly evolving exons were either partially or totally duplicated, which is a significant enrichment of the 6% rate observed across all human coding exons. CONCLUSIONS Our results provide a more comprehensive view of the action of selection upon segmental duplications, which are the most complex regions of our genomes. In light of these findings, we suggest that segmental duplications could be subjected to rapid evolution more frequently than previously thought.
Collapse
Affiliation(s)
- Belen Lorente-Galdos
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Jonathan Bleyhl
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Gabriel Santpere
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Laura Vives
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Oscar Ramírez
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Jessica Hernandez
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Roger Anglada
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Gregory M Cooper
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Arcadi Navarro
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, Seattle, Washington 98195, USA
| | - Tomas Marques-Bonet
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| |
Collapse
|
50
|
Research proceedings on primate comparative genomics. Zool Res 2013; 33:108-18. [DOI: 10.3724/sp.j.1141.2012.01108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|