1
|
Pucker B, Holtgräwe D, Rosleff Sörensen T, Stracke R, Viehöver P, Weisshaar B. A De Novo Genome Sequence Assembly of the Arabidopsis thaliana Accession Niederzenz-1 Displays Presence/Absence Variation and Strong Synteny. PLoS One 2016; 11:e0164321. [PMID: 27711162 PMCID: PMC5053417 DOI: 10.1371/journal.pone.0164321] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2016] [Accepted: 09/22/2016] [Indexed: 11/23/2022] Open
Abstract
Arabidopsis thaliana is the most important model organism for fundamental plant biology. The genome diversity of different accessions of this species has been intensively studied, for example in the 1001 genome project which led to the identification of many small nucleotide polymorphisms (SNPs) and small insertions and deletions (InDels). In addition, presence/absence variation (PAV), copy number variation (CNV) and mobile genetic elements contribute to genomic differences between A. thaliana accessions. To address larger genome rearrangements between the A. thaliana reference accession Columbia-0 (Col-0) and another accession of about average distance to Col-0, we created a de novo next generation sequencing (NGS)-based assembly from the accession Niederzenz-1 (Nd-1). The result was evaluated with respect to assembly strategy and synteny to Col-0. We provide a high quality genome sequence of the A. thaliana accession (Nd-1, LXSY01000000). The assembly displays an N50 of 0.590 Mbp and covers 99% of the Col-0 reference sequence. Scaffolds from the de novo assembly were positioned on the basis of sequence similarity to the reference. Errors in this automatic scaffold anchoring were manually corrected based on analyzing reciprocal best BLAST hits (RBHs) of genes. Comparison of the final Nd-1 assembly to the reference revealed duplications and deletions (PAV). We identified 826 insertions and 746 deletions in Nd-1. Randomly selected candidates of PAV were experimentally validated. Our Nd-1 de novo assembly allowed reliable identification of larger genic and intergenic variants, which was difficult or error-prone by short read mapping approaches alone. While overall sequence similarity as well as synteny is very high, we detected short and larger (affecting more than 100 bp) differences between Col-0 and Nd-1 based on bi-directional comparisons. The de novo assembly provided here and additional assemblies that will certainly be published in the future will allow to describe the pan-genome of A. thaliana.
Collapse
Affiliation(s)
- Boas Pucker
- Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Daniela Holtgräwe
- Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Thomas Rosleff Sörensen
- Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Ralf Stracke
- Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Prisca Viehöver
- Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Bernd Weisshaar
- Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
- * E-mail:
| |
Collapse
|
2
|
Liu JM, Zhao JY, Lu PP, Chen M, Guo CH, Xu ZS, Ma YZ. The E-Subgroup Pentatricopeptide Repeat Protein Family in Arabidopsis thaliana and Confirmation of the Responsiveness PPR96 to Abiotic Stresses. FRONTIERS IN PLANT SCIENCE 2016; 7:1825. [PMID: 27994613 PMCID: PMC5136568 DOI: 10.3389/fpls.2016.01825] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Accepted: 11/21/2016] [Indexed: 05/20/2023]
Abstract
Pentatricopeptide repeat (PPR) proteins are extensive in all eukaryotes. Their functions remain as yet largely unknown. Mining potential stress responsive PPRs, and checking whether known PPR editing factors are affected in the stress treatments. It is beneficial to elucidate the regulation mechanism of PPRs involved in biotic and abiotic stress. Here, we explored the characteristics and origin of the 105 E subgroup PPRs in Arabidopsis thaliana. Phylogenetic analysis categorized the E subgroup PPRs into five discrete groups (Cluster I to V), and they may have a common origin in both A. thaliana and rice. An in silico expression analysis of the 105 E subgroup PPRs in A. thaliana was performed using available microarray data. Thirty-four PPRs were differentially expressed during A. thaliana seed imbibition, seed development stage(s), and flowers development processes. To explore potential stress responsive PPRs, differential expression of 92 PPRs was observed in A. thaliana seedlings subjected to different abiotic stresses. qPCR data of E subgroup PPRs under stress conditions revealed that the expression of 5 PPRs was responsive to abiotic stresses. In addition, PPR96 is involved in plant responses to salt, abscisic acid (ABA), and oxidative stress. The T-DNA insertion mutation inactivating PPR96 expression results in plant insensitivity to salt, ABA, and oxidative stress. The PPR96 protein is localized in the mitochondria, and altered transcription levels of several stress-responsive genes under abiotic stress treatments. Our results suggest that PPR96 may important function in a role connecting the regulation of oxidative respiration and environmental responses in A. thaliana.
Collapse
Affiliation(s)
- Jia-Ming Liu
- Key Laboratory of Molecular Cytogenetics and Genetic Breeding of Heilongjiang Province, College of Life Science and Technology, Harbin Normal UniversityHarbin, China
- Key Laboratory of Biology and Genetic Improvement of Triticeae Crops, Ministry of Agriculture, Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facility for Crop Gene Resources and Genetic ImprovementBeijing, China
| | - Juan-Ying Zhao
- Key Laboratory of Molecular Cytogenetics and Genetic Breeding of Heilongjiang Province, College of Life Science and Technology, Harbin Normal UniversityHarbin, China
- Key Laboratory of Biology and Genetic Improvement of Triticeae Crops, Ministry of Agriculture, Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facility for Crop Gene Resources and Genetic ImprovementBeijing, China
| | - Pan-Pan Lu
- Key Laboratory of Biology and Genetic Improvement of Triticeae Crops, Ministry of Agriculture, Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facility for Crop Gene Resources and Genetic ImprovementBeijing, China
| | - Ming Chen
- Key Laboratory of Biology and Genetic Improvement of Triticeae Crops, Ministry of Agriculture, Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facility for Crop Gene Resources and Genetic ImprovementBeijing, China
| | - Chang-Hong Guo
- Key Laboratory of Molecular Cytogenetics and Genetic Breeding of Heilongjiang Province, College of Life Science and Technology, Harbin Normal UniversityHarbin, China
| | - Zhao-Shi Xu
- Key Laboratory of Biology and Genetic Improvement of Triticeae Crops, Ministry of Agriculture, Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facility for Crop Gene Resources and Genetic ImprovementBeijing, China
- *Correspondence: Zhao-Shi Xu
| | - You-Zhi Ma
- Key Laboratory of Biology and Genetic Improvement of Triticeae Crops, Ministry of Agriculture, Institute of Crop Science, Chinese Academy of Agricultural Sciences/National Key Facility for Crop Gene Resources and Genetic ImprovementBeijing, China
| |
Collapse
|
3
|
Weiss-Schneeweiss H, Emadzade K, Jang TS, Schneeweiss G. Evolutionary consequences, constraints and potential of polyploidy in plants. Cytogenet Genome Res 2013; 140:137-50. [PMID: 23796571 PMCID: PMC3859924 DOI: 10.1159/000351727] [Citation(s) in RCA: 133] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Polyploidy, the possession of more than 2 complete genomes, is a major force in plant evolution known to affect the genetic and genomic constitution and the phenotype of an organism, which will have consequences for its ecology and geography as well as for lineage diversification and speciation. In this review, we discuss phylogenetic patterns in the incidence of polyploidy including possible underlying causes, the role of polyploidy for diversification, the effects of polyploidy on geographical and ecological patterns, and putative underlying mechanisms as well as chromosome evolution and evolution of repetitive DNA following polyploidization. Spurred by technological advances, a lot has been learned about these aspects both in model and increasingly also in nonmodel species. Despite this enormous progress, long-standing questions about polyploidy still cannot be unambiguously answered, due to frequently idiosyncratic outcomes and insufficient integration of different organizational levels (from genes to ecology), but likely this will change in the near future. See also the sister article focusing on animals by Choleva and Janko in this themed issue.
Collapse
Affiliation(s)
- H. Weiss-Schneeweiss
- Department of Systematic and Evolutionary Botany University of Vienna, Rennweg 14 AT–1030 Vienna (Austria)
| | - K. Emadzade
- Department of Systematic and Evolutionary Botany University of Vienna, Rennweg 14 AT–1030 Vienna (Austria)
| | - T.-S. Jang
- Department of Systematic and Evolutionary Botany University of Vienna, Rennweg 14 AT–1030 Vienna (Austria)
| | - G.M. Schneeweiss
- Department of Systematic and Evolutionary Botany University of Vienna, Rennweg 14 AT–1030 Vienna (Austria)
| |
Collapse
|
4
|
Liu SL, Adams KL. Dramatic change in function and expression pattern of a gene duplicated by polyploidy created a paternal effect gene in the Brassicaceae. Mol Biol Evol 2010; 27:2817-28. [PMID: 20616146 DOI: 10.1093/molbev/msq169] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
New gene formation by polyploidy has been an ongoing process during the evolution of various eukaryotes that has contributed greatly to the large number of genes in their genomes. After duplication, some genes that are retained can acquire new functions or expression patterns, or subdivide their functions or expression patterns between duplicates. Here, we show that SHORT SUSPENSOR (SSP) and Brassinosteroid Kinase 1 (BSK1) are paralogs duplicated by a polyploidy event that occurred in the Brassicaceae family about 23 Ma. SSP is involved in paternal control of zygote elongation in Arabidopsis thaliana by transcription in the sperm cells of pollen and then translation in the zygote, whereas BSK1 is involved in brassinosteroid signal transduction. Comparative analysis of expression in 63 different organs and developmental stages revealed that BSK1 and SSP have opposite expression patterns in pollen compared with all other parts of the plant. We determined that BSK1 retains the ancestral expression pattern and function. Thus, SSP has diverged in function after duplication from a component of the brassinosteroid signaling pathway to a paternal regulator of the timing of zygote elongation. The ancestral function of SSP was lost by deletions in the kinase domain. Our sequence rate analysis revealed that SSP but not BSK1 has experienced a greatly accelerated rate of amino acid sequence changes and relaxation of purifying selection. In addition, SSP has been duplicated to create a new gene (SSP-like1) with a completely different expression pattern, a shorter coding sequence that has lost a critical functional domain, and a greatly accelerated rate of amino acid sequence evolution along with evidence for positive selection, together indicative of neofunctionalization. This study illustrates two dramatic examples of neofunctionalization following gene duplication by complete changes in expression pattern and function. In addition, our findings indicate that paternal control of zygote elongation by SSP is an evolutionarily recent innovation in the Brassicaceae family.
Collapse
Affiliation(s)
- Shao-Lun Liu
- UBC Botanical Garden and Centre for Plant Research, Vancouver, British Columbia, Canada
| | | |
Collapse
|
5
|
Zhou X, Lin Z, Ma H. Phylogenetic detection of numerous gene duplications shared by animals, fungi and plants. Genome Biol 2010; 11:R38. [PMID: 20370904 PMCID: PMC2884541 DOI: 10.1186/gb-2010-11-4-r38] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2009] [Revised: 02/04/2010] [Accepted: 04/06/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene duplication is considered a major driving force for evolution of genetic novelty, thereby facilitating functional divergence and organismal diversity, including the process of speciation. Animals, fungi and plants are major eukaryotic kingdoms and the divergences between them are some of the most significant evolutionary events. Although gene duplications in each lineage have been studied extensively in various contexts, the extent of gene duplication prior to the split of plants and animals/fungi is not clear. RESULTS Here, we have studied gene duplications in early eukaryotes by phylogenetic relative dating. We have reconstructed gene families (with one or more orthogroups) with members from both animals/fungi and plants by using two different clustering strategies. Extensive phylogenetic analyses of the gene families show that, among nearly 2,600 orthogroups identified, at least 300 of them still retain duplication that occurred before the divergence of the three kingdoms. We further found evidence that such duplications were also detected in some highly divergent protists, suggesting that these duplication events occurred in the ancestors of most major extant eukaryotic groups. CONCLUSIONS Our phylogenetic analyses show that numerous gene duplications happened at the early stage of eukaryotic evolution, probably before the separation of known major eukaryotic lineages. We discuss the implication of our results in the contexts of different models of eukaryotic phylogeny. One possible explanation for the large number of gene duplication events is one or more large-scale duplications, possibly whole genome or segmental duplication(s), which provides a genomic basis for the successful radiation of early eukaryotes.
Collapse
Affiliation(s)
- Xiaofan Zhou
- Department of Biology, the Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | | | | |
Collapse
|
6
|
Patrushev LI, Minkevich IG. The problem of the eukaryotic genome size. BIOCHEMISTRY (MOSCOW) 2009; 73:1519-52. [PMID: 19216716 DOI: 10.1134/s0006297908130117] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The current state of knowledge concerning the unsolved problem of the huge interspecific eukaryotic genome size variations not correlating with the species phenotypic complexity (C-value enigma also known as C-value paradox) is reviewed. Characteristic features of eukaryotic genome structure and molecular mechanisms that are the basis of genome size changes are examined in connection with the C-value enigma. It is emphasized that endogenous mutagens, including reactive oxygen species, create a constant nuclear environment where any genome evolves. An original quantitative model and general conception are proposed to explain the C-value enigma. In accordance with the theory, the noncoding sequences of the eukaryotic genome provide genes with global and differential protection against chemical mutagens and (in addition to the anti-mutagenesis and DNA repair systems) form a new, third system that protects eukaryotic genetic information. The joint action of these systems controls the spontaneous mutation rate in coding sequences of the eukaryotic genome. It is hypothesized that the genome size is inversely proportional to functional efficiency of the anti-mutagenesis and/or DNA repair systems in a particular biological species. In this connection, a model of eukaryotic genome evolution is proposed.
Collapse
Affiliation(s)
- L I Patrushev
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, 117997, Russia.
| | | |
Collapse
|
7
|
Cardoso JCR, de Vet ECJM, Louro B, Elgar G, Clark MS, Power DM. Persistence of duplicated PAC1 receptors in the teleost, Sparus auratus. BMC Evol Biol 2007; 7:221. [PMID: 17997850 PMCID: PMC2245808 DOI: 10.1186/1471-2148-7-221] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2007] [Accepted: 11/12/2007] [Indexed: 12/05/2022] Open
Abstract
Background: Duplicated genes are common in vertebrate genomes. Their persistence is assumed to be either a consequence of gain of novel function (neofunctionalisation) or partitioning of the function of the ancestral molecule (sub-functionalisation). Surprisingly few studies have evaluated the extent of such modifications despite the numerous duplicated receptor and ligand genes identified in vertebrate genomes to date. In order to study the importance of function in the maintenance of duplicated genes, sea bream (Sparus auratus) PAC1 receptors, sequence homologues of the mammalian receptor specific for PACAP (Pituitary Adenylate Cyclase-Activating Polypeptide), were studied. These receptors belong to family 2 GPCRs and most of their members are duplicated in teleosts although the reason why both persist in the genome is unknown. Results: Duplicate sea bream PACAP receptor genes (sbPAC1A and sbPAC1B), members of family 2 GPCRs, were isolated and share 77% amino acid sequence identity. RT-PCR with specific primers for each gene revealed that they have a differential tissue distribution which overlaps with the distribution of the single mammalian receptor. Furthermore, in common with mammals, the teleost genes undergo alternative splicing and a PAC1Ahop1 isoform has been characterised. Duplicated orthologous receptors have also been identified in other teleost genomes and their distribution profile suggests that function may be species specific. Functional analysis of the paralogue sbPAC1s in Cos7 cells revealed that they are strongly stimulated in the presence of mammalian PACAP27 and PACAP38 and far less with VIP (Vasoactive Intestinal Peptide). The sbPAC1 receptors are equally stimulated (LOGEC50 values for maximal cAMP production) in the presence of PACAP27 (-8.74 ± 0.29 M and -9.15 ± 0.21 M, respectively for sbPAC1A and sbPAC1B, P > 0.05) and PACAP38 (-8.54 ± 0.18 M and -8.92 ± 0.24 M, respectively for sbPAC1A and sbPAC1B, P > 0.05). Human VIP was found to stimulate sbPAC1A (-7.23 ± 0.20 M) more strongly than sbPAC1B (-6.57 ± 0.14 M, P < 0.05) and human secretin (SCT), which has not so far been identified in fish genomes, caused negligible stimulation of both receptors. Conclusion: The existence of functionally divergent duplicate sbPAC1 receptors is in line with previously proposed theories about the origin and maintenance of duplicated genes. Sea bream PAC1 duplicate receptors resemble the typical mammalian PAC1, and PACAP peptides were found to be more effective than VIP in stimulating cAMP production, although sbPAC1A was more responsive for VIP than sbPAC1B. These results together with the highly divergent pattern of tissue distribution suggest that a process involving neofunctionalisation occurred after receptor duplication within the fish lineage and probably accounts for their persistence in the genome. The characterisation of further duplicated receptors and their ligands should provide insights into the evolution and function of novel protein-protein interactions associated with the vertebrate radiation.
Collapse
Affiliation(s)
- João C R Cardoso
- CCMAR, Molecular and Comparative Endocrinology, University of Algarve, 8005-139 Faro, Portugal.
| | | | | | | | | | | |
Collapse
|
8
|
Nakatani Y, Takeda H, Kohara Y, Morishita S. Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates. Genome Res 2007; 17:1254-65. [PMID: 17652425 PMCID: PMC1950894 DOI: 10.1101/gr.6316407] [Citation(s) in RCA: 357] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Although several vertebrate genomes have been sequenced, little is known about the genome evolution of early vertebrates and how large-scale genomic changes such as the two rounds of whole-genome duplications (2R WGD) affected evolutionary complexity and novelty in vertebrates. Reconstructing the ancestral vertebrate genome is highly nontrivial because of the difficulty in identifying traces originating from the 2R WGD. To resolve this problem, we developed a novel method capable of pinning down remains of the 2R WGD in the human and medaka fish genomes using invertebrate tunicate and sea urchin genes to define ohnologs, i.e., paralogs produced by the 2R WGD. We validated the reconstruction using the chicken genome, which was not considered in the reconstruction step, and observed that many ancestral proto-chromosomes were retained in the chicken genome and had one-to-one correspondence to chicken microchromosomes, thereby confirming the reconstructed ancestral genomes. Our reconstruction revealed a contrast between the slow karyotype evolution after the second WGD and the rapid, lineage-specific genome reorganizations that occurred in the ancestral lineages of major taxonomic groups such as teleost fishes, amphibians, reptiles, and marsupials.
Collapse
Affiliation(s)
- Yoichiro Nakatani
- Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa 277-0882, Japan
- Corresponding authors.E-mail ; fax 81-47-136-3977.E-mail ; fax 81-47-136-3977
| | - Hiroyuki Takeda
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo 113-0033, Japan
| | - Yuji Kohara
- Center for Genetic Resource Information, National Institute of Genetics, Mishima 411-8540, Japan
| | - Shinichi Morishita
- Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa 277-0882, Japan
- Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency (JST), Tokyo 102-8666, Japan
- Corresponding authors.E-mail ; fax 81-47-136-3977.E-mail ; fax 81-47-136-3977
| |
Collapse
|
9
|
Evolution of secretin family GPCR members in the metazoa. BMC Evol Biol 2006; 6:108. [PMID: 17166275 PMCID: PMC1764030 DOI: 10.1186/1471-2148-6-108] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2006] [Accepted: 12/13/2006] [Indexed: 11/10/2022] Open
Abstract
Background Comparative approaches using protostome and deuterostome data have greatly contributed to understanding gene function and organismal complexity. The family 2 G-protein coupled receptors (GPCRs) are one of the largest and best studied hormone and neuropeptide receptor families. They are suggested to have arisen from a single ancestral gene via duplication events. Despite the recent identification of receptor members in protostome and early deuterostome genomes, relatively little is known about their function or origin during metazoan divergence. In this study a comprehensive description of family 2 GPCR evolution is given based on in silico and expression analyses of the invertebrate receptor genes. Results Family 2 GPCR members were identified in the invertebrate genomes of the nematodes C. elegans and C. briggsae, the arthropods D. melanogaster and A. gambiae (mosquito) and in the tunicate C. intestinalis. This suggests that they are of ancient origin and have evolved through gene/genome duplication events. Sequence comparisons and phylogenetic analyses have demonstrated that the immediate gene environment, with regard to gene content, is conserved between the protostome and deuterostome receptor genomic regions. Also that the protostome genes are more like the deuterostome Corticotrophin Releasing Factor (CRF) and Calcitonin/Calcitonin Gene-Related Peptide (CAL/CGRP) receptors members than the other family 2 GPCR members. The evolution of family 2 GPCRs in deuterostomes is characterised by acquisition of new family members, with SCT (Secretin) receptors only present in tetrapods. Gene structure is characterised by an increase in intron number with organismal complexity with the exception of the vertebrate CAL/CGRP receptors. Conclusion The family 2 GPCR members provide a good example of gene duplication events occurring in tandem with increasing organismal complexity during metazoan evolution. The putative ancestral receptors are proposed to be more like the deuterostome CAL/CGRP and CRF receptors and this may be associated with their fundamental role in calcium regulation and the stress response, both of which are essential for survival.
Collapse
|
10
|
Itzkovitz S, Tlusty T, Alon U. Coding limits on the number of transcription factors. BMC Genomics 2006; 7:239. [PMID: 16984633 PMCID: PMC1590034 DOI: 10.1186/1471-2164-7-239] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2006] [Accepted: 09/19/2006] [Indexed: 12/02/2022] Open
Abstract
Background Transcription factor proteins bind specific DNA sequences to control the expression of genes. They contain DNA binding domains which belong to several super-families, each with a specific mechanism of DNA binding. The total number of transcription factors encoded in a genome increases with the number of genes in the genome. Here, we examined the number of transcription factors from each super-family in diverse organisms. Results We find that the number of transcription factors from most super-families appears to be bounded. For example, the number of winged helix factors does not generally exceed 300, even in very large genomes. The magnitude of the maximal number of transcription factors from each super-family seems to correlate with the number of DNA bases effectively recognized by the binding mechanism of that super-family. Coding theory predicts that such upper bounds on the number of transcription factors should exist, in order to minimize cross-binding errors between transcription factors. This theory further predicts that factors with similar binding sequences should tend to have similar biological effect, so that errors based on mis-recognition are minimal. We present evidence that transcription factors with similar binding sequences tend to regulate genes with similar biological functions, supporting this prediction. Conclusion The present study suggests limits on the transcription factor repertoire of cells, and suggests coding constraints that might apply more generally to the mapping between binding sites and biological function.
Collapse
Affiliation(s)
- Shalev Itzkovitz
- Dept. Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
- Dept. Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Tsvi Tlusty
- Dept. Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Uri Alon
- Dept. Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
- Dept. Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
11
|
McEwen GK, Woolfe A, Goode D, Vavouri T, Callaway H, Elgar G. Ancient duplicated conserved noncoding elements in vertebrates: a genomic and functional analysis. Genome Res 2006; 16:451-65. [PMID: 16533910 PMCID: PMC1457030 DOI: 10.1101/gr.4143406] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Fish-mammal genomic comparisons have proved powerful in identifying conserved noncoding elements likely to be cis-regulatory in nature, and the majority of those tested in vivo have been shown to act as tissue-specific enhancers associated with genes involved in transcriptional regulation of development. Although most of these elements share little sequence identity to each other, a small number are remarkably similar and appear to be the product of duplication events. Here, we searched for duplicated conserved noncoding elements in the human genome, using comparisons with Fugu to select putative cis-regulatory sequences. We identified 124 families of duplicated elements, each containing between two and five members, that are highly conserved within and between vertebrate genomes. In 74% of cases, we were able to assign a specific set of paralogous genes with annotation relating to transcriptional regulation and/or development to each family, thus removing much of the ambiguity in identifying associated genes. We find that duplicate elements have the potential to up-regulate reporter gene expression in a tissue-specific manner and that expression domains often overlap, but are not necessarily identical, between family members. Over two thirds of the families are conserved in duplicate in fish and appear to predate the large-scale duplication events thought to have occurred at the origin of vertebrates. We propose a model whereby gene duplication and the evolution of cis-regulatory elements can be considered in the context of increased morphological diversity and the emergence of the modern vertebrate body plan.
Collapse
Affiliation(s)
- Gayle K. McEwen
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, United Kingdom
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, United Kingdom
- MRC Biostatistics Unit, Institute of Public Health, Cambridge CB2 2SR, United Kingdom
| | - Adam Woolfe
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, United Kingdom
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, United Kingdom
| | - Debbie Goode
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, United Kingdom
| | - Tanya Vavouri
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, United Kingdom
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SB, United Kingdom
| | - Heather Callaway
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, United Kingdom
| | - Greg Elgar
- School of Biological and Chemical Sciences, Queen Mary, University of London, London E1 4NS, United Kingdom
- Corresponding author.E-mail ; fax 0044 207 882 3000
| |
Collapse
|
12
|
Durand D, Hoberman R. Diagnosing duplications – can it be done? Trends Genet 2006; 22:156-64. [PMID: 16442663 DOI: 10.1016/j.tig.2006.01.002] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2005] [Revised: 11/30/2005] [Accepted: 01/11/2006] [Indexed: 01/10/2023]
Abstract
New genes arise through duplication and modification of DNA sequences on a range of scales: single gene duplication, duplication of large chromosomal fragments and whole-genome duplication. Each duplication mechanism has specific characteristics that influence the fate of the resulting duplicates, such as the size of the duplicated fragment, the potential for dosage imbalance, the preservation or disruption of regulatory control and genomic context. The ability to diagnose or identify the mechanism that produced a pair of paralogs has the potential to increase our ability to reconstruct evolutionary history, to understand the processes that govern genome evolution and to make functional predictions based on paralogy. The recent availability of large amounts of whole-genome sequence, often from several closely related species, has stimulated a wealth of new computational methods to diagnose gene duplications.
Collapse
Affiliation(s)
- Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| | | |
Collapse
|
13
|
Abstract
Recent sequencing efforts and experiments have advanced our understanding of genome evolution in yeasts, particularly the Saccharomyces yeasts. The ancestral genome of the Saccharomyces sensu stricto complex has been subject to both whole-genome duplication, followed by massive sequence loss and divergence, and segmental duplication. In addition the subtelomeric regions are subject to further duplications and rearrangements via ectopic exchanges. Translocations and other gross chromosomal rearrangements that break down syntenic relationships occur; however, they do not appear to be a driving force of speciation. Analysis of single genomes has been fruitful for hypothesis generation such as the whole-genome duplication, but comparative genomics between close and more distant species has proven to be a powerful tool in testing these hypotheses as well as elucidating evolutionary processes acting on the genome. Future work on population genomics and experimental evolution will keep yeast at the forefront of studies in genome evolution.
Collapse
Affiliation(s)
- Gianni Liti
- Institute of Genetics, University of Nottingham, Queen's Medical Centre, Nottingham NG7 2UH, United Kingdom.
| | | |
Collapse
|
14
|
Wang H, Yu L, Lai F, Liu L, Wang J. Molecular evidence for asymmetric evolution of sister duplicated blocks after cereal polyploidy. PLANT MOLECULAR BIOLOGY 2005; 59:63-74. [PMID: 16217602 DOI: 10.1007/s11103-005-4414-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2005] [Accepted: 03/22/2005] [Indexed: 05/04/2023]
Abstract
Polyploidy (genome duplication) is thought to have contributed to the evolution of the eukaryotic genome, but complex genome structures and massive gene loss during evolution has complicated detection of these ancestral duplication events. The major factors determining the fate of duplicated genes are currently unclear, as are the processes by which duplicated genes evolve after polyploidy. Fine-scale analysis between homologous regions may allow us to better understand post-polyploidy evolution. Here, using gene-by-gene and gene-by-genome strategies, we identified the S5 region and four homologous regions within the japonica genome. Additional phylogenomic analyses of the comparable duplicated blocks indicate that four successive duplication events gave rise to these five regions, allowing us to propose a model for this local chromosomal evolution. According to this model, gene loss may play a major role in post-duplication genetic evolution at the segmental level. Moreover, we found molecular evidence that one of the sister duplicated blocks experienced more gene loss and a more rapid evolution subsequent to two recent duplication events. Given that these two recent duplication events were likely involved in polyploidy, this asymmetric evolution (gene loss and gene divergence) may be one possible mechanism accounting for the diploidization at the segmental level.
Collapse
Affiliation(s)
- Hongbin Wang
- The State Key Laboratory for Biocontrol and The Key Laboratory of Gene Engineering of Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China
| | | | | | | | | |
Collapse
|
15
|
Abstract
Teleost fish, which roughly make up half of the extant vertebrate species, exhibit an amazing level of biodiversity affecting their morphology, ecology and behaviour as well as many other aspects of their biology. This huge variability makes fish extremely attractive for the study of many biological questions, particularly of those related to evolution. New insights gained from different teleost species and sequencing projects have recently revealed several peculiar features of fish genomes that might have played a role in fish evolution and speciation. There is now substantial evidence that a round of tetraploidization/rediploidization has taken place during the early evolution of the ray-finned fish lineage, and that hundreds of duplicate pairs generated by this event have been maintained over hundreds of millions of years of evolution. Differential loss or subfunction partitioning of such gene duplicates might have been involved in the generation of fish variability. In contrast to mammalian genomes, teleost genomes also contain multiple families of active transposable elements, which might have played a role in speciation by affecting hybrid sterility and viability. Finally, the amazing diversity of sex determination systems and the plasticity of sex chromosomes observed in teleost might have been involved in both pre- and postmating reproductive isolation. Comparison of data generated by current and future genome projects as well as complementary studies in other species will allow one to approach the molecular and evolutionary mechanisms underlying genome diversity in fish, and will certainly significantly contribute to our understanding of gene evolution and function in humans and other vertebrates.
Collapse
Affiliation(s)
- J-N Volff
- BioFuture Research Group, Physiologische Chemie I, Biozentrum, University of Würzburg, am Hubland, D-97074 Würzburg, Germany.
| |
Collapse
|
16
|
Wang X, Shi X, Hao B, Ge S, Luo J. Duplication and DNA segmental loss in the rice genome: implications for diploidization. THE NEW PHYTOLOGIST 2005; 165:937-46. [PMID: 15720704 DOI: 10.1111/j.1469-8137.2004.01293.x] [Citation(s) in RCA: 225] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
* Large-scale duplication events have been recently uncovered in the rice genome, but different interpretations were proposed regarding the extent of the duplications. * Through analysing the 370 Mb genome sequences assembled into 12 chromosomes of Oryza sativa subspecies indica, we detected 10 duplicated blocks on all 12 chromosomes that contained 47% of the total predicted genes. Based on the phylogenetic analysis, we inferred that this was a result of a genome duplication that occurred c. 70 million years ago, supporting the polyploidy origin of the rice genome. In addition, a segmental duplication was also identified involving chromosomes 11 and 12, which occurred c. 5 million years ago. * Following the duplications, there have been large-scale chromosomal rearrangements and deletions. About 30-65% of duplicated genes were lost shortly after the duplications, leading to a rapid diploidization. * Together with other lines of evidence, we propose that polyploidization is still an ongoing process in grasses of polyploidy origins.
Collapse
Affiliation(s)
- Xiyin Wang
- College of Life Sciences, National Laboratory of Plant Genetic Engineering and Protein Engineering, Center of Bioinformatics, Peking University, Beijing 100871, China
| | | | | | | | | |
Collapse
|
17
|
Abstract
Recent analyses of complete genome sequences have revealed that many genomes have been duplicated in their evolutionary past. Such events have been associated with important biological transitions, major leaps in evolution and adaptive radiations of species. Here, we consider recently developed computational methods to detect such ancient large-scale gene duplication events. Several new approaches have been used to show that large-scale gene duplications are more common than previously thought.
Collapse
Affiliation(s)
- Yves Van de Peer
- Department of Plant Systems Biology, Flanders Interuniversity, Institute for Biotechnology, Ghent, Belgium.
| |
Collapse
|
18
|
Simillion C, Vandepoele K, Van de Peer Y. Recent developments in computational approaches for uncovering genomic homology. Bioessays 2004; 26:1225-35. [PMID: 15499578 DOI: 10.1002/bies.20127] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Identifying genomic homology within and between genomes is essential when studying genome evolution. In the past years, different computational techniques have been developed to detect homology even when the actual similarity between homologous segments is low. Depending on the strategy used, these methods search for pairs of chromosomal segments between which either both gene content and order are conserved or gene content only. However, due to fact that, after their divergence, homologous segments can lose a different set of genes, these methods still often fail to detect genomic homology. Recently, more advanced approaches have been developed that can combine gene order and content information of multiple genomic segments.
Collapse
Affiliation(s)
- Cedric Simillion
- Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology, Ghent University, Belgium
| | | | | |
Collapse
|