51
|
Lipovich L, Hou ZC, Jia H, Sinkler C, McGowen M, Sterner KN, Weckle A, Sugalski AB, Pipes L, Gatti DL, Mason CE, Sherwood CC, Hof PR, Kuzawa CW, Grossman LI, Goodman M, Wildman DE. High-throughput RNA sequencing reveals structural differences of orthologous brain-expressed genes between western lowland gorillas and humans. J Comp Neurol 2015; 524:288-308. [PMID: 26132897 DOI: 10.1002/cne.23843] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Revised: 06/20/2015] [Accepted: 06/23/2015] [Indexed: 12/22/2022]
Abstract
The human brain and human cognitive abilities are strikingly different from those of other great apes despite relatively modest genome sequence divergence. However, little is presently known about the interspecies divergence in gene structure and transcription that might contribute to these phenotypic differences. To date, most comparative studies of gene structure in the brain have examined humans, chimpanzees, and macaque monkeys. To add to this body of knowledge, we analyze here the brain transcriptome of the western lowland gorilla (Gorilla gorilla gorilla), an African great ape species that is phylogenetically closely related to humans, but with a brain that is approximately one-third the size. Manual transcriptome curation from a sample of the planum temporale region of the neocortex revealed 12 protein-coding genes and one noncoding-RNA gene with exons in the gorilla unmatched by public transcriptome data from the orthologous human loci. These interspecies gene structure differences accounted for a total of 134 amino acids in proteins found in the gorilla that were absent from protein products of the orthologous human genes. Proteins varying in structure between human and gorilla were involved in immunity and energy metabolism, suggesting their relevance to phenotypic differences. This gorilla neocortical transcriptome comprises an empirical, not homology- or prediction-driven, resource for orthologous gene comparisons between human and gorilla. These findings provide a unique repository of the sequences and structures of thousands of genes transcribed in the gorilla brain, pointing to candidate genes that may contribute to the traits distinguishing humans from other closely related great apes.
Collapse
Affiliation(s)
- Leonard Lipovich
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Neurology, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Zhuo-Cheng Hou
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Animal Genetics, China Agricultural University, Beijing, China
| | - Hui Jia
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Christopher Sinkler
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Michael McGowen
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,School of Biological and Chemical Sciences, Queen Mary, University of London, London, United Kingdom
| | - Kirstin N Sterner
- Department of Anthropology, University of Oregon, Eugene, Oregon, 97403
| | - Amy Weckle
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, 61801.,Department of Molecular and Integrative Physiology, University of Illinois, Urbana, Illinois, 61801
| | - Amara B Sugalski
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Lenore Pipes
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, 10021
| | - Domenico L Gatti
- Department of Biochemistry and Molecular Biology, School of Medicine, Wayne State University, Detroit, Michigan, 48201.,Cardiovascular Research Institute, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, 10021
| | - Chet C Sherwood
- Department of Anthropology and the Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, 20052
| | - Patrick R Hof
- Fishberg Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, 10029.,New York Consortium in Evolutionary Primatology, New York, New York, 10024
| | | | - Lawrence I Grossman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201
| | - Morris Goodman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Department of Anatomy and Cell Biology, School of Medicine, Wayne State University, Detroit, Michigan, 48201
| | - Derek E Wildman
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, Michigan, 48201.,Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, 61801.,Department of Molecular and Integrative Physiology, University of Illinois, Urbana, Illinois, 61801
| |
Collapse
|
52
|
Abstract
Gene duplication is a key factor contributing to phenotype diversity across and within species. Although the availability of complete genomes has led to the extensive study of genomic duplications, the dynamics and variability of gene duplications mediated by retrotransposition are not well understood. Here, we predict mRNA retrotransposition and use comparative genomics to investigate their origin and variability across primates. Analyzing seven anthropoid primate genomes, we found a similar number of mRNA retrotranspositions (∼7,500 retrocopies) in Catarrhini (Old Word Monkeys, including humans), but a surprising large number of retrocopies (∼10,000) in Platyrrhini (New World Monkeys), which may be a by-product of higher long interspersed nuclear element 1 activity in these genomes. By inferring retrocopy orthology, we dated most of the primate retrocopy origins, and estimated a decrease in the fixation rate in recent primate history, implying a smaller number of species-specific retrocopies. Moreover, using RNA-Seq data, we identified approximately 3,600 expressed retrocopies. As expected, most of these retrocopies are located near or within known genes, present tissue-specific and even species-specific expression patterns, and no expression correlation to their parental genes. Taken together, our results provide further evidence that mRNA retrotransposition is an active mechanism in primate evolution and suggest that retrocopies may not only introduce great genetic variability between lineages but also create a large reservoir of potentially functional new genomic loci in primate genomes.
Collapse
Affiliation(s)
- Fábio C P Navarro
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo, Brazil Dep. de Bioquímica, Universidade de São Paulo, Brazil
| | - Pedro A F Galante
- Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo, Brazil
| |
Collapse
|
53
|
Abstract
The world of primate genomics is expanding rapidly in new and exciting ways owing to lowered costs and new technologies in molecular methods and bioinformatics. The primate order is composed of 78 genera and 478 species, including human. Taxonomic inferences are complex and likely a consequence of ongoing hybridization, introgression, and reticulate evolution among closely related taxa. Recently, we applied large-scale sequencing methods and extensive taxon sampling to generate a highly resolved phylogeny that affirms, reforms, and extends previous depictions of primate speciation. The next stage of research uses this phylogeny as a foundation for investigating genome content, structure, and evolution across primates. Ongoing and future applications of a robust primate phylogeny are discussed, highlighting advancements in adaptive evolution of genes and genomes, taxonomy and conservation management of endangered species, next-generation genomic technologies, and biomedicine.
Collapse
Affiliation(s)
- Jill Pecon-Slattery
- Laboratory of Genomic Diversity, National Cancer Institute, Frederick, Maryland 21702; Current Affiliation: Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, Virginia 22630;
| |
Collapse
|
54
|
Genome-wide patterns of genetic variation among silkworms. Mol Genet Genomics 2015; 290:1575-87. [PMID: 25749967 DOI: 10.1007/s00438-015-1017-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Accepted: 02/20/2015] [Indexed: 10/23/2022]
Abstract
Although the draft genome sequence of silkworm is available for a decade, its genetic variations, especially structural variations, are far from well explored. In this study, we identified 1,298,659 SNPs and 9,731 indels, of which 32 % of SNPs and 92.2 % of indels were novel compared to previous silkworm re-sequencing analysis. In addition, we applied a read depth-based approach to investigate copy number variations among 21 silkworm strains at genome-wide level. This effort resulted in 562 duplicated and 41 deleted CNV regions, and among them 442 CNV were newly identified. Functional annotation of genes affected by these genetic variations reveal that these genes include a wide spectrum of molecular functions, such as immunity and drug detoxification, which are important for the adaptive evolution of silkworms. We further validated the predicted CNV regions using q-PCR. 94.7 % (36/38) of the selected regions show divergent copy numbers compared to a single-copy gene OR2. In addition, potential presence/absence variations are also observed in our study: 11 genes are present in the reference genome, but absent in other strains. Overall, we draw an integrative map of silkworm genetic variation at genome-wide level. The identification of genetic variations in this study improves our understanding that these variants play important roles in shaping phenotypic variations between wild and domesticated silkworms.
Collapse
|
55
|
Farré M, Robinson TJ, Ruiz-Herrera A. An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity. Bioessays 2015; 37:479-88. [PMID: 25739389 DOI: 10.1002/bies.201400174] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Revised: 02/12/2015] [Accepted: 02/13/2015] [Indexed: 12/23/2022]
Abstract
Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders.
Collapse
Affiliation(s)
- Marta Farré
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Campus UAB, Barcelona, Spain
| | | | | |
Collapse
|
56
|
Milligan MJ, Lipovich L. Pseudogene-derived lncRNAs: emerging regulators of gene expression. Front Genet 2015; 5:476. [PMID: 25699073 PMCID: PMC4316772 DOI: 10.3389/fgene.2014.00476] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Accepted: 12/25/2014] [Indexed: 01/11/2023] Open
Abstract
In the more than one decade since the completion of the Human Genome Project, the prevalence of non-protein-coding functional elements in the human genome has emerged as a key revelation in post-genomic biology. Highlighted by the ENCODE (Encyclopedia of DNA Elements) and FANTOM (Functional Annotation of Mammals) consortia, these elements include tens of thousands of pseudogenes, as well as comparably numerous long non-coding RNA (lncRNA) genes. Pseudogene transcription and function remain insufficiently understood. However, the field is of great importance for human disease due to the high sequence similarity between pseudogenes and their parental protein-coding genes, which generates the potential for sequence-specific regulation. Recent case studies have established essential and coordinated roles of both pseudogenes and lncRNAs in development and disease in metazoan systems, including functional impacts of lncRNA transcription at pseudogene loci on the regulation of the pseudogenes’ parental genes. This review synthesizes the nascent evidence for regulatory modalities jointly exerted by lncRNAs and pseudogenes in human disease, and for recent evolutionary origins of these systems.
Collapse
Affiliation(s)
- Michael J Milligan
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine , Detroit, MI, USA
| | - Leonard Lipovich
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine , Detroit, MI, USA
| |
Collapse
|
57
|
|
58
|
Thierry A, Khanna V, Créno S, Lafontaine I, Ma L, Bouchier C, Dujon B. Macrotene chromosomes provide insights to a new mechanism of high-order gene amplification in eukaryotes. Nat Commun 2015; 6:6154. [PMID: 25635677 PMCID: PMC4317496 DOI: 10.1038/ncomms7154] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 12/15/2014] [Indexed: 12/30/2022] Open
Abstract
Copy number variation of chromosomal segments is now recognized as a major source of genetic polymorphism within natural populations of eukaryotes, as well as a possible cause of genetic diseases in humans, including cancer, but its molecular bases remain incompletely understood. In the baker's yeast Saccharomyces cerevisiae, a variety of low-order amplifications (segmental duplications) were observed after adaptation to limiting environmental conditions or recovery from gene dosage imbalance, and interpreted in terms of replication-based mechanisms associated or not with homologous recombination. Here we show the emergence of novel high-order amplification structures, with corresponding overexpression of embedded genes, during evolution under favourable growth conditions of severely unfit yeast cells bearing genetically disabled genomes. Such events form massively extended chromosomes, which we propose to call macrotene, whose characteristics suggest the products of intrachromosomal rolling-circle type of replication structures, probably initiated by increased accidental template switches under important cellular stress conditions.
Collapse
Affiliation(s)
- Agnès Thierry
- Institut Pasteur, Unité de Génétique moléculaire des levures, CNRS UMR3525, Sorbonne Universités, UPMC, Univ. Paris 06 UFR927, 25, rue du Docteur Roux, F-75724 Paris, France
| | - Varun Khanna
- Institut Pasteur, Unité de Génétique moléculaire des levures, CNRS UMR3525, Sorbonne Universités, UPMC, Univ. Paris 06 UFR927, 25, rue du Docteur Roux, F-75724 Paris, France
| | - Sophie Créno
- Institut Pasteur, Genomic platform, 28, rue du Docteur Roux, F-75724 Paris, France
| | - Ingrid Lafontaine
- Institut Pasteur, Unité de Génétique moléculaire des levures, CNRS UMR3525, Sorbonne Universités, UPMC, Univ. Paris 06 UFR927, 25, rue du Docteur Roux, F-75724 Paris, France
| | - Laurence Ma
- Institut Pasteur, Genomic platform, 28, rue du Docteur Roux, F-75724 Paris, France
| | - Christiane Bouchier
- Institut Pasteur, Genomic platform, 28, rue du Docteur Roux, F-75724 Paris, France
| | - Bernard Dujon
- Institut Pasteur, Unité de Génétique moléculaire des levures, CNRS UMR3525, Sorbonne Universités, UPMC, Univ. Paris 06 UFR927, 25, rue du Docteur Roux, F-75724 Paris, France
| |
Collapse
|
59
|
Novel TBX5 duplication in a Japanese family with Holt-Oram syndrome. Pediatr Cardiol 2015; 36:244-7. [PMID: 25274398 DOI: 10.1007/s00246-014-1028-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Accepted: 09/23/2014] [Indexed: 10/24/2022]
Abstract
Holt-Oram syndrome is an autosomal dominant disorder characterized by upper limb malformations in the preaxial radial ray and cardiac septation and/or a conduction abnormality. It has been demonstrated that Holt-Oram syndrome is caused by mutations in the T-box transcription factor gene TBX5. Numerous germline mutations (more than 90) of this gene have been reported; however, TBX5 mutations are only identified in up to 74% of typical Holt-Oram syndrome patients. We report a Japanese family with 2 affected individuals with the typical Holt-Oram syndrome phenotype, namely bilateral asymmetrical radial ray deformities and an atrial septal defect. An array-based comparative genomic hybridization study revealed an 11-kb duplication at 12q24.1. Moreover, a multiplex ligation-dependent probe amplification study confirmed the duplication of exons 1-6 of TBX5. Although a small duplication in TBX5 (6 bases) has been reported, a large duplication of this gene has not been described previously in typical Holt-Oram syndrome patients. All typical Holt-Oram syndrome cases in which a mutation is not identified should be screened for TBX5 exon duplications.
Collapse
|
60
|
Extensive copy-number variation of young genes across stickleback populations. PLoS Genet 2014; 10:e1004830. [PMID: 25474574 PMCID: PMC4256280 DOI: 10.1371/journal.pgen.1004830] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2014] [Accepted: 10/16/2014] [Indexed: 12/30/2022] Open
Abstract
Duplicate genes emerge as copy-number variations (CNVs) at the population level, and remain copy-number polymorphic until they are fixed or lost. The successful establishment of such structural polymorphisms in the genome plays an important role in evolution by promoting genetic diversity, complexity and innovation. To characterize the early evolutionary stages of duplicate genes and their potential adaptive benefits, we combine comparative genomics with population genomics analyses to evaluate the distribution and impact of CNVs across natural populations of an eco-genomic model, the three-spined stickleback. With whole genome sequences of 66 individuals from populations inhabiting three distinct habitats, we find that CNVs generally occur at low frequencies and are often only found in one of the 11 populations surveyed. A subset of CNVs, however, displays copy-number differentiation between populations, showing elevated within-population frequencies consistent with local adaptation. By comparing teleost genomes to identify lineage-specific genes and duplications in sticklebacks, we highlight rampant gene content differences among individuals in which over 30% of young duplicate genes are CNVs. These CNV genes are evolving rapidly at the molecular level and are enriched with functional categories associated with environmental interactions, depicting the dynamic early copy-number polymorphic stage of genes during population differentiation. After a locus is duplicated in a genome, individuals from a population instantaneously differ in the number of copies of this locus producing a copy-number variation (CNV). Over time, the joint effects of selection and other evolutionary forces will act to either eliminate the extra genetic copy or retain it. Depending on this evolutionary interplay, young duplications, including newly duplicated genes, can persist for millions of years as CNVs. CNVs may especially be prevalent between populations that have colonized and adapted to disparate environments in which selective pressures differ. Using whole genome sequences from several populations of three-spined sticklebacks that inhabit different environments, we find that a third of young duplicated genes are CNVs. These young CNV genes are enriched with environmental response functions and evolving rapidly at the molecular level, making them promising candidates for a role in the rapid ecological adaptation to novel environments.
Collapse
|
61
|
Field B, Osbourn A. Order in the playground: Formation of plant gene clusters in dynamic chromosomal regions. Mob Genet Elements 2014; 2:46-50. [PMID: 22754752 PMCID: PMC3383449 DOI: 10.4161/mge.19348] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
In eukaryotes the arrangement of genes along the chromosome is not as random as it at first appeared, and distinctive clusters of functionally related but non-homologous genes can be found in the genomes of certain animals and fungi. These include the major histocompatibility complex in mammals and gene clusters for nutrient use and secondary metabolite production in fungi. A growing number of functional gene clusters for different types of secondary metabolite are now being discovered in plant genomes. However, the molecular mechanisms and evolutionary pressures behind their formation are poorly understood. Here we discuss the implications of our recent investigation into the origin of two functional gene clusters in the model plant Arabidopsis thaliana.
Collapse
|
62
|
Ottolini B, Hornsby MJ, Abujaber R, MacArthur JAL, Badge RM, Schwarzacher T, Albertson DG, Bevins CL, Solnick JV, Hollox EJ. Evidence of convergent evolution in humans and macaques supports an adaptive role for copy number variation of the β-defensin-2 gene. Genome Biol Evol 2014; 6:3025-38. [PMID: 25349268 PMCID: PMC4255768 DOI: 10.1093/gbe/evu236] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
β-defensins are a family of important peptides of innate immunity, involved in host defense, immunomodulation, reproduction, and pigmentation. Genes encoding β-defensins show evidence of birth-and-death evolution, adaptation by amino acid sequence changes, and extensive copy number variation (CNV) within humans and other species. The role of CNV in the adaptation of β-defensins to new functions remains unclear, as does the adaptive role of CNV in general. Here, we fine-map CNV of a cluster of β-defensins in humans and rhesus macaques. Remarkably, we found that the structure of the CNV is different between primates, with distinct mutational origins and CNV boundaries defined by retroviral long terminal repeat elements. Although the human β-defensin CNV region is 322 kb and encompasses several genes, including β-defensins, a long noncoding RNA gene, and testes-specific zinc-finger transcription factors, the orthologous region in the rhesus macaque shows CNV of a 20-kb region, containing only a single gene, the ortholog of the human β-defensin-2 gene. Despite its independent origins, the range of gene copy numbers in the rhesus macaque is similar to humans. In addition, the rhesus macaque gene has been subject to divergent positive selection at the amino acid level following its initial duplication event between 3 and 9.5 Ma, suggesting adaptation of this gene as the macaque successfully colonized novel environments outside Africa. Therefore, the molecular phenotype of β-defensin-2 CNV has undergone convergent evolution, and this gene shows evidence of adaptation at the amino acid level in rhesus macaques.
Collapse
Affiliation(s)
| | - Michael J Hornsby
- Department of Microbiology and Immunology, University of California Davis School of Medicine
| | - Razan Abujaber
- Department of Genetics, University of Leicester, United Kingdom
| | - Jacqueline A L MacArthur
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco Present address: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Richard M Badge
- Department of Genetics, University of Leicester, United Kingdom
| | | | - Donna G Albertson
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco Present address: Bluestone Center for Clinical Research, New York University College of Dentistry, New York, New York
| | - Charles L Bevins
- Department of Microbiology and Immunology, University of California Davis School of Medicine
| | - Jay V Solnick
- Department of Microbiology and Immunology, University of California Davis School of Medicine Department of Medicine, Center for Comparative Medicine, and the California National Primate Research Center, University of California
| | - Edward J Hollox
- Department of Genetics, University of Leicester, United Kingdom
| |
Collapse
|
63
|
Kapusta A, Feschotte C. Volatile evolution of long noncoding RNA repertoires: mechanisms and biological implications. Trends Genet 2014; 30:439-52. [PMID: 25218058 PMCID: PMC4464757 DOI: 10.1016/j.tig.2014.08.004] [Citation(s) in RCA: 204] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Revised: 08/15/2014] [Accepted: 08/16/2014] [Indexed: 02/08/2023]
Abstract
Thousands of genes encoding long noncoding RNAs (lncRNAs) have been identified in all vertebrate genomes thus far examined. The list of lncRNAs partaking in arguably important biochemical, cellular, and developmental activities is steadily growing. However, it is increasingly clear that lncRNA repertoires are subject to weak functional constraint and rapid turnover during vertebrate evolution. We discuss here some of the factors that may explain this apparent paradox, including relaxed constraint on sequence to maintain lncRNA structure/function, extensive redundancy in the regulatory circuits in which lncRNAs act, as well as adaptive and non-adaptive forces such as genetic drift. We explore the molecular mechanisms promoting the birth and rapid evolution of lncRNA genes, with an emphasis on the influence of bidirectional transcription and transposable elements, two pervasive features of vertebrate genomes. Together these properties reveal a remarkably dynamic and malleable noncoding transcriptome which may represent an important source of robustness and evolvability.
Collapse
Affiliation(s)
- Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112, USA.
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112, USA.
| |
Collapse
|
64
|
Phylogenetic investigation of human FGFR-bearing paralogons favors piecemeal duplication theory of vertebrate genome evolution. Mol Phylogenet Evol 2014; 81:49-60. [PMID: 25245952 DOI: 10.1016/j.ympev.2014.09.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Revised: 09/09/2014] [Accepted: 09/11/2014] [Indexed: 11/23/2022]
Abstract
BACKGROUND Understanding the genetic mechanisms underlying the organismal complexity and origin of novelties during vertebrate history is one of the central goals of evolutionary biology. Ohno (1970) was the first to postulate that whole genome duplications (WGD) have played a vital role in the evolution of new gene functions: permitting an increase in morphological, physiological and anatomical complexity during early vertebrate history. RESULTS Here, we analyze the evolutionary history of human FGFR-bearing paralogon (human autosome 4/5/8/10) by the phylogenetic analysis of multigene families with triplicate and quadruplicate distribution on these chromosomes. Our results categorized the histories of 21 families into discrete co-duplicated groups. Genes of a particular co-duplicated group exhibit identical evolutionary history and have duplicated in concert with each other, whereas genes belonging to different groups have dissimilar histories and have not duplicated concurrently. CONCLUSION Taken together with our previously published data, we submit that there is sufficient empirical evidence to disprove the 1R/2R hypothesis and to support the general prediction that vertebrate genome evolved by relatively small-scale, regional duplication events that spread across the history of life.
Collapse
|
65
|
Molecular phylogeny and predicted 3D structure of plant beta-D-N-acetylhexosaminidase. ScientificWorldJournal 2014; 2014:186029. [PMID: 25165734 PMCID: PMC4129151 DOI: 10.1155/2014/186029] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2014] [Revised: 06/14/2014] [Accepted: 06/21/2014] [Indexed: 11/21/2022] Open
Abstract
beta-D-N-Acetylhexosaminidase, a family 20 glycosyl hydrolase, catalyzes the removal of β-1,4-linked N-acetylhexosamine residues from oligosaccharides and their conjugates. We constructed phylogenetic tree of β-hexosaminidases to analyze the evolutionary history and predicted functions of plant hexosaminidases. Phylogenetic analysis reveals the complex history of evolution of plant β-hexosaminidase that can be described by gene duplication events. The 3D structure of tomato β-hexosaminidase (β-Hex-Sl) was predicted by homology modeling using 1now as a template. Structural conformity studies of the best fit model showed that more than 98% of the residues lie inside the favoured and allowed regions where only 0.9% lie in the unfavourable region. Predicted 3D structure contains 531 amino acids residues with glycosyl hydrolase20b domain-I and glycosyl hydrolase20 superfamily domain-II including the (β/α)8 barrel in the central part. The α and β contents of the modeled structure were found to be 33.3% and 12.2%, respectively. Eleven amino acids were found to be involved in ligand-binding site; Asp(330) and Glu(331) could play important roles in enzyme-catalyzed reactions. The predicted model provides a structural framework that can act as a guide to develop a hypothesis for β-Hex-Sl mutagenesis experiments for exploring the functions of this class of enzymes in plant kingdom.
Collapse
|
66
|
A high resolution map of mammalian X chromosome fragile regions assessed by large-scale comparative genomics. Mamm Genome 2014; 25:618-35. [PMID: 25086724 DOI: 10.1007/s00335-014-9537-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 07/14/2014] [Indexed: 10/24/2022]
Abstract
Chromosomal evolution involves multiple changes at structural and numerical levels. These changes, which are related to the variation of the gene number and their location, can be tracked by the identification of syntenic blocks (SB). First reports proposed that ~180-280 SB might be shared by mouse and human species. More recently, further studies including additional genomes have identified up to ~1,400 SB during the evolution of eutherian species. A considerable number of studies regarding the X chromosome's structure and evolution have been undertaken because of its extraordinary biological impact on reproductive fitness and speciation. Some have identified evolutionary breakpoint regions and fragile sites at specific locations in the human X chromosome. However, mapping these regions to date has involved using low-to-moderate resolution techniques. Such scenario might be related to underestimating their total number and giving an inaccurate location. The present study included using a combination of bioinformatics methods for identifying, at base-pair level, chromosomal rearrangements occurring during X chromosome evolution in 13 mammalian species. A comparative technique using four different algorithms was used for optimizing the detection of hotspot regions in the human X chromosome. We identified a significant interspecific variation in SB size which was related to genetic information gain regarding the human X chromosome. We found that human hotspot regions were enriched by LINE-1 and Alu transposable elements, which may have led to intraspecific chromosome rearrangement events. New fragile regions located in the human X chromosome have also been postulated. We estimate that the high resolution map of X chromosome fragile sites presented here constitutes useful data concerning future studies on mammalian evolution and human disease.
Collapse
|
67
|
Abstract
The field of nonhuman primate genomics is undergoing rapid change and making impressive progress. Exploiting new technologies for DNA sequencing, researchers have generated new whole-genome sequence assemblies for multiple primate species over the past 6 years. In addition, investigations of within-species genetic variation, gene expression and RNA sequences, conservation of non-protein-coding regions of the genome, and other aspects of comparative genomics are moving at an accelerating speed. This progress is opening a wide array of new research opportunities in the analysis of comparative primate genome content and evolution. It also creates new possibilities for the use of nonhuman primates as model organisms in biomedical research. This transition, based on both new technology and the new information being generated in regard to human genetics, provides an important justification for reevaluating the research goals, strategies, and study designs used in primate genetics and genomics.
Collapse
|
68
|
Ebert G, Steininger A, Weißmann R, Boldt V, Lind-Thomsen A, Grune J, Badelt S, Heßler M, Peiser M, Hitzler M, Jensen LR, Müller I, Hu H, Arndt PF, Kuss AW, Tebel K, Ullmann R. Distribution of segmental duplications in the context of higher order chromatin organisation of human chromosome 7. BMC Genomics 2014; 15:537. [PMID: 24973960 PMCID: PMC4092221 DOI: 10.1186/1471-2164-15-537] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Accepted: 06/17/2014] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Segmental duplications (SDs) are not evenly distributed along chromosomes. The reasons for this biased susceptibility to SD insertion are poorly understood. Accumulation of SDs is associated with increased genomic instability, which can lead to structural variants and genomic disorders such as the Williams-Beuren syndrome. Despite these adverse effects, SDs have become fixed in the human genome. Focusing on chromosome 7, which is particularly rich in interstitial SDs, we have investigated the distribution of SDs in the context of evolution and the three dimensional organisation of the chromosome in order to gain insights into the mutual relationship of SDs and chromatin topology. RESULTS Intrachromosomal SDs preferentially accumulate in those segments of chromosome 7 that are homologous to marmoset chromosome 2. Although this formerly compact segment has been re-distributed to three different sites during primate evolution, we can show by means of public data on long distance chromatin interactions that these three intervals, and consequently the paralogous SDs mapping to them, have retained their spatial proximity in the nucleus. Focusing on SD clusters implicated in the aetiology of the Williams-Beuren syndrome locus we demonstrate by cross-species comparison that these SDs have inserted at the borders of a topological domain and that they flank regions with distinct DNA conformation. CONCLUSIONS Our study suggests a link of nuclear architecture and the propagation of SDs across chromosome 7, either by promoting regional SD insertion or by contributing to the establishment of higher order chromatin organisation themselves. The latter could compensate for the high risk of structural rearrangements and thus may have contributed to their evolutionary fixation in the human genome.
Collapse
Affiliation(s)
- Grit Ebert
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
- />Department of Biology, Chemistry and Pharmacy, Free University Berlin, 14195 Berlin, Germany
| | - Anne Steininger
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
- />Department of Biology, Chemistry and Pharmacy, Free University Berlin, 14195 Berlin, Germany
| | - Robert Weißmann
- />Department of Human Genetics, University Medicine Greifswald, and Interfaculty Institute of Genetics and Functional Genomics, University of Greifswald, Fleischmannstraße 42-44, 17475 Greifswald, Germany
| | - Vivien Boldt
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
- />Department of Biology, Chemistry and Pharmacy, Free University Berlin, 14195 Berlin, Germany
| | - Allan Lind-Thomsen
- />Wilhelm Johannsen Centre for Functional Genome Research, Department of Cellular and Molecular Medicine, University of Copenhagen, Blegdamsvej 3, DK-2200 Copenhagen, Denmark
| | - Jana Grune
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Stefan Badelt
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
- />Institute for Theoretical Chemistry, University of Vienna, Waehringer Straße 17, A-1090 Vienna, Austria
| | - Melanie Heßler
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Matthias Peiser
- />Unit Experimental Research, Department of Product Safety, Federal Institute for Bundeswehr Institute of Radiobiology affiliated, the University of Ulm, Neuherbergstraße 11, 80937 Munich, Germany
| | - Manuel Hitzler
- />Unit Experimental Research, Department of Product Safety, Federal Institute for Bundeswehr Institute of Radiobiology affiliated, the University of Ulm, Neuherbergstraße 11, 80937 Munich, Germany
| | - Lars R Jensen
- />Department of Human Genetics, University Medicine Greifswald, and Interfaculty Institute of Genetics and Functional Genomics, University of Greifswald, Fleischmannstraße 42-44, 17475 Greifswald, Germany
| | - Ines Müller
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Hao Hu
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Peter F Arndt
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Andreas W Kuss
- />Department of Human Genetics, University Medicine Greifswald, and Interfaculty Institute of Genetics and Functional Genomics, University of Greifswald, Fleischmannstraße 42-44, 17475 Greifswald, Germany
| | - Katrin Tebel
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Reinhard Ullmann
- />Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| |
Collapse
|
69
|
Prendergast JGD, Chambers EV, Semple CAM. Sequence-level mechanisms of human epigenome evolution. Genome Biol Evol 2014; 6:1758-71. [PMID: 24966180 PMCID: PMC4122940 DOI: 10.1093/gbe/evu142] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
DNA methylation and chromatin states play key roles in development and disease. However, the extent of recent evolutionary divergence in the human epigenome and the influential factors that have shaped it are poorly understood. To determine the links between genome sequence and human epigenome evolution, we examined the divergence of DNA methylation and chromatin states following segmental duplication events in the human lineage. Chromatin and DNA methylation states were found to have been generally well conserved following a duplication event, with the evolution of the epigenome largely uncoupled from the total number of genetic changes in the surrounding DNA sequence. However, the epigenome at tissue-specific, distal regulatory regions was observed to be unusually prone to diverge following duplication, with particular sequence differences, altering known sequence motifs, found to be associated with divergence in patterns of DNA methylation and chromatin. Alu elements were found to have played a particularly prominent role in shaping human epigenome evolution, and we show that human-specific AluY insertion events are strongly linked to the evolution of the DNA methylation landscape and gene expression levels, including at key neurological genes in the human brain. Studying paralogous regions within the same sample enables the study of the links between genome and epigenome evolution while controlling for biological and technical variation. We show DNA methylation and chromatin divergence between duplicated regions are linked to the divergence of particular genetic motifs, with Alu elements having played a disproportionate role in the evolution of the epigenome in the human lineage.
Collapse
Affiliation(s)
| | - Emily V Chambers
- The Roslin Institute, The University of Edinburgh, Midlothian, United Kingdom
| | - Colin A M Semple
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, United Kingdom
| |
Collapse
|
70
|
Abstract
Research into when and where modern humans originated and how they differ from, and interacted with, other now-extinct forms of human has so far been the realm of archaeologists and paleoanthropologists. However, over the past decade, molecular geneticists have begun to study genomes of extinct humans. Here, I discuss where we stand today with respect to understanding how modern humans came to differ from Neandertals and other human forms that existed until about 30,000 years ago.
Collapse
Affiliation(s)
- Svante Pääbo
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany.
| |
Collapse
|
71
|
Yeo RA, Gangestad SW, Walton E, Ehrlich S, Pommy J, Turner JA, Liu J, Mayer AR, Schulz SC, Ho BC, Bustillo JR, Wassink TH, Sponheim SR, Morrow EM, Calhoun VD. Genetic influences on cognitive endophenotypes in schizophrenia. Schizophr Res 2014; 156:71-5. [PMID: 24768440 PMCID: PMC4699552 DOI: 10.1016/j.schres.2014.03.022] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/17/2013] [Revised: 03/18/2014] [Accepted: 03/20/2014] [Indexed: 02/04/2023]
Abstract
BACKGROUND Cognitive deficits are prominent in schizophrenia and represent promising endophenotypes for genetic research. METHODS The current study investigated the importance of two conceptually distinct genetic aggregates, one based on copy number variations (uncommon deletion burden), and one based on single nucleotide polymorphisms identified in recent risk studies (genetic risk score). The impact of these genetic factors, and their interaction, was examined on cognitive endophenotypes defined by principal component analysis (PCA) in a multi-center sample of 50 patients with schizophrenia and 86 controls. PCA was used to identify three different types of executive function (EF: planning, fluency, and inhibition), and in separate analyses, a measure general cognitive ability (GCA). RESULTS Cognitive deficits were prominent among individuals with schizophrenia, but no group differences were evident for either genetic factor. Among patients the deletion burden measures predicted cognitive deficits across the three EF components and GCA. Further, an interaction was noted between the two genetic factors for both EF and GCA and the observed patterns of interaction suggested antagonistic epistasis. In general, the set of genetic interactions examined predicted a substantial portion of variance in these cognitive endophenotypes. LIMITATIONS Though adequately powered, our sample size is small for a genetic study. CONCLUSIONS These results draw attention to genetic interactions and the possibility that genetic influences on cognition differ in patients and controls.
Collapse
Affiliation(s)
- Ronald A. Yeo
- Department of Psychology, University of New Mexico, Albuquerque, NM, USA,The Mind Research Network, Albuquerque, NM, USA
| | | | - Esther Walton
- MGH/MIT/HMS Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, USA,Department of Child and Adolescent Psychiatry, University Hospital Carl Gustav Carus, Dresden University of Technology, Dresden, Germany
| | - Stefan Ehrlich
- MGH/MIT/HMS Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown, MA, USA,Department of Child and Adolescent Psychiatry, University Hospital Carl Gustav Carus, Dresden University of Technology, Dresden, Germany,Department of Psychiatry, Massachusetts General Hospital, Boston, MA
| | - Jessica Pommy
- Department of Psychology, University of New Mexico, Albuquerque, NM, USA
| | - Jessica A. Turner
- The Mind Research Network, Albuquerque, NM, USA,Dept of Psychology and the Neuroscience Institute, Georgia State University, Atlanta, GA
| | - Jingyu Liu
- The Mind Research Network, Albuquerque, NM, USA,Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA
| | | | - S. Charles Schulz
- Department of Psychiatry, University of Minnesota, Minneapolis, MN, USA
| | - Beng-Choon Ho
- Department of Psychiatry, Carver College of Medicine, University of Iowa, Iowa City, IA, USA
| | - Juan R. Bustillo
- The Mind Research Network, Albuquerque, NM, USA,Department of Psychiatry, University of New Mexico, Albuquerque, NM, USA
| | - Thomas H. Wassink
- Department of Psychiatry, Carver College of Medicine, University of Iowa, Iowa City, IA, USA
| | - Scott R. Sponheim
- Department of Psychiatry, University of Minnesota, Minneapolis, MN, USA,Minneapolis Veterans Administration Health Care System, Minneapolis, MN, USA
| | - Eric M. Morrow
- Department of Molecular Biology, Cell Biology and Biochemistry, Laboratory for Molecular Medicine, Brown University, Providence, RI, USA
| | - Vince D. Calhoun
- The Mind Research Network, Albuquerque, NM, USA,Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, USA
| |
Collapse
|
72
|
Cook DE, Bayless AM, Wang K, Guo X, Song Q, Jiang J, Bent AF. Distinct Copy Number, Coding Sequence, and Locus Methylation Patterns Underlie Rhg1-Mediated Soybean Resistance to Soybean Cyst Nematode. PLANT PHYSIOLOGY 2014; 165:630-647. [PMID: 24733883 PMCID: PMC4044848 DOI: 10.1104/pp.114.235952] [Citation(s) in RCA: 98] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Accepted: 03/19/2014] [Indexed: 05/18/2023]
Abstract
Copy number variation of kilobase-scale genomic DNA segments, beyond presence/absence polymorphisms, can be an important driver of adaptive traits. Resistance to Heterodera glycines (Rhg1) is a widely utilized quantitative trait locus that makes the strongest known contribution to resistance against soybean cyst nematode (SCN), Heterodera glycines, the most damaging pathogen of soybean (Glycine max). Rhg1 was recently discovered to be a complex locus at which resistance-conferring haplotypes carry up to 10 tandem repeat copies of a 31-kb DNA segment, and three disparate genes present on each repeat contribute to SCN resistance. Here, we use whole-genome sequencing, fiber-FISH (fluorescence in situ hybridization), and other methods to discover the genetic variation at Rhg1 across 41 diverse soybean accessions. Based on copy number variation, transcript abundance, nucleic acid polymorphisms, and differentially methylated DNA regions, we find that SCN resistance is associated with multicopy Rhg1 haplotypes that form two distinct groups. The tested high-copy-number Rhg1 accessions, including plant introduction (PI) 88788, contain a flexible number of copies (seven to 10) of the 31-kb Rhg1 repeat. The identified low-copy-number Rhg1 group, including PI 548402 (Peking) and PI 437654, contains three copies of the Rhg1 repeat and a newly identified allele of Glyma18g02590 (a predicted α-SNAP [α-soluble N-ethylmaleimide-sensitive factor attachment protein]). There is strong evidence for a shared origin of the two resistance-conferring multicopy Rhg1 groups and subsequent independent evolution. Differentially methylated DNA regions also were identified within Rhg1 that correlate with SCN resistance. These data provide insights into copy number variation of multigene segments, using as the example a disease resistance trait of high economic importance.
Collapse
Affiliation(s)
- David E Cook
- Department of Plant Pathology (D.E.C., A.M.B., X.G., A.F.B.) and Department of Horticulture (K.W., J.J.), University of Wisconsin, Madison, Wisconsin 53706; andSoybean Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland 20705 (Q.S.)
| | - Adam M Bayless
- Department of Plant Pathology (D.E.C., A.M.B., X.G., A.F.B.) and Department of Horticulture (K.W., J.J.), University of Wisconsin, Madison, Wisconsin 53706; andSoybean Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland 20705 (Q.S.)
| | - Kai Wang
- Department of Plant Pathology (D.E.C., A.M.B., X.G., A.F.B.) and Department of Horticulture (K.W., J.J.), University of Wisconsin, Madison, Wisconsin 53706; andSoybean Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland 20705 (Q.S.)
| | - Xiaoli Guo
- Department of Plant Pathology (D.E.C., A.M.B., X.G., A.F.B.) and Department of Horticulture (K.W., J.J.), University of Wisconsin, Madison, Wisconsin 53706; andSoybean Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland 20705 (Q.S.)
| | - Qijian Song
- Department of Plant Pathology (D.E.C., A.M.B., X.G., A.F.B.) and Department of Horticulture (K.W., J.J.), University of Wisconsin, Madison, Wisconsin 53706; andSoybean Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland 20705 (Q.S.)
| | - Jiming Jiang
- Department of Plant Pathology (D.E.C., A.M.B., X.G., A.F.B.) and Department of Horticulture (K.W., J.J.), University of Wisconsin, Madison, Wisconsin 53706; andSoybean Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland 20705 (Q.S.)
| | - Andrew F Bent
- Department of Plant Pathology (D.E.C., A.M.B., X.G., A.F.B.) and Department of Horticulture (K.W., J.J.), University of Wisconsin, Madison, Wisconsin 53706; andSoybean Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, Maryland 20705 (Q.S.)
| |
Collapse
|
73
|
Ambreen S, Khalil F, Abbasi AA. Integrating large-scale phylogenetic datasets to dissect the ancient evolutionary history of vertebrate genome. Mol Phylogenet Evol 2014; 78:1-13. [PMID: 24821622 DOI: 10.1016/j.ympev.2014.05.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Revised: 04/17/2014] [Accepted: 05/01/2014] [Indexed: 11/18/2022]
Abstract
BACKGROUND The vertebrate genome often contains closely spaced set of paralogous genes from distinct gene families on typically two, three or four different chromosomes (paralogons). This type of genome architecture is widely considered to be remnants of whole genome duplication events (WGD/2R). RESULTS Taking advantage of the well-annotated and high-quality human genomic sequence map as well as the ever-increasing accessibility of large-scale genomic sequence data from a diverse range of animal species, we investigated the evolutionary history of potential quadruplicated regions residing on human HOX-cluster bearing chromosomes (chromosomes 2/7/12/17). For this purpose a detailed phylogenetic analysis was performed for those multigene families, including members of at least three of the four HOX-bearing chromosomes. Topology comparison approach categorized the members of 63 families into distinct co-duplicated groups. Distinct gene families belonging to a particular co-duplicated group, exhibit similar evolutionary history and hence have duplicated concurrently, whereas genes of two different co-duplicated groups do not share their history and have not duplicated in concert with each other. CONCLUSIONS These results based on large-scale phylogenetic dataset yielded no evidence in favor of polyploidization events; instead it appears that triplicated and quadruplicated genomic segments on the human HOX-bearing chromosomes arose by small-scale duplication events that occurred at widely different time points in animal evolution.
Collapse
Affiliation(s)
- Sadaf Ambreen
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan
| | - Faiqa Khalil
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan
| | - Amir Ali Abbasi
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan.
| |
Collapse
|
74
|
Guo X, Zheng S, Dang H, Pace RG, Stonebraker JR, Jones CD, Boellmann F, Yuan G, Haridass P, Fedrigo O, Corcoran DL, Seibold MA, Ranade SS, Knowles MR, O'Neal WK, Voynow JA. Genome reference and sequence variation in the large repetitive central exon of human MUC5AC. Am J Respir Cell Mol Biol 2014; 50:223-32. [PMID: 24010879 DOI: 10.1165/rcmb.2013-0235oc] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Despite modern sequencing efforts, the difficulty in assembly of highly repetitive sequences has prevented resolution of human genome gaps, including some in the coding regions of genes with important biological functions. One such gene, MUC5AC, encodes a large, secreted mucin, which is one of the two major secreted mucins in human airways. The MUC5AC region contains a gap in the human genome reference (hg19) across the large, highly repetitive, and complex central exon. This exon is predicted to contain imperfect tandem repeat sequences and multiple conserved cysteine-rich (CysD) domains. To resolve the MUC5AC genomic gap, we used high-fidelity long PCR followed by single molecule real-time (SMRT) sequencing. This technology yielded long sequence reads and robust coverage that allowed for de novo sequence assembly spanning the entire repetitive region. Furthermore, we used SMRT sequencing of PCR amplicons covering the central exon to identify genetic variation in four individuals. The results demonstrated the presence of segmental duplications of CysD domains, insertions/deletions (indels) of tandem repeats, and single nucleotide variants. Additional studies demonstrated that one of the identified tandem repeat insertions is tagged by nonexonic single nucleotide polymorphisms. Taken together, these data illustrate the successful utility of SMRT sequencing long reads for de novo assembly of large repetitive sequences to fill the gaps in the human genome. Characterization of the MUC5AC gene and the sequence variation in the central exon will facilitate genetic and functional studies for this critical airway mucin.
Collapse
Affiliation(s)
- Xueliang Guo
- 1 Cystic Fibrosis/Pulmonary Research and Treatment Center, and
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
75
|
Zhao Q, Han MJ, Sun W, Zhang Z. Copy number variations among silkworms. BMC Genomics 2014; 15:251. [PMID: 24684762 PMCID: PMC3997817 DOI: 10.1186/1471-2164-15-251] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Accepted: 03/25/2014] [Indexed: 11/10/2022] Open
Abstract
Background Copy number variations (CNVs), which are important source for genetic and phenotypic variation, have been shown to be associated with disease as well as important QTLs, especially in domesticated animals. However, little is known about the CNVs in silkworm. Results In this study, we have constructed the first CNVs map based on genome-wide analysis of CNVs in domesticated silkworm. Using next-generation sequencing as well as quantitative PCR (qPCR), we identified ~319 CNVs in total and almost half of them (~ 49%) were distributed on uncharacterized chromosome. The CNVs covered 10.8 Mb, which is about 2.3% of the entire silkworm genome. Furthermore, approximately 61% of CNVs directly overlapped with SDs in silkworm. The genes in CNVs are mainly related to reproduction, immunity, detoxification and signal recognition, which is consistent with the observations in mammals. Conclusions An initial CNVs map for silkworm has been described in this study. And this map provides new information for genetic variations in silkworm. Furthermore, the silkworm CNVs may play important roles in reproduction, immunity, detoxification and signal recognition. This study provided insight into the evolution of the silkworm genome and an invaluable resource for insect genomics research.
Collapse
Affiliation(s)
| | | | | | - Ze Zhang
- Laboratory of Evolutionary and Functional Genomics, School of Life Sciences, Chongqing University, Chongqing 400044, China.
| |
Collapse
|
76
|
Abstract
To understand the emergence of human higher cognition, we must understand its biological substrate--the cerebral cortex, which considers itself the crowning achievement of evolution. Here, we describe how advances in developmental neurobiology, coupled with those in genetics, including adaptive protein evolution via gene duplications and the emergence of novel regulatory elements, can provide insights into the evolutionary mechanisms culminating in the human cerebrum. Given that the massive expansion of the cortical surface and elaboration of its connections in humans originates from developmental events, understanding the genetic regulation of cell number, neuronal migration to proper layers, columns, and regions, and ultimately their differentiation into specific phenotypes, is critical. The pre- and postnatal environment also interacts with the cellular substrate to yield a basic network that is refined via selection and elimination of synaptic connections, a process that is prolonged in humans. This knowledge provides essential insight into the pathogenesis of human-specific neuropsychiatric disorders.
Collapse
Affiliation(s)
- Daniel H Geschwind
- Program in Neurogenetics, Department of Neurology, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| | | |
Collapse
|
77
|
Dumont BL, Eichler EE. Signals of historical interlocus gene conversion in human segmental duplications. PLoS One 2013; 8:e75949. [PMID: 24124524 PMCID: PMC3790853 DOI: 10.1371/journal.pone.0075949] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 08/17/2013] [Indexed: 12/04/2022] Open
Abstract
Standard methods of DNA sequence analysis assume that sequences evolve independently, yet this assumption may not be appropriate for segmental duplications that exchange variants via interlocus gene conversion (IGC). Here, we use high quality multiple sequence alignments from well-annotated segmental duplications to systematically identify IGC signals in the human reference genome. Our analysis combines two complementary methods: (i) a paralog quartet method that uses DNA sequence simulations to identify a statistical excess of sites consistent with inter-paralog exchange, and (ii) the alignment-based method implemented in the GENECONV program. One-quarter (25.4%) of the paralog families in our analysis harbor clear IGC signals by the quartet approach. Using GENECONV, we identify 1477 gene conversion tracks that cumulatively span 1.54 Mb of the genome. Our analyses confirm the previously reported high rates of IGC in subtelomeric regions and Y-chromosome palindromes, and identify multiple novel IGC hotspots, including the pregnancy specific glycoproteins and the neuroblastoma breakpoint gene families. Although the duplication history of a paralog family is described by a single tree, we show that IGC has introduced incredible site-to-site variation in the evolutionary relationships among paralogs in the human genome. Our findings indicate that IGC has left significant footprints in patterns of sequence diversity across segmental duplications in the human genome, out-pacing the contributions of single base mutation by orders of magnitude. Collectively, the IGC signals we report comprise a catalog that will provide a critical reference for interpreting observed patterns of DNA sequence variation across duplicated genomic regions, including targets of recent adaptive evolution in humans.
Collapse
Affiliation(s)
- Beth L. Dumont
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Seattle, Washington, United States of America
| |
Collapse
|
78
|
Fawcett JA, Innan H. The role of gene conversion in preserving rearrangement hotspots in the human genome. Trends Genet 2013; 29:561-8. [PMID: 23953668 DOI: 10.1016/j.tig.2013.07.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 06/20/2013] [Accepted: 07/08/2013] [Indexed: 11/27/2022]
Abstract
Hotspots of non-allelic homologous recombination (NAHR) have a crucial role in creating genetic diversity and are also associated with dozens of genomic disorders. Recent studies suggest that many human NAHR hotspots have been preserved throughout the evolution of primates. NAHR hotspots are likely to remain active as long as the segmental duplications (SDs) promoting NAHR retain sufficient similarity. Here, we propose an evolutionary model of SDs that incorporates the effect of gene conversion and compare it with a null model that assumes SDs evolve independently without gene conversion. The gene conversion model predicts a much longer lifespan of NAHR hotspots compared with the null model. We show that the literature on copy number variants (CNVs) and genomic disorders, and also the results of additional analysis of CNVs, are all more consistent with the gene conversion model.
Collapse
Affiliation(s)
- Jeffrey A Fawcett
- Graduate University for Advanced Studies, Hayama, Kanagawa 240-0193, Japan
| | | |
Collapse
|
79
|
|
80
|
Paudel Y, Madsen O, Megens HJ, Frantz LAF, Bosse M, Bastiaansen JWM, Crooijmans RPMA, Groenen MAM. Evolutionary dynamics of copy number variation in pig genomes in the context of adaptation and domestication. BMC Genomics 2013; 14:449. [PMID: 23829399 PMCID: PMC3716681 DOI: 10.1186/1471-2164-14-449] [Citation(s) in RCA: 101] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2013] [Accepted: 07/01/2013] [Indexed: 12/23/2022] Open
Abstract
Background Copy number variable regions (CNVRs) can result in drastic phenotypic differences and may therefore be subject to selection during domestication. Studying copy number variation in relation to domestication is highly relevant in pigs because of their very rich natural and domestication history that resulted in many different phenotypes. To investigate the evolutionary dynamic of CNVRs, we applied read depth method on next generation sequence data from 16 individuals, comprising wild boars and domestic pigs from Europe and Asia. Results We identified 3,118 CNVRs with an average size of 13 kilobases comprising a total of 39.2 megabases of the pig genome and 545 overlapping genes. Functional analyses revealed that CNVRs are enriched with genes related to sensory perception, neurological process and response to stimulus, suggesting their contribution to adaptation in the wild and behavioral changes during domestication. Variations of copy number (CN) of antimicrobial related genes suggest an ongoing process of evolution of these genes to combat food-borne pathogens. Likewise, some genes related to the omnivorous lifestyle of pigs, like genes involved in detoxification, were observed to be CN variable. A small portion of CNVRs was unique to domestic pigs and may have been selected during domestication. The majority of CNVRs, however, is shared between wild and domesticated individuals, indicating that domestication had minor effect on the overall diversity of CNVRs. Also, the excess of CNVRs in non-genic regions implies that a major part of these variations is likely to be (nearly) neutral. Comparison between different populations showed that larger populations have more CNVRs, highlighting that CNVRs are, like other genetic variation such as SNPs and microsatellites, reflecting demographic history rather than phenotypic diversity. Conclusion CNVRs in pigs are enriched for genes related to sensory perception, neurological process, and response to stimulus. The majority of CNVRs ascertained in domestic pigs are also variable in wild boars, suggesting that the domestication of the pig did not result in a change in CNVRs in domesticated pigs. The majority of variable regions were found to reflect demographic patterns rather than phenotypic.
Collapse
Affiliation(s)
- Yogesh Paudel
- Animal Breeding and Genomics Centre, Wageningen University, De Elst 1, Wageningen, WD, 6708, The Netherlands.
| | | | | | | | | | | | | | | |
Collapse
|
81
|
Comparative Analysis of CNV Calling Algorithms: Literature Survey and a Case Study Using Bovine High-Density SNP Data. MICROARRAYS 2013; 2:171-85. [PMID: 27605188 PMCID: PMC5003459 DOI: 10.3390/microarrays2030171] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/02/2013] [Revised: 06/04/2013] [Accepted: 06/05/2013] [Indexed: 11/23/2022]
Abstract
Copy number variations (CNVs) are gains and losses of genomic sequence between two individuals of a species when compared to a reference genome. The data from single nucleotide polymorphism (SNP) microarrays are now routinely used for genotyping, but they also can be utilized for copy number detection. Substantial progress has been made in array design and CNV calling algorithms and at least 10 comparison studies in humans have been published to assess them. In this review, we first survey the literature on existing microarray platforms and CNV calling algorithms. We then examine a number of CNV calling tools to evaluate their impacts using bovine high-density SNP data. Large incongruities in the results from different CNV calling tools highlight the need for standardizing array data collection, quality assessment and experimental validation. Only after careful experimental design and rigorous data filtering can the impacts of CNVs on both normal phenotypic variability and disease susceptibility be fully revealed.
Collapse
|
82
|
Sassa T. The Role of Human-Specific Gene Duplications During Brain Development and Evolution. J Neurogenet 2013; 27:86-96. [DOI: 10.3109/01677063.2013.789512] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
83
|
Zhang YB, Liu TK, Jiang J, Shi J, Liu Y, Li S, Gui JF. Identification of a novel Gig2 gene family specific to non-amniote vertebrates. PLoS One 2013; 8:e60588. [PMID: 23593256 PMCID: PMC3617106 DOI: 10.1371/journal.pone.0060588] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2012] [Accepted: 02/28/2013] [Indexed: 12/15/2022] Open
Abstract
Gig2 (grass carp reovirus (GCRV)-induced gene 2) is first identified as a novel fish interferon (IFN)-stimulated gene (ISG). Overexpression of a zebrafish Gig2 gene can protect cultured fish cells from virus infection. In the present study, we identify a novel gene family that is comprised of genes homologous to the previously characterized Gig2. EST/GSS search and in silico cloning identify 190 Gig2 homologous genes in 51 vertebrate species ranged from lampreys to amphibians. Further large-scale search of vertebrate and invertebrate genome databases indicate that Gig2 gene family is specific to non-amniotes including lampreys, sharks/rays, ray-finned fishes and amphibians. Phylogenetic analysis and synteny analysis reveal lineage-specific expansion of Gig2 gene family and also provide valuable evidence for the fish-specific genome duplication (FSGD) hypothesis. Although Gig2 family proteins exhibit no significant sequence similarity to any known proteins, a typical Gig2 protein appears to consist of two conserved parts: an N-terminus that bears very low homology to the catalytic domains of poly(ADP-ribose) polymerases (PARPs), and a novel C-terminal domain that is unique to this gene family. Expression profiling of zebrafish Gig2 family genes shows that some duplicate pairs have diverged in function via acquisition of novel spatial and/or temporal expression under stresses. The specificity of this gene family to non-amniotes might contribute to a large extent to distinct physiology in non-amniote vertebrates.
Collapse
Affiliation(s)
- Yi-Bing Zhang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
- * E-mail: (YZ) (YZ); (JG) (JG)
| | - Ting-Kai Liu
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Jun Jiang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Jun Shi
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Ying Liu
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Shun Li
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Jian-Fang Gui
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
- * E-mail: (YZ) (YZ); (JG) (JG)
| |
Collapse
|
84
|
Currall BB, Chiang C, Talkowski ME, Morton CC. Mechanisms for Structural Variation in the Human Genome. CURRENT GENETIC MEDICINE REPORTS 2013; 1:81-90. [PMID: 23730541 DOI: 10.1007/s40142-013-0012-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
It has been known for several decades that genetic variation involving changes to chromosomal structure (i.e., structural variants) can contribute to disease; however this relationship has been brought into acute focus in recent years largely based on innovative new genomics approaches and technology. Structural variants (SVs) arise from improperly repaired DNA double-strand breaks (DSB). DSBs are a frequent occurrence in all cells and two major pathways are involved in their repair: homologous recombination and non-homologous end joining. Errors during these repair mechanisms can result in SVs that involve losses, gains and rearrangements ranging from a few nucleotides to entire chromosomal arms. Factors such as rearrangements, hotspots and induced DSBs are implicated in the formation of SVs. While de novo SVs are often associated with disease, some SVs are conserved within human subpopulations and may have had a meaningful influence on primate evolution. As the ability to sequence the whole human genome rapidly evolves, the diversity of SVs is illuminated, including very complex rearrangements involving multiple DSBs in a process recently designated as "chromothripsis". Elucidating mechanisms involved in the etiology of SVs informs disease pathogenesis as well as the dynamic function associated with the biology and evolution of human genomes.
Collapse
Affiliation(s)
- Benjamin B Currall
- Departments of Obstetrics, Gynecology and Reproductive Biology, Brigham and Women's Hospital and Harvard Medical School, New Research Building, Room 160D, 77 Avenue Louis Pasteur, Boston, MA 02115, USA. Harvard Medical School, Boston, MA, USA
| | | | | | | |
Collapse
|
85
|
Lorente-Galdos B, Bleyhl J, Santpere G, Vives L, Ramírez O, Hernandez J, Anglada R, Cooper GM, Navarro A, Eichler EE, Marques-Bonet T. Accelerated exon evolution within primate segmental duplications. Genome Biol 2013; 14:R9. [PMID: 23360670 PMCID: PMC3906575 DOI: 10.1186/gb-2013-14-1-r9] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Revised: 12/20/2012] [Accepted: 01/29/2013] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND The identification of signatures of natural selection has long been used as an approach to understanding the unique features of any given species. Genes within segmental duplications are overlooked in most studies of selection due to the limitations of draft nonhuman genome assemblies and to the methodological reliance on accurate gene trees, which are difficult to obtain for duplicated genes. RESULTS In this work, we detected exons with an accumulation of high-quality nucleotide differences between the human assembly and shotgun sequencing reads from single human and macaque individuals. Comparing the observed rates of nucleotide differences between coding exons and their flanking intronic sequences with a likelihood-ratio test, we identified 74 exons with evidence for rapid coding sequence evolution during the evolution of humans and Old World monkeys. Fifty-five percent of rapidly evolving exons were either partially or totally duplicated, which is a significant enrichment of the 6% rate observed across all human coding exons. CONCLUSIONS Our results provide a more comprehensive view of the action of selection upon segmental duplications, which are the most complex regions of our genomes. In light of these findings, we suggest that segmental duplications could be subjected to rapid evolution more frequently than previously thought.
Collapse
Affiliation(s)
- Belen Lorente-Galdos
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Jonathan Bleyhl
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Gabriel Santpere
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Laura Vives
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Oscar Ramírez
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Jessica Hernandez
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Roger Anglada
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Gregory M Cooper
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Arcadi Navarro
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, Seattle, Washington 98195, USA
| | - Tomas Marques-Bonet
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| |
Collapse
|
86
|
Asrar Z, Haq F, Abbasi AA. Fourfold paralogy regions on human HOX-bearing chromosomes: role of ancient segmental duplications in the evolution of vertebrate genome. Mol Phylogenet Evol 2012; 66:737-47. [PMID: 23142696 DOI: 10.1016/j.ympev.2012.10.024] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2012] [Revised: 10/27/2012] [Accepted: 10/29/2012] [Indexed: 01/26/2023]
Abstract
BACKGROUND Susumu Ohno's idea that modern vertebrates are degenerate polyploids (concept referred as 2R hypothesis) has been the subject of intense debate for past four decades. It was proposed that intra-genomic synteny regions (paralogons) in human genome are remains of ancient polyploidization events that occurred early in the vertebrate history. The quadruplicated paralogon centered on human HOX clusters is taken as evidence that human HOX-bearing chromosomes were structured by two rounds of whole genome duplication (WGD) events. RESULTS Evolutionary history of human HOX-bearing chromosomes (chromosomes 2/7/12/17) was evaluated by the phylogenetic analysis of multigene families with triplicated or quadruplicated distribution on these chromosomes. Topology comparison approach categorized the members of 44 families into four distinct co-duplicated groups. Distinct gene families belonging to a particular co-duplicated group, exhibit similar evolutionary history and hence have duplicated simultaneously, whereas genes of two distinct co-duplicated groups do not share their evolutionary history and have not duplicated in concert with each other. CONCLUSION The recovery of co-duplicated groups suggests that "ancient segmental duplications and rearrangements" is the most rational model of evolutionary events that have generated the triplicated and quadruplicated paralogy regions seen on the human HOX-bearing chromosomes.
Collapse
Affiliation(s)
- Zainab Asrar
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan
| | | | | |
Collapse
|
87
|
Mácha J, Teichmanová R, Sater AK, Wells DE, Tlapáková T, Zimmerman LB, Krylov V. Deep ancestry of mammalian X chromosome revealed by comparison with the basal tetrapod Xenopus tropicalis. BMC Genomics 2012; 13:315. [PMID: 22800176 PMCID: PMC3472169 DOI: 10.1186/1471-2164-13-315] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 06/25/2012] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND The X and Y sex chromosomes are conspicuous features of placental mammal genomes. Mammalian sex chromosomes arose from an ordinary pair of autosomes after the proto-Y acquired a male-determining gene and degenerated due to suppression of X-Y recombination. Analysis of earlier steps in X chromosome evolution has been hampered by the long interval between the origins of teleost and amniote lineages as well as scarcity of X chromosome orthologs in incomplete avian genome assemblies. RESULTS This study clarifies the genesis and remodelling of the Eutherian X chromosome by using a combination of sequence analysis, meiotic map information, and cytogenetic localization to compare amniote genome organization with that of the amphibian Xenopus tropicalis. Nearly all orthologs of human X genes localize to X. tropicalis chromosomes 2 and 8, consistent with an ancestral X-conserved region and a single X-added region precursor. This finding contradicts a previous hypothesis of three evolutionary strata in this region. Homologies between human, opossum, chicken and frog chromosomes suggest a single X-added region predecessor in therian mammals, corresponding to opossum chromosomes 4 and 7. A more ancient X-added ancestral region, currently extant as a major part of chicken chromosome 1, is likely to have been present in the progenitor of synapsids and sauropsids. Analysis of X chromosome gene content emphasizes conservation of single protein coding genes and the role of tandem arrays in formation of novel genes. CONCLUSIONS Chromosomal regions orthologous to Therian X chromosomes have been located in the genome of the frog X. tropicalis. These X chromosome ancestral components experienced a series of fusion and breakage events to give rise to avian autosomes and mammalian sex chromosomes. The early branching tetrapod X. tropicalis' simple diploid genome and robust synteny to amniotes greatly enhances studies of vertebrate chromosome evolution.
Collapse
Affiliation(s)
- Jaroslav Mácha
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Radka Teichmanová
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Amy K Sater
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| | - Dan E Wells
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| | - Tereza Tlapáková
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Lyle B Zimmerman
- Division of Developmental Biology, MRC-National Institute for Medical Research, Mill Hill, London, NW7 1AA, UK
| | - Vladimír Krylov
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| |
Collapse
|
88
|
Liu GE, Bickhart DM. Copy number variation in the cattle genome. Funct Integr Genomics 2012; 12:609-24. [DOI: 10.1007/s10142-012-0289-9] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Revised: 06/13/2012] [Accepted: 06/20/2012] [Indexed: 11/29/2022]
|
89
|
Abstract
New genes are a major source of genetic innovation in genomes. However, until recently, understanding how new genes originate and how they evolve was hampered by the lack of appropriate genetic datasets. The advent of the genomic era brought about a revolution in the amount of data available to study new genes. For the first time, decades-old theoretical principles could be tested empirically and novel and unexpected avenues of research opened up. This chapter explores how genomic data can and is being used to study both the origin and evolution of new genes and the surprising discoveries made thus far.
Collapse
|
90
|
Marotta M, Piontkivska H, Tanaka H. Molecular trajectories leading to the alternative fates of duplicate genes. PLoS One 2012; 7:e38958. [PMID: 22720000 PMCID: PMC3375281 DOI: 10.1371/journal.pone.0038958] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Accepted: 05/14/2012] [Indexed: 11/21/2022] Open
Abstract
Gene duplication generates extra gene copies in which mutations can accumulate without risking the function of pre-existing genes. Such mutations modify duplicates and contribute to evolutionary novelties. However, the vast majority of duplicates appear to be short-lived and experience duplicate silencing within a few million years. Little is known about the molecular mechanisms leading to these alternative fates. Here we delineate differing molecular trajectories of a relatively recent duplication event between humans and chimpanzees by investigating molecular properties of a single duplicate: DNA sequences, gene expression and promoter activities. The inverted duplication of the Glutathione S-transferase Theta 2 (GSTT2) gene had occurred at least 7 million years ago in the common ancestor of African great apes and is preserved in chimpanzees (Pan troglodytes), whereas a deletion polymorphism is prevalent in humans. The alternative fates are associated with expression divergence between these species, and reduced expression in humans is regulated by silencing mutations that have been propagated between duplicates by gene conversion. In contrast, selective constraint preserved duplicate divergence in chimpanzees. The difference in evolutionary processes left a unique DNA footprint in which dying duplicates are significantly more similar to each other (99.4%) than preserved ones. Such molecular trajectories could provide insights for the mechanisms underlying duplicate life and death in extant genomes.
Collapse
Affiliation(s)
- Michael Marotta
- Department of Molecular Genetics, Cleveland Clinic Foundation, Cleveland, Ohio, United States of America
| | - Helen Piontkivska
- Department of Biological Sciences, Kent State University, Kent, Ohio, United States of America
| | - Hisashi Tanaka
- Department of Molecular Genetics, Cleveland Clinic Foundation, Cleveland, Ohio, United States of America
| |
Collapse
|
91
|
Audemard E, Schiex T, Faraut T. Detecting long tandem duplications in genomic sequences. BMC Bioinformatics 2012; 13:83. [PMID: 22568762 PMCID: PMC3464658 DOI: 10.1186/1471-2105-13-83] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2011] [Accepted: 05/08/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. RESULTS In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome, using a reference set of tandem duplicated genes built using TAIR,(a) we show that ReD Tandem is able to predict a large fraction of recently duplicated genes (dS < 1) and that it is also able to predict tandem duplications involving non coding elements such as pseudo-genes or RNA genes. CONCLUSIONS ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations.
Collapse
Affiliation(s)
- Eric Audemard
- Unité de Biométrie et Intelligence Artificielle, UR 875, INRA, Toulouse, France.
| | | | | |
Collapse
|
92
|
Ranz JM, Parsch J. Newly evolved genes: moving from comparative genomics to functional studies in model systems. How important is genetic novelty for species adaptation and diversification? Bioessays 2012; 34:477-83. [PMID: 22461005 DOI: 10.1002/bies.201100177] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Genes are gained and lost over the course of evolution. A recent study found that over 1,800 new genes have appeared during primate evolution and that an unexpectedly high proportion of these genes are expressed in the human brain. But what are the molecular functions of newly evolved genes and what is their impact on an organism's fitness? The acquisition of new genes may provide a rich source of genetic diversity that fuels evolutionary innovation. Although gene manipulation experiments are not feasible in humans, studies in model organisms, such as Drosophila melanogaster, have shown that new genes can quickly become integrated into genetic networks and become essential for survival or fertility. Future studies of new genes, especially chimeric genes, and their functions will help determine the role of genetic novelty in the adaptation and diversification of species.
Collapse
Affiliation(s)
- José M Ranz
- Department of Ecology and Evolutionary Biology, University of California-Irvine, CA, USA.
| | | |
Collapse
|
93
|
Abbasi AA, Hanif H. Phylogenetic history of paralogous gene quartets on human chromosomes 1, 2, 8 and 20 provides no evidence in favor of the vertebrate octoploidy hypothesis. Mol Phylogenet Evol 2012; 63:922-7. [PMID: 22425707 DOI: 10.1016/j.ympev.2012.02.028] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2011] [Revised: 02/08/2012] [Accepted: 02/27/2012] [Indexed: 01/24/2023]
Abstract
Fourfold paralogy regions in the human genome have been considered historical remnants of whole-genome duplication events predicted to have occurred early in vertebrate evolution. Taking advantage of the well-annotated and high-quality human genomic sequence map as well as the ever-increasing accessibility of large-scale genomic sequence data from a diverse range of animal species, we investigated the prediction that the ancestral vertebrate genome was shaped by two rapid rounds of whole-genome duplication within a period of 10 million years. Both the map self-comparison approach and a phylogenetic analysis revealed that gene families identified as tetralogous on human chromosomes 1/2/8/20 arose by small-scale duplication events that occurred at widely different time points in animal evolution. Furthermore, the data discount the likelihood that tree topologies of the form ((A,B)(C,D)) are best explained by the octoploidy hypothesis. We instead propose that such symmetrical tree patterns are also consistent with local duplications and rearrangement events.
Collapse
Affiliation(s)
- Amir Ali Abbasi
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan.
| | | |
Collapse
|
94
|
Bickhart DM, Hou Y, Schroeder SG, Alkan C, Cardone MF, Matukumalli LK, Song J, Schnabel RD, Ventura M, Taylor JF, Garcia JF, Van Tassell CP, Sonstegard TS, Eichler EE, Liu GE. Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res 2012; 22:778-90. [PMID: 22300768 DOI: 10.1101/gr.133967.111] [Citation(s) in RCA: 220] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Copy number variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often intractable. Using a read depth approach based on next-generation sequencing, we examined genome-wide copy number differences among five taurine (three Angus, one Holstein, and one Hereford) and one indicine (Nelore) cattle. Within mapped chromosomal sequence, we identified 1265 CNV regions comprising ~55.6-Mbp sequence--476 of which (~38%) have not previously been reported. We validated this sequence-based CNV call set with array comparative genomic hybridization (aCGH), quantitative PCR (qPCR), and fluorescent in situ hybridization (FISH), achieving a validation rate of 82% and a false positive rate of 8%. We further estimated absolute copy numbers for genomic segments and annotated genes in each individual. Surveys of the top 25 most variable genes revealed that the Nelore individual had the lowest copy numbers in 13 cases (~52%, χ(2) test; P-value <0.05). In contrast, genes related to pathogen- and parasite-resistance, such as CATHL4 and ULBP17, were highly duplicated in the Nelore individual relative to the taurine cattle, while genes involved in lipid transport and metabolism, including APOL3 and FABP2, were highly duplicated in the beef breeds. These CNV regions also harbor genes like BPIFA2A (BSP30A) and WC1, suggesting that some CNVs may be associated with breed-specific differences in adaptation, health, and production traits. By providing the first individualized cattle CNV and segmental duplication maps and genome-wide gene copy number estimates, we enable future CNV studies into highly duplicated regions in the cattle genome.
Collapse
Affiliation(s)
- Derek M Bickhart
- USDA-ARS, ANRI, Bovine Functional Genomics Laboratory, Beltsville, Maryland 20705, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
95
|
Abstract
Structural variation (SV) encompasses diverse types of genomic variants including deletions, duplications, inversions, transpositions, translocations, and complex rearrangements, and is now recognized to be an abundant class of genetic variation in mammals. Different individuals, or strains, of a given species can differ by thousands of variants. However, despite a large number of studies over the past decade and impressive progress on many fronts, there remain significant gaps in our knowledge, particularly in species other than human. Arguably the most relevant among these are genetically tractable models such as mouse, rat, and dog. The emergence of efficient and affordable DNA sequencing technologies presents an opportunity to make rapid progress toward understanding the nature, origin, and function of SV in these, and other, domesticated species. Here, we summarize the current state of knowledge of SV in mammals, with a focus on the similarities and differences between domesticated species and human. We then present methods to identify SV breakpoints from next-generation sequence (NGS) data by paired-end mapping, split-read mapping, and local assembly, and discuss challenges that arise when interpreting these data in lineages with complex breeding histories and incomplete reference genomes. We further describe technical modifications that allow for identification of variants involving repetitive DNA elements such as transposons and segmental duplications. Finally, we explore a few of the key biological insights that can be gained by applying NGS methods to model organisms.
Collapse
Affiliation(s)
- Ira M Hall
- Department of Biochemistry and Molecular Genetics, University of Virginia School of Medicine, Charlottesville, VA, USA.
| | | |
Collapse
|
96
|
Vu TH, Coccaro EF, Eichler EE, Girirajan S. Genomic architecture of aggression: rare copy number variants in intermittent explosive disorder. Am J Med Genet B Neuropsychiatr Genet 2011; 156B:808-16. [PMID: 21812102 PMCID: PMC3168586 DOI: 10.1002/ajmg.b.31225] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 07/11/2011] [Indexed: 12/29/2022]
Abstract
Copy number variants (CNVs) are known to be associated with complex neuropsychiatric disorders (e.g., schizophrenia and autism) but have not been explored in the isolated features of aggressive behaviors such as intermittent explosive disorder (IED). IED is characterized by recurrent episodes of aggression in which individuals act impulsively and grossly out of proportion from the involved stressors. Previous studies have identified genetic variants in the serotonergic pathway that play a role in susceptibility to this behavior, but additional contributors have not been identified. Therefore, to further delineate possible genetic influences, we investigated CNVs in individuals diagnosed with IED and/or personality disorder (PD). We carried out array comparative genomic hybridization on 113 samples of individuals with isolated features of IED (n = 90) or PD (n = 23). We detected a recurrent 1.35-Mbp deletion on chromosome 1q21.1 in one IED subject and a novel ∼350-kbp deletion on chromosome 16q22.3q23.1 in another IED subject. While five recent reports have suggested the involvement of an ∼1.6-Mbp 15q13.3 deletion in individuals with behavioral problems, particularly aggression, we report an absence of such events in our study of individuals specifically selected for aggression. We did, however, detect a smaller ∼430-kbp 15q13.3 duplication containing CHRNA7 in one individual with PD. While these results suggest a possible role for rare CNVs in identifying genes underlying IED or PD, further studies on a large number of well-characterized individuals are necessary.
Collapse
Affiliation(s)
- Tiffany H Vu
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, USA.
| | - Emil F Coccaro
- Department of Psychiatry and Behavioral Neuroscience, University of ChicagoChicago, Illinois
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of MedicineSeattle, Washington,Howard Hughes Medical Institute, University of Washington School of MedicineSeattle, Washington
| | - Santhosh Girirajan
- Department of Genome Sciences, University of Washington School of MedicineSeattle, Washington,*Correspondence to: Santhosh Girirajan, MBBS, Ph.D., Department of Genome Sciences, University of Washington, Foege S-413A, Box 355065, 3720 15th Ave NE, Seattle, WA 98195. E-mail:
| |
Collapse
|
97
|
Ponting CP, Nellåker C, Meader S. Rapid turnover of functional sequence in human and other genomes. Annu Rev Genomics Hum Genet 2011; 12:275-99. [PMID: 21721940 DOI: 10.1146/annurev-genom-090810-183115] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The amount of a genome's sequence that is functional has been surprisingly difficult to estimate accurately. This has severely hindered analyses asking whether the amount of functional genomic sequence correlates with organismal complexity. Most studies estimate these amounts by considering nucleotide substitution rates within aligned sequences. These approaches show reduced power to identify sequence that is aligned, functional, and constrained only within narrowly defined phyla. The neutral indel model exploits insertions or deletions (indels) rather than substitutions in predicting functional sequence. Surprisingly, this method indicates that half of all functional sequence is specific to individual eutherian lineages. This review considers the rates at which coding or noncoding and functional or nonfunctional sequence changes among mammalian genomes. In contrast to the slow rate at which protein-coding sequence changes, functional noncoding sequence appears to change or be turned over at rapid rates in mammals.
Collapse
Affiliation(s)
- Chris P Ponting
- Medical Research Council Functional Genomics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford OX1 3QX, United Kingdom.
| | | | | |
Collapse
|
98
|
|
99
|
Yu P, Wang C, Xu Q, Feng Y, Yuan X, Yu H, Wang Y, Tang S, Wei X. Detection of copy number variations in rice using array-based comparative genomic hybridization. BMC Genomics 2011; 12:372. [PMID: 21771342 PMCID: PMC3156786 DOI: 10.1186/1471-2164-12-372] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Accepted: 07/20/2011] [Indexed: 01/02/2023] Open
Abstract
Background Copy number variations (CNVs) can create new genes, change gene dosage, reshape gene structures, and modify elements regulating gene expression. As with all types of genetic variation, CNVs may influence phenotypic variation and gene expression. CNVs are thus considered major sources of genetic variation. Little is known, however, about their contribution to genetic variation in rice. Results To detect CNVs, we used a set of NimbleGen whole-genome comparative genomic hybridization arrays containing 718,256 oligonucleotide probes with a median probe spacing of 500 bp. We compiled a high-resolution map of CNVs in the rice genome, showing 641 CNVs between the genomes of the rice cultivars 'Nipponbare' (from O. sativa ssp. japonica) and 'Guang-lu-ai 4' (from O. sativa ssp. indica). The CNVs identified vary in size from 1.1 kb to 180.7 kb, and encompass approximately 7.6 Mb of the rice genome. The largest regions showing copy gain and loss are of 37.4 kb on chromosome 4, and 180.7 kb on chromosome 8. In addition, 85 DNA segments were identified, including some genic sequences. Contracted genes greatly outnumbered duplicated ones. Many of the contracted genes corresponded to either the same genes or genes involved in the same biological processes; this was also the case for genes involved in disease and defense. Conclusion We detected CNVs in rice by array-based comparative genomic hybridization. These CNVs contain known genes. Further discussion of CNVs is important, as they are linked to variation among rice varieties, and are likely to contribute to subspecific characteristics.
Collapse
Affiliation(s)
- Ping Yu
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Hangzhou, China
| | | | | | | | | | | | | | | | | |
Collapse
|
100
|
Chung D, Kuan PF, Li B, Sanalkumar R, Liang K, Bresnick EH, Dewey C, Keleş S. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data. PLoS Comput Biol 2011; 7:e1002111. [PMID: 21779159 PMCID: PMC3136429 DOI: 10.1371/journal.pcbi.1002111] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2011] [Accepted: 05/18/2011] [Indexed: 11/19/2022] Open
Abstract
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.
Collapse
Affiliation(s)
- Dongjun Chung
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Pei Fen Kuan
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Bo Li
- Department of Computer Sciences, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Rajendran Sanalkumar
- Wisconsin Institutes for Medical Research, UW Carbone Cancer Center, Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States of America
| | - Kun Liang
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Emery H. Bresnick
- Wisconsin Institutes for Medical Research, UW Carbone Cancer Center, Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, Wisconsin, United States of America
| | - Colin Dewey
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Computer Sciences, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Sündüz Keleş
- Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America
| |
Collapse
|