151
|
Zhang Q, Su B. Evolutionary origin and human-specific expansion of a cancer/testis antigen gene family. Mol Biol Evol 2014; 31:2365-75. [PMID: 24916032 DOI: 10.1093/molbev/msu188] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Cancer/testis (CT) antigens are encoded by germline genes and are aberrantly expressed in a number of human cancers. Interestingly, CT antigens are frequently involved in gene families that are highly expressed in germ cells. Here, we presented an evolutionary analysis of the CTAGE (cutaneous T-cell-lymphoma-associated antigen) gene family to delineate its molecular history and functional significance during primate evolution. Comparisons among human, chimpanzee, gorilla, orangutan, macaque, marmoset, and other mammals show a rapid and primate specific expansion of CTAGE family, which starts with an ancestral retroposition in the haplorhini ancestor. Subsequent DNA-based duplications lead to the prosperity of single-exon CTAGE copies in catarrhines, especially in humans. Positive selection was identified on the single-exon copies in comparison with functional constraint on the multiexon copies. Further sequence analysis suggests that the newly derived CTAGE genes may obtain regulatory elements from long terminal repeats. Our result indicates the dynamic evolution of primate genomes, and the recent expansion of this CT antigen family in humans may confer advantageous phenotypic traits during early human evolution.
Collapse
Affiliation(s)
- Qu Zhang
- Department of Human Evolutionary Biology, Graduate School of Art and Science, Harvard University
| | - Bing Su
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
152
|
Interplay of interlocus gene conversion and crossover in segmental duplications under a neutral scenario. G3-GENES GENOMES GENETICS 2014; 4:1479-89. [PMID: 24906640 PMCID: PMC4132178 DOI: 10.1534/g3.114.012435] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Interlocus gene conversion is a major evolutionary force that drives the concerted evolution of duplicated genomic regions. Theoretical models successfully have addressed the effects of interlocus gene conversion and the importance of crossover in the evolutionary fate of gene families and duplications but have not considered complex recombination scenarios, such as the presence of hotspots. To study the interplay between interlocus gene conversion and crossover, we have developed a forward-time simulator that allows the exploration of a wide range of interlocus gene conversion rates under different crossover models. Using it, we have analyzed patterns of nucleotide variation and linkage disequilibrium within and between duplicate regions, focusing on a neutral scenario with constant population size and validating our results with the existing theoretical models. We show that the interaction of gene conversion and crossover is nontrivial and that the location of crossover junctions is a fundamental determinant of levels of variation and linkage disequilibrium in duplicated regions. We also show that if crossover activity between duplications is strong enough, recurrent interlocus gene conversion events can break linkage disequilibrium within duplicates. Given the complex nature of interlocus gene conversion and crossover, we provide a framework to explore their interplay to help increase knowledge on molecular evolution within segmental duplications under more complex scenarios, such as demographic changes or natural selection.
Collapse
|
153
|
Ambreen S, Khalil F, Abbasi AA. Integrating large-scale phylogenetic datasets to dissect the ancient evolutionary history of vertebrate genome. Mol Phylogenet Evol 2014; 78:1-13. [PMID: 24821622 DOI: 10.1016/j.ympev.2014.05.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Revised: 04/17/2014] [Accepted: 05/01/2014] [Indexed: 11/18/2022]
Abstract
BACKGROUND The vertebrate genome often contains closely spaced set of paralogous genes from distinct gene families on typically two, three or four different chromosomes (paralogons). This type of genome architecture is widely considered to be remnants of whole genome duplication events (WGD/2R). RESULTS Taking advantage of the well-annotated and high-quality human genomic sequence map as well as the ever-increasing accessibility of large-scale genomic sequence data from a diverse range of animal species, we investigated the evolutionary history of potential quadruplicated regions residing on human HOX-cluster bearing chromosomes (chromosomes 2/7/12/17). For this purpose a detailed phylogenetic analysis was performed for those multigene families, including members of at least three of the four HOX-bearing chromosomes. Topology comparison approach categorized the members of 63 families into distinct co-duplicated groups. Distinct gene families belonging to a particular co-duplicated group, exhibit similar evolutionary history and hence have duplicated concurrently, whereas genes of two different co-duplicated groups do not share their history and have not duplicated in concert with each other. CONCLUSIONS These results based on large-scale phylogenetic dataset yielded no evidence in favor of polyploidization events; instead it appears that triplicated and quadruplicated genomic segments on the human HOX-bearing chromosomes arose by small-scale duplication events that occurred at widely different time points in animal evolution.
Collapse
Affiliation(s)
- Sadaf Ambreen
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan
| | - Faiqa Khalil
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan
| | - Amir Ali Abbasi
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan.
| |
Collapse
|
154
|
Wang A, Fu M, Jiang X, Mao Y, Li X, Tao S. Evolution of the F-box gene family in Euarchontoglires: gene number variation and selection patterns. PLoS One 2014; 9:e94899. [PMID: 24727786 PMCID: PMC3984280 DOI: 10.1371/journal.pone.0094899] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Accepted: 03/21/2014] [Indexed: 02/06/2023] Open
Abstract
F-box proteins are substrate adaptors used by the SKP1–CUL1–F-box protein (SCF) complex, a type of E3 ubiquitin ligase complex in the ubiquitin proteasome system (UPS). SCF-mediated ubiquitylation regulates proteolysis of hundreds of cellular proteins involved in key signaling and disease systems. However, our knowledge of the evolution of the F-box gene family in Euarchontoglires is limited. In the present study, 559 F-box genes and nine related pseudogenes were identified in eight genomes. Lineage-specific gene gain and loss events occurred during the evolution of Euarchontoglires, resulting in varying F-box gene numbers ranging from 66 to 81 among the eight species. Both tandem duplication and retrotransposition were found to have contributed to the increase of F-box gene number, whereas mutation in the F-box domain was the main mechanism responsible for reduction in the number of F-box genes, resulting in a balance of expansion and contraction in the F-box gene family. Thus, the Euarchontoglire F-box gene family evolved under a birth-and-death model. Signatures of positive selection were detected in substrate-recognizing domains of multiple F-box proteins, and adaptive changes played a role in evolution of the Euarchontoglire F-box gene family. In addition, single nucleotide polymorphism (SNP) distributions were found to be highly non-random among different regions of F-box genes in 1092 human individuals, with domain regions having a significantly lower number of non-synonymous SNPs.
Collapse
Affiliation(s)
- Ailan Wang
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A & F University, Yangling, Shaanxi, China
- Bioinformatics Center, Northwest A&F University, Yangling, Shaanxi, China
| | - Mingchuan Fu
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A & F University, Yangling, Shaanxi, China
- Bioinformatics Center, Northwest A&F University, Yangling, Shaanxi, China
| | - Xiaoqian Jiang
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A & F University, Yangling, Shaanxi, China
- Bioinformatics Center, Northwest A&F University, Yangling, Shaanxi, China
| | - Yuanhui Mao
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A & F University, Yangling, Shaanxi, China
- Bioinformatics Center, Northwest A&F University, Yangling, Shaanxi, China
| | - Xiangchen Li
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A & F University, Yangling, Shaanxi, China
- Bioinformatics Center, Northwest A&F University, Yangling, Shaanxi, China
| | - Shiheng Tao
- State Key Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A & F University, Yangling, Shaanxi, China
- Bioinformatics Center, Northwest A&F University, Yangling, Shaanxi, China
- * E-mail:
| |
Collapse
|
155
|
Bickhart DM, Liu GE. The challenges and importance of structural variation detection in livestock. Front Genet 2014; 5:37. [PMID: 24600474 PMCID: PMC3927395 DOI: 10.3389/fgene.2014.00037] [Citation(s) in RCA: 83] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2013] [Accepted: 01/31/2014] [Indexed: 01/25/2023] Open
Abstract
Recent studies in humans and other model organisms have demonstrated that structural variants (SVs) comprise a substantial proportion of variation among individuals of each species. Many of these variants have been linked to debilitating diseases in humans, thereby cementing the importance of refining methods for their detection. Despite progress in the field, reliable detection of SVs still remains a problem even for human subjects. Many of the underlying problems that make SVs difficult to detect in humans are amplified in livestock species, whose lower quality genome assemblies and incomplete gene annotation can often give rise to false positive SV discoveries. Regardless of the challenges, SV detection is just as important for livestock researchers as it is for human researchers, given that several productive traits and diseases have been linked to copy number variations (CNVs) in cattle, sheep, and pig. Already, there is evidence that many beneficial SVs have been artificially selected in livestock such as a duplication of the agouti signaling protein gene that causes white coat color in sheep. In this review, we will list current SV and CNV discoveries in livestock and discuss the problems that hinder routine discovery and tracking of these polymorphisms. We will also discuss the impacts of selective breeding on CNV and SV frequencies and mention how SV genotyping could be used in the future to improve genetic selection.
Collapse
Affiliation(s)
- Derek M Bickhart
- Animal Improvement Programs Laboratory, United States Department of Agriculture-Agricultural Research Service Beltsville, MD, USA
| | - George E Liu
- Bovine Functional Genomics Laboratory, United States Department of Agriculture-Agricultural Research Service Beltsville, MD, USA
| |
Collapse
|
156
|
Zhang Q. Using pseudogene database to identify lineage-specific genes and pseudogenes in humans and chimpanzees. ACTA ACUST UNITED AC 2014; 105:436-43. [PMID: 24399747 DOI: 10.1093/jhered/est097] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
It has been revealed that gene content changes, or gene gains or losses, have played an important role in the evolution of modern humans. As one of the major players accounting for gene content changes, gene pseudogenization is abundant in mammalian genomes, and approximately 20000 pseudogenes have been identified in ape genomes. Therefore, it is an interesting question how to exploit rich information embedded in pseudogenes. Here, I present a bioinformatic pipeline that utilizes a pseudogene database to identify both lineage-specific genes and pseudogenes in humans and chimpanzees. I found 6 human-specific gene gains (HSGs), 1 chimpanzee-specific gene gain, and 4 chimpanzee-specific pseudogenes, most not discovered in previous studies. Further analysis showed that HSGs have been evolving under strong purifying selection and are broadly expressed, indicating strong functional constraint. This study demonstrates the usage of pseudogene information in comparative genomics and suggests that new genes during primate evolution may acquire essential functions in a short time. The pipeline developed here could also be applied to other species.
Collapse
Affiliation(s)
- Qu Zhang
- the Department of Human Evolutionary Biology, Graduate School of Art and Science, Harvard University, 11 Divinity Avenue, Cambridge, MA 02138
| |
Collapse
|
157
|
Behura SK, Severson DW. Association of microsatellite pairs with segmental duplications in insect genomes. BMC Genomics 2013; 14:907. [PMID: 24359442 PMCID: PMC3878106 DOI: 10.1186/1471-2164-14-907] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 12/16/2013] [Indexed: 11/30/2022] Open
Abstract
Background Segmental duplications (SDs), also known as low-copy repeats, are DNA sequences of length greater than 1 kb which are duplicated with a high degree of sequence identity (greater than 90%) causing instability in genomes. SDs are generally found in the genome as mosaic forms of duplicated sequences which are generated by a two-step process: first, multiple duplicated sequences are aggregated at specific genomic regions, and then, these primary duplications undergo multiple secondary duplications. However, the mechanism of how duplicated sequences are aggregated in the first place is not well understood. Results By analyzing the distribution of microsatellite sequences among twenty insect species in a genome-wide manner it was found that pairs of microsatellites along with the intervening sequences were duplicated multiple times in each genome. They were found as low copy repeats or segmental duplications when the duplicated loci were greater than 1 kb in length and had greater than 90% sequence similarity. By performing a sliding-window genomic analysis for number of paired microsatellites and number of segmental duplications, it was observed that regions rich in repetitive paired microsatellites tend to get richer in segmental duplication suggesting a “rich-gets-richer” mode of aggregation of the duplicated loci in specific regions of the genome. Results further show that the relationship between number of paired microsatellites and segmental duplications among the species is independent of the known phylogeny suggesting that association of microsatellites with segmental duplications may be a species-specific evolutionary process. It was also observed that the repetitive microsatellite pairs are associated with gene duplications but those sequences are rarely retained in the orthologous genes between species. Although some of the duplicated sequences with microsatellites as termini were found within transposable elements (TEs) of Drosophila, most of the duplications are found in the TE-free and gene-free regions of the genome. Conclusion The study clearly suggests that microsatellites are instrumental in extensive sequence duplications that may contribute to species-specific evolution of genome plasticity in insects.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA.
| | | |
Collapse
|
158
|
Taillefer E, Miller J. Exhaustive computation of exact duplications via super and non-nested local maximal repeats. J Bioinform Comput Biol 2013; 12:1350018. [PMID: 24467757 DOI: 10.1142/s0219720013500182] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
We propose and implement a method to obtain all duplicated sequences (repeats) from a chromosome or whole genome. Unlike existing approaches our method makes it possible to simultaneously identify and classify repeats into super, local, and non-nested local maximal repeats. Computation verification demonstrates that maximal repeats for a genome of several gigabases can be identified in a reasonable time, enabling us to identified these maximal repeats for any sequenced genome. The algorithm used for the identification relies on enhanced suffix array data structure to achieve practical space and time efficiency, to identify and classify the maximal repeats, and to perform further post-processing on the identified duplicated sequences. The simplicity and effectiveness of the implementation makes the method readily extendible to more sophisticated computations. Maxmers can be exhaustively accounted for in few minutes for genome sequences of dozen megabases in length and in less than a day or two for genome sequences of few gigabases in length. One application of duplicated sequence identification is to the study of duplicated sequence length distributions, which our found to exhibit for large lengths a persistent power-law behavior. Variation of estimated exponents of this power law are studied among different species and successive assembly release versions of the same species. This makes the characterization of the power-law regime of sequenced genomes via maximal repeats identification and classification, an important task for the derivation of models that would help us to elucidate sequence duplication and genome evolution.
Collapse
Affiliation(s)
- Eddy Taillefer
- Physics and Biology Unit, Okinawa Institute of Science and Technology, 1919-1 Tancha, Onna-son, Kunigami-gun 904-0412, Japan
| | | |
Collapse
|
159
|
Juan D, Rico D, Marques-Bonet T, Fernández-Capetillo Ó, Valencia A. Late-replicating CNVs as a source of new genes. Biol Open 2013; 2:1402-11. [PMID: 24285712 PMCID: PMC3863426 DOI: 10.1242/bio.20136924] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Accepted: 10/23/2013] [Indexed: 01/09/2023] Open
Abstract
Asynchronous replication of the genome has been associated with different rates of point mutation and copy number variation (CNV) in human populations. Here, our aim was to investigate whether the bias in the generation of CNV that is associated with DNA replication timing might have conditioned the birth of new protein-coding genes during evolution. We show that genes that were duplicated during primate evolution are more commonly found among the human genes located in late-replicating CNV regions. We traced the relationship between replication timing and the evolutionary age of duplicated genes. Strikingly, we found that there is a significant enrichment of evolutionary younger duplicates in late-replicating regions of the human and mouse genome. Indeed, the presence of duplicates in late-replicating regions gradually decreases as the evolutionary time since duplication extends. Our results suggest that the accumulation of recent duplications in late-replicating CNV regions is an active process influencing genome evolution.
Collapse
Affiliation(s)
- David Juan
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Daniel Rico
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Tomas Marques-Bonet
- Institut Catala de Recerca i Estudis Avancats (ICREA) and Institut de Biologia Evolutiva (UPF/CSIC), Dr Aiguader 88, PRBB, 08003 Barcelona, Spain
| | - Óscar Fernández-Capetillo
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Center (CNIO), Melchor Fernández Almagro 3, 28029 Madrid, Spain
| |
Collapse
|
160
|
Katju V, Bergthorsson U. Copy-number changes in evolution: rates, fitness effects and adaptive significance. Front Genet 2013; 4:273. [PMID: 24368910 PMCID: PMC3857721 DOI: 10.3389/fgene.2013.00273] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2013] [Accepted: 11/18/2013] [Indexed: 11/13/2022] Open
Abstract
Gene copy-number differences due to gene duplications and deletions are rampant in natural populations and play a crucial role in the evolution of genome complexity. Per-locus analyses of gene duplication rates in the pre-genomic era revealed that gene duplication rates are much higher than the per nucleotide substitution rate. Analyses of gene duplication and deletion rates in mutation accumulation lines of model organisms have revealed that these high rates of copy-number mutations occur at a genome-wide scale. Furthermore, comparisons of the spontaneous duplication and deletion rates to copy-number polymorphism data and bioinformatic-based estimates of duplication rates from sequenced genomes suggest that the vast majority of gene duplications are detrimental and removed by natural selection. The rate at which new gene copies appear in populations greatly influences their evolutionary dynamics and standing gene copy-number variation in populations. The opportunity for mutations that result in the maintenance of duplicate copies, either through neofunctionalization or subfunctionalization, also depends on the equilibrium frequency of additional gene copies in the population, and hence on the spontaneous gene duplication (and loss) rate. The duplication rate may therefore have profound effects on the role of adaptation in the evolution of duplicated genes as well as important consequences for the evolutionary potential of organisms. We further discuss the broad ramifications of this standing gene copy-number variation on fitness and adaptive potential from a population-genetic and genome-wide perspective.
Collapse
Affiliation(s)
- Vaishali Katju
- Department of Biology, University of New Mexico Albuquerque, NM, USA
| | | |
Collapse
|
161
|
Buckner RL, Krienen FM. The evolution of distributed association networks in the human brain. Trends Cogn Sci 2013; 17:648-65. [DOI: 10.1016/j.tics.2013.09.017] [Citation(s) in RCA: 475] [Impact Index Per Article: 43.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Revised: 09/28/2013] [Accepted: 09/30/2013] [Indexed: 01/25/2023]
|
162
|
Shao M, Lin Y, Moret B. Sorting genomes with rearrangements and segmental duplications through trajectory graphs. BMC Bioinformatics 2013; 14 Suppl 15:S9. [PMID: 24564345 PMCID: PMC3851842 DOI: 10.1186/1471-2105-14-s15-s9] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
We study the problem of sorting genomes under an evolutionary model that includes genomic rearrangements and segmental duplications. We propose an iterative algorithm to improve any initial evolutionary trajectory between two genomes in terms of parsimony. Our algorithm is based on a new graphical model, the trajectory graph, which models not only the final states of two genomes but also an existing evolutionary trajectory between them. We show that redundant rearrangements in the trajectory correspond to certain cycles in the trajectory graph, and prove that our algorithm converges to an optimal trajectory for any initial trajectory involving only rearrangements.
Collapse
|
163
|
Wu J, Zhang W, Huang S, He Z, Cheng Y, Wang J, Lam TW, Peng Z, Yiu SM. SOAPfusion: a robust and effective computational fusion discovery tool for RNA-seq reads. Bioinformatics 2013; 29:2971-8. [DOI: 10.1093/bioinformatics/btt522] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
|
164
|
Dumont BL, Eichler EE. Signals of historical interlocus gene conversion in human segmental duplications. PLoS One 2013; 8:e75949. [PMID: 24124524 PMCID: PMC3790853 DOI: 10.1371/journal.pone.0075949] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 08/17/2013] [Indexed: 12/04/2022] Open
Abstract
Standard methods of DNA sequence analysis assume that sequences evolve independently, yet this assumption may not be appropriate for segmental duplications that exchange variants via interlocus gene conversion (IGC). Here, we use high quality multiple sequence alignments from well-annotated segmental duplications to systematically identify IGC signals in the human reference genome. Our analysis combines two complementary methods: (i) a paralog quartet method that uses DNA sequence simulations to identify a statistical excess of sites consistent with inter-paralog exchange, and (ii) the alignment-based method implemented in the GENECONV program. One-quarter (25.4%) of the paralog families in our analysis harbor clear IGC signals by the quartet approach. Using GENECONV, we identify 1477 gene conversion tracks that cumulatively span 1.54 Mb of the genome. Our analyses confirm the previously reported high rates of IGC in subtelomeric regions and Y-chromosome palindromes, and identify multiple novel IGC hotspots, including the pregnancy specific glycoproteins and the neuroblastoma breakpoint gene families. Although the duplication history of a paralog family is described by a single tree, we show that IGC has introduced incredible site-to-site variation in the evolutionary relationships among paralogs in the human genome. Our findings indicate that IGC has left significant footprints in patterns of sequence diversity across segmental duplications in the human genome, out-pacing the contributions of single base mutation by orders of magnitude. Collectively, the IGC signals we report comprise a catalog that will provide a critical reference for interpreting observed patterns of DNA sequence variation across duplicated genomic regions, including targets of recent adaptive evolution in humans.
Collapse
Affiliation(s)
- Beth L. Dumont
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Seattle, Washington, United States of America
| |
Collapse
|
165
|
Campbell CD, Eichler EE. Properties and rates of germline mutations in humans. Trends Genet 2013; 29:575-84. [PMID: 23684843 PMCID: PMC3785239 DOI: 10.1016/j.tig.2013.04.005] [Citation(s) in RCA: 188] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Revised: 04/05/2013] [Accepted: 04/18/2013] [Indexed: 11/25/2022]
Abstract
All genetic variation arises via new mutations; therefore, determining the rate and biases for different classes of mutation is essential for understanding the genetics of human disease and evolution. Decades of mutation rate analyses have focused on a relatively small number of loci because of technical limitations. However, advances in sequencing technology have allowed for empirical assessments of genome-wide rates of mutation. Recent studies have shown that 76% of new mutations originate in the paternal lineage and provide unequivocal evidence for an increase in mutation with paternal age. Although most analyses have focused on single nucleotide variants (SNVs), studies have begun to provide insight into the mutation rate for other classes of variation, including copy number variants (CNVs), microsatellites, and mobile element insertions (MEIs). Here, we review the genome-wide analyses for the mutation rate of several types of variants and suggest areas for future research.
Collapse
Affiliation(s)
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195
- Howard Hughes Medical Institute, Seattle, WA 98195
| |
Collapse
|
166
|
Giannuzzi G, Pazienza M, Huddleston J, Antonacci F, Malig M, Vives L, Eichler EE, Ventura M. Hominoid fission of chromosome 14/15 and the role of segmental duplications. Genome Res 2013; 23:1763-73. [PMID: 24077392 PMCID: PMC3814877 DOI: 10.1101/gr.156240.113] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Ape chromosomes homologous to human chromosomes 14 and 15 were generated by a fission event of an ancestral submetacentric chromosome, where the two chromosomes were joined head-to-tail. The hominoid ancestral chromosome most closely resembles the macaque chromosome 7. In this work, we provide insights into the evolution of human chromosomes 14 and 15, performing a comparative study between macaque boundary region 14/15 and the orthologous human regions. We construct a 1.6-Mb contig of macaque BAC clones in the region orthologous to the ancestral hominoid fission site and use it to define the structural changes that occurred on human 14q pericentromeric and 15q subtelomeric regions. We characterize the novel euchromatin–heterochromatin transition region (∼20 Mb) acquired during the neocentromere establishment on chromosome 14, and find it was mainly derived through pericentromeric duplications from ancestral hominoid chromosomes homologous to human 2q14–qter and 10. Further, we show a relationship between evolutionary hotspots and low-copy repeat loci for chromosome 15, revealing a possible role of segmental duplications not only in mediating but also in “stitching” together rearrangement breakpoints.
Collapse
Affiliation(s)
- Giuliana Giannuzzi
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro," Bari 70125, Italy
| | | | | | | | | | | | | | | |
Collapse
|
167
|
Lee K, Nguyen DT, Choi M, Cha SY, Kim JH, Dadi H, Seo HG, Seo K, Chun T, Park C. Analysis of cattle olfactory subgenome: the first detail study on the characteristics of the complete olfactory receptor repertoire of a ruminant. BMC Genomics 2013; 14:596. [PMID: 24004971 PMCID: PMC3766653 DOI: 10.1186/1471-2164-14-596] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2013] [Accepted: 08/24/2013] [Indexed: 11/21/2022] Open
Abstract
Background Mammalian olfactory receptors (ORs) are encoded by the largest mammalian multigene family. Understanding the OR gene repertoire in the cattle genome could lead to link the effects of genetic differences in these genes to variations in olfaction in cattle. Results We report here a whole genome analysis of the olfactory receptor genes of Bos taurus using conserved OR gene-specific motifs and known OR protein sequences from diverse species. Our analysis, using the current cattle genome assembly UMD 3.1 covering 99.9% of the cattle genome, shows that the cattle genome contains 1,071 OR-related sequences including 881 functional, 190 pseudo, and 352 partial OR sequences. The OR genes are located in 49 clusters on 26 cattle chromosomes. We classified them into 18 families consisting of 4 Class I and 14 Class II families and these were further grouped into 272 subfamilies. Comparative analyses of the OR genes of cattle, pigs, humans, mice, and dogs showed that 6.0% (n = 53) of functional OR cattle genes were species-specific. We also showed that significant copy number variations are present in the OR repertoire of the cattle from the analysis of 10 selected OR genes. Conclusion Our analysis revealed the almost complete OR gene repertoire from an individual cattle genome. Though the number of OR genes were lower than in pigs, the analysis of the genetic system of cattle ORs showed close similarities to that of the pig.
Collapse
Affiliation(s)
- Kyooyeol Lee
- Department of Animal Biotechnology, Konkuk University, 263 Achasan-ro, Gwangjin-gu, Seoul 143-701, Korea.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
168
|
Floutsakou I, Agrawal S, Nguyen TT, Seoighe C, Ganley ARD, McStay B. The shared genomic architecture of human nucleolar organizer regions. Genome Res 2013; 23:2003-12. [PMID: 23990606 PMCID: PMC3847771 DOI: 10.1101/gr.157941.113] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The short arms of the five acrocentric human chromosomes harbor sequences that direct the assembly and function of the nucleolus, one of the key functional domains of the nucleus, yet they are absent from the current human genome assembly. Here we describe the genomic architecture of these human nucleolar organizers. Sequences distal and proximal to ribosomal gene arrays are conserved among the acrocentric chromosomes, suggesting they are sites of frequent recombination. Although previously believed to be heterochromatic, characterization of these two flanking regions reveals that they share a complex genomic architecture similar to other euchromatic regions of the genome, but they have distinct genomic characteristics. Proximal sequences are almost entirely segmentally duplicated, similar to the regions bordering centromeres. In contrast, the distal sequence is predominantly unique to the acrocentric short arms and is dominated by a very large inverted repeat. We show that the distal element is localized to the periphery of the nucleolus, where it appears to anchor the ribosomal gene repeats. This, combined with its complex chromatin structure and transcriptional activity, suggests that this region is involved in nucleolar organization. Our results provide a platform for investigating the role of NORs in nucleolar formation and function, and open the door for determining the role of these regions in the well-known empirical association of nucleoli with pathology.
Collapse
Affiliation(s)
- Ioanna Floutsakou
- Centre for Chromosome Biology, School of Natural Sciences, National University of Ireland, Galway, Galway, Ireland
| | | | | | | | | | | |
Collapse
|
169
|
A novel framework for the identification and analysis of duplicons between human and chimpanzee. BIOMED RESEARCH INTERNATIONAL 2013; 2013:264532. [PMID: 23984331 PMCID: PMC3747353 DOI: 10.1155/2013/264532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Revised: 06/25/2013] [Accepted: 07/10/2013] [Indexed: 11/30/2022]
Abstract
Human and other primate genomes consist of many segmental
duplications (SDs) due to fixation of copy number variations (CNVs). Structure of these duplications within the human genome has been shown to be a complex mosaic composed of juxtaposed subunits (called duplicons). These duplicons are difficult to be uncovered from the mosaic repeat structure. In addition, the distribution and evolution of duplicons among primates are still poorly investigated. In this paper, we develop a statistical framework for discovering duplicons via integration of a Hidden Markov Model (HMM) and a permutation test. Our comparative analysis indicates that the mosaic structure of duplicons is common in CNV/SD regions of both human and chimpanzee genomes, and a subset of core duplicons is shared by the majority of CNVs/SDs. Phylogenetic analyses using duplicons suggested that most CNVs/SDs share common duplication ancestry. Many human/chimpanzee duplicons flank both ends of CNVs, which may be hotspots of nonallelic homologous recombination.
Collapse
|
170
|
Hehir-Kwa JY, Pfundt R, Veltman JA, de Leeuw N. Pathogenic or not? Assessing the clinical relevance of copy number variants. Clin Genet 2013; 84:415-21. [PMID: 23895381 DOI: 10.1111/cge.12242] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Revised: 07/24/2013] [Accepted: 07/24/2013] [Indexed: 02/04/2023]
Abstract
The availability of commercially produced genomic microarrays has resulted in the wide spread implementation of genomic microarrays, often as a first-tier diagnostic test for copy number variant (CNV) screening of patients who are suspected for chromosomal aberrations. Patients with intellectual disability (ID) and/or multiple congenital anomalies (MCA) were traditionally the main focus for this microarray-based CNV screening, but the application of microarrays to other (neurodevelopmental) disorders and tumor diagnostics has also been explored and implemented. The diagnostic workflow for patients with ID is now well established, relying on the identification of rare CNVs and determining their inheritance patterns. However, experience gained through screening large numbers of samples has revealed many subtleties and complexities of CNV interpretation. This has resulted in a better understanding of the contribution of CNVs to genomic disorders not only via de novo occurrence, but also via X-linked and recessive inheritance models as well as through models taking into account mosaicisms, imprinting, and digenic inheritance. In this review, we discuss CNV interpretation within the context of these different genetic disease models and common pitfalls that can occur when searching for supportive evidence that a CNV is clinically relevant.
Collapse
Affiliation(s)
- J Y Hehir-Kwa
- Department of Human Genetics, Nijmegen Centre for Molecular Life Sciences, Radboud University Medical Centre, Nijmegen, The Netherlands
| | | | | | | |
Collapse
|
171
|
Rausell A, McLaren PJ, Telenti A. HIV and innate immunity - a genomics perspective. F1000PRIME REPORTS 2013; 5:29. [PMID: 23967380 PMCID: PMC3732074 DOI: 10.12703/p5-29] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Innate immunity is a theme of increasing interest for HIV research. However, the term is overstretched to cover biological barriers, cellular systems, soluble factors, signaling pathways, and effectors and is inconsistently applied. A clearer semantic classification of the components of innate immunity is needed, which will have direct relevance to the interpretation of human genome variation. Here, we discuss genomic approaches that can assist in re-defining the perimeter of innate immunity. We place particular emphasis on the characteristics of effectors of the intracellular defense against HIV and other pathogens.
Collapse
Affiliation(s)
- Antonio Rausell
- Institute of Microbiology, University Hospital of Lausanne and University of LausanneSwitzerland
- Swiss Institute of Bioinformatics, LausanneSwitzerland
| | - Paul J. McLaren
- Institute of Microbiology, University Hospital of Lausanne and University of LausanneSwitzerland
- Swiss Institute of Bioinformatics, LausanneSwitzerland
- School of Life Sciences, École Polytechnique Fédérale de LausanneLausanneSwitzerland
| | - Amalio Telenti
- Institute of Microbiology, University Hospital of Lausanne and University of LausanneSwitzerland
| |
Collapse
|
172
|
Rapid and accurate large-scale genotyping of duplicated genes and discovery of interlocus gene conversions. Nat Methods 2013; 10:903-9. [PMID: 23892896 PMCID: PMC3985568 DOI: 10.1038/nmeth.2572] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2013] [Accepted: 06/06/2013] [Indexed: 01/17/2023]
Abstract
Over 900 genes have been annotated within duplicated regions of the human genome, yet their functions and potential roles in disease remain largely unknown. One major obstacle has been the inability to accurately and comprehensively assay genetic variation for these genes in a high-throughput manner. We developed a sequencing-based method for rapid and high-throughput genotyping of duplicated genes using molecular inversion probes designed to target unique paralogous sequence variants. We applied this method to genotype all members of two gene families, SRGAP2 and RH, among a diversity panel of 1,056 humans. The approach could accurately distinguish copy number in paralogs having up to ∼99.6% sequence identity, identify small gene-disruptive deletions, detect single-nucleotide variants, define breakpoints of unequal crossover and discover regions of interlocus gene conversion. The ability to rapidly and accurately genotype multiple gene families in thousands of individuals at low cost enables the development of genome-wide gene conversion maps and 'unlocks' many previously inaccessible duplicated genes for association with human traits.
Collapse
|
173
|
Discovery of structural alterations in solid tumor oligodendroglioma by single molecule analysis. BMC Genomics 2013; 14:505. [PMID: 23885787 PMCID: PMC3727977 DOI: 10.1186/1471-2164-14-505] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Accepted: 07/23/2013] [Indexed: 12/31/2022] Open
Abstract
Background Solid tumors present a panoply of genomic alterations, from single base changes to the gain or loss of entire chromosomes. Although aberrations at the two extremes of this spectrum are readily defined, comprehensive discernment of the complex and disperse mutational spectrum of cancer genomes remains a significant challenge for current genome analysis platforms. In this context, high throughput, single molecule platforms like Optical Mapping offer a unique perspective. Results Using measurements from large ensembles of individual DNA molecules, we have discovered genomic structural alterations in the solid tumor oligodendroglioma. Over a thousand structural variants were identified in each tumor sample, without any prior hypotheses, and often in genomic regions deemed intractable by other technologies. These findings were then validated by comprehensive comparisons to variants reported in external and internal databases, and by selected experimental corroborations. Alterations range in size from under 5 kb to hundreds of kilobases, and comprise insertions, deletions, inversions and compound events. Candidate mutations were scored at sub-genic resolution and unambiguously reveal structural details at aberrant loci. Conclusions The Optical Mapping system provides a rich description of the complex genomes of solid tumors, including sequence level aberrations, structural alterations and copy number variants that power generation of functional hypotheses for oligodendroglioma genetics.
Collapse
|
174
|
Ritland Politz JC, Scalzo D, Groudine M. Something silent this way forms: the functional organization of the repressive nuclear compartment. Annu Rev Cell Dev Biol 2013; 29:241-70. [PMID: 23834025 PMCID: PMC3999972 DOI: 10.1146/annurev-cellbio-101512-122317] [Citation(s) in RCA: 82] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The repressive compartment of the nucleus is comprised primarily of telomeric and centromeric regions, the silent portion of ribosomal RNA genes, the majority of transposable element repeats, and facultatively repressed genes specific to different cell types. This compartment localizes into three main regions: the peripheral heterochromatin, perinucleolar heterochromatin, and pericentromeric heterochromatin. Both chromatin remodeling proteins and transcription of noncoding RNAs are involved in maintenance of repression in these compartments. Global reorganization of the repressive compartment occurs at each cell division, during early development, and during terminal differentiation. Differential action of chromatin remodeling complexes and boundary element looping activities are involved in mediating these organizational changes. We discuss the evidence that heterochromatin formation and compartmentalization may drive nuclear organization.
Collapse
Affiliation(s)
| | - David Scalzo
- Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109
| | - Mark Groudine
- Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109
| |
Collapse
|
175
|
Song S, Jiang F, Yuan J, Guo W, Miao Y. Exceptionally high cumulative percentage of NUMTs originating from linear mitochondrial DNA molecules in the Hydra magnipapillata genome. BMC Genomics 2013; 14:447. [PMID: 23826818 PMCID: PMC3716686 DOI: 10.1186/1471-2164-14-447] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2013] [Accepted: 07/03/2013] [Indexed: 11/10/2022] Open
Abstract
Background In contrast to most animal genomes, mitochondrial genomes in species belonging to the phylum Cnidaria show distinct variations in genome structure, including the mtDNA structure (linear or circular) and the presence or absence of introns in protein-coding genes. Therefore, the analysis of nuclear insertions of mitochondrial sequences (NUMTs) in cnidarians allows us to compare the NUMT content in animals with different mitochondrial genome structures. Results NUMT identification in the Hydra magnipapillata, Nematostella vectensis and Acropora digitifera genomes showed that the NUMT density in the H. magnipapillata genome clearly exceeds that in other two cnidarians with circular mitochondrial genomes. We found that H. magnipapillata is an exceptional ancestral metazoan with a high NUMT cumulative percentage but a large genome, and its mitochondrial genome linearisation might be responsible for the NUMT enrichment. We also detected the co-transposition of exonic and intronic fragments within NUMTs in N. vectensis and provided direct evidence that mitochondrial sequences can be transposed into the nuclear genome through DNA-mediated fragment transfer. In addition, NUMT expression analyses showed that NUMTs are co-expressed with adjacent protein-coding genes, suggesting the relevance of their biological function. Conclusions Taken together, our results provide valuable information for understanding the impact of mitochondrial genome structure on the interaction of mitochondrial molecules and nuclear genomes.
Collapse
Affiliation(s)
- Shen Song
- Faculty of Animal Science and Technology, Yunnan Agricultural University, Kunming, Yunnan 650201, China
| | | | | | | | | |
Collapse
|
176
|
Alves JM, Lopes AM, Chikhi L, Amorim A. On the structural plasticity of the human genome: chromosomal inversions revisited. Curr Genomics 2013; 13:623-32. [PMID: 23730202 PMCID: PMC3492802 DOI: 10.2174/138920212803759703] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Revised: 09/23/2012] [Accepted: 09/24/2012] [Indexed: 01/02/2023] Open
Abstract
With the aid of novel and powerful molecular biology techniques, recent years have witnessed a dramatic increase in the number of studies reporting the involvement of complex structural variants in several genomic disorders. In fact, with the discovery of Copy Number Variants (CNVs) and other forms of unbalanced structural variation, much attention has been directed to the detection and characterization of such rearrangements, as well as the identification of the mechanisms involved in their formation. However, it has long been appreciated that chromosomes can undergo other forms of structural changes - balanced rearrangements - that do not involve quantitative variation of genetic material. Indeed, a particular subtype of balanced rearrangement – inversions – was recently found to be far more common than had been predicted from traditional cytogenetics. Chromosomal inversions alter the orientation of a specific genomic sequence and, unless involving breaks in coding or regulatory regions (and, disregarding complex trans effects, in their close vicinity), appear to be phenotypically silent. Such a surprising finding, which is difficult to reconcile with the classical interpretation of inversions as a mechanism causing subfertility (and ultimately reproductive isolation), motivated a new series of theoretical and empirical studies dedicated to understand their role in human genome evolution and to explore their possible association to complex genetic disorders. With this review, we attempt to describe the latest methodological improvements to inversions detection at a genome wide level, while exploring some of the possible implications of inversion rearrangements on the evolution of the human genome.
Collapse
Affiliation(s)
- Joao M Alves
- Doctoral Program in Areas of Basic and Applied Biology (GABBA), University of Porto, Portugal ; IPATIMUP - Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Porto, Portugal ; Instituto Gulbenkian de Ciência (IGC), Oeiras, Portugal
| | | | | | | |
Collapse
|
177
|
Díaz-Castillo C. Females and males contribute in opposite ways to the evolution of gene order in Drosophila. PLoS One 2013; 8:e64491. [PMID: 23696898 PMCID: PMC3655977 DOI: 10.1371/journal.pone.0064491] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 04/16/2013] [Indexed: 11/19/2022] Open
Abstract
An intriguing association between the spatial layout of chromosomes within nuclei and the evolution of chromosome gene order was recently uncovered. Chromosome regions with conserved gene order in the Drosophila genus are larger if they interact with the inner side of the nuclear envelope in D. melanogaster somatic cells. This observation opens a new door to understand the evolution of chromosomes in the light of the dynamics of the spatial layout of chromosomes and the way double-strand breaks are repaired in D. melanogaster germ lines. Chromosome regions at the nuclear periphery in somatic cell nuclei relocate to more internal locations of male germ line cell nuclei, which might prefer a gene order-preserving mechanism to repair double-strand breaks. Conversely, chromosome regions at the nuclear periphery in somatic cells keep their location in female germ line cell nuclei, which might be inaccessible for cellular machinery that causes gene order-disrupting chromosome rearrangements. Thus, the gene order stability for genome regions at the periphery of somatic cell nuclei might result from the active repair of double-strand breaks using conservative mechanisms in male germ line cells, and the passive inaccessibility for gene order-disrupting factors at the periphery of nuclei of female germ line cells. In the present article, I find evidence consistent with a DNA break repair-based differential contribution of both D. melanogaster germ lines to the stability/disruption of gene order. The importance of germ line differences for the layout of chromosomes and DNA break repair strategies with regard to other genomic patterns is briefly discussed.
Collapse
|
178
|
Gabora L, Scott EO, Kauffman S. A quantum model of exaptation: incorporating potentiality into evolutionary theory. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2013; 113:108-16. [PMID: 23567156 DOI: 10.1016/j.pbiomolbio.2013.03.012] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The phenomenon of preadaptation, or exaptation (wherein a trait that originally evolved to solve one problem is co-opted to solve a new problem) presents a formidable challenge to efforts to describe biological phenomena using a classical (Kolmogorovian) mathematical framework. We develop a quantum framework for exaptation with examples from both biological and cultural evolution. The state of a trait is written as a linear superposition of a set of basis states, or possible forms the trait could evolve into, in a complex Hilbert space. These basis states are represented by mutually orthogonal unit vectors, each weighted by an amplitude term. The choice of possible forms (basis states) depends on the adaptive function of interest (e.g., ability to metabolize lactose or thermoregulate), which plays the role of the observable. Observables are represented by self-adjoint operators on the Hilbert space. The possible forms (basis states) corresponding to this adaptive function (observable) are called eigenstates. The framework incorporates key features of exaptation: potentiality, contextuality, nonseparability, and emergence of new features. However, since it requires that one enumerate all possible contexts, its predictive value is limited, consistent with the assertion that there exists no biological equivalent to "laws of motion" by which we can predict the evolution of the biosphere.
Collapse
Affiliation(s)
- Liane Gabora
- Department of Psychology, University of British Columbia, Okanagan Campus, 3333 University Way, Kelowna, British Columbia V1V 1V7, Canada.
| | | | | |
Collapse
|
179
|
Massip F, Arndt PF. Neutral evolution of duplicated DNA: an evolutionary stick-breaking process causes scale-invariant behavior. PHYSICAL REVIEW LETTERS 2013; 110:148101. [PMID: 25167038 DOI: 10.1103/physrevlett.110.148101] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Indexed: 06/03/2023]
Abstract
Recently, an enrichment of identical matching sequences has been found in many eukaryotic genomes. Their length distribution exhibits a power law tail raising the question of what evolutionary mechanism or functional constraints would be able to shape this distribution. Here we introduce a simple and evolutionarily neutral model, which involves only point mutations and segmental duplications, and produces the same statistical features as observed for genomic data. Further, we extend a mathematical model for random stick breaking to analytically show that the exponent of the power law tail is -3 and universal as it does not depend on the microscopic details of the model.
Collapse
Affiliation(s)
- Florian Massip
- Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| | - Peter F Arndt
- Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany
| |
Collapse
|
180
|
Pfendner EG, Uitto J, Gerard GF, Terry SF. Pseudoxanthoma elasticum: genetic diagnostic markers. ACTA ACUST UNITED AC 2013; 2:63-79. [PMID: 23485117 DOI: 10.1517/17530059.2.1.63] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Pseudoxanthoma elasticum (PXE), an autosomal recessive disorder with considerable phenotypic variability, mainly affects the eyes, skin and cardiovascular system, and is characterized by ectopic mineralization of elastic fibers of connective tissues. Since the identification of the ABCC6 gene (ATP-binding cassette family C member 6), which encodes a putative transmembrane transporter (ABCC6), as the site of mutations responsible for PXE, a number of researchers have disclosed mutations spanning the entire gene. An important advance in the ability to identify mutations has been the identification of two closely related pseudogenes and identifying sequence differences between the coding gene and the pseudogenes allowing accurate sequencing. In this review, the mutation spectrum in PXE is summarized and a strategy to optimize mutation detection in this difficult disorder is outlined.
Collapse
|
181
|
Marzo M, Bello X, Puig M, Maside X, Ruiz A. Striking structural dynamism and nucleotide sequence variation of the transposon Galileo in the genome of Drosophila mojavensis. Mob DNA 2013; 4:6. [PMID: 23374229 PMCID: PMC3573991 DOI: 10.1186/1759-8753-4-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2012] [Accepted: 11/26/2012] [Indexed: 01/25/2023] Open
Abstract
Background Galileo is a transposable element responsible for the generation of three chromosomal inversions in natural populations of Drosophila buzzatii. Although the most characteristic feature of Galileo is the long internally-repetitive terminal inverted repeats (TIRs), which resemble the Drosophila Foldback element, its transposase-coding sequence has led to its classification as a member of the P-element superfamily (Class II, subclass 1, TIR order). Furthermore, Galileo has a wide distribution in the genus Drosophila, since it has been found in 6 of the 12 Drosophila sequenced genomes. Among these species, D. mojavensis, the one closest to D. buzzatii, presented the highest diversity in sequence and structure of Galileo elements. Results In the present work, we carried out a thorough search and annotation of all the Galileo copies present in the D. mojavensis sequenced genome. In our set of 170 Galileo copies we have detected 5 Galileo subfamilies (C, D, E, F, and X) with different structures ranging from nearly complete, to only 2 TIR or solo TIR copies. Finally, we have explored the structural and length variation of the Galileo copies that point out the relatively frequent rearrangements within and between Galileo elements. Different mechanisms responsible for these rearrangements are discussed. Conclusions Although Galileo is a transposable element with an ancient history in the D. mojavensis genome, our data indicate a recent transpositional activity. Furthermore, the dynamism in sequence and structure, mainly affecting the TIRs, suggests an active exchange of sequences among the copies. This exchange could lead to new subfamilies of the transposon, which could be crucial for the long-term survival of the element in the genome.
Collapse
Affiliation(s)
- Mar Marzo
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Catalunya, 08193, Spain.
| | | | | | | | | |
Collapse
|
182
|
Lorente-Galdos B, Bleyhl J, Santpere G, Vives L, Ramírez O, Hernandez J, Anglada R, Cooper GM, Navarro A, Eichler EE, Marques-Bonet T. Accelerated exon evolution within primate segmental duplications. Genome Biol 2013; 14:R9. [PMID: 23360670 PMCID: PMC3906575 DOI: 10.1186/gb-2013-14-1-r9] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Revised: 12/20/2012] [Accepted: 01/29/2013] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND The identification of signatures of natural selection has long been used as an approach to understanding the unique features of any given species. Genes within segmental duplications are overlooked in most studies of selection due to the limitations of draft nonhuman genome assemblies and to the methodological reliance on accurate gene trees, which are difficult to obtain for duplicated genes. RESULTS In this work, we detected exons with an accumulation of high-quality nucleotide differences between the human assembly and shotgun sequencing reads from single human and macaque individuals. Comparing the observed rates of nucleotide differences between coding exons and their flanking intronic sequences with a likelihood-ratio test, we identified 74 exons with evidence for rapid coding sequence evolution during the evolution of humans and Old World monkeys. Fifty-five percent of rapidly evolving exons were either partially or totally duplicated, which is a significant enrichment of the 6% rate observed across all human coding exons. CONCLUSIONS Our results provide a more comprehensive view of the action of selection upon segmental duplications, which are the most complex regions of our genomes. In light of these findings, we suggest that segmental duplications could be subjected to rapid evolution more frequently than previously thought.
Collapse
Affiliation(s)
- Belen Lorente-Galdos
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Jonathan Bleyhl
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Gabriel Santpere
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Laura Vives
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Oscar Ramírez
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Jessica Hernandez
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Roger Anglada
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Gregory M Cooper
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Arcadi Navarro
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- National Institute for Bioinformatics (INB), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, Seattle, Washington 98195, USA
| | - Tomas Marques-Bonet
- IBE, Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
- Institucio Catalana de Recerca i Estudis Avançats (ICREA), PRBB, Doctor Aiguader, 88, 08003, Barcelona, Catalonia, Spain
| |
Collapse
|
183
|
Xin H, Lee D, Hormozdiari F, Yedkar S, Mutlu O, Alkan C. Accelerating read mapping with FastHASH. BMC Genomics 2013; 14 Suppl 1:S13. [PMID: 23369189 PMCID: PMC3549798 DOI: 10.1186/1471-2164-14-s1-s13] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that can process and analyze the enormous amount of sequence data quickly and accurately. Unfortunately, the current read mapping algorithms have difficulties in coping with the massive amounts of data generated by NGS.We propose a new algorithm, FastHASH, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods. FastHASH is a generic algorithm compatible with all seed-and-extend class read mapping algorithms. It introduces two main techniques, namely Adjacency Filtering, and Cheap K-mer Selection.We implemented FastHASH and merged it into the codebase of the popular read mapping program, mrFAST. Depending on the edit distance cutoffs, we observed up to 19-fold speedup while still maintaining 100% sensitivity and high comprehensiveness.
Collapse
Affiliation(s)
- Hongyi Xin
- Depts. of Computer Science and Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | | | | | | | | |
Collapse
|
184
|
Serbielle C, Dupas S, Perdereau E, Héricourt F, Dupuy C, Huguet E, Drezen JM. Evolutionary mechanisms driving the evolution of a large polydnavirus gene family coding for protein tyrosine phosphatases. BMC Evol Biol 2012; 12:253. [PMID: 23270369 PMCID: PMC3573978 DOI: 10.1186/1471-2148-12-253] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Accepted: 12/11/2012] [Indexed: 11/20/2022] Open
Abstract
Background Gene duplications have been proposed to be the main mechanism involved in genome evolution and in acquisition of new functions. Polydnaviruses (PDVs), symbiotic viruses associated with parasitoid wasps, are ideal model systems to study mechanisms of gene duplications given that PDV genomes consist of virulence genes organized into multigene families. In these systems the viral genome is integrated in a wasp chromosome as a provirus and virus particles containing circular double-stranded DNA are injected into the parasitoids’ hosts and are essential for parasitism success. The viral virulence factors, organized in gene families, are required collectively to induce host immune suppression and developmental arrest. The gene family which encodes protein tyrosine phosphatases (PTPs) has undergone spectacular expansion in several PDV genomes with up to 42 genes. Results Here, we present strong indications that PTP gene family expansion occurred via classical mechanisms: by duplication of large segments of the chromosomally integrated form of the virus sequences (segmental duplication), by tandem duplications within this form and by dispersed duplications. We also propose a novel duplication mechanism specific to PDVs that involves viral circle reintegration into the wasp genome. The PTP copies produced were shown to undergo conservative evolution along with episodes of adaptive evolution. In particular recently produced copies have undergone positive selection in sites most likely involved in defining substrate selectivity. Conclusion The results provide evidence about the dynamic nature of polydnavirus proviral genomes. Classical and PDV-specific duplication mechanisms have been involved in the production of new gene copies. Selection pressures associated with antagonistic interactions with parasitized hosts have shaped these genes used to manipulate lepidopteran physiology with evidence for positive selection involved in adaptation to host targets.
Collapse
Affiliation(s)
- Céline Serbielle
- Institut de Recherche sur la Biologie de l'Insecte, UMR CNRS 7261, Faculté des Sciences et Techniques, Université F. Rabelais, Parc de Grandmont, 37200, Tours, France
| | | | | | | | | | | | | |
Collapse
|
185
|
Abstract
Differences between individual human genomes, or between human and cancer genomes, range in scale from single nucleotide variants (SNVs) through intermediate and large-scale duplications, deletions, and rearrangements of genomic segments. The latter class, called structural variants (SVs), have received considerable attention in the past several years as they are a previously under appreciated source of variation in human genomes. Much of this recent attention is the result of the availability of higher-resolution technologies for measuring these variants, including both microarray-based techniques, and more recently, high-throughput DNA sequencing. We describe the genomic technologies and computational techniques currently used to measure SVs, focusing on applications in human and cancer genomics.
Collapse
Affiliation(s)
- Benjamin J Raphael
- Department of Computer Science and Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America.
| |
Collapse
|
186
|
Shao M, Lin Y. Approximating the edit distance for genomes with duplicate genes under DCJ, insertion and deletion. BMC Bioinformatics 2012. [PMCID: PMC3527062 DOI: 10.1186/1471-2105-13-s19-s13] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
187
|
Kim H, Lee T, Sung S, Lee C, Kim H. Reanalysis of Ohno's hypothesis on conservation of the size of the X chromosome in mammals. Anim Cells Syst (Seoul) 2012. [DOI: 10.1080/19768354.2012.724709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022] Open
|
188
|
Farré M, Micheletti D, Ruiz-Herrera A. Recombination rates and genomic shuffling in human and chimpanzee--a new twist in the chromosomal speciation theory. Mol Biol Evol 2012. [PMID: 23204393 PMCID: PMC3603309 DOI: 10.1093/molbev/mss272] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
A long-standing question in evolutionary biology concerns the effect of recombination in shaping the genomic architecture of organisms and, in particular, how this impacts the speciation process. Despite efforts employed in the last decade, the role of chromosomal reorganizations in the human-chimpanzee speciation process remains unresolved. Through whole-genome comparisons, we have analyzed the genome-wide impact of genomic shuffling in the distribution of human recombination rates during the human-chimpanzee speciation process. We have constructed a highly refined map of the reorganizations and evolutionary breakpoint regions in the human and chimpanzee genomes based on orthologous genes and genome sequence alignments. The analysis of the most recent human and chimpanzee recombination maps inferred from genome-wide single-nucleotide polymorphism data revealed that the standardized recombination rate was significantly lower in rearranged than in collinear chromosomes. In fact, rearranged chromosomes presented significantly lower recombination rates than chromosomes that have been maintained since the ancestor of great apes, and this was related with the lineage in which they become fixed. Importantly, inverted regions had lower recombination rates than collinear and noninverted regions, independently of the effect of centromeres. Our observations have implications for the chromosomal speciation theory, providing new evidences for the contribution of inversions in suppressing recombination in mammals.
Collapse
Affiliation(s)
- Marta Farré
- Departament de Biologia Cellular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Campus UAB, Cerdanyola del Vallès, Barcelona, Spain
| | | | | |
Collapse
|
189
|
Marotta M, Chen X, Inoshita A, Stephens R, Budd GT, Crowe JP, Lyons J, Kondratova A, Tubbs R, Tanaka H. A common copy-number breakpoint of ERBB2 amplification in breast cancer colocalizes with a complex block of segmental duplications. Breast Cancer Res 2012. [PMID: 23181561 PMCID: PMC4053137 DOI: 10.1186/bcr3362] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Introduction Segmental duplications (low-copy repeats) are the recently duplicated genomic segments in the human genome that display nearly identical (> 90%) sequences and account for about 5% of euchromatic regions. In germline, duplicated segments mediate nonallelic homologous recombination and thus cause both non-disease-causing copy-number variants and genomic disorders. To what extent duplicated segments play a role in somatic DNA rearrangements in cancer remains elusive. Duplicated segments often cluster and form genomic blocks enriched with both direct and inverted repeats (complex genomic regions). Such complex regions could be fragile and play a mechanistic role in the amplification of the ERBB2 gene in breast tumors, because repeated sequences are known to initiate gene amplification in model systems. Methods We conducted polymerase chain reaction (PCR)-based assays for primary breast tumors and analyzed publically available array-comparative genomic hybridization data to map a common copy-number breakpoint in ERBB2-amplified primary breast tumors. We further used molecular, bioinformatics, and population-genetics approaches to define duplication contents, structural variants, and haplotypes within the common breakpoint. Results We found a large (> 300-kb) block of duplicated segments that was colocalized with a common-copy number breakpoint for ERBB2 amplification. The breakpoint that potentially initiated ERBB2 amplification localized in a region 1.5 megabases (Mb) on the telomeric side of ERBB2. The region is very complex, with extensive duplications of KRTAP genes, structural variants, and, as a result, a paucity of single-nucleotide polymorphism (SNP) markers. Duplicated segments are varied in size and degree of sequence homology, indicating that duplications have occurred recurrently during genome evolution. Conclusions Amplification of the ERBB2 gene in breast tumors is potentially initiated by a complex region that has unusual genomic features and thus requires rigorous, labor-intensive investigation. The haplotypes we provide could be useful to identify the potential association between the complex region and ERBB2 amplification.
Collapse
|
190
|
Asrar Z, Haq F, Abbasi AA. Fourfold paralogy regions on human HOX-bearing chromosomes: role of ancient segmental duplications in the evolution of vertebrate genome. Mol Phylogenet Evol 2012; 66:737-47. [PMID: 23142696 DOI: 10.1016/j.ympev.2012.10.024] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2012] [Revised: 10/27/2012] [Accepted: 10/29/2012] [Indexed: 01/26/2023]
Abstract
BACKGROUND Susumu Ohno's idea that modern vertebrates are degenerate polyploids (concept referred as 2R hypothesis) has been the subject of intense debate for past four decades. It was proposed that intra-genomic synteny regions (paralogons) in human genome are remains of ancient polyploidization events that occurred early in the vertebrate history. The quadruplicated paralogon centered on human HOX clusters is taken as evidence that human HOX-bearing chromosomes were structured by two rounds of whole genome duplication (WGD) events. RESULTS Evolutionary history of human HOX-bearing chromosomes (chromosomes 2/7/12/17) was evaluated by the phylogenetic analysis of multigene families with triplicated or quadruplicated distribution on these chromosomes. Topology comparison approach categorized the members of 44 families into four distinct co-duplicated groups. Distinct gene families belonging to a particular co-duplicated group, exhibit similar evolutionary history and hence have duplicated simultaneously, whereas genes of two distinct co-duplicated groups do not share their evolutionary history and have not duplicated in concert with each other. CONCLUSION The recovery of co-duplicated groups suggests that "ancient segmental duplications and rearrangements" is the most rational model of evolutionary events that have generated the triplicated and quadruplicated paralogy regions seen on the human HOX-bearing chromosomes.
Collapse
Affiliation(s)
- Zainab Asrar
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan
| | | | | |
Collapse
|
191
|
Giannuzzi G, Siswara P, Malig M, Marques-Bonet T, Mullikin JC, Ventura M, Eichler EE. Evolutionary dynamism of the primate LRRC37 gene family. Genome Res 2012; 23:46-59. [PMID: 23064749 PMCID: PMC3530683 DOI: 10.1101/gr.138842.112] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Core duplicons in the human genome represent ancestral duplication modules shared by the majority of intrachromosomal duplication blocks within a given chromosome. These cores are associated with the emergence of novel gene families in the hominoid lineage, but their genomic organization and gene characterization among other primates are largely unknown. Here, we investigate the genomic organization and expression of the core duplicon on chromosome 17 that led to the expansion of LRRC37 during primate evolution. A comparison of the LRRC37 gene family organization in human, orangutan, macaque, marmoset, and lemur genomes shows the presence of both orthologous and species-specific gene copies in all primate lineages. Expression profiling in mouse, macaque, and human tissues reveals that the ancestral expression of LRRC37 was restricted to the testis. In the hominid lineage, the pattern of LRRC37 became increasingly ubiquitous, with significantly higher levels of expression in the cerebellum and thymus, and showed a remarkable diversity of alternative splice forms. Transfection studies in HeLa cells indicate that the human FLAG-tagged recombinant LRRC37 protein is secreted after cleavage of a transmembrane precursor and its overexpression can induce filipodia formation.
Collapse
Affiliation(s)
- Giuliana Giannuzzi
- Dipartimento di Biologia, Università degli Studi di Bari Aldo Moro, Bari 70126, Italy
| | | | | | | | | | | | | | | |
Collapse
|
192
|
Glunčić M, Paar V. Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm. Nucleic Acids Res 2012; 41:e17. [PMID: 22977183 PMCID: PMC3592446 DOI: 10.1093/nar/gks721] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes).
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, Bijenička 32 and Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia.
| | | |
Collapse
|
193
|
Kanthaswamy S, Ng J, Ross CT, Trask JS, Smith DG, Buffalo VS, Fass JN, Lin D. Identifying human-rhesus macaque gene orthologs using heterospecific SNP probes. Genomics 2012; 101:30-7. [PMID: 22982528 DOI: 10.1016/j.ygeno.2012.09.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Revised: 07/27/2012] [Accepted: 09/04/2012] [Indexed: 02/07/2023]
Abstract
We genotyped a Chinese and an Indian-origin rhesus macaque using the Affymetrix Genome-Wide Human SNP Array 6.0 and cataloged 85,473 uniquely mapping heterospecific SNPs. These SNPs were assigned to rhesus chromosomes according to their probe sequence alignments as displayed in the human and rhesus reference sequences. The conserved gene order (synteny) revealed by heterospecific SNP maps is in concordance with that of the published human and rhesus macaque genomes. Using these SNPs' original human rs numbers, we identified 12,328 genes annotated in humans that are associated with these SNPs, 3674 of which were found in at least one of the two rhesus macaques studied. Due to their density, the heterospecific SNPs allow fine-grained comparisons, including approximate boundaries of intra- and extra-chromosomal rearrangements involving gene orthologs, which can be used to distinguish rhesus macaque chromosomes from human chromosomes.
Collapse
Affiliation(s)
- Sree Kanthaswamy
- Molecular Anthropology Lab., Dept. of Anthropology, UC Davis, CA, USA.
| | | | | | | | | | | | | | | |
Collapse
|
194
|
Ahn D, You KH, Kim CH. Evolution of the tbx6/16 subfamily genes in vertebrates: insights from zebrafish. Mol Biol Evol 2012; 29:3959-83. [PMID: 22915831 DOI: 10.1093/molbev/mss199] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
In any comparative studies striving to understand the similarities and differences of the living organisms at the molecular genetic level, the crucial first step is to establish the homology (orthology and paralogy) of genes between different organisms. Determination of the homology of genes becomes complicated when the genes have undergone a rapid divergence in sequence or when the involved genes are members of a gene family that has experienced a differential gain or loss of its constituents in different taxonomic groups. Organisms with duplicated genomes such as teleost fishes might have been especially prone to these problems because the functional redundancies provided by the duplicate copies of genes would have allowed a rapid divergence or loss of genes during evolution. In this study, we will demonstrate that much of the ambiguities in the determination of the homology between fish and tetrapod genes resulting from the problems like these can be eliminated by complementing the sequence-based phylogenies with nonsequence information, such as the exon-intron structure of a gene or the composition of a gene's genomic neighbors. We will use the Tbx6/16 subfamily genes of zebrafish (tbx6, tbx16, tbx24, and mga genes), which have been well known for the ambiguity of their evolutionary relationships to the Tbx6/16 subfamily genes of tetrapods, as an illustrative example. We will show that, despite the similarity of sequence and expression to the tetrapod Tbx6 genes, zebrafish tbx6 gene is actually a novel T-box gene more closely related to the tetrapod Tbx16 genes, whereas the zebrafish tbx24 gene, hitherto considered to be a novel gene due to the high level of sequence divergence, is actually an ortholog of tetrapod Tbx6 genes. We will also show that, after their initial appearance by the multiplication of a common ancestral gene at the beginning of vertebrate evolution, the Tbx6/16 subfamily of vertebrate T-box genes might have experienced differential losses of member genes in different vertebrate groups and gradual pooling of member gene's functions in surviving members, which might have prevented the revelation of the true identity of member genes by way of the comparison of sequence and function.
Collapse
Affiliation(s)
- Daegwon Ahn
- Department of Biology, Chungnam National University, Daejeon, Republic of Korea
| | | | | |
Collapse
|
195
|
Sipos B, Massingham T, Stütz AM, Goldman N. An improved protocol for sequencing of repetitive genomic regions and structural variations using mutagenesis and next generation sequencing. PLoS One 2012; 7:e43359. [PMID: 22912860 PMCID: PMC3422288 DOI: 10.1371/journal.pone.0043359] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 07/19/2012] [Indexed: 11/24/2022] Open
Abstract
The rise of Next Generation Sequencing (NGS) technologies has transformed de novo genome sequencing into an accessible research tool, but obtaining high quality eukaryotic genome assemblies remains a challenge, mostly due to the abundance of repetitive elements. These also make it difficult to study nucleotide polymorphism in repetitive regions, including certain types of structural variations. One solution proposed for resolving such regions is Sequence Assembly aided by Mutagenesis (SAM), which relies on the fact that introducing enough random mutations breaks the repetitive structure, making assembly possible. Sequencing many different mutated copies permits the sequence of the repetitive region to be inferred by consensus methods. However, this approach relies on molecular cloning in order to isolate and amplify individual mutant copies, making it hard to scale-up the approach for use in conjunction with high-throughput sequencing technologies. To address this problem, we propose NG-SAM, a modified version of the SAM protocol that relies on PCR and dilution steps only, coupled to a NGS workflow. NG-SAM therefore has the potential to be scaled-up, e.g. using emerging microfluidics technologies. We built a realistic simulation pipeline to study the feasibility of NG-SAM, and our results suggest that under appropriate experimental conditions the approach might be successfully put into practice. Moreover, our simulations suggest that NG-SAM is capable of reconstructing robustly a wide range of potential target sequences of varying lengths and repetitive structures.
Collapse
Affiliation(s)
- Botond Sipos
- European Bioinformatics Institute, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
| | | | | | | |
Collapse
|
196
|
Mendivil Ramos O, Ferrier DEK. Mechanisms of Gene Duplication and Translocation and Progress towards Understanding Their Relative Contributions to Animal Genome Evolution. INTERNATIONAL JOURNAL OF EVOLUTIONARY BIOLOGY 2012; 2012:846421. [PMID: 22919542 PMCID: PMC3420103 DOI: 10.1155/2012/846421] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Revised: 05/30/2012] [Accepted: 06/27/2012] [Indexed: 01/10/2023]
Abstract
Duplication of genetic material is clearly a major route to genetic change, with consequences for both evolution and disease. A variety of forms and mechanisms of duplication are recognised, operating across the scales of a few base pairs upto entire genomes. With the ever-increasing amounts of gene and genome sequence data that are becoming available, our understanding of the extent of duplication is greatly improving, both in terms of the scales of duplication events as well as their rates of occurrence. An accurate understanding of these processes is vital if we are to properly understand important events in evolution as well as mechanisms operating at the level of genome organisation. Here we will focus on duplication in animal genomes and how the duplicated sequences are distributed, with the aim of maintaining a focus on principles of evolution and organisation that are most directly applicable to the shaping of our own genome.
Collapse
Affiliation(s)
| | - David E. K. Ferrier
- The Scottish Oceans Institute, School of Biology, University of St Andrews, East Sands, Fife KY16 8LB, UK
| |
Collapse
|
197
|
Du R, Lu C, Jiang Z, Li S, Ma R, An H, Xu M, An Y, Xia Y, Jin L, Wang X, Zhang F. Efficient typing of copy number variations in a segmental duplication-mediated rearrangement hotspot using multiplex competitive amplification. J Hum Genet 2012; 57:545-551. [PMID: 22673690 DOI: 10.1038/jhg.2012.66] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Local genomic architecture, such as segmental duplications (SDs), can induce copy number variations (CNVs) hotspots in the human genome, many of which manifest as genomic disorders. Significant technological advances have been achieved for genome-wide CNV investigations, but these costly methods are not suitable for genotyping certain disease-associated CNVs or other loci of interest in populations. Recently, two independent studies showed that the murine meiosis expressed gene 1 (Meig1) was critical to spermatogenesis. We found that the human orthologue MEIG1 is flanked by an SD pair, between which non-allelic homologous recombination (NAHR) can cause recurrent CNVs. To study this potential CNV hotspot and its role in spermatogenesis, we developed a new CNV genotyping method, AccuCopy, based on multiplex competitive amplification to investigate 320 patients with spermatogenic impairment and 93 healthy controls. Three MEIG1 duplications (two in patients and one in controls) were identified, whereas no deletion was found. As NAHR results in more recurrent deletions than duplications at a locus, the over representation of recurrent MEIG1 duplications suggests a potential purifying selection operating on this hotspot, possibly via fecundity. We also showed that AccuCopy is an efficient and reliable method for multiplex CNV genotyping.
Collapse
Affiliation(s)
- Renqian Du
- MOE Key Laboratory of Contemporary Anthropology and State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
198
|
Carrigan MA, Uryasev O, Davis RP, Zhai L, Hurley TD, Benner SA. The natural history of class I primate alcohol dehydrogenases includes gene duplication, gene loss, and gene conversion. PLoS One 2012; 7:e41175. [PMID: 22859968 PMCID: PMC3409193 DOI: 10.1371/journal.pone.0041175] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 06/18/2012] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Gene duplication is a source of molecular innovation throughout evolution. However, even with massive amounts of genome sequence data, correlating gene duplication with speciation and other events in natural history can be difficult. This is especially true in its most interesting cases, where rapid and multiple duplications are likely to reflect adaptation to rapidly changing environments and life styles. This may be so for Class I of alcohol dehydrogenases (ADH1s), where multiple duplications occurred in primate lineages in Old and New World monkeys (OWMs and NWMs) and hominoids. METHODOLOGY/PRINCIPAL FINDINGS To build a preferred model for the natural history of ADH1s, we determined the sequences of nine new ADH1 genes, finding for the first time multiple paralogs in various prosimians (lemurs, strepsirhines). Database mining then identified novel ADH1 paralogs in both macaque (an OWM) and marmoset (a NWM). These were used with the previously identified human paralogs to resolve controversies relating to dates of duplication and gene conversion in the ADH1 family. Central to these controversies are differences in the topologies of trees generated from exonic (coding) sequences and intronic sequences. CONCLUSIONS/SIGNIFICANCE We provide evidence that gene conversions are the primary source of difference, using molecular clock dating of duplications and analyses of microinsertions and deletions (micro-indels). The tree topology inferred from intron sequences appear to more correctly represent the natural history of ADH1s, with the ADH1 paralogs in platyrrhines (NWMs) and catarrhines (OWMs and hominoids) having arisen by duplications shortly predating the divergence of OWMs and NWMs. We also conclude that paralogs in lemurs arose independently. Finally, we identify errors in database interpretation as the source of controversies concerning gene conversion. These analyses provide a model for the natural history of ADH1s that posits four ADH1 paralogs in the ancestor of Catarrhine and Platyrrhine primates, followed by the loss of an ADH1 paralog in the human lineage.
Collapse
Affiliation(s)
- Matthew A Carrigan
- Foundation for Applied Molecular Evolution, Gainesville, Florida, United States of America.
| | | | | | | | | | | |
Collapse
|
199
|
Mácha J, Teichmanová R, Sater AK, Wells DE, Tlapáková T, Zimmerman LB, Krylov V. Deep ancestry of mammalian X chromosome revealed by comparison with the basal tetrapod Xenopus tropicalis. BMC Genomics 2012; 13:315. [PMID: 22800176 PMCID: PMC3472169 DOI: 10.1186/1471-2164-13-315] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 06/25/2012] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND The X and Y sex chromosomes are conspicuous features of placental mammal genomes. Mammalian sex chromosomes arose from an ordinary pair of autosomes after the proto-Y acquired a male-determining gene and degenerated due to suppression of X-Y recombination. Analysis of earlier steps in X chromosome evolution has been hampered by the long interval between the origins of teleost and amniote lineages as well as scarcity of X chromosome orthologs in incomplete avian genome assemblies. RESULTS This study clarifies the genesis and remodelling of the Eutherian X chromosome by using a combination of sequence analysis, meiotic map information, and cytogenetic localization to compare amniote genome organization with that of the amphibian Xenopus tropicalis. Nearly all orthologs of human X genes localize to X. tropicalis chromosomes 2 and 8, consistent with an ancestral X-conserved region and a single X-added region precursor. This finding contradicts a previous hypothesis of three evolutionary strata in this region. Homologies between human, opossum, chicken and frog chromosomes suggest a single X-added region predecessor in therian mammals, corresponding to opossum chromosomes 4 and 7. A more ancient X-added ancestral region, currently extant as a major part of chicken chromosome 1, is likely to have been present in the progenitor of synapsids and sauropsids. Analysis of X chromosome gene content emphasizes conservation of single protein coding genes and the role of tandem arrays in formation of novel genes. CONCLUSIONS Chromosomal regions orthologous to Therian X chromosomes have been located in the genome of the frog X. tropicalis. These X chromosome ancestral components experienced a series of fusion and breakage events to give rise to avian autosomes and mammalian sex chromosomes. The early branching tetrapod X. tropicalis' simple diploid genome and robust synteny to amniotes greatly enhances studies of vertebrate chromosome evolution.
Collapse
Affiliation(s)
- Jaroslav Mácha
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Radka Teichmanová
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Amy K Sater
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| | - Dan E Wells
- Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA
| | - Tereza Tlapáková
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| | - Lyle B Zimmerman
- Division of Developmental Biology, MRC-National Institute for Medical Research, Mill Hill, London, NW7 1AA, UK
| | - Vladimír Krylov
- Department of Cell Biology, Faculty of Science, Charles University in Prague, Vinicna 7, Prague 2, Czech Republic
| |
Collapse
|
200
|
Mannaert A, Downing T, Imamura H, Dujardin JC. Adaptive mechanisms in pathogens: universal aneuploidy in Leishmania. Trends Parasitol 2012; 28:370-6. [PMID: 22789456 DOI: 10.1016/j.pt.2012.06.003] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Revised: 06/14/2012] [Accepted: 06/14/2012] [Indexed: 02/07/2023]
Abstract
Genomic stability and maintenance of the correct chromosome number are assumed to be essential for normal development in eukaryotes. Aneuploidy is usually associated with severe abnormalities and decrease of cell fitness, but some organisms appear to rely on aneuploidy for rapid adaptation to changing environments. This phenomenon is mostly described in pathogenic fungi and cancer cells. However, recent genome studies highlight the importance of Leishmania as a new model for studies on aneuploidy. Several reports revealed extensive variation in chromosome copy number, indicating that aneuploidy is a constitutive feature of this protozoan parasite genus. Aneuploidy appears to be beneficial in organisms that are primarily asexual, unicellular, and that undergo sporadic epidemic expansions, including common pathogens as well as cancer.
Collapse
Affiliation(s)
- An Mannaert
- Unit of Molecular Parasitology, Department of Biomedical Sciences, Institute of Tropical Medicine, Antwerp, Belgium
| | | | | | | |
Collapse
|