1
|
Wait DR, Peñalba JV. Suture zones, speciation, and evolution. Evolution 2025; 79:329-341. [PMID: 39708295 DOI: 10.1093/evolut/qpae184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Accepted: 12/19/2024] [Indexed: 12/23/2024]
Abstract
In the more than 50 years since the initial conceptualization of the suture zone, little work has been done to take full advantage of the comparative capability of these geographic regions. During this time, great advances have been made in hybrid zone research that have provided invaluable insight into speciation and evolution. Hybrid zones have long been recognized to be "windows to the evolutionary process." If a single hybrid zone provides a window, then multiple hybrid zones in a suture zone can provide a panoramic view of the evolutionary process. Here, we hope to redirect attention to suture zones, bring the advances from hybrid zone research to a comparative framework, and further expand our understanding of speciation and evolution. In this review, we recount the historical discussions surrounding suture zones, briefly review what we can learn from hybrid zones, and review the comparative studies done on suture zones thus far. We also highlight the opportunities and challenges of performing research in suture zones to help guide researchers hoping to start a research project in these regions. Lastly, we propose future directions and questions for comparative research that can be done in suture zones.
Collapse
Affiliation(s)
- Daniel R Wait
- Museum of Vertebrate Zoology, Department of Integrative Biology, University of California at Berkeley, 3101 Valley Life Sciences Buildings, Berkeley, CA 94720, United States
| | - Joshua V Peñalba
- Museum für Naturkunde, Leibniz Institute for Evolution and Biodiversity, Center for Integrative Biodiversity Discovery, Invalidenstraße 43, Berlin 10115, Germany
| |
Collapse
|
2
|
Liang J, Zhang K, Hu X, Lv A. Comparative Genomics Analysis of the Fish Pathogen Rahnella aquatilis KCL-5 Reveals Potential Multidrug Resistance and Virulence Properties. Curr Microbiol 2025; 82:158. [PMID: 40014107 DOI: 10.1007/s00284-025-04125-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Accepted: 02/06/2025] [Indexed: 02/28/2025]
Abstract
The genus Rahnella has been isolated from human, fish, and water environments. We recently reported on the isolation and genomic identification of a novel pathogen R. aquatilis strain KCL-5 from crucian carp Carassius auratus. To investigate the evolution of bacterial virulence and resistance properties of R. aquatilis, comparative genomics analyses were performed for genus Rahnella strains including R. aquatilis, R. variigena, R. bruchi, and R. victoriana. This analysis provides up-to-date information on genus Rahnella genomic diversity, including comparative analyses of virulence and resistance, synteny, single-nucleotide polymorphisms, average nucleotide identity, core-genes, gene families, and genomic islands. The sister species to R. aquatilis is R. victoriana, and closer R. victoriana than with other Rahnella sp. Multiple genes encoding functions that likely contribute to antimicrobial resistance and pathogenic factors were identified by comparative genome analysis, including multidrug resistance efflux pump, adherence, invasion, and secretion systems. To our knowledge, this is the first report to provide a more detailed insight into the comparative genomic characteristics of Rahnella spp., contributing to the understanding of its diversity and evolution, as well as concerning the virulence of R. aquatilis.
Collapse
Affiliation(s)
- Jing Liang
- Tianjin Key Lab of Aqua-Ecology and Aquaculture, College of Fisheries, Tianjin Agricultural University, Tianjin, 300392, China
| | - Kaiyang Zhang
- Tianjin Key Lab of Aqua-Ecology and Aquaculture, College of Fisheries, Tianjin Agricultural University, Tianjin, 300392, China
| | - Xiucai Hu
- Tianjin Key Lab of Aqua-Ecology and Aquaculture, College of Fisheries, Tianjin Agricultural University, Tianjin, 300392, China.
| | - Aijun Lv
- Tianjin Key Lab of Aqua-Ecology and Aquaculture, College of Fisheries, Tianjin Agricultural University, Tianjin, 300392, China.
| |
Collapse
|
3
|
Mah JC, Lohmueller KE, Garud NR. Inference of the Demographic Histories and Selective Effects of Human Gut Commensal Microbiota Over the Course of Human History. Mol Biol Evol 2025; 42:msaf010. [PMID: 39838923 PMCID: PMC11824422 DOI: 10.1093/molbev/msaf010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 11/07/2024] [Accepted: 01/07/2025] [Indexed: 01/23/2025] Open
Abstract
Despite the importance of gut commensal microbiota to human health, there is little knowledge about their evolutionary histories, including their demographic histories and distributions of fitness effects (DFEs) of mutations. Here, we infer the demographic histories and DFEs for amino acid-changing mutations of 39 of the most prevalent and abundant commensal gut microbial species found in Westernized individuals over timescales exceeding human generations. Some species display contractions in population size and others expansions, with several of these events coinciding with several key historical moments in human history. DFEs across species vary from highly to mildly deleterious, with differences between accessory and core gene DFEs largely driven by genetic drift. Within genera, DFEs tend to be more congruent, reflective of underlying phylogenetic relationships. Together, these findings suggest that gut microbes have distinct demographic and selective histories.
Collapse
Affiliation(s)
- Jonathan C Mah
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, USA
- Department of Human Genetics, University of California, Los Angeles, USA
| | - Nandita R Garud
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, USA
- Department of Human Genetics, University of California, Los Angeles, USA
| |
Collapse
|
4
|
Wolff R, Garud NR. Pervasive selective sweeps across human gut microbiomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.22.573162. [PMID: 38187688 PMCID: PMC10769429 DOI: 10.1101/2023.12.22.573162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
The human gut microbiome is composed of a highly diverse consortia of species which are continually evolving within and across hosts. The ability to identify adaptations common to many human gut microbiomes would not only reveal shared selection pressures across hosts, but also key drivers of functional differentiation of the microbiome that may affect community structure and host traits. However, to date there has not been a systematic scan for adaptations that have spread across human gut microbiomes. Here, we develop a novel selection scan statistic named the integrated Linkage Disequilibrium Score (iLDS) that can detect the spread of adaptive haplotypes across host microbiomes via migration and horizontal gene transfer. Specifically, iLDS leverages signals of hitchhiking of deleterious variants with the beneficial variant. Application of the statistic to ~30 of the most prevalent commensal gut species from 24 populations around the world revealed more than 300 selective sweeps across species. We find an enrichment for selective sweeps at loci involved in carbohydrate metabolism-potentially indicative of adaptation to features of host diet-and we find that the targets of selection significantly differ between Westernized and non-Westernized populations. Underscoring the potential role of diet in driving selection, we find a selective sweep absent from non-Westernized populations but ubiquitous in Westernized populations at a locus known to be involved in the metabolism of maltodextrin, a synthetic starch that has recently become a widespread component of Western diets. In summary, we demonstrate that selective sweeps across host microbiomes are a common feature of the evolution of the human gut microbiome, and that targets of selection may be strongly impacted by host diet.
Collapse
Affiliation(s)
- Richard Wolff
- Department of Ecology and Evolutionary Biology, UCLA
| | - Nandita R. Garud
- Department of Ecology and Evolutionary Biology, UCLA
- Department of Human Genetics, UCLA
| |
Collapse
|
5
|
Mah JC, Lohmueller KE, Garud N. Inference of the demographic histories and selective effects of human gut commensal microbiota over the course of human history. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.09.566454. [PMID: 38014007 PMCID: PMC10680615 DOI: 10.1101/2023.11.09.566454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Despite the importance of gut commensal microbiota to human health, there is little knowledge about their evolutionary histories, including their population demographic histories and their distributions of fitness effects (DFE) of new mutations. Here, we infer the demographic histories and DFEs of 27 of the most highly prevalent and abundant commensal gut microbial species in North Americans over timescales exceeding human generations using a collection of lineages inferred from a panel of healthy hosts. We find overall reductions in genetic variation among commensal gut microbes sampled from a Western population relative to an African rural population. Additionally, some species in North American microbiomes display contractions in population size and others expansions, potentially occurring at several key historical moments in human history. DFEs across species vary from highly to mildly deleterious, with accessory genes experiencing more drift compared to core genes. Within genera, DFEs tend to be more congruent, reflective of underlying phylogenetic relationships. Taken together, these findings suggest that human commensal gut microbes have distinct evolutionary histories, possibly reflecting the unique roles of individual members of the microbiome.
Collapse
Affiliation(s)
- Jonathan C. Mah
- Bioinformatics Interdepartmental Program, University of California, Los Angeles
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles
- Department of Human Genetics, University of California, Los Angeles
| | - Nandita Garud
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles
- Department of Human Genetics, University of California, Los Angeles
| |
Collapse
|
6
|
Yıldırım B, Vogl C. Purifying selection against spurious splicing signals contributes to the base composition evolution of the polypyrimidine tract. J Evol Biol 2023; 36:1295-1312. [PMID: 37564008 PMCID: PMC10946897 DOI: 10.1111/jeb.14205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/31/2023] [Accepted: 06/15/2023] [Indexed: 08/12/2023]
Abstract
Among eukaryotes, the major spliceosomal pathway is highly conserved. While long introns may contain additional regulatory sequences, the ones in short introns seem to be nearly exclusively related to splicing. Although these regulatory sequences involved in splicing are well-characterized, little is known about their evolution. At the 3' end of introns, the splice signal nearly universally contains the dimer AG, which consists of purines, and the polypyrimidine tract upstream of this 3' splice signal is characterized by over-representation of pyrimidines. If the over-representation of pyrimidines in the polypyrimidine tract is also due to avoidance of a premature splicing signal, we hypothesize that AG should be the most under-represented dimer. Through the use of DNA-strand asymmetry patterns, we confirm this prediction in fruit flies of the genus Drosophila and by comparing the asymmetry patterns to a presumably neutrally evolving region, we quantify the selection strength acting on each motif. Moreover, our inference and simulation method revealed that the best explanation for the base composition evolution of the polypyrimidine tract is the joint action of purifying selection against a spurious 3' splice signal and the selection for pyrimidines. Patterns of asymmetry in other eukaryotes indicate that avoidance of premature splicing similarly affects the nucleotide composition in their polypyrimidine tracts.
Collapse
Affiliation(s)
- Burçin Yıldırım
- Department of Biomedical SciencesVetmeduni ViennaViennaAustria
- Vienna Graduate School of Population GeneticsViennaAustria
| | - Claus Vogl
- Department of Biomedical SciencesVetmeduni ViennaViennaAustria
- Vienna Graduate School of Population GeneticsViennaAustria
| |
Collapse
|
7
|
Robinson J, Kyriazis CC, Yuan SC, Lohmueller KE. Deleterious Variation in Natural Populations and Implications for Conservation Genetics. Annu Rev Anim Biosci 2023; 11:93-114. [PMID: 36332644 PMCID: PMC9933137 DOI: 10.1146/annurev-animal-080522-093311] [Citation(s) in RCA: 44] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Deleterious mutations decrease reproductive fitness and are ubiquitous in genomes. Given that many organisms face ongoing threats of extinction, there is interest in elucidating the impact of deleterious variation on extinction risk and optimizing management strategies accounting for such mutations. Quantifying deleterious variation and understanding the effects of population history on deleterious variation are complex endeavors because we do not know the strength of selection acting on each mutation. Further, the effect of demographic history on deleterious mutations depends on the strength of selection against the mutation and the degree of dominance. Here we clarify how deleterious variation can be quantified and studied in natural populations. We then discuss how different demographic factors, such as small population size, nonequilibrium population size changes, inbreeding, and gene flow, affect deleterious variation. Lastly, we provide guidance on studying deleterious variation in nonmodel populations of conservation concern.
Collapse
Affiliation(s)
- Jacqueline Robinson
- Institute for Human Genetics, University of California, San Francisco, California, USA;
| | - Christopher C Kyriazis
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA; , ,
| | - Stella C Yuan
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA; , ,
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA; , , .,Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| |
Collapse
|
8
|
Good BH. Linkage disequilibrium between rare mutations. Genetics 2022; 220:6503502. [PMID: 35100407 PMCID: PMC8982034 DOI: 10.1093/genetics/iyac004] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 12/21/2021] [Indexed: 01/13/2023] Open
Abstract
The statistical associations between mutations, collectively known as linkage disequilibrium, encode important information about the evolutionary forces acting within a population. Yet in contrast to single-site analogues like the site frequency spectrum, our theoretical understanding of linkage disequilibrium remains limited. In particular, little is currently known about how mutations with different ages and fitness costs contribute to expected patterns of linkage disequilibrium, even in simple settings where recombination and genetic drift are the major evolutionary forces. Here, I introduce a forward-time framework for predicting linkage disequilibrium between pairs of neutral and deleterious mutations as a function of their present-day frequencies. I show that the dynamics of linkage disequilibrium become much simpler in the limit that mutations are rare, where they admit a simple heuristic picture based on the trajectories of the underlying lineages. I use this approach to derive analytical expressions for a family of frequency-weighted linkage disequilibrium statistics as a function of the recombination rate, the frequency scale, and the additive and epistatic fitness costs of the mutations. I find that the frequency scale can have a dramatic impact on the shapes of the resulting linkage disequilibrium curves, reflecting the broad range of time scales over which these correlations arise. I also show that the differences between neutral and deleterious linkage disequilibrium are not purely driven by differences in their mutation frequencies and can instead display qualitative features that are reminiscent of epistasis. I conclude by discussing the implications of these results for recent linkage disequilibrium measurements in bacteria. This forward-time approach may provide a useful framework for predicting linkage disequilibrium across a range of evolutionary scenarios.
Collapse
Affiliation(s)
- Benjamin H Good
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA,Corresponding author: Department of Applied Physics, Stanford University, Clark Center, 318 Campus Drive, Stanford, CA 94305, USA.
| |
Collapse
|
9
|
Shoemaker WR, Chen D, Garud NR. Comparative Population Genetics in the Human Gut Microbiome. Genome Biol Evol 2022; 14:evab116. [PMID: 34028530 PMCID: PMC8743038 DOI: 10.1093/gbe/evab116] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/22/2021] [Indexed: 11/13/2022] Open
Abstract
Genetic variation in the human gut microbiome is responsible for conferring a number of crucial phenotypes like the ability to digest food and metabolize drugs. Yet, our understanding of how this variation arises and is maintained remains relatively poor. Thus, the microbiome remains a largely untapped resource, as the large number of coexisting species in the microbiome presents a unique opportunity to compare and contrast evolutionary processes across species to identify universal trends and deviations. Here we outline features of the human gut microbiome that, while not unique in isolation, as an assemblage make it a system with unparalleled potential for comparative population genomics studies. We consciously take a broad view of comparative population genetics, emphasizing how sampling a large number of species allows researchers to identify universal evolutionary dynamics in addition to new genes, which can then be leveraged to identify exceptional species that deviate from general patterns. To highlight the potential power of comparative population genetics in the microbiome, we reanalyze patterns of purifying selection across ∼40 prevalent species in the human gut microbiome to identify intriguing trends which highlight functional categories in the microbiome that may be under more or less constraint.
Collapse
Affiliation(s)
- William R Shoemaker
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
| | - Daisy Chen
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
| | - Nandita R Garud
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
- Department of Human Genetics, University of California, Los Angeles, California, USA
| |
Collapse
|
10
|
Deleterious protein-coding variants in diverse cattle breeds of the world. Genet Sel Evol 2021; 53:80. [PMID: 34654372 PMCID: PMC8518297 DOI: 10.1186/s12711-021-00674-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Accepted: 09/22/2021] [Indexed: 11/16/2022] Open
Abstract
The domestication of wild animals has resulted in a reduction in effective population sizes, which can affect the deleterious mutation load of domesticated breeds. In addition, artificial selection contributes to the accumulation of deleterious mutations because of an increased rate of inbreeding among domesticated animals. Since founder population sizes and artificial selection differ between cattle breeds, their deleterious mutation load can vary. We investigated this question by using whole-genome data from 432 animals belonging to 54 worldwide cattle breeds. Our analysis revealed a negative correlation between genomic heterozygosity and nonsynonymous-to-silent diversity ratio, which suggests a higher proportion of single nucleotide variants (SNVs) affecting proteins in low-diversity breeds. Our results also showed that low-diversity breeds had a larger number of high-frequency (derived allele frequency (DAF) > 0.51) deleterious SNVs than high-diversity breeds. An opposite trend was observed for the low-frequency (DAF ≤ 0.51) deleterious SNVs. Overall, the number of high-frequency deleterious SNVs was larger in the genomes of taurine cattle breeds than of indicine breeds, whereas the number of low-frequency deleterious SNVs was larger in the genomes of indicine cattle than in those of taurine cattle. Furthermore, we observed significant variation in the counts of deleterious SNVs within taurine breeds. The variations in deleterious mutation load between taurine and indicine breeds could be attributed to the population sizes of the wild progenitors before domestication, whereas the variations observed within taurine breeds could be due to differences in inbreeding level, strength of artificial selection, and/or founding population size. Our findings imply that the incidence of genetic diseases can vary between cattle breeds.
Collapse
|
11
|
Fang L, Zhao T, Hu Y, Si Z, Zhu X, Han Z, Liu G, Wang S, Ju L, Guo M, Mei H, Wang L, Qi B, Wang H, Guan X, Zhang T. Divergent improvement of two cultivated allotetraploid cotton species. PLANT BIOTECHNOLOGY JOURNAL 2021; 19:1325-1336. [PMID: 33448110 PMCID: PMC8313128 DOI: 10.1111/pbi.13547] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 12/24/2020] [Accepted: 01/03/2021] [Indexed: 05/21/2023]
Abstract
Interspecific genomic variation can provide a genetic basis for local adaptation and domestication. A series of studies have presented its role of interspecific haplotypes and introgressions in adaptive traits, but few studies have addressed their role in improving agronomic character. Two allotetraploid Gossypium species, Gossypium barbadense (Gb) and G. hirsutum (Gh) originating from the Americas, are cultivated independently. Here, through sequencing and the comparison of one GWAS panel in 229 Gb accessions and two GWAS panels in 491 Gh accessions, we found that most associated loci or functional haplotypes for agronomic traits were highly divergent, representing the strong divergent improvement between Gb and Gh. Using a comprehensive interspecific haplotype map, we revealed that six interspecific introgressions from Gh to Gb were significantly associated with the phenotypic performance of Gb, which could explain 5%-40% of phenotypic variation in yield and fibre qualities. In addition, three introgressions overlapped with six associated loci in Gb, indicating that these introgression regions were under further selection and stabilized during improvement. A single interspecific introgression often possessed yield-increasing potential but decreased fibre qualities, or the opposite, making it difficult to simultaneously improve yield and fibre qualities. Our study not only has proved the importance of interspecific functional haplotypes or introgressions in the divergent improvement of Gb and Gh, but also supports their potential value in further human-mediated hybridization or precision breeding.
Collapse
Affiliation(s)
- Lei Fang
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Ting Zhao
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Yan Hu
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Zhanfeng Si
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Xiefei Zhu
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjingChina
| | - Zegang Han
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Guizhen Liu
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjingChina
- Henan Province Seed StationZhengzhouChina
| | - Sen Wang
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjingChina
- Institute of Food CropsJiangsu Academy of Agricultural SciencesNanjingChina
| | - Longzhen Ju
- State Key Laboratory of Crop Genetics and Germplasm EnhancementNanjing Agricultural UniversityNanjingChina
| | - Menglan Guo
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Huan Mei
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Luyao Wang
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Bowen Qi
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Heng Wang
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Xueying Guan
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| | - Tianzhen Zhang
- Zhejiang Provincial Key Laboratory of Crop Genetic ResourcesInstitute of Crop SciencePlant Precision Breeding AcademyCollege of Agriculture and BiotechnologyZhejiang UniversityHangzhouChina
| |
Collapse
|
12
|
Barreira SN, Nguyen AD, Fredriksen MT, Wolfsberg TG, Moreland RT, Baxevanis AD. AniProtDB: A Collection of Consistently Generated Metazoan Proteomes for Comparative Genomics Studies. Mol Biol Evol 2021; 38:4628-4633. [PMID: 34048573 DOI: 10.1093/molbev/msab165] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
To address the void in the availability of high-quality proteomic data traversing the animal tree, we have implemented a pipeline for generating de novo assemblies based on publicly available data from the NCBI Sequence Read Archive, yielding a comprehensive collection of proteomes from 100 species spanning 21 animal phyla. We have also created the Animal Proteome Database (AniProtDB), a resource providing open access to this collection of high-quality metazoan proteomes, along with information on predicted proteins and protein domains for each taxonomic classification and the ability to perform sequence similarity searches against all proteomes generated using this pipeline. This solution vastly increases the utility of these data by removing the barrier to access for research groups who do not have the expertise or resources to generate these data themselves and enables the use of data from non-traditional research organisms that have the potential to address key questions in biomedicine.
Collapse
Affiliation(s)
- Sofia N Barreira
- Computational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892, USA
| | - Anh-Dao Nguyen
- Computational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892, USA
| | - Mark T Fredriksen
- Computational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892, USA
| | - Tyra G Wolfsberg
- Computational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892, USA
| | - R Travis Moreland
- Computational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892, USA
| | - Andreas D Baxevanis
- Computational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, 20892, USA
| |
Collapse
|
13
|
Huber CD, Kim BY, Lohmueller KE. Population genetic models of GERP scores suggest pervasive turnover of constrained sites across mammalian evolution. PLoS Genet 2020; 16:e1008827. [PMID: 32469868 PMCID: PMC7286533 DOI: 10.1371/journal.pgen.1008827] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 06/10/2020] [Accepted: 05/05/2020] [Indexed: 01/20/2023] Open
Abstract
Comparative genomic approaches have been used to identify sites where mutations are under purifying selection and of functional consequence by searching for sequences that are conserved across distantly related species. However, the performance of these approaches has not been rigorously evaluated under population genetic models. Further, short-lived functional elements may not leave a footprint of sequence conservation across many species. We use simulations to study how one measure of conservation, the Genomic Evolutionary Rate Profiling (GERP) score, relates to the strength of selection (Nes). We show that the GERP score is related to the strength of purifying selection. However, changes in selection coefficients or functional elements over time (i.e. functional turnover) can strongly affect the GERP distribution, leading to unexpected relationships between GERP and Nes. Further, we show that for functional elements that have a high turnover rate, adding more species to the analysis does not necessarily increase statistical power. Finally, we use the distribution of GERP scores across the human genome to compare models with and without turnover of sites where mutations are under purifying selection. We show that mutations in 4.51% of the noncoding human genome are under purifying selection and that most of this sequence has likely experienced changes in selection coefficients throughout mammalian evolution. Our work reveals limitations to using comparative genomic approaches to identify deleterious mutations. Commonly used GERP score thresholds miss over half of the noncoding sites in the human genome where mutations are under purifying selection. One of the most significant and challenging tasks in modern genomics is to assess the functional consequences of a particular nucleotide change in a genome. A common approach to address this challenge prioritizes sequences that share similar nucleotides across distantly related species, with the rationale that mutations at such positions were deleterious and removed from the population by purifying natural selection. Our manuscript shows that one popular measure of sequence conservation, the GERP score, performs well at identifying selected mutations if mutations at a site were under selection across all of mammalian evolution. Changes in selection at a given site dramatically reduces the power of GERP to detect selected mutations in humans. We also combine population genetic models with the distribution of GERP scores at noncoding sites across the human genome to show that the degree of selection at individual sites has changed throughout mammalian evolution. Importantly, we demonstrate that at least 80 Mb of noncoding sequence under purifying selection in humans will not have extreme GERP scores and will likely be missed by modern comparative genomic approaches. Our work argues that new approaches, potentially based on genetic variation within species, will be required to identify deleterious mutations.
Collapse
Affiliation(s)
- Christian D. Huber
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia, Australia
| | - Bernard Y. Kim
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, United States of America
- * E-mail:
| |
Collapse
|
14
|
Davis LK. Intelligent Design of 14-3-3 Docking Proteins Utilizing Synthetic Evolution Artificial Intelligence (SYN-AI). ACS OMEGA 2019; 4:18948-18960. [PMID: 31763516 PMCID: PMC6868599 DOI: 10.1021/acsomega.8b03100] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 07/10/2019] [Indexed: 05/13/2023]
Abstract
The ability to write DNA code from scratch will allow for the discovery of new and interesting chemistries as well as allowing the rewiring of cell signal pathways. Herein, we have utilized synthetic evolution artificial intelligence (SYN-AI) to intelligently design a set of 14-3-3 docking genes. SYN-AI engineers synthetic genes utilizing a parental gene as an evolution template. Wherein, evolution is fast-forwarded by transforming template gene sequences to DNA secondary and tertiary codes based upon gene hierarchical structural levels. The DNA secondary code allows identification of genomic building blocks across an orthologous sequence space comprising multiple genomes. Where, the DNA tertiary code allows engineering of supersecondary structures. SYN-AI constructed a library of 10 million genes that was reduced to three structurally functional 14-3-3 docking genes by applying natural selection protocols. Synthetic protein identity was verified utilizing Clustal Omega sequence alignments and Phylogeny.fr phylogenetic analysis. Wherein, we were able to confirm the three-dimensional structure utilizing I-TASSER and protein-ligand interactions utilizing COACH and Cofactor. The conservation of allosteric communications was confirmed utilizing elastic and anisotropic network models. Whereby, we utilized elNemo and ANM2.1 to confirm conservation of the 14-3-3 ζ amphipathic groove. Notably, to the best of our knowledge, we report the first 14-3-3 docking genes to be written from scratch.
Collapse
Affiliation(s)
- Leroy K. Davis
- Prairie
View A&M University, Cooperative Agricultural Research Center (CARC), 700 University Drive, Prairie
View, Texas 77446-0518, United States
- Gene
Evolution Project, LLC, Baton Rouge, Louisiana 70835, United States
| |
Collapse
|
15
|
Theys K, Feder AF, Gelbart M, Hartl M, Stern A, Pennings PS. Within-patient mutation frequencies reveal fitness costs of CpG dinucleotides and drastic amino acid changes in HIV. PLoS Genet 2018; 14:e1007420. [PMID: 29953449 PMCID: PMC6023119 DOI: 10.1371/journal.pgen.1007420] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Accepted: 04/29/2018] [Indexed: 12/22/2022] Open
Abstract
HIV has a high mutation rate, which contributes to its ability to evolve quickly. However, we know little about the fitness costs of individual HIV mutations in vivo, their distribution and the different factors shaping the viral fitness landscape. We calculated the mean frequency of transition mutations at 870 sites of the pol gene in 160 patients, allowing us to determine the cost of these mutations. As expected, we found high costs for non-synonymous and nonsense mutations as compared to synonymous mutations. In addition, we found that non-synonymous mutations that lead to drastic amino acid changes are twice as costly as those that do not and mutations that create new CpG dinucleotides are also twice as costly as those that do not. We also found that G→A and C→T mutations are more costly than A→G mutations. We anticipate that our new in vivo frequency-based approach will provide insights into the fitness landscape and evolvability of not only HIV, but a variety of microbes.
Collapse
Affiliation(s)
- Kristof Theys
- Clinical and Epidemiological Virology, Department of Microbiology and Immunology, Rega Institute for Medical Research, KU Leuven, University of Leuven, Leuven, Belgium
| | - Alison F. Feder
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Maoz Gelbart
- School of Molecular Cell Biology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Marion Hartl
- Department of Biology, San Francisco State University, San Francisco, California, United States of America
| | - Adi Stern
- School of Molecular Cell Biology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Pleuni S. Pennings
- Department of Biology, San Francisco State University, San Francisco, California, United States of America
| |
Collapse
|
16
|
Naidoo T, Sjödin P, Schlebusch C, Jakobsson M. Patterns of variation in cis-regulatory regions: examining evidence of purifying selection. BMC Genomics 2018; 19:95. [PMID: 29373957 PMCID: PMC5787233 DOI: 10.1186/s12864-017-4422-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 12/27/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND With only 2 % of the human genome consisting of protein coding genes, functionality across the rest of the genome has been the subject of much debate. This has gained further impetus in recent years due to a rapidly growing catalogue of genomic elements, based primarily on biochemical signatures (e.g. the ENCODE project). While the assessment of functionality is a complex task, the presence of selection acting on a genomic region is a strong indicator of importance. In this study, we apply population genetic methods to investigate signals overlaying several classes of regulatory elements. RESULTS We disentangle signals of purifying selection acting directly on regulatory elements from the confounding factors of demography and purifying selection linked to e.g. nearby protein coding regions. We confirm the importance of regulatory regions proximal to coding sequence, while also finding differential levels of selection at distal regions. We note differences in purifying selection among transcription factor families. Signals of constraint at some genomic classes were also strongly dependent on their physical location relative to coding sequence. In addition, levels of selection efficacy across genomic classes differed between African and non-African populations. CONCLUSIONS In order to assign a valid signal of selection to a particular class of genomic sequence, we show that it is crucial to isolate the signal by accounting for the effects of demography and linked-purifying selection. Our study highlights the intricate interplay of factors affecting signals of selection on functional elements.
Collapse
Affiliation(s)
- Thijessen Naidoo
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - Per Sjödin
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - Carina Schlebusch
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - Mattias Jakobsson
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden. .,Science for Life Lab, Uppsala, Sweden.
| |
Collapse
|
17
|
Sánchez-Gracia A, Guirao-Rico S, Hinojosa-Alvarez S, Rozas J. Computational prediction of the phenotypic effects of genetic variants: basic concepts and some application examples in Drosophila nervous system genes. J Neurogenet 2017; 31:307-319. [DOI: 10.1080/01677063.2017.1398241] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Alejandro Sánchez-Gracia
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
| | - Sara Guirao-Rico
- Center for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Bellaterra, Spain
| | - Silvia Hinojosa-Alvarez
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
| | - Julio Rozas
- Departament de Genètica, Microbiologia i Estadística and Institut de Recerca de la Biodiversitat (IRBio), Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
18
|
Abstract
Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast ‘twilight zone’ in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species’ population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional.
Collapse
Affiliation(s)
- Chris P Ponting
- MRC Human Genetics Unit, The Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, EH4 2XU, UK.
| |
Collapse
|
19
|
Early AM, Clark AG. Genomic signatures of local adaptation in the Drosophila immune response. Fly (Austin) 2017; 11:277-283. [PMID: 28586288 DOI: 10.1080/19336934.2017.1337612] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Abstract
As environments and pathogen landscapes shift, host defenses must evolve to remain effective. Due to this selection pressure, among-species comparisons of genetic sequence data often find immune genes to be among the fastest evolving genes across the genome. The full extent and nature of these immune adaptations, however, remain largely unexplored. In a recent study, we analyzed patterns of selection within distinct components of the Drosophila melanogaster immune pathway. While we found evidence of positive selection within some immune processes, immune genes were not universally characterized by signatures of strong selection. On the contrary, we even found that some immune functions show greater than expected constraint. Overall these results highlight 2 major factors that appear to play an outsize role in determining a gene's evolutionary rate: the type of pathogen the gene targets and the gene's position within the immune network. These results join a growing body of literature that highlight the complexity of immune adaptation. Rather than there being uniformly strong selection across all immune genes, a combination of pathogen-specificity and host genetic constraints appear to play key roles in determining each immune gene's individual evolutionary trajectory.
Collapse
Affiliation(s)
- Angela M Early
- a Department of Molecular Biology and Genetics , Cornell University , Ithaca , NY.,b Infectious Disease and Microbiome Program , Broad Institute of MIT and Harvard , Cambridge , MA
| | - Andrew G Clark
- a Department of Molecular Biology and Genetics , Cornell University , Ithaca , NY
| |
Collapse
|
20
|
Szövényi P, Ullrich KK, Rensing SA, Lang D, van Gessel N, Stenøien HK, Conti E, Reski R. Selfing in Haploid Plants and Efficacy of Selection: Codon Usage Bias in the Model Moss Physcomitrella patens. Genome Biol Evol 2017; 9:1528-1546. [PMID: 28549175 PMCID: PMC5507605 DOI: 10.1093/gbe/evx098] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/25/2017] [Indexed: 12/15/2022] Open
Abstract
A long-term reduction in effective population size will lead to major shift in genome evolution. In particular, when effective population size is small, genetic drift becomes dominant over natural selection. The onset of self-fertilization is one evolutionary event considerably reducing effective size of populations. Theory predicts that this reduction should be more dramatic in organisms capable for haploid than for diploid selfing. Although theoretically well-grounded, this assertion received mixed experimental support. Here, we test this hypothesis by analyzing synonymous codon usage bias of genes in the model moss Physcomitrella patens frequently undergoing haploid selfing. In line with population genetic theory, we found that the effect of natural selection on synonymous codon usage bias is very weak. Our conclusion is supported by four independent lines of evidence: 1) Very weak or nonsignificant correlation between gene expression and codon usage bias, 2) no increased codon usage bias in more broadly expressed genes, 3) no evidence that codon usage bias would constrain synonymous and nonsynonymous divergence, and 4) predominant role of genetic drift on synonymous codon usage predicted by a model-based analysis. These findings show striking similarity to those observed in AT-rich genomes with weak selection for optimal codon usage and GC content overall. Our finding is in contrast to a previous study reporting adaptive codon usage bias in the moss P. patens.
Collapse
Affiliation(s)
- Péter Szövényi
- Department of Systematic and Evolutionary Botany, University of Zurich, Switzerland
| | - Kristian K. Ullrich
- Plant Cell Biology, Faculty of Biology, University of Marburg, Germany
- Present address: Max-Planck-Insitut für Evolutionsbiologie, Plön, Germany
| | - Stefan A. Rensing
- Plant Cell Biology, Faculty of Biology, University of Marburg, Germany
- BIOSS—Centre for Biological Signalling Studies, University of Freiburg, Germany
| | - Daniel Lang
- Plant Genome and Systems Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Nico van Gessel
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Germany
| | | | - Elena Conti
- Department of Systematic and Evolutionary Botany, University of Zurich, Switzerland
| | - Ralf Reski
- BIOSS—Centre for Biological Signalling Studies, University of Freiburg, Germany
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Germany
| |
Collapse
|
21
|
Huber CD, Kim BY, Marsden CD, Lohmueller KE. Determining the factors driving selective effects of new nonsynonymous mutations. Proc Natl Acad Sci U S A 2017; 114:4465-4470. [PMID: 28400513 PMCID: PMC5410820 DOI: 10.1073/pnas.1619508114] [Citation(s) in RCA: 87] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The distribution of fitness effects (DFE) of new mutations plays a fundamental role in evolutionary genetics. However, the extent to which the DFE differs across species has yet to be systematically investigated. Furthermore, the biological mechanisms determining the DFE in natural populations remain unclear. Here, we show that theoretical models emphasizing different biological factors at determining the DFE, such as protein stability, back-mutations, species complexity, and mutational robustness make distinct predictions about how the DFE will differ between species. Analyzing amino acid-changing variants from natural populations in a comparative population genomic framework, we find that humans have a higher proportion of strongly deleterious mutations than Drosophila melanogaster. Furthermore, when comparing the DFE across yeast, Drosophila, mice, and humans, the average selection coefficient becomes more deleterious with increasing species complexity. Last, pleiotropic genes have a DFE that is less variable than that of nonpleiotropic genes. Comparing four categories of theoretical models, only Fisher's geometrical model (FGM) is consistent with our findings. FGM assumes that multiple phenotypes are under stabilizing selection, with the number of phenotypes defining the complexity of the organism. Our results suggest that long-term population size and cost of complexity drive the evolution of the DFE, with many implications for evolutionary and medical genomics.
Collapse
Affiliation(s)
- Christian D Huber
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095;
| | - Bernard Y Kim
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095
| | - Clare D Marsden
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095;
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA 90095
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA 90095
| |
Collapse
|
22
|
Cheng M, Liu X, Yang M, Han L, Xu A, Huang Q. Computational analyses of type 2 diabetes-associated loci identified by genome-wide association studies. J Diabetes 2017; 9:362-377. [PMID: 27121852 DOI: 10.1111/1753-0407.12421] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Revised: 03/31/2016] [Accepted: 04/23/2016] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) of type 2 diabetes (T2D) have discovered a number of loci that contribute to susceptibility to the disease. Future challenges include elucidation of functional mechanisms through which these GWAS-identified loci modulate T2D disease risk. The aim of the present study was to comprehensively characterize T2D associated single nucleotide polymorphisms (SNPs) and genes through computational approaches. METHODS Computational biology approaches used in the present study included comparative genomic analyses and functional annotation using GWAS3D and RegulomeDB, investigation of the effects of T2D-associated SNPs on miRNA binding and protein phosphorylation, and gene ontology, pathway enrichment, protein-protein interaction (PPI) networks and functional module analysis of T2D-associated genes from previously published GWAS. RESULTS Computational analysis identified a number of T2D GWAS-associated SNPs that were located at protein binding sites, including CCCTC-binding factor (CTCF), E1A binding protein p300 (EP300), hepatocyte nuclear factor 4alpha (HNF4A), transcription factor 7 like 2 (TCF7L2), forkhead box A1 (FOXA1) and A2 (FOXA2), and potentially affected the binding of miRNAs and protein phosphorylation. Pathway enrichment analysis confirmed two well-known maturity onset diabetes of the young and T2D pathways, whereas PPI network analysis identified highly interconnected "hub" genes, such as TCF7L2, melatonin receptor 1B (MTNR1B), and solute carrier family 30 (zinc transporter), member 8 (SLC30A8), that created two tight subnetworks. CONCLUSIONS The results provide objectives and clues for future experimental studies and further insights into the molecular pathogenesis of T2D.
Collapse
Affiliation(s)
- Mengrong Cheng
- College of Life Sciences, Central China Normal University, Wuhan, China
| | - Xinhong Liu
- College of Life Sciences, Central China Normal University, Wuhan, China
| | - Mei Yang
- College of Life Sciences, Central China Normal University, Wuhan, China
| | - Lanchun Han
- College of Life Sciences, Central China Normal University, Wuhan, China
- Institute of Public Health and Molecular Medicine Analysis, Central China Normal University, Wuhan, China
| | - Aimin Xu
- Li Cha Chung Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Qingyang Huang
- College of Life Sciences, Central China Normal University, Wuhan, China
- Institute of Public Health and Molecular Medicine Analysis, Central China Normal University, Wuhan, China
| |
Collapse
|
23
|
|
24
|
Joly-Lopez Z, Flowers JM, Purugganan MD. Developing maps of fitness consequences for plant genomes. CURRENT OPINION IN PLANT BIOLOGY 2016; 30:101-7. [PMID: 26950251 DOI: 10.1016/j.pbi.2016.02.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Revised: 02/02/2016] [Accepted: 02/17/2016] [Indexed: 05/22/2023]
Abstract
Predicting the fitness consequences of mutations, and their concomitant impacts on molecular and cellular function as well as organismal phenotypes, is an important challenge in biology that has new relevance in an era when genomic data is readily available. The ability to construct genomewide maps of fitness consequences in plant genomes is a recent development that has profound implications for our ability to predict the fitness effects of mutations and discover functional elements. Here we highlight approaches to building fitness consequence maps to infer regions under selection. We emphasize computational methods applied primarily to the study of human disease that translate physical maps of within-species genome variation into maps of fitness effects of individual natural mutations. Maps of fitness consequences in plants, combined with traditional genetic approaches, could accelerate discovery of functional elements such as regulatory sequences in non-coding DNA and genetic polymorphisms associated with key traits, including agronomically-important traits such as yield and environmental stress responses.
Collapse
Affiliation(s)
- Zoé Joly-Lopez
- Center for Genomics and Systems Biology, Department of Biology, 12 Waverly Place, New York University, New York, NY 10003, United States
| | - Jonathan M Flowers
- Center for Genomics and Systems Biology, Department of Biology, 12 Waverly Place, New York University, New York, NY 10003, United States; Center for Genomics and Systems Biology, NYU Abu Dhabi Research Institute, NYU Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates
| | - Michael D Purugganan
- Center for Genomics and Systems Biology, Department of Biology, 12 Waverly Place, New York University, New York, NY 10003, United States; Center for Genomics and Systems Biology, NYU Abu Dhabi Research Institute, NYU Abu Dhabi, Saadiyat Island, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
25
|
Qin L, Liu Y, Wang Y, Wu G, Chen J, Ye W, Yang J, Huang Q. Computational Characterization of Osteoporosis Associated SNPs and Genes Identified by Genome-Wide Association Studies. PLoS One 2016; 11:e0150070. [PMID: 26930606 PMCID: PMC4773152 DOI: 10.1371/journal.pone.0150070] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2015] [Accepted: 02/09/2016] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES Genome-wide association studies (GWASs) have revealed many SNPs and genes associated with osteoporosis. However, influence of these SNPs and genes on the predisposition to osteoporosis is not fully understood. We aimed to identify osteoporosis GWASs-associated SNPs potentially influencing the binding affinity of transcription factors and miRNAs, and reveal enrichment signaling pathway and "hub" genes of osteoporosis GWAS-associated genes. METHODS We conducted multiple computational analyses to explore function and mechanisms of osteoporosis GWAS-associated SNPs and genes, including SNP conservation analysis and functional annotation (influence of SNPs on transcription factors and miRNA binding), gene ontology analysis, pathway analysis and protein-protein interaction analysis. RESULTS Our results suggested that a number of SNPs potentially influence the binding affinity of transcription factors (NFATC2, MEF2C, SOX9, RUNX2, ESR2, FOXA1 and STAT3) and miRNAs. Osteoporosis GWASs-associated genes showed enrichment of Wnt signaling pathway, basal cell carcinoma and Hedgehog signaling pathway. Highly interconnected "hub" genes revealed by interaction network analysis are RUNX2, SP7, TNFRSF11B, LRP5, DKK1, ESR1 and SOST. CONCLUSIONS Our results provided the targets for further experimental assessment and further insight on osteoporosis pathophysiology.
Collapse
Affiliation(s)
- Longjuan Qin
- College of Life Sciences, Central China Normal University, Wuhan, 430079, China
| | - Yuyong Liu
- College of Life Sciences, Central China Normal University, Wuhan, 430079, China
| | - Ya Wang
- College of Life Sciences, Central China Normal University, Wuhan, 430079, China
| | - Guiju Wu
- College of Life Sciences, Central China Normal University, Wuhan, 430079, China
| | - Jie Chen
- College of Life Sciences, Central China Normal University, Wuhan, 430079, China
| | - Weiyuan Ye
- College of Life Sciences, Central China Normal University, Wuhan, 430079, China
| | - Jiancai Yang
- College of Computer Sciences, Central China Normal University, Wuhan, 430079, China
| | - Qingyang Huang
- College of Life Sciences, Central China Normal University, Wuhan, 430079, China
- * E-mail:
| |
Collapse
|
26
|
Natural Selection and Recombination Rate Variation Shape Nucleotide Polymorphism Across the Genomes of Three Related Populus Species. Genetics 2015; 202:1185-200. [PMID: 26721855 PMCID: PMC4788117 DOI: 10.1534/genetics.115.183152] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Accepted: 12/24/2015] [Indexed: 12/30/2022] Open
Abstract
A central aim of evolutionary genomics is to identify the relative roles that various evolutionary forces have played in generating and shaping genetic variation within and among species. Here we use whole-genome resequencing data to characterize and compare genome-wide patterns of nucleotide polymorphism, site frequency spectrum, and population-scaled recombination rates in three species of Populus: Populus tremula, P. tremuloides, and P. trichocarpa. We find that P. tremuloides has the highest level of genome-wide variation, skewed allele frequencies, and population-scaled recombination rates, whereas P. trichocarpa harbors the lowest. Our findings highlight multiple lines of evidence suggesting that natural selection, due to both purifying and positive selection, has widely shaped patterns of nucleotide polymorphism at linked neutral sites in all three species. Differences in effective population sizes and rates of recombination largely explain the disparate magnitudes and signatures of linked selection that we observe among species. The present work provides the first phylogenetic comparative study on a genome-wide scale in forest trees. This information will also improve our ability to understand how various evolutionary forces have interacted to influence genome evolution among related species.
Collapse
|
27
|
Genomics and the making of yeast biodiversity. Curr Opin Genet Dev 2015; 35:100-9. [PMID: 26649756 DOI: 10.1016/j.gde.2015.10.008] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Revised: 10/28/2015] [Accepted: 10/29/2015] [Indexed: 12/22/2022]
Abstract
Yeasts are unicellular fungi that do not form fruiting bodies. Although the yeast lifestyle has evolved multiple times, most known species belong to the subphylum Saccharomycotina (syn. Hemiascomycota, hereafter yeasts). This diverse group includes the premier eukaryotic model system, Saccharomyces cerevisiae; the common human commensal and opportunistic pathogen, Candida albicans; and over 1000 other known species (with more continuing to be discovered). Yeasts are found in every biome and continent and are more genetically diverse than angiosperms or chordates. Ease of culture, simple life cycles, and small genomes (∼10-20Mbp) have made yeasts exceptional models for molecular genetics, biotechnology, and evolutionary genomics. Here we discuss recent developments in understanding the genomic underpinnings of the making of yeast biodiversity, comparing and contrasting natural and human-associated evolutionary processes. Only a tiny fraction of yeast biodiversity and metabolic capabilities has been tapped by industry and science. Expanding the taxonomic breadth of deep genomic investigations will further illuminate how genome function evolves to encode their diverse metabolisms and ecologies.
Collapse
|
28
|
Bataillon T, Duan J, Hvilsom C, Jin X, Li Y, Skov L, Glemin S, Munch K, Jiang T, Qian Y, Hobolth A, Wang J, Mailund T, Siegismund HR, Schierup MH. Inference of purifying and positive selection in three subspecies of chimpanzees (Pan troglodytes) from exome sequencing. Genome Biol Evol 2015; 7:1122-32. [PMID: 25829516 PMCID: PMC4419804 DOI: 10.1093/gbe/evv058] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
We study genome-wide nucleotide diversity in three subspecies of extant chimpanzees using exome capture. After strict filtering, Single Nucleotide Polymorphisms and indels were called and genotyped for greater than 50% of exons at a mean coverage of 35× per individual. Central chimpanzees (Pan troglodytes troglodytes) are the most polymorphic (nucleotide diversity, θw = 0.0023 per site) followed by Eastern (P. t. schweinfurthii) chimpanzees (θw = 0.0016) and Western (P. t. verus) chimpanzees (θw = 0.0008). A demographic scenario of divergence without gene flow fits the patterns of autosomal synonymous nucleotide diversity well except for a signal of recent gene flow from Western into Eastern chimpanzees. The striking contrast in X-linked versus autosomal polymorphism and divergence previously reported in Central chimpanzees is also found in Eastern and Western chimpanzees. We show that the direction of selection statistic exhibits a strong nonmonotonic relationship with the strength of purifying selection S, making it inappropriate for estimating S. We instead use counts in synonymous versus nonsynonymous frequency classes to infer the distribution of S coefficients acting on nonsynonymous mutations in each subspecies. The strength of purifying selection we infer is congruent with the differences in effective sizes of each subspecies: Central chimpanzees are undergoing the strongest purifying selection followed by Eastern and Western chimpanzees. Coding indels show stronger selection against indels changing the reading frame than observed in human populations.
Collapse
Affiliation(s)
| | - Jinjie Duan
- Bioinformatics Research Centre, Aarhus University, Denmark
| | - Christina Hvilsom
- Science and Conservation, Copenhagen Zoo, Denmark Bioinformatics, University of Copenhagen, Denmark
| | | | | | - Laurits Skov
- Bioinformatics Research Centre, Aarhus University, Denmark
| | - Sylvain Glemin
- Institut des Sciences de l'Evolution, Universite Montpellier 2, France
| | - Kasper Munch
- Bioinformatics Research Centre, Aarhus University, Denmark
| | | | - Yu Qian
- Bioinformatics Research Centre, Aarhus University, Denmark
| | - Asger Hobolth
- Bioinformatics Research Centre, Aarhus University, Denmark
| | - Jun Wang
- BGI Shenzhen, China Section of Metabolic Genetics, The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark The Department of Genetic Medicine, Faculty of Medicine and Princess Al Jawhara Albrahim Center of Excellence in the Research of Hereditary Disorders, King Abdulaziz University, Jeddah, Saudi Arabia Department of Biology, University of Copenhagen, Denmark Macau University of Science and Technology, China
| | - Thomas Mailund
- Bioinformatics Research Centre, Aarhus University, Denmark
| | | | | |
Collapse
|
29
|
Siepel A, Arbiza L. Cis-regulatory elements and human evolution. Curr Opin Genet Dev 2014; 29:81-9. [PMID: 25218861 PMCID: PMC4258466 DOI: 10.1016/j.gde.2014.08.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Revised: 08/17/2014] [Accepted: 08/23/2014] [Indexed: 11/20/2022]
Abstract
Modification of gene regulation has long been considered an important force in human evolution, particularly through changes to cis-regulatory elements (CREs) that function in transcriptional regulation. For decades, however, the study of cis-regulatory evolution was severely limited by the available data. New data sets describing the locations of CREs and genetic variation within and between species have now made it possible to study CRE evolution much more directly on a genome-wide scale. Here, we review recent research on the evolution of CREs in humans based on large-scale genomic data sets. We consider inferences based on primate divergence, human polymorphism, and combinations of divergence and polymorphism. We then consider 'new frontiers' in this field stemming from recent research on transcriptional regulation.
Collapse
Affiliation(s)
- Adam Siepel
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA.
| | - Leonardo Arbiza
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|
30
|
Bank C, Ewing GB, Ferrer-Admettla A, Foll M, Jensen JD. Thinking too positive? Revisiting current methods of population genetic selection inference. Trends Genet 2014; 30:540-6. [PMID: 25438719 DOI: 10.1016/j.tig.2014.09.010] [Citation(s) in RCA: 78] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 09/19/2014] [Accepted: 09/23/2014] [Indexed: 02/03/2023]
Abstract
In the age of next-generation sequencing, the availability of increasing amounts and improved quality of data at decreasing cost ought to allow for a better understanding of how natural selection is shaping the genome than ever before. However, alternative forces, such as demography and background selection (BGS), obscure the footprints of positive selection that we would like to identify. In this review, we illustrate recent developments in this area, and outline a roadmap for improved selection inference. We argue (i) that the development and obligatory use of advanced simulation tools is necessary for improved identification of selected loci, (ii) that genomic information from multiple time points will enhance the power of inference, and (iii) that results from experimental evolution should be utilized to better inform population genomic studies.
Collapse
Affiliation(s)
- Claudia Bank
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland.
| | - Gregory B Ewing
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland
| | - Anna Ferrer-Admettla
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland; Department of Biology and Biochemistry, University of Fribourg, 1700 Fribourg, Switzerland
| | - Matthieu Foll
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland
| | - Jeffrey D Jensen
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), 1015 Lausanne, Switzerland
| |
Collapse
|
31
|
Approximation to the distribution of fitness effects across functional categories in human segregating polymorphisms. PLoS Genet 2014; 10:e1004697. [PMID: 25375159 PMCID: PMC4222666 DOI: 10.1371/journal.pgen.1004697] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2014] [Accepted: 08/22/2014] [Indexed: 02/03/2023] Open
Abstract
Quantifying the proportion of polymorphic mutations that are deleterious or neutral is of fundamental importance to our understanding of evolution, disease genetics and the maintenance of variation genome-wide. Here, we develop an approximation to the distribution of fitness effects (DFE) of segregating single-nucleotide mutations in humans. Unlike previous methods, we do not assume that synonymous mutations are neutral or not strongly selected, and we do not rely on fitting the DFE of all new nonsynonymous mutations to a single probability distribution, which is poorly motivated on a biological level. We rely on a previously developed method that utilizes a variety of published annotations (including conservation scores, protein deleteriousness estimates and regulatory data) to score all mutations in the human genome based on how likely they are to be affected by negative selection, controlling for mutation rate. We map this and other conservation scores to a scale of fitness coefficients via maximum likelihood using diffusion theory and a Poisson random field model on SNP data. Our method serves to approximate the deleterious DFE of mutations that are segregating, regardless of their genomic consequence. We can then compare the proportion of mutations that are negatively selected or neutral across various categories, including different types of regulatory sites. We observe that the distribution of intergenic polymorphisms is highly peaked at neutrality, while the distribution of nonsynonymous polymorphisms has a second peak at [Formula: see text]. Other types of polymorphisms have shapes that fall roughly in between these two. We find that transcriptional start sites, strong CTCF-enriched elements and enhancers are the regulatory categories with the largest proportion of deleterious polymorphisms.
Collapse
|