1
|
Golomb R, Dahan O, Dahary D, Pilpel Y. Cell-autonomous adaptation: an overlooked avenue of adaptation in human evolution. Trends Genet 2025; 41:12-22. [PMID: 39732540 DOI: 10.1016/j.tig.2024.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2024] [Revised: 10/21/2024] [Accepted: 10/24/2024] [Indexed: 12/30/2024]
Abstract
Adaptation to environmental conditions occurs over diverse evolutionary timescales. In multi-cellular organisms, adaptive traits are often studied in tissues/organs relevant to the environmental challenge. We argue for the importance of an underappreciated layer of evolutionary adaptation manifesting at the cellular level. Cell-autonomous adaptations (CAAs) are inherited traits that boost organismal fitness by enhancing individual cell function. For instance, the cell-autonomous enhancement of mitochondrial oxygen utilization in hypoxic environments differs from an optimized erythropoiesis response, which involves multiple tissues. We explore the breadth of CAAs across challenges and highlight their counterparts in unicellular organisms. Applying these insights, we mine selection signals in Andean highlanders, revealing novel candidate CAAs. The conservation of CAAs across species may reveal valuable insights into multi-cellular evolution.
Collapse
Affiliation(s)
- Ruthie Golomb
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Orna Dahan
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Dvir Dahary
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Yitzhak Pilpel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, 76100, Israel.
| |
Collapse
|
2
|
Nielsen R, Vaughn AH, Deng Y. Inference and applications of ancestral recombination graphs. Nat Rev Genet 2025; 26:47-58. [PMID: 39349760 DOI: 10.1038/s41576-024-00772-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/16/2024] [Indexed: 12/15/2024]
Abstract
Ancestral recombination graphs (ARGs) summarize the complex genealogical relationships between individuals represented in a sample of DNA sequences. Their use is currently revolutionizing the field of population genetics and is leading to the development of powerful new methods to elucidate individual and population genetic processes, including population size history, migration, admixture, recombination, mutation and selection. In this Review, we introduce the readers to the structure of ARGs and discuss how they relate to processes such as recombination and genetic drift. We explore differences and similarities between methods of estimating ARGs and provide concrete illustrative examples of how ARGs can be used to elucidate population-level processes.
Collapse
Affiliation(s)
- Rasmus Nielsen
- Department of Integrative Biology and Department of Statistics, UC Berkeley, Berkeley, CA, USA.
- GLOBE Institute, University of Copenhagen, Copenhagen, Denmark.
- Center for Computational Biology, UC Berkeley, Berkeley, CA, USA.
| | - Andrew H Vaughn
- Center for Computational Biology, UC Berkeley, Berkeley, CA, USA
| | - Yun Deng
- Center for Computational Biology, UC Berkeley, Berkeley, CA, USA
| |
Collapse
|
3
|
Fine AG, Steinrücken M. A novel expectation-maximization approach to infer general diploid selection from time-series genetic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.10.593575. [PMID: 38798346 PMCID: PMC11118272 DOI: 10.1101/2024.05.10.593575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Detecting and quantifying the strength of selection is a main objective in population genetics. Since selection acts over multiple generations, many approaches have been developed to detect and quantify selection using genetic data sampled at multiple points in time. Such time series genetic data is commonly analyzed using Hidden Markov Models, but in most cases, under the assumption of additive selection. However, many examples of genetic variation exhibiting non-additive mechanisms exist, making it critical to develop methods that can characterize selection in more general scenarios. Thus, we extend a previously introduced expectation-maximization algorithm for the inference of additive selection coefficients to the case of general diploid selection, in which the heterozygote and homozygote fitness are parameterized independently. We furthermore introduce a framework to identify bespoke modes of diploid selection from given data, as well as a procedure for aggregating data across linked loci to increase power and robustness. Using extensive simulation studies, we find that our method accurately and efficiently estimates selection coefficients for different modes of diploid selection across a wide range of scenarios; however, power to classify the mode of selection is low unless selection is very strong. We apply our method to ancient DNA samples from Great Britain in the last 4,450 years, and detect evidence for selection in six genomic regions, including the well-characterized LCT locus. Our work is the first genome-wide scan characterizing signals of general diploid selection.
Collapse
Affiliation(s)
- Adam G Fine
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, USA
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, Illinois, USA
| | - Matthias Steinrücken
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, USA
- Department of Human Genetics, University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
4
|
Amin MR, Hasan M, DeGiorgio M. Digital Image Processing to Detect Adaptive Evolution. Mol Biol Evol 2024; 41:msae242. [PMID: 39565932 PMCID: PMC11631197 DOI: 10.1093/molbev/msae242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 10/28/2024] [Accepted: 11/13/2024] [Indexed: 11/22/2024] Open
Abstract
In recent years, advances in image processing and machine learning have fueled a paradigm shift in detecting genomic regions under natural selection. Early machine learning techniques employed population-genetic summary statistics as features, which focus on specific genomic patterns expected by adaptive and neutral processes. Though such engineered features are important when training data are limited, the ease at which simulated data can now be generated has led to the recent development of approaches that take in image representations of haplotype alignments and automatically extract important features using convolutional neural networks. Digital image processing methods termed α-molecules are a class of techniques for multiscale representation of objects that can extract a diverse set of features from images. One such α-molecule method, termed wavelet decomposition, lends greater control over high-frequency components of images. Another α-molecule method, termed curvelet decomposition, is an extension of the wavelet concept that considers events occurring along curves within images. We show that application of these α-molecule techniques to extract features from image representations of haplotype alignments yield high true positive rate and accuracy to detect hard and soft selective sweep signatures from genomic data with both linear and nonlinear machine learning classifiers. Moreover, we find that such models are easy to visualize and interpret, with performance rivaling those of contemporary deep learning approaches for detecting sweeps.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
5
|
Grinde KE, Browning BL, Reiner AP, Thornton TA, Browning SR. Adjusting for principal components can induce collider bias in genome-wide association studies. PLoS Genet 2024; 20:e1011242. [PMID: 39680601 PMCID: PMC11684764 DOI: 10.1371/journal.pgen.1011242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 12/30/2024] [Accepted: 11/14/2024] [Indexed: 12/18/2024] Open
Abstract
Principal component analysis (PCA) is widely used to control for population structure in genome-wide association studies (GWAS). Top principal components (PCs) typically reflect population structure, but challenges arise in deciding how many PCs are needed and ensuring that PCs do not capture other artifacts such as regions with atypical linkage disequilibrium (LD). In response to the latter, many groups suggest performing LD pruning or excluding known high LD regions prior to PCA. However, these suggestions are not universally implemented and the implications for GWAS are not fully understood, especially in the context of admixed populations. In this paper, we investigate the impact of pre-processing and the number of PCs included in GWAS models in African American samples from the Women's Health Initiative SNP Health Association Resource and two Trans-Omics for Precision Medicine Whole Genome Sequencing Project contributing studies (Jackson Heart Study and Genetic Epidemiology of Chronic Obstructive Pulmonary Disease Study). In all three samples, we find the first PC is highly correlated with genome-wide ancestry whereas later PCs often capture local genomic features. The pattern of which, and how many, genetic variants are highly correlated with individual PCs differs from what has been observed in prior studies focused on European populations and leads to distinct downstream consequences: adjusting for such PCs yields biased effect size estimates and elevated rates of spurious associations due to the phenomenon of collider bias. Excluding high LD regions identified in previous studies does not resolve these issues. LD pruning proves more effective, but the optimal choice of thresholds varies across datasets. Altogether, our work highlights unique issues that arise when using PCA to control for ancestral heterogeneity in admixed populations and demonstrates the importance of careful pre-processing and diagnostics to ensure that PCs capturing multiple local genomic features are not included in GWAS models.
Collapse
Affiliation(s)
- Kelsey E. Grinde
- Department of Mathematics, Statistics, and Computer Science, Macalester College, Saint Paul, Minnesota, United States of America
| | - Brian L. Browning
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Alexander P. Reiner
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Department of Epidemiology, University of Washington, Seattle, Washington, United States of America
| | - Timothy A. Thornton
- Regeneron Genetics Center, Tarrytown, New York, United States of America
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Sharon R. Browning
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
6
|
Fogarty L, Otto SP. Signatures of selection with cultural interference. Proc Natl Acad Sci U S A 2024; 121:e2322885121. [PMID: 39556724 PMCID: PMC11621839 DOI: 10.1073/pnas.2322885121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 08/01/2024] [Indexed: 11/20/2024] Open
Abstract
Human evolution is intricately linked with culture, which permeates almost all facets of human life from health and reproduction, to the environments in which we live. Nevertheless, our understanding of the ways in which stably transmitted, evolutionarily relevant human cultural traits might interact with the human genome is incomplete, and methods to detect such interactions are limited. Here, we describe some rules of cultural transmission which could pertain to both humans and cultural nonhuman animals that could lead to the formation and maintenance of stable associations between cultural and genetic traits. Next, we show that, in the presence of such associations, a process analogous to genetic hitchhiking is possible in gene-culture systems. These could leave signatures in the human genome similar to, and perhaps indistinguishable from, those left by selection on genetic traits. Finally, we model selective interference between cultural and genetic traits. We show that selective interference between a cultural trait under selection and a genetic trait under selection can reduce the efficacy of natural selection in the human genome, both in terms of the probability of fixation of beneficial alleles and the dynamics of selective sweeps. We then show that the efficiency of selection at genetic loci can, however, be increased in the presence of strong cultural transmission biases. This implies that the signatures of gene-culture interactions in genetic data may be complex and wide-ranging in gene-culture coevolutionary systems.
Collapse
Affiliation(s)
- Laurel Fogarty
- Department of Human Behavior, Ecology and Culture, Max Planck Institute for Evolutionary Anthropology, 04103Leipzig, Germany
| | - Sarah P. Otto
- Biodiversity Centre and Department of Zoology, University of British Columbia, Vancouver, BCV6T 1Z4, Canada
| |
Collapse
|
7
|
Temple SD, Waples RK, Browning SR. Modeling recent positive selection using identity-by-descent segments. Am J Hum Genet 2024; 111:2510-2529. [PMID: 39362217 PMCID: PMC11568764 DOI: 10.1016/j.ajhg.2024.08.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 08/29/2024] [Accepted: 08/30/2024] [Indexed: 10/05/2024] Open
Abstract
Recent positive selection can result in an excess of long identity-by-descent (IBD) haplotype segments overlapping a locus. The statistical methods that we propose here address three major objectives in studying selective sweeps: scanning for regions of interest, identifying possible sweeping alleles, and estimating a selection coefficient s. First, we implement a selection scan to locate regions with excess IBD rates. Second, we estimate the allele frequency and location of an unknown sweeping allele by aggregating over variants that are more abundant in an inferred outgroup with excess IBD rate versus the rest of the sample. Third, we propose an estimator for the selection coefficient and quantify uncertainty using the parametric bootstrap. Comparing against state-of-the-art methods in extensive simulations, we show that our methods are more precise at estimating s when s≥0.015. We also show that our 95% confidence intervals contain s in nearly 95% of our simulations. We apply these methods to study positive selection in European ancestry samples from the Trans-Omics for Precision Medicine project. We analyze eight loci where IBD rates are more than four standard deviations above the genome-wide median, including LCT where the maximum IBD rate is 35 standard deviations above the genome-wide median. Overall, we present robust and accurate approaches to study recent adaptive evolution without knowing the identity of the causal allele or using time series data.
Collapse
Affiliation(s)
- Seth D Temple
- Department of Statistics, University of Washington, Seattle, WA, USA.
| | - Ryan K Waples
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Sharon R Browning
- Department of Biostatistics, University of Washington, Seattle, WA, USA.
| |
Collapse
|
8
|
Lisi A, Campbell MC. AncestryGrapher toolkit: Python command-line pipelines to visualize global- and local- ancestry inferences from the RFMIX version 2 software. Bioinformatics 2024; 40:btae616. [PMID: 39412440 PMCID: PMC11534077 DOI: 10.1093/bioinformatics/btae616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 08/21/2024] [Accepted: 10/14/2024] [Indexed: 11/06/2024] Open
Abstract
SUMMARY Admixture is a fundamental process that has shaped levels and patterns of genetic variation in human populations. RFMIX version 2 (RFMIX2) utilizes a robust modeling approach to identify the genetic ancestries in admixed populations. However, this software does not have a built-in method to visually summarize the results of analyses. Here, we introduce the AncestryGrapher toolkit, which converts the numerical output of RFMIX2 into graphical representations of global and local ancestry (i.e. the per-individual ancestry components and the genetic ancestry along chromosomes, respectively). RESULTS To demonstrate the utility of our methods, we applied the AncestryGrapher toolkit to visualize the global and local ancestry of individuals in the North African Mozabite Berber population from the Human Genome Diversity Panel. Our results showed that the Mozabite Berbers derived their ancestry from the Middle East, Europe, and sub-Saharan Africa (global ancestry). We also found that the population origin of ancestry varied considerably along chromosomes (local ancestry). For example, we observed variance in local ancestry in the genomic region on Chromosome 2 containing the regulatory sequence in the MCM6 gene associated with lactase persistence, a human trait tied to the cultural development of adult milk consumption. Overall, the AncestryGrapher toolkit facilitates the exploration, interpretation, and reporting of ancestry patterns in human populations. AVAILABILITY AND IMPLEMENTATION The AncestryGrapher toolkit is free and open source on https://github.com/alisi1989/RFmix2-Pipeline-to-plot.
Collapse
Affiliation(s)
- Alessandro Lisi
- Department of Biological Sciences (Human and Evolutionary Biology Section), University of Southern California, Los Angeles, CA 90089, United States
| | - Michael C Campbell
- Department of Biological Sciences (Human and Evolutionary Biology Section), University of Southern California, Los Angeles, CA 90089, United States
| |
Collapse
|
9
|
Malyarchuk BA. Genetic aspects of lactase deficiency in indigenous populations of Siberia. Vavilovskii Zhurnal Genet Selektsii 2024; 28:650-658. [PMID: 39440313 PMCID: PMC11491482 DOI: 10.18699/vjgb-24-72] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 05/31/2024] [Accepted: 06/03/2024] [Indexed: 10/25/2024] Open
Abstract
The ability to metabolize lactose in adulthood is associated with the persistence of lactase enzyme activity. In European populations, lactase persistence is determined mainly by the presence of the rs4988235-T variant in the MCM6 gene, which increases the expression of the LCT gene, encoding lactase. The highest rates of lactase persistence are characteristic of Europeans, and the lowest rates are found in East Asian populations. Analysis of published data on the distribution of the hypolactasia-associated variant rs4988235-C in the populations of Central Asia and Siberia showed that the frequency of this variant increases in the northeastern direction. The frequency of this allele is 87 % in Central Asia, 90.6 % in Southern Siberia, and 92.9 % in Northeastern Siberia. Consequently, the ability of the population to metabolize lactose decreases in the same geographical direction. The analysis of paleogenomic data has shown that the higher frequency of the rs4988235-T allele in populations of Central Asia and Southern Siberia is associated with the eastward spread of ancient populations of the Eastern European steppes, starting from the Bronze Age. The results of polymorphism analysis of exons and adjacent introns of the MCM6 and LCT genes in indigenous populations of Siberia indicate the possibility that polymorphic variants may potentially be related to lactose metabolism exist in East Asian populations. In East Asian populations, including Siberian ethnic groups, a ~26.5 thousand nucleotide pairs long region of the MCM6 gene, including a combination of the rs4988285-A, rs2070069-G, rs3087353-T, and rs2070068-A alleles, was found. The rs4988285 and rs2070069 loci are located in the enhancer region that regulates the activity of the LCT gene. Analysis of paleogenomic sequences showed that the genomes of Denisovans and Neanderthals are characterized by the above combination of alleles of the MCM6 gene. Thus, the haplotype discovered appears to be archaic. It could have been inherited from a common ancestor of modern humans, Neanderthals, and Denisovans, or it could have been acquired by hybridization with Denisovans or Neanderthals. The data obtained indicate a possible functional significance of archaic variants of the MCM6 gene.
Collapse
Affiliation(s)
- B A Malyarchuk
- Institute of Biological Problems of the North of the Far Eastern Branch of the Russian Academy of Sciences, Magadan, Russia
| |
Collapse
|
10
|
Caldon M, Mutti G, Mondanaro A, Imai H, Shotake T, Oteo Garcia G, Belay G, Morata J, Trotta JR, Montinaro F, Gippoliti S, Capelli C. Gelada genomes highlight events of gene flow, hybridisation and local adaptation that track past climatic changes. Mol Ecol 2024; 33:e17514. [PMID: 39206888 DOI: 10.1111/mec.17514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 06/28/2024] [Accepted: 08/13/2024] [Indexed: 09/04/2024]
Abstract
Theropithecus gelada, the last surviving species of this genus, occupy a unique and highly specialised ecological niche in the Ethiopian highlands. A subdivision into three geographically defined populations (Northern, Central and Southern) has been tentatively proposed for this species on the basis of genetic analyses, but genomic data have been investigated only for two of these groups (Northern and Central). Here we combined newly generated whole genome sequences of individuals sampled from the population living south of the East Africa Great Rift Valley with available data from the other two gelada populations to reconstruct the evolutionary history of the species. Integrating genomic and paleoclimatic data we found that gene-flow across populations and with Papio species tracked past climate changes. The isolation and climatic conditions experienced by Southern geladas during the Holocene shaped local diversity and generated diet-related genomic signatures.
Collapse
Affiliation(s)
- Matteo Caldon
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
| | - Giacomo Mutti
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
- Barcelona Supercomputing Centre (BSC-CNS), Barcelona, Spain
- Institute for Research in Biomedicine (IRB Barcelona), the Barcelona Institute of Science and Technology, Barcelona, Spain
| | | | - Hiroo Imai
- Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Aichi, Japan
| | | | - Gonzalo Oteo Garcia
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
- Centre for Palaeogenetics, Stockholm, Sweden
- Department of Archaeology and Classical Studies, Stockholm University, Stockholm, Sweden
| | - Gurja Belay
- Department of Microbial, Cellular and Molecular Biology, Addis Ababa University, Addis Ababa, Ethiopia
| | - Jordi Morata
- Centre Nacional d'Anàlisi Genòmica, Barcelona, Spain
| | | | - Francesco Montinaro
- Department of Biology-Genetics, University of Bari, Bari, Italy
- Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Spartaco Gippoliti
- IUCN/SSC Primate Specialist Group, Rome, Italy
- Società Italiana per la Storia Della Fauna "G. Altobello", Rome, Italy
| | - Cristian Capelli
- Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, Parma, Italy
- Department of Biology, University of Oxford, Oxford, UK
| |
Collapse
|
11
|
Cohen CE, Swallow DM, Walker C. The molecular basis of lactase persistence: Linking genetics and epigenetics. Ann Hum Genet 2024. [PMID: 39171584 DOI: 10.1111/ahg.12575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 07/24/2024] [Accepted: 07/29/2024] [Indexed: 08/23/2024]
Abstract
Lactase persistence (LP) - the genetic trait that determines the continued expression of the enzyme lactase into adulthood - has undergone recent, rapid positive selection since the advent of animal domestication and dairying in some human populations. While underlying evolutionary explanations have been widely posited and studied, the molecular basis of LP remains less so. This review considers the genetic and epigenetic bases of LP. Multiple single-nucleotide polymorphisms (SNPs) in an LCT enhancer in intron 13 of the neighbouring MCM6 gene are associated with LP. These SNPs alter binding of transcription factors (TFs) and likely prevent age-related increases in methylation in the enhancer, maintaining LCT expression into adulthood to cause LP. However, the complex relationship between the genetics and epigenetics of LP is not fully characterised, and the mode of action of methylation quantitative trait loci (meQTLs) (SNPs affecting methylation) generally remains poorly understood. Here, we examine published LP data to propose a model describing how methylation in the LCT enhancer is prevented in LP adults. We argue that this occurs through altered binding of the TF Oct-1 (encoded by the gene POU2F1) and neighbouring TFs GATA-6 (GATA6), HNF-3A (FOXA1) and c-Ets1 (ETS1) acting in concert. We therefore suggest a plausible new model for LCT downregulation in the context of LP, with wider relevance for future work on the mechanisms of other meQTLs.
Collapse
Affiliation(s)
- Céleste E Cohen
- Department of Genetics, Evolution and Environment, University College London Genetics Institute (UGI), London, UK
| | - Dallas M Swallow
- Department of Genetics, Evolution and Environment, University College London Genetics Institute (UGI), London, UK
| | - Catherine Walker
- Department of Genetics, Evolution and Environment, University College London Genetics Institute (UGI), London, UK
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
12
|
Vaughn AH, Nielsen R. Fast and Accurate Estimation of Selection Coefficients and Allele Histories from Ancient and Modern DNA. Mol Biol Evol 2024; 41:msae156. [PMID: 39078618 PMCID: PMC11321360 DOI: 10.1093/molbev/msae156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 07/02/2024] [Accepted: 07/10/2024] [Indexed: 07/31/2024] Open
Abstract
We here present CLUES2, a full-likelihood method to infer natural selection from sequence data that is an extension of the method CLUES. We make several substantial improvements to the CLUES method that greatly increases both its applicability and its speed. We add the ability to use ancestral recombination graphs on ancient data as emissions to the underlying hidden Markov model, which enables CLUES2 to use both temporal and linkage information to make estimates of selection coefficients. We also fully implement the ability to estimate distinct selection coefficients in different epochs, which allows for the analysis of changes in selective pressures through time, as well as selection with dominance. In addition, we greatly increase the computational efficiency of CLUES2 over CLUES using several approximations to the forward-backward algorithms and develop a new way to reconstruct historic allele frequencies by integrating over the uncertainty in the estimation of the selection coefficients. We illustrate the accuracy of CLUES2 through extensive simulations and validate the importance sampling framework for integrating over the uncertainty in the inference of gene trees. We also show that CLUES2 is well-calibrated by showing that under the null hypothesis, the distribution of log-likelihood ratios follows a χ2 distribution with the appropriate degrees of freedom. We run CLUES2 on a set of recently published ancient human data from Western Eurasia and test for evidence of changing selection coefficients through time. We find significant evidence of changing selective pressures in several genes correlated with the introduction of agriculture to Europe and the ensuing dietary and demographic shifts of that time. In particular, our analysis supports previous hypotheses of strong selection on lactase persistence during periods of ancient famines and attenuated selection in more modern periods.
Collapse
Affiliation(s)
- Andrew H Vaughn
- Center for Computational Biology, University of California, Berkeley, CA 94720, USA
| | - Rasmus Nielsen
- Departments of Integrative Biology and Statistics, University of California, Berkeley, CA 94720, USA
- Center for GeoGenetics, University of Copenhagen, Copenhagen DK-1350, Denmark
| |
Collapse
|
13
|
L Rocha J, Lou RN, Sudmant PH. Structural variation in humans and our primate kin in the era of telomere-to-telomere genomes and pangenomics. Curr Opin Genet Dev 2024; 87:102233. [PMID: 39042999 PMCID: PMC11695101 DOI: 10.1016/j.gde.2024.102233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 07/02/2024] [Accepted: 07/05/2024] [Indexed: 07/25/2024]
Abstract
Structural variants (SVs) account for the majority of base pair differences both within and between primate species. However, our understanding of inter- and intra-species SV has been historically hampered by the quality of draft primate genomes and the absence of genome resources for key taxa. Recently, advances in long-read sequencing and genome assembly have begun to radically reshape our understanding of SVs. Two landmark achievements include the publication of a human telomere-to-telomere (T2T) genome as well as the development of the first human pangenome reference. In this review, we first look back to the major works laying the foundation for these projects. We then examine the ways in which T2T genome assemblies and pangenomes are transforming our understanding of and approach to primate SV. Finally, we discuss what the future of primate SV research may look like in the era of T2T genomes and pangenomics.
Collapse
Affiliation(s)
- Joana L Rocha
- Department of Integrative Biology, University of California, Berkeley, Berkeley, USA. https://twitter.com/@joanocha
| | - Runyang N Lou
- Department of Integrative Biology, University of California, Berkeley, Berkeley, USA. https://twitter.com/@NicolasLou10
| | - Peter H Sudmant
- Department of Integrative Biology, University of California, Berkeley, Berkeley, USA; Center for Computational Biology, University of California, Berkeley, Berkeley, USA.
| |
Collapse
|
14
|
Chotai M, Wei X, Messer PW. Signatures of selective sweeps in continuous-space populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.26.605365. [PMID: 39091822 PMCID: PMC11291165 DOI: 10.1101/2024.07.26.605365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Selective sweeps describe the process by which an adaptive mutation arises and rapidly fixes in the population, thereby removing genetic variation in its genomic vicinity. The expected signatures of selective sweeps are relatively well understood in panmictic population models, yet natural populations often extend across larger geographic ranges where individuals are more likely to mate with those born nearby. To investigate how such spatial population structure can affect sweep dynamics and signatures, we simulated selective sweeps in populations inhabiting a two-dimensional continuous landscape. The maximum dispersal distance of offspring from their parents can be varied in our simulations from an essentially panmictic population to scenarios with increasingly limited dispersal. We find that in low-dispersal populations, adaptive mutations spread more slowly than in panmictic ones, while recombination becomes less effective at breaking up genetic linkage around the sweep locus. Together, these factors result in a trough of reduced genetic diversity around the sweep locus that looks very similar across dispersal rates. We also find that the site frequency spectrum around hard sweeps in low-dispersal populations becomes enriched for intermediate-frequency variants, making these sweeps appear softer than they are. Furthermore, haplotype heterozygosity at the sweep locus tends to be elevated in low-dispersal scenarios as compared to panmixia, contrary to what we observe in neutral scenarios without sweeps. The haplotype patterns generated by these hard sweeps in low-dispersal populations can resemble soft sweeps from standing genetic variation that arose from substantially older alleles. Our results highlight the need for better accounting for spatial population structure when making inferences about selective sweeps.
Collapse
Affiliation(s)
- Meera Chotai
- Department of Computational Biology, Cornell University
| | - Xinzhu Wei
- Department of Computational Biology, Cornell University
| | | |
Collapse
|
15
|
Alkaraki AK, Alfonso-Sánchez MA, Peña JA, Abuelezz AI. Lactase persistence in the Jordanian population: Potential effects of the Arabian Peninsula and Sahara's aridification. Heliyon 2024; 10:e33455. [PMID: 39027493 PMCID: PMC11255666 DOI: 10.1016/j.heliyon.2024.e33455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 06/19/2024] [Accepted: 06/21/2024] [Indexed: 07/20/2024] Open
Abstract
The single nucleotide polymorphism (SNP) -13910 C > T has proved a good predictor of the incidence of lactase persistence in Europe and South Asia. Yet, this is not the case in the Near East, although this region is a passageway between the two continents. Lactase persistence is associated with cattle breeding, which originated in the Fertile Crescent of the Near East and spread later during the Middle Neolithic throughout Europe. Here we analyzed five SNPs (-13915 T > G (rs41380347), -13910 C > T (rs4988235), -13907 C > G (rs41525747), -14009 T > G (rs869051967), and -14010 G > C (rs145946881)) in three Jordanian human groups, namely the Bedouins, Jordan valley farmers, and Jordanian urban people. The SNPs -14009 T > G and -14010 G > C were not detected in the sample, -13907 C > G was virtually non-existent, -13910 C > T showed low frequencies, and -13915 T > G exhibited salient frequencies. The estimated incidence of lactase persistence was lower in the urban population (16 %), intermediate in the Jordan Valley's farmer population (30 %), and higher among the Bedouins (62 %). In explaining our findings, we postulated climatic change brought about by the aridification episode of the Arabian Peninsula and the Sahara 4200 years ago. This climatic milestone caused the collapse of the Akkadian Empire and the Old Kingdom in Egypt. Also, it could have led to a drastic decline of cattle in the region, being replaced by the domestication of camels. Loss of traditional crops and increasing dependence on camel milk might have triggered local selective pressures, mainly associated with -13915 T > G and differentiated from the ones in Europe, associated with -13910 C > T.
Collapse
Affiliation(s)
- Almuthanna K. Alkaraki
- Department of Biological Sciences, Faculty of Science, Yarmouk University, Irbid, 21163, Jordan
| | - Miguel A. Alfonso-Sánchez
- Departamento de Genética, Antropología Física y Fisiología Animal. Facultad de Ciencia y Tecnología. Universidad del País Vasco (UPV/EHU), Spain
| | - Jose A. Peña
- Departamento de Genética, Antropología Física y Fisiología Animal. Facultad de Ciencia y Tecnología. Universidad del País Vasco (UPV/EHU), Spain
| | - Alanoud I. Abuelezz
- Department of Biological Sciences, Faculty of Science, Yarmouk University, Irbid, 21163, Jordan
| |
Collapse
|
16
|
North HL, Fu Z, Metz R, Stull MA, Johnson CD, Shirley X, Crumley K, Reisig D, Kerns DL, Gilligan T, Walsh T, Jiggins CD, Sword GA. Rapid Adaptation and Interspecific Introgression in the North American Crop Pest Helicoverpa zea. Mol Biol Evol 2024; 41:msae129. [PMID: 38941083 PMCID: PMC11259193 DOI: 10.1093/molbev/msae129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 06/12/2024] [Accepted: 06/14/2024] [Indexed: 06/29/2024] Open
Abstract
Insect crop pests threaten global food security. This threat is amplified through the spread of nonnative species and through adaptation of native pests to control measures. Adaptations such as pesticide resistance can result from selection on variation within a population, or through gene flow from another population. We investigate these processes in an economically important noctuid crop pest, Helicoverpa zea, which has evolved resistance to a wide range of pesticides. Its sister species Helicoverpa armigera, first detected as an invasive species in Brazil in 2013, introduced the pyrethroid-resistance gene CYP337B3 to South American H. zea via adaptive introgression. To understand whether this could contribute to pesticide resistance in North America, we sequenced 237 H. zea genomes across 10 sample sites. We report H. armigera introgression into the North American H. zea population. Two individuals sampled in Texas in 2019 carry H. armigera haplotypes in a 4 Mbp region containing CYP337B3. Next, we identify signatures of selection in the panmictic population of nonadmixed H. zea, identifying a selective sweep at a second cytochrome P450 gene: CYP333B3. We estimate that its derived allele conferred a ∼5% fitness advantage and show that this estimate explains independently observed rare nonsynonymous CYP333B3 mutations approaching fixation over a ∼20-year period. We also detect putative signatures of selection at a kinesin gene associated with Bt resistance. Overall, we document two mechanisms of rapid adaptation: the introduction of fitness-enhancing alleles through interspecific introgression, and selection on intraspecific variation.
Collapse
Affiliation(s)
- Henry L North
- Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK
| | - Zhen Fu
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
- Bioinformatics and Biostatistics Core, Van Andel Institute, Grand Rapids, MI 49503, USA
| | - Richard Metz
- AgriLife Genomics and Bioinformatics Service, Texas A&M University, College Station, TX 77843, USA
| | - Matt A Stull
- AgriLife Genomics and Bioinformatics Service, Texas A&M University, College Station, TX 77843, USA
| | - Charles D Johnson
- AgriLife Genomics and Bioinformatics Service, Texas A&M University, College Station, TX 77843, USA
| | - Xanthe Shirley
- Animal and Plant Health Inspection Service, United States Department of Agriculture, College Station, TX, USA
| | - Kate Crumley
- Agrilife Extension, Texas A&M University, Wharton, TX, USA
| | - Dominic Reisig
- Department of Entomology and Plant Pathology, North Carolina State University, Plymouth, NC, 27962, USA
| | - David L Kerns
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - Todd Gilligan
- Animal and Plant Health Inspection Service, United States Department of Agriculture, Fort Collins, CO, USA
| | - Tom Walsh
- Black Mountain Laboratories, Commonwealth Scientific and Industrial Research Organization, Canberra, Australia
| | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK
| | - Gregory A Sword
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
17
|
Borisenkov M, Gubin D, Sergey K. On the issue of adaptive fitness of chronotypes in high latitudes. BIOL RHYTHM RES 2024; 55:354-358. [DOI: 10.1080/09291016.2024.2363742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 05/30/2024] [Indexed: 10/07/2024]
Affiliation(s)
- Mikhail Borisenkov
- Institute of Physiology of Komi Science Centre of the Ural Branch of the Russian Academy of Sciences, Syktyvkar, Russia
| | - Denis Gubin
- Laboratory for Chronobiology and Chronomedicine, Research Institute of Biomedicine and Biomedical Technologies, Medical University, Tyumen, Russia
- Department of Biology, Medical University, Tyumen, Russia
- Tyumen Cardiology Research Center, Tomsk National Research Medical Center, Russian Academy of Sciences, Tomsk, Russia
| | - Kolomeichuk Sergey
- Institute of Biology, Karelian Research Centre of the Russian Academy of Sciences, Petrozavodsk, Russia
- Laboratory for Genomics, Proteomics, and Metabolomics, Research Institute of Biomedicine and Biomedical Technologies, Tyumen State Medical University, Tyumen, Russia
| |
Collapse
|
18
|
Grinde KE, Browning BL, Reiner AP, Thornton TA, Browning SR. Adjusting for principal components can induce spurious associations in genome-wide association studies in admixed populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.02.587682. [PMID: 38617337 PMCID: PMC11014513 DOI: 10.1101/2024.04.02.587682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/24/2024]
Abstract
Principal component analysis (PCA) is widely used to control for population structure in genome-wide association studies (GWAS). Top principal components (PCs) typically reflect population structure, but challenges arise in deciding how many PCs are needed and ensuring that PCs do not capture other artifacts such as regions with atypical linkage disequilibrium (LD). In response to the latter, many groups suggest performing LD pruning or excluding known high LD regions prior to PCA. However, these suggestions are not universally implemented and the implications for GWAS are not fully understood, especially in the context of admixed populations. In this paper, we investigate the impact of pre-processing and the number of PCs included in GWAS models in African American samples from the Women's Women's Health Initiative SNP Health Association Resource and two Trans-Omics for Precision Medicine Whole Genome Sequencing Project contributing studies (Jackson Heart Study and Genetic Epidemiology of Chronic Obstructive Pulmonary Disease Study). In all three samples, we find the first PC is highly correlated with genome-wide ancestry whereas later PCs often capture local genomic features. The pattern of which, and how many, genetic variants are highly correlated with individual PCs differs from what has been observed in prior studies focused on European populations and leads to distinct downstream consequences: adjusting for such PCs yields biased effect size estimates and elevated rates of spurious associations due to the phenomenon of collider bias. Excluding high LD regions identified in previous studies does not resolve these issues. LD pruning proves more effective, but the optimal choice of thresholds varies across datasets. Altogether, our work highlights unique issues that arise when using PCA to control for ancestral heterogeneity in admixed populations and demonstrates the importance of careful pre-processing and diagnostics to ensure that PCs capturing multiple local genomic features are not included in GWAS models.
Collapse
Affiliation(s)
- Kelsey E. Grinde
- Department of Mathematics, Statistics, and Computer Science, Macalester College, Saint Paul, Minnesota, 55105, USA
| | - Brian L. Browning
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington, 98195, USA
| | - Alexander P. Reiner
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109, USA
- Department of Epidemiology, University of Washington, Seattle, Washington, 98195, USA
| | - Timothy A. Thornton
- Regeneron Genetics Center, Tarrytown, New York, 10591, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, 98195, USA
| | - Sharon R. Browning
- Department of Biostatistics, University of Washington, Seattle, Washington, 98195, USA
| |
Collapse
|
19
|
Riley R, Mathieson I, Mathieson S. Interpreting generative adversarial networks to infer natural selection from genetic data. Genetics 2024; 226:iyae024. [PMID: 38386895 PMCID: PMC10990424 DOI: 10.1093/genetics/iyae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 01/15/2024] [Accepted: 01/19/2024] [Indexed: 02/24/2024] Open
Abstract
Understanding natural selection and other forms of non-neutrality is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically require slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection and other local evolutionary processes that requires relatively few selection simulations during training. We build upon a generative adversarial network trained to simulate realistic neutral data. This consists of a generator (fitted demographic model), and a discriminator (convolutional neural network) that predicts whether a genomic region is real or fake. As the generator can only generate data under neutral demographic processes, regions of real data that the discriminator recognizes as having a high probability of being "real" do not fit the neutral demographic model and are therefore candidates for targets of selection. To incentivize identification of a specific mode of selection, we fine-tune the discriminator with a small number of custom non-neutral simulations. We show that this approach has high power to detect various forms of selection in simulations, and that it finds regions under positive selection identified by state-of-the-art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics.
Collapse
Affiliation(s)
- Rebecca Riley
- Department of Computer Science, Haverford College, Haverford, PA 19041, USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sara Mathieson
- Department of Computer Science, Haverford College, Haverford, PA 19041, USA
| |
Collapse
|
20
|
Song H, Chu J, Li W, Li X, Fang L, Han J, Zhao S, Ma Y. A Novel Approach Utilizing Domain Adversarial Neural Networks for the Detection and Classification of Selective Sweeps. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2304842. [PMID: 38308186 PMCID: PMC11005742 DOI: 10.1002/advs.202304842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/10/2024] [Indexed: 02/04/2024]
Abstract
The identification and classification of selective sweeps are of great significance for improving the understanding of biological evolution and exploring opportunities for precision medicine and genetic improvement. Here, a domain adaptation sweep detection and classification (DASDC) method is presented to balance the alignment of two domains and the classification performance through a domain-adversarial neural network and its adversarial learning modules. DASDC effectively addresses the issue of mismatch between training data and real genomic data in deep learning models, leading to a significant improvement in its generalization capability, prediction robustness, and accuracy. The DASDC method demonstrates improved identification performance compared to existing methods and excels in classification performance, particularly in scenarios where there is a mismatch between application data and training data. The successful implementation of DASDC in real data of three distinct species highlights its potential as a useful tool for identifying crucial functional genes and investigating adaptive evolutionary mechanisms, particularly with the increasing availability of genomic data.
Collapse
Affiliation(s)
- Hui Song
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Jinyu Chu
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Wangjiao Li
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
| | - Lingzhao Fang
- Center for Quantitative Genetics and GenomicsAarhus UniversityAarhus8000Denmark
| | - Jianlin Han
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- CAAS‐ILRI Joint Laboratory on Livestock and Forage Genetic ResourcesInstitute of Animal ScienceChinese Academy of Agricultural Sciences (CAAS)Beijing100193China
- Livestock Genetics ProgramInternational Livestock Research Institute (ILRI)Nairobi00100Kenya
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
- Lingnan Modern Agricultural Science and Technology Guangdong LaboratoryGuangzhou510642China
| | - Yunlong Ma
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
- Lingnan Modern Agricultural Science and Technology Guangdong LaboratoryGuangzhou510642China
| |
Collapse
|
21
|
González A, Fullaondo A, Odriozola A. Impact of evolution on lifestyle in microbiome. ADVANCES IN GENETICS 2024; 111:149-198. [PMID: 38908899 DOI: 10.1016/bs.adgen.2024.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/24/2024]
Abstract
This chapter analyses the interaction between microbiota and humans from an evolutionary point of view. Long-term interactions between gut microbiota and host have been generated as a result of dietary choices through coevolutionary processes, where mutuality of advantage is essential. Likewise, the characteristics of the intestinal environment have made it possible to describe different intrahost evolutionary mechanisms affecting microbiota. For its part, the intestinal microbiota has been of great importance in the evolution of mammals, allowing the diversification of dietary niches, phenotypic plasticity and the selection of host phenotypes. Although the origin of the human intestinal microbial community is still not known with certainty, mother-offspring transmission plays a key role, and it seems that transmissibility between individuals in adulthood also has important implications. Finally, it should be noted that certain aspects inherent to modern lifestyle, including refined diets, antibiotic intake, exposure to air pollutants, microplastics, and stress, could negatively affect the diversity and composition of our gut microbiota. This chapter aims to combine current knowledge to provide a comprehensive view of the interaction between microbiota and humans throughout evolution.
Collapse
Affiliation(s)
- Adriana González
- Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country (UPV/EHU), Leioa, Spain.
| | - Asier Fullaondo
- Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country (UPV/EHU), Leioa, Spain
| | - Adrián Odriozola
- Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country (UPV/EHU), Leioa, Spain
| |
Collapse
|
22
|
Aslett LJM, Christ RR. kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R. BMC Bioinformatics 2024; 25:86. [PMID: 38418970 PMCID: PMC10900616 DOI: 10.1186/s12859-024-05688-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 02/01/2024] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND Approximating the recent phylogeny of N phased haplotypes at a set of variants along the genome is a core problem in modern population genomics and central to performing genome-wide screens for association, selection, introgression, and other signals. The Li & Stephens (LS) model provides a simple yet powerful hidden Markov model for inferring the recent ancestry at a given variant, represented as an N × N distance matrix based on posterior decodings. RESULTS We provide a high-performance engine to make these posterior decodings readily accessible with minimal pre-processing via an easy to use package kalis, in the statistical programming language R. kalis enables investigators to rapidly resolve the ancestry at loci of interest and developers to build a range of variant-specific ancestral inference pipelines on top. kalis exploits both multi-core parallelism and modern CPU vector instruction sets to enable scaling to hundreds of thousands of genomes. CONCLUSIONS The resulting distance matrices accessible via kalis enable local ancestry, selection, and association studies in modern large scale genomic datasets.
Collapse
Affiliation(s)
- Louis J M Aslett
- Department of Mathematical Sciences, Durham University, Stockton Road, Durham, DH1 3LE, UK.
| | - Ryan R Christ
- Department of Genetics, Yale School of Medicine, 333 Cedar Street, New Haven, CT, 06520, USA
| |
Collapse
|
23
|
Gao Z. Unveiling recent and ongoing adaptive selection in human populations. PLoS Biol 2024; 22:e3002469. [PMID: 38236800 PMCID: PMC10796035 DOI: 10.1371/journal.pbio.3002469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2024] Open
Abstract
Genome-wide scans for signals of selection have become a routine part of the analysis of population genomic variation datasets and have resulted in compelling evidence of selection during recent human evolution. This Essay spotlights methodological innovations that have enabled the detection of selection over very recent timescales, even in contemporary human populations. By harnessing large-scale genomic and phenotypic datasets, these new methods use different strategies to uncover connections between genotype, phenotype, and fitness. This Essay outlines the rationale and key findings of each strategy, discusses challenges in interpretation, and describes opportunities to improve detection and understanding of ongoing selection in human populations.
Collapse
Affiliation(s)
- Ziyue Gao
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
24
|
Barrie W, Yang Y, Irving-Pease EK, Attfield KE, Scorrano G, Jensen LT, Armen AP, Dimopoulos EA, Stern A, Refoyo-Martinez A, Pearson A, Ramsøe A, Gaunitz C, Demeter F, Jørkov MLS, Møller SB, Springborg B, Klassen L, Hyldgård IM, Wickmann N, Vinner L, Korneliussen TS, Allentoft ME, Sikora M, Kristiansen K, Rodriguez S, Nielsen R, Iversen AKN, Lawson DJ, Fugger L, Willerslev E. Elevated genetic risk for multiple sclerosis emerged in steppe pastoralist populations. Nature 2024; 625:321-328. [PMID: 38200296 PMCID: PMC10781639 DOI: 10.1038/s41586-023-06618-z] [Citation(s) in RCA: 34] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 09/06/2023] [Indexed: 01/12/2024]
Abstract
Multiple sclerosis (MS) is a neuro-inflammatory and neurodegenerative disease that is most prevalent in Northern Europe. Although it is known that inherited risk for MS is located within or in close proximity to immune-related genes, it is unknown when, where and how this genetic risk originated1. Here, by using a large ancient genome dataset from the Mesolithic period to the Bronze Age2, along with new Medieval and post-Medieval genomes, we show that the genetic risk for MS rose among pastoralists from the Pontic steppe and was brought into Europe by the Yamnaya-related migration approximately 5,000 years ago. We further show that these MS-associated immunogenetic variants underwent positive selection both within the steppe population and later in Europe, probably driven by pathogenic challenges coinciding with changes in diet, lifestyle and population density. This study highlights the critical importance of the Neolithic period and Bronze Age as determinants of modern immune responses and their subsequent effect on the risk of developing MS in a changing environment.
Collapse
Affiliation(s)
- William Barrie
- Department of Zoology, University of Cambridge, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Yaoling Yang
- Department of Statistical Sciences, School of Mathematics, University of Bristol, Bristol, UK
- MRC Integrative Epidemiology Unit, Population Health Sciences, University of Bristol, Bristol, UK
| | - Evan K Irving-Pease
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Kathrine E Attfield
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
| | - Gabriele Scorrano
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Lise Torp Jensen
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
- Department of Clinical Medicine, Aarhus University Hospital, Aarhus, Denmark
| | - Angelos P Armen
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
| | | | - Aaron Stern
- Departments of Integrative Biology and Statistics, University of California, Berkeley, Berkeley, CA, USA
| | - Alba Refoyo-Martinez
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Alice Pearson
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Abigail Ramsøe
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Charleen Gaunitz
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Fabrice Demeter
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Eco-anthropologie (EA), Muséum National d'Histoire Naturelle, CNRS, Université de Paris, Musée de l'Homme, Paris, France
| | - Marie Louise S Jørkov
- Laboratory of Biological Anthropology, Department of Forensic Medicine, University of Copenhagen, Copenhagen, Denmark
| | | | | | - Lutz Klassen
- Museum Østdanmark-Djursland og Randers, Randers, Denmark
| | | | | | - Lasse Vinner
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | | | - Morten E Allentoft
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Trace and Environmental DNA (TrEnD) Laboratory, School of Molecular and Life Sciences, Curtin University, Perth, Western Australia, Australia
| | - Martin Sikora
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Kristian Kristiansen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden
| | - Santiago Rodriguez
- MRC Integrative Epidemiology Unit, Population Health Sciences, University of Bristol, Bristol, UK
| | - Rasmus Nielsen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Departments of Integrative Biology and Statistics, University of California, Berkeley, Berkeley, CA, USA
| | - Astrid K N Iversen
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK.
- Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK.
| | - Daniel J Lawson
- Department of Statistical Sciences, School of Mathematics, University of Bristol, Bristol, UK.
- MRC Integrative Epidemiology Unit, Population Health Sciences, University of Bristol, Bristol, UK.
| | - Lars Fugger
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK.
- Department of Clinical Medicine, Aarhus University Hospital, Aarhus, Denmark.
- MRC Human Immunology Unit, John Radcliffe Hospital, University of Oxford, Oxford, UK.
| | - Eske Willerslev
- Department of Zoology, University of Cambridge, Cambridge, UK.
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
- MARUM Center for Marine Environmental Sciences and Faculty of Geosciences, University of Bremen, Bremen, Germany.
| |
Collapse
|
25
|
Mera-Charria A, Nieto-Lopez F, Francès MP, Arbex PM, Vila-Vecilla L, Russo V, Silva CCV, De Souza GT. Genetic variant panel allows predicting both obesity risk, and efficacy of procedures and diet in weight loss. Front Nutr 2023; 10:1274662. [PMID: 38035352 PMCID: PMC10687570 DOI: 10.3389/fnut.2023.1274662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 10/24/2023] [Indexed: 12/02/2023] Open
Abstract
Purpose Obesity is a multifactorial condition with a relevant genetic correlation. Recent advances in genomic research have identified several single nucleotide polymorphisms (SNPs) in genes such as FTO, MCM6, HLA, and MC4R, associated with obesity. This study aimed to evaluate the association of 102 SNPs with BMI and weight loss treatment response in a multi-ethnic population. Methods The study analyzed 9,372 patients for the correlation between SNPs and BMI (dataset A). The correlation between SNP and weight loss was accessed in 474 patients undergoing different treatments (dataset B). Patients in dataset B were further divided into 3 categories based on the type of intervention: dietary therapy, intragastric balloon procedures, or surgeries. SNP association analysis and multiple models of inheritance were performed. Results In dataset A, ten SNPs, including rs9939609 (FTO), rs4988235 (MCM6), and rs2395182 (HLA), were significantly associated with increased BMI. Additionally, other four SNPs, rs7903146 (TCF7L2), (rs6511720), rs5400 (SLC2A2), and rs7498665 (SH2B1), showed sex-specific correlation. For dataset B, SNPs rs2016520 (PPAR-Delta) and rs2419621 (ACSL5) demonstrated significant correlation with weight loss for all treatment types. In patients who adhered to dietary therapy, SNPs rs6544713 (ABCG8) and rs762551 (CYP1A2) were strongly correlated with weight loss. Patients undergoing surgical or endoscopic procedures exhibited differential correlations with several SNPs, including rs1801725 (CASR) and rs12970134 (MC4R), and weight loss. Conclusion This study provides valuable insights into the genetic factors influencing BMI and weight loss response to different treatments. The findings highlight the potential for personalized weight management approaches based on individual genetic profiles.
Collapse
Affiliation(s)
| | - Francisco Nieto-Lopez
- Dorsia Clinics, Madrid, Spain
- Catedra UCAM Dorsia, Catholic University San Antonio of Murcia, Guadalupe, Spain
| | | | | | | | | | | | | |
Collapse
|
26
|
Gouveia MH, Bentley AR, Leal TP, Tarazona-Santos E, Bustamante CD, Adeyemo AA, Rotimi CN, Shriner D. Unappreciated subcontinental admixture in Europeans and European Americans and implications for genetic epidemiology studies. Nat Commun 2023; 14:6802. [PMID: 37935687 PMCID: PMC10630423 DOI: 10.1038/s41467-023-42491-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 10/12/2023] [Indexed: 11/09/2023] Open
Abstract
European-ancestry populations are recognized as stratified but not as admixed, implying that residual confounding by locus-specific ancestry can affect studies of association, polygenic adaptation, and polygenic risk scores. We integrate individual-level genome-wide data from ~19,000 European-ancestry individuals across 79 European populations and five European American cohorts. We generate a new reference panel that captures ancestral diversity missed by both the 1000 Genomes and Human Genome Diversity Projects. Both Europeans and European Americans are admixed at the subcontinental level, with admixture dates differing among subgroups of European Americans. After adjustment for both genome-wide and locus-specific ancestry, associations between a highly differentiated variant in LCT (rs4988235) and height or LDL-cholesterol were confirmed to be false positives whereas the association between LCT and body mass index was genuine. We provide formal evidence of subcontinental admixture in individuals with European ancestry, which, if not properly accounted for, can produce spurious results in genetic epidemiology studies.
Collapse
Affiliation(s)
- Mateus H Gouveia
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Amy R Bentley
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Thiago P Leal
- Department of Genomic Medicine, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44197, USA
| | - Eduardo Tarazona-Santos
- Departamento de Genética, Ecologia e Evolução, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, 31270-910, Brazil
| | - Carlos D Bustamante
- Center for Computational, Evolutionary and Human Genomics (CEHG), Stanford University, Stanford, CA, 94305, USA
| | - Adebowale A Adeyemo
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Charles N Rotimi
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| | - Daniel Shriner
- Center for Research on Genomics and Global Health, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
27
|
Mo Z, Siepel A. Domain-adaptive neural networks improve supervised machine learning based on simulated population genetic data. PLoS Genet 2023; 19:e1011032. [PMID: 37934781 PMCID: PMC10655966 DOI: 10.1371/journal.pgen.1011032] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 11/17/2023] [Accepted: 10/23/2023] [Indexed: 11/09/2023] Open
Abstract
Investigators have recently introduced powerful methods for population genetic inference that rely on supervised machine learning from simulated data. Despite their performance advantages, these methods can fail when the simulated training data does not adequately resemble data from the real world. Here, we show that this "simulation mis-specification" problem can be framed as a "domain adaptation" problem, where a model learned from one data distribution is applied to a dataset drawn from a different distribution. By applying an established domain-adaptation technique based on a gradient reversal layer (GRL), originally introduced for image classification, we show that the effects of simulation mis-specification can be substantially mitigated. We focus our analysis on two state-of-the-art deep-learning population genetic methods-SIA, which infers positive selection from features of the ancestral recombination graph (ARG), and ReLERNN, which infers recombination rates from genotype matrices. In the case of SIA, the domain adaptive framework also compensates for ARG inference error. Using the domain-adaptive SIA (dadaSIA) model, we estimate improved selection coefficients at selected loci in the 1000 Genomes CEU population. We anticipate that domain adaptation will prove to be widely applicable in the growing use of supervised machine learning in population genetics.
Collapse
Affiliation(s)
- Ziyi Mo
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| |
Collapse
|
28
|
Luo H, Zhang P, Zhang W, Zheng Y, Hao D, Shi Y, Niu Y, Song T, Li Y, Zhao S, Chen H, Xu T, He S. Recent positive selection signatures reveal phenotypic evolution in the Han Chinese population. Sci Bull (Beijing) 2023; 68:2391-2404. [PMID: 37661541 DOI: 10.1016/j.scib.2023.08.027] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 05/08/2023] [Accepted: 08/10/2023] [Indexed: 09/05/2023]
Abstract
Characterizing natural selection signatures and relationships with phenotype spectra is important for understanding human evolution and both biological and pathological mechanisms. Here, we identified 24 genetic loci under recent selection by analyzing rare singletons in 3946 high-depth whole-genome sequencing data of Han Chinese. The loci include immune-related gene regions (MHC cluster, IGH cluster, STING1, and PSG), alcohol metabolism-related gene regions (ADH1B, ALDH2, and ALDH3B2), and the olfactory perception gene OR4C16, in which the MHC cluster, ADH1B, and ALDH2 were also identified by TOPMed and WestLake Biobank. Among the signals, the IGH cluster is particularly interesting, in which the favored allele of variant 14_105737776_C_T (rs117518546, IgG1-G396R) promotes immune response, but also increases the risk of an autoimmune disease systemic lupus erythematosus (SLE). It is also surprising that our newly discovered ALDH3B2 evolved in the opposite direction to ALDH2 for alcohol metabolism. Besides monogenic traits, we found that multiple complex traits experienced polygenic adaptation. Particularly, multi-methods consistently revealed that lower blood pressure was favored in natural selection. Finally, we built a database named RePoS (recent positive selection, http://bigdata.ibp.ac.cn/RePoS/) to integrate and display multi-population selection signals. Our study extended our understanding of natural evolution and phenotype adaptation in Han Chinese as well as other populations.
Collapse
Affiliation(s)
- Huaxia Luo
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; Department of Pediatrics, Peking University First Hospital, Beijing 100034, China
| | - Peng Zhang
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Wanyu Zhang
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yu Zheng
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Di Hao
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yirong Shi
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiwei Niu
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingrui Song
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yanyan Li
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shilei Zhao
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; China National Center for Bioinformation, Beijing 100101, China
| | - Hua Chen
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; China National Center for Bioinformation, Beijing 100101, China.
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; Shandong First Medical University & Shandong Academy of Medical Sciences, Taian 271016, China.
| | - Shunmin He
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
29
|
Long L, Xu W, Valencia F, Paaby AB, McGrath PT. A toxin-antidote selfish element increases fitness of its host. eLife 2023; 12:e81640. [PMID: 37874324 PMCID: PMC10629817 DOI: 10.7554/elife.81640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 10/23/2023] [Indexed: 10/25/2023] Open
Abstract
Selfish genetic elements can promote their transmission at the expense of individual survival, creating conflict between the element and the rest of the genome. Recently, a large number of toxin-antidote (TA) post-segregation distorters have been identified in non-obligate outcrossing nematodes. Their origin and the evolutionary forces that keep them at intermediate population frequencies are poorly understood. Here, we study a TA element in Caenorhabditis elegans called zeel-1;peel-1. Two major haplotypes of this locus, with and without the selfish element, segregate in C. elegans. We evaluate the fitness consequences of the zeel-1;peel-1 element outside of its role in gene drive in non-outcrossing animals and demonstrate that loss of the toxin peel-1 decreased fitness of hermaphrodites and resulted in reductions in fecundity and body size. These findings suggest a biological role for peel-1 beyond toxin lethality. This work demonstrates that a TA element can provide a fitness benefit to its hosts either during their initial evolution or by being co-opted by the animals following their selfish spread. These findings guide our understanding on how TA elements can remain in a population where gene drive is minimized, helping resolve the mystery of prevalent TA elements in selfing animals.
Collapse
Affiliation(s)
- Lijiang Long
- School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
- Interdisciplinary Graduate Program in Quantitative Biosciences, Georgia Institute of TechnologyAtlantaUnited States
| | - Wen Xu
- School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
| | - Francisco Valencia
- School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
| | - Annalise B Paaby
- School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
| | - Patrick T McGrath
- School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
- School of Physics, Georgia Institute of TechnologyAtlantaUnited States
| |
Collapse
|
30
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data. Mol Biol Evol 2023; 40:msad216. [PMID: 37772983 PMCID: PMC10581699 DOI: 10.1093/molbev/msad216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/10/2023] [Accepted: 09/14/2023] [Indexed: 09/30/2023] Open
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
31
|
Mo Z, Siepel A. Domain-adaptive neural networks improve supervised machine learning based on simulated population genetic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.01.529396. [PMID: 36909514 PMCID: PMC10002701 DOI: 10.1101/2023.03.01.529396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
Abstract
Investigators have recently introduced powerful methods for population genetic inference that rely on supervised machine learning from simulated data. Despite their performance advantages, these methods can fail when the simulated training data does not adequately resemble data from the real world. Here, we show that this "simulation mis-specification" problem can be framed as a "domain adaptation" problem, where a model learned from one data distribution is applied to a dataset drawn from a different distribution. By applying an established domain-adaptation technique based on a gradient reversal layer (GRL), originally introduced for image classification, we show that the effects of simulation mis-specification can be substantially mitigated. We focus our analysis on two state-of-the-art deep-learning population genetic methods-SIA, which infers positive selection from features of the ancestral recombination graph (ARG), and ReLERNN, which infers recombination rates from genotype matrices. In the case of SIA, the domain adaptive framework also compensates for ARG inference error. Using the domain-adaptive SIA (dadaSIA) model, we estimate improved selection coefficients at selected loci in the 1000 Genomes CEU population. We anticipate that domain adaptation will prove to be widely applicable in the growing use of supervised machine learning in population genetics.
Collapse
Affiliation(s)
- Ziyi Mo
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
- School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
- School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
| |
Collapse
|
32
|
Riley R, Mathieson I, Mathieson S. INTERPRETING GENERATIVE ADVERSARIAL NETWORKS TO INFER NATURAL SELECTION FROM GENETIC DATA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.07.531546. [PMID: 36945387 PMCID: PMC10028936 DOI: 10.1101/2023.03.07.531546] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
Understanding natural selection in humans and other species is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically requires slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Mismatches between simulated training data and real test data can lead to incorrect inference. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection that requires relatively few selection simulations during training. We use a Generative Adversarial Network (GAN) trained to simulate realistic neutral data. The resulting GAN consists of a generator (fitted demographic model) and a discriminator (convolutional neural network). For a genomic region, the discriminator predicts whether it is "real" or "fake" in the sense that it could have been simulated by the generator. As the "real" training data includes regions that experienced selection and the generator cannot produce such regions, regions with a high probability of being real are likely to have experienced selection. To further incentivize this behavior, we "fine-tune" the discriminator with a small number of selection simulations. We show that this approach has high power to detect selection in simulations, and that it finds regions under selection identified by state-of-the art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics. In summary, our approach is a novel, efficient, and powerful way to use machine learning to detect natural selection.
Collapse
Affiliation(s)
- Rebecca Riley
- Department of Computer Science, Haverford College, Haverford PA, 19041 USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia PA, 19104 USA
| | - Sara Mathieson
- Department of Computer Science, Haverford College, Haverford PA, 19041 USA
| |
Collapse
|
33
|
Arnab SP, Amin MR, DeGiorgio M. Uncovering Footprints of Natural Selection Through Spectral Analysis of Genomic Summary Statistics. Mol Biol Evol 2023; 40:msad157. [PMID: 37433019 PMCID: PMC10365025 DOI: 10.1093/molbev/msad157] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 06/28/2023] [Accepted: 07/06/2023] [Indexed: 07/13/2023] Open
Abstract
Natural selection leaves a spatial pattern along the genome, with a haplotype distribution distortion near the selected locus that fades with distance. Evaluating the spatial signal of a population-genetic summary statistic across the genome allows for patterns of natural selection to be distinguished from neutrality. Considering the genomic spatial distribution of multiple summary statistics is expected to aid in uncovering subtle signatures of selection. In recent years, numerous methods have been devised that consider genomic spatial distributions across summary statistics, utilizing both classical machine learning and deep learning architectures. However, better predictions may be attainable by improving the way in which features are extracted from these summary statistics. We apply wavelet transform, multitaper spectral analysis, and S-transform to summary statistic arrays to achieve this goal. Each analysis method converts one-dimensional summary statistic arrays to two-dimensional images of spectral analysis, allowing simultaneous temporal and spectral assessment. We feed these images into convolutional neural networks and consider combining models using ensemble stacking. Our modeling framework achieves high accuracy and power across a diverse set of evolutionary settings, including population size changes and test sets of varying sweep strength, softness, and timing. A scan of central European whole-genome sequences recapitulated well-established sweep candidates and predicted novel cancer-associated genes as sweeps with high support. Given that this modeling framework is also robust to missing genomic segments, we believe that it will represent a welcome addition to the population-genomic toolkit for learning about adaptive processes from genomic data.
Collapse
Affiliation(s)
- Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
34
|
Vaill M, Kawanishi K, Varki N, Gagneux P, Varki A. Comparative physiological anthropogeny: exploring molecular underpinnings of distinctly human phenotypes. Physiol Rev 2023; 103:2171-2229. [PMID: 36603157 PMCID: PMC10151058 DOI: 10.1152/physrev.00040.2021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 12/26/2022] [Accepted: 12/28/2022] [Indexed: 01/06/2023] Open
Abstract
Anthropogeny is a classic term encompassing transdisciplinary investigations of the origins of the human species. Comparative anthropogeny is a systematic comparison of humans and other living nonhuman hominids (so-called "great apes"), aiming to identify distinctly human features in health and disease, with the overall goal of explaining human origins. We begin with a historical perspective, briefly describing how the field progressed from the earliest evolutionary insights to the current emphasis on in-depth molecular and genomic investigations of "human-specific" biology and an increased appreciation for cultural impacts on human biology. While many such genetic differences between humans and other hominids have been revealed over the last two decades, this information remains insufficient to explain the most distinctive phenotypic traits distinguishing humans from other living hominids. Here we undertake a complementary approach of "comparative physiological anthropogeny," along the lines of the preclinical medical curriculum, i.e., beginning with anatomy and considering each physiological system and in each case considering genetic and molecular components that are relevant. What is ultimately needed is a systematic comparative approach at all levels from molecular to physiological to sociocultural, building networks of related information, drawing inferences, and generating testable hypotheses. The concluding section will touch on distinctive considerations in the study of human evolution, including the importance of gene-culture interactions.
Collapse
Affiliation(s)
- Michael Vaill
- Center for Academic Research and Training in Anthropogeny, University of California, San Diego, La Jolla, California
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California
- Glycobiology Research and Training Center, University of California, San Diego, La Jolla, California
| | - Kunio Kawanishi
- Center for Academic Research and Training in Anthropogeny, University of California, San Diego, La Jolla, California
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California
- Department of Experimental Pathology, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
| | - Nissi Varki
- Center for Academic Research and Training in Anthropogeny, University of California, San Diego, La Jolla, California
- Glycobiology Research and Training Center, University of California, San Diego, La Jolla, California
- Department of Pathology, University of California, San Diego, La Jolla, California
| | - Pascal Gagneux
- Center for Academic Research and Training in Anthropogeny, University of California, San Diego, La Jolla, California
- Glycobiology Research and Training Center, University of California, San Diego, La Jolla, California
- Department of Pathology, University of California, San Diego, La Jolla, California
| | - Ajit Varki
- Center for Academic Research and Training in Anthropogeny, University of California, San Diego, La Jolla, California
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, California
- Glycobiology Research and Training Center, University of California, San Diego, La Jolla, California
| |
Collapse
|
35
|
Bľandová G, Patlevičová A, Palkovičová J, Pavlíková Š, Beňuš R, Repiská V, Baldovič M. Pilot study of correlation of selected genetic factors with cribra orbitalia in individuals from a medieval population from Slovakia. INTERNATIONAL JOURNAL OF PALEOPATHOLOGY 2023; 41:1-7. [PMID: 36812666 DOI: 10.1016/j.ijpp.2023.02.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 01/30/2023] [Accepted: 02/06/2023] [Indexed: 06/12/2023]
Abstract
OBJECTIVE The aim of this study is to investigate the potential genetic etiology of cribra orbitalia noted on human skeletal remains. MATERIALS We obtained and analyzed ancient DNA of 43 individuals with cribra orbitalia. The analyzed set represented medieval individuals from two cemeteries in western Slovakia, Castle Devín (11th-12th century AD) and Cífer-Pác (8th-9th century AD). METHODS We performed a sequence analysis of 5 variants in 3 genes associated with anemia (HBB, G6PD, PKLR), which are the most common pathogenic variants in present day of European populations, and one variant MCM6:c.1917 + 326 C>T (rs4988235) associated with lactose intolerance. RESULTS DNA variants associated with anemia were not found in the samples. The allele frequency of MCM6:c.1917 + 326 C was 0.875. This frequency is higher but not statistically significant in individuals displaying cribra orbitalia compared to individuals without the lesion. SIGNIFICANCE This study seeks to expand our knowledge of the etiology of cribra orbitalia by exploring the potential association between the lesion and the presence of alleles linked to hereditary anemias and lactose intolerance. LIMITATIONS A relatively small set of individuals were analyzed, so an unequivocal conclusion cannot be drawn. Hence, although it is unlikely, a genetic form of anemia caused by rare variants cannot be ruled out. SUGGESTIONS FOR FURTHER RESEARCH Genetic research based on larger sample sizes and in more diverse geographical regions.
Collapse
Affiliation(s)
- Gabriela Bľandová
- Institute of Medical Biology, Genetics and Clinical Genetics, Faculty of Medicine, Comenius University, Sasinkova 4, 811 08 Bratislava, Slovakia
| | - Andrea Patlevičová
- Department of Biology, Faculty of Natural Sciences, University of Ss. Cyril and Methodius, Nám. J. Herdu 2, 917 01 Trnava, Slovakia
| | - Jana Palkovičová
- Department of Molecular Biology, Faculty of Natural Sciences, Comenius University, Ilkovičova 6, 842 15 Bratislava, Slovakia
| | - Štefánia Pavlíková
- Department of Anthropology, Faculty of Natural Sciences, Comenius University, Ilkovičova 6, 842 15 Bratislava, Slovakia
| | - Radoslav Beňuš
- Department of Anthropology, Faculty of Natural Sciences, Comenius University, Ilkovičova 6, 842 15 Bratislava, Slovakia
| | - Vanda Repiská
- Institute of Medical Biology, Genetics and Clinical Genetics, Faculty of Medicine, Comenius University, Sasinkova 4, 811 08 Bratislava, Slovakia
| | - Marian Baldovič
- Department of Molecular Biology, Faculty of Natural Sciences, Comenius University, Ilkovičova 6, 842 15 Bratislava, Slovakia; Laboratory of Genomic Medicine, GHC GENETICS SK, Science Park Comenius University, Ilkovičova 8, 841 04 Bratislava, Slovakia.
| |
Collapse
|
36
|
Tobler R, Souilmi Y, Huber CD, Bean N, Turney CSM, Grey ST, Cooper A. The role of genetic selection and climatic factors in the dispersal of anatomically modern humans out of Africa. Proc Natl Acad Sci U S A 2023; 120:e2213061120. [PMID: 37220274 PMCID: PMC10235988 DOI: 10.1073/pnas.2213061120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 03/14/2023] [Indexed: 05/25/2023] Open
Abstract
The evolutionarily recent dispersal of anatomically modern humans (AMH) out of Africa (OoA) and across Eurasia provides a unique opportunity to examine the impacts of genetic selection as humans adapted to multiple new environments. Analysis of ancient Eurasian genomic datasets (~1,000 to 45,000 y old) reveals signatures of strong selection, including at least 57 hard sweeps after the initial AMH movement OoA, which have been obscured in modern populations by extensive admixture during the Holocene. The spatiotemporal patterns of these hard sweeps provide a means to reconstruct early AMH population dispersals OoA. We identify a previously unsuspected extended period of genetic adaptation lasting ~30,000 y, potentially in the Arabian Peninsula area, prior to a major Neandertal genetic introgression and subsequent rapid dispersal across Eurasia as far as Australia. Consistent functional targets of selection initiated during this period, which we term the Arabian Standstill, include loci involved in the regulation of fat storage, neural development, skin physiology, and cilia function. Similar adaptive signatures are also evident in introgressed archaic hominin loci and modern Arctic human groups, and we suggest that this signal represents selection for cold adaptation. Surprisingly, many of the candidate selected loci across these groups appear to directly interact and coordinately regulate biological processes, with a number associated with major modern diseases including the ciliopathies, metabolic syndrome, and neurodegenerative disorders. This expands the potential for ancestral human adaptation to directly impact modern diseases, providing a platform for evolutionary medicine.
Collapse
Affiliation(s)
- Raymond Tobler
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, SA5005, Australia
| | - Yassine Souilmi
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, SA5005, Australia
- Environment Institute, The University of Adelaide, Adelaide, SA5005, Australia
| | - Christian D. Huber
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, SA5005, Australia
| | - Nigel Bean
- Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, SA5005, Australia
- School of Mathematical Sciences, The University of Adelaide, Adelaide, SA5005, Australia
| | - Chris S. M. Turney
- Division of Research, University of Technology Sydney, Ultimo, NSW2007, Australia
| | - Shane T. Grey
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, University of New South Wales, Sydney, NSW2052, Australia
- Transplantation Immunology Group, Translation Science Pillar, Garvan Institute of Medical Research, Darlinghurst, NSW2010, Australia
| | - Alan Cooper
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, SA5005, Australia
- Blue Sky Genetics, Ashton, SA5137, Australia
| |
Collapse
|
37
|
Gu L, Xia C, Yang S, Yang G. The adaptive evolution of cancer driver genes. BMC Genomics 2023; 24:215. [PMID: 37098512 PMCID: PMC10131384 DOI: 10.1186/s12864-023-09301-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 04/08/2023] [Indexed: 04/27/2023] Open
Abstract
BACKGROUND Cancer is a life-threatening disease in humans; yet, cancer genes are frequently reported to be under positive selection. This suggests an evolutionary-genetic paradox in which cancer evolves as a secondary product of selection in human beings. However, systematic investigation of the evolution of cancer driver genes is sparse. RESULTS Using comparative genomics analysis, population genetics analysis and computational molecular evolutionary analysis, the evolution of 568 cancer driver genes of 66 cancer types were evaluated at two levels, selection on the early evolution of humans (long timescale selection in the human lineage during primate evolution, i.e., millions of years), and recent selection in modern human populations (~ 100,000 years). Results showed that eight cancer genes covering 11 cancer types were under positive selection in the human lineage (long timescale selection). And 35 cancer genes covering 47 cancer types were under positive selection in modern human populations (recent selection). Moreover, SNPs associated with thyroid cancer in three thyroid cancer driver genes (CUX1, HERC2 and RGPD3) were under positive selection in East Asian and European populations, consistent with the high incidence of thyroid cancer in these populations. CONCLUSIONS These findings suggest that cancer can be evolved, in part, as a by-product of adaptive changes in humans. Different SNPs at the same locus can be under different selection pressures in different populations, and thus should be under consideration during precision medicine, especially for targeted medicine in specific populations.
Collapse
Affiliation(s)
- Langyu Gu
- State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, Guangdong, 510275, China.
| | - Canwei Xia
- Ministry of Education Key Laboratory for Biodiversity and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Shiyu Yang
- The Affiliated Brain Hospital, Guangzhou Medical University, Guangzhou, 510180, Guangdong, China
| | - Guofen Yang
- Department of Gynecology, First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510060, Guangdong, China.
| |
Collapse
|
38
|
Barroso GV, Lohmueller KE. Inferring the mode and strength of ongoing selection. Genome Res 2023; 33:632-643. [PMID: 37055196 PMCID: PMC10234300 DOI: 10.1101/gr.276386.121] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 03/29/2023] [Indexed: 04/15/2023]
Abstract
Genome sequence data are no longer scarce. The UK Biobank alone comprises 200,000 individual genomes, with more on the way, leading the field of human genetics toward sequencing entire populations. Within the next decades, other model organisms will follow suit, especially domesticated species such as crops and livestock. Having sequences from most individuals in a population will present new challenges for using these data to improve health and agriculture in the pursuit of a sustainable future. Existing population genetic methods are designed to model hundreds of randomly sampled sequences but are not optimized for extracting the information contained in the larger and richer data sets that are beginning to emerge, with thousands of closely related individuals. Here we develop a new method called trio-based inference of dominance and selection (TIDES) that uses data from tens of thousands of family trios to make inferences about natural selection acting in a single generation. TIDES further improves on the state of the art by making no assumptions regarding demography, linkage, or dominance. We discuss how our method paves the way for studying natural selection from new angles.
Collapse
Affiliation(s)
- Gustavo V Barroso
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095-1606, USA; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095-1606, USA; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
39
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor decomposition based feature extraction and classification to detect natural selection from genomic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.27.527731. [PMID: 37034767 PMCID: PMC10081272 DOI: 10.1101/2023.03.27.527731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under non-convex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data while preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx , which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
|
40
|
Begg TJA, Schmidt A, Kocher A, Larmuseau MHD, Runfeldt G, Maier PA, Wilson JD, Barquera R, Maj C, Szolek A, Sager M, Clayton S, Peltzer A, Hui R, Ronge J, Reiter E, Freund C, Burri M, Aron F, Tiliakou A, Osborn J, Behar DM, Boecker M, Brandt G, Cleynen I, Strassburg C, Prüfer K, Kühnert D, Meredith WR, Nöthen MM, Attenborough RD, Kivisild T, Krause J. Genomic analyses of hair from Ludwig van Beethoven. Curr Biol 2023; 33:1431-1447.e22. [PMID: 36958333 DOI: 10.1016/j.cub.2023.02.041] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 10/11/2022] [Accepted: 02/13/2023] [Indexed: 03/25/2023]
Abstract
Ludwig van Beethoven (1770-1827) remains among the most influential and popular classical music composers. Health problems significantly impacted his career as a composer and pianist, including progressive hearing loss, recurring gastrointestinal complaints, and liver disease. In 1802, Beethoven requested that following his death, his disease be described and made public. Medical biographers have since proposed numerous hypotheses, including many substantially heritable conditions. Here we attempt a genomic analysis of Beethoven in order to elucidate potential underlying genetic and infectious causes of his illnesses. We incorporated improvements in ancient DNA methods into existing protocols for ancient hair samples, enabling the sequencing of high-coverage genomes from small quantities of historical hair. We analyzed eight independently sourced locks of hair attributed to Beethoven, five of which originated from a single European male. We deemed these matching samples to be almost certainly authentic and sequenced Beethoven's genome to 24-fold genomic coverage. Although we could not identify a genetic explanation for Beethoven's hearing disorder or gastrointestinal problems, we found that Beethoven had a genetic predisposition for liver disease. Metagenomic analyses revealed furthermore that Beethoven had a hepatitis B infection during at least the months prior to his death. Together with the genetic predisposition and his broadly accepted alcohol consumption, these present plausible explanations for Beethoven's severe liver disease, which culminated in his death. Unexpectedly, an analysis of Y chromosomes sequenced from five living members of the Van Beethoven patrilineage revealed the occurrence of an extra-pair paternity event in Ludwig van Beethoven's patrilineal ancestry.
Collapse
Affiliation(s)
- Tristan James Alexander Begg
- Department of Archaeology, University of Cambridge, CB2 3ER Cambridge, UK; Institute for Archaeological Sciences, University of Tübingen, 72070 Tübingen, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany.
| | - Axel Schmidt
- Institute of Human Genetics, University Hospital of Bonn, Bonn 53127, Germany
| | - Arthur Kocher
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany; Transmission, Infection, Diversification and Evolution Group, Max Planck Institute for the Science of Human History, 07745 Jena, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Maarten H D Larmuseau
- Department of Human Genetics, Katholieke Universiteit Leuven, 3000 Leuven, Belgium; Laboratory of Human Genetic Genealogy, Department of Human Genetics, Katholieke Universiteit Leuven, 3000 Leuven, Belgium; ARCHES - Antwerp Cultural Heritage Sciences, Faculty of Design Sciences, University of Antwerp, 2000 Antwerp, Belgium; Histories vzw, 9000 Gent, Belgium
| | | | | | - John D Wilson
- Austrian Academy of Sciences, 1030 Vienna, Austria; University of Vienna, 1010 Vienna, Austria
| | - Rodrigo Barquera
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany
| | - Carlo Maj
- Institute of Human Genetics, University Hospital of Bonn, Bonn 53127, Germany; Center for Human Genetics, University Hospital of Marburg, Marburg, Germany
| | - András Szolek
- Applied Bioinformatics, Department for Computer Science, University of Tübingen, Sand 14, 72076 Tübingen, Germany; Department of Immunology, Interfaculty Institute for Cell Biology, University of Tübingen, Tübingen, Germany
| | | | - Stephen Clayton
- Institute for Archaeological Sciences, University of Tübingen, 72070 Tübingen, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Alexander Peltzer
- Quantitative Biology Center (QBiC) University of Tübingen, Tübingen, Germany
| | - Ruoyun Hui
- MacDonald Institute for Archaeological Research, University of Cambridge, Cambridge CB2 3ER, UK; Alan Turing Institute, 2QR, John Dodson House, London NW1 2DB, UK
| | | | - Ella Reiter
- Institute for Archaeological Sciences, University of Tübingen, 72070 Tübingen, Germany
| | - Cäcilia Freund
- Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Marta Burri
- Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Franziska Aron
- Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Anthi Tiliakou
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Joanna Osborn
- Department of Archaeology, University of Cambridge, CB2 3ER Cambridge, UK
| | - Doron M Behar
- Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | | | - Guido Brandt
- Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - Isabelle Cleynen
- Department of Human Genetics, Katholieke Universiteit Leuven, 3000 Leuven, Belgium
| | - Christian Strassburg
- Department of Internal Medicine I, University Hospital Bonn, 53127 Bonn, Germany
| | - Kay Prüfer
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany
| | - Denise Kühnert
- Transmission, Infection, Diversification and Evolution Group, Max Planck Institute for the Science of Human History, 07745 Jena, Germany; European Virus Bioinformatics Center (EVBC), Jena, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany
| | - William Rhea Meredith
- American Beethoven Society, San Jose State University, San Jose, CA 95192, USA; Ira F. Brilliant Center for Beethoven Studies, San Jose State University, San Jose, CA 95192, USA; School of Music and Dance, San Jose State University, San Jose, CA 95192, USA
| | - Markus M Nöthen
- Institute of Human Genetics, University Hospital of Bonn, Bonn 53127, Germany
| | - Robert David Attenborough
- MacDonald Institute for Archaeological Research, University of Cambridge, Cambridge CB2 3ER, UK; School of Archaeology & Anthropology, Australian National University, Canberra, ACT 0200, Australia
| | - Toomas Kivisild
- Department of Archaeology, University of Cambridge, CB2 3ER Cambridge, UK; Department of Human Genetics, Katholieke Universiteit Leuven, 3000 Leuven, Belgium; Estonian Biocentre, Institute of Genomics, University of Tartu, Tartu 51010, Estonia.
| | - Johannes Krause
- Institute for Archaeological Sciences, University of Tübingen, 72070 Tübingen, Germany; Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany; Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745 Jena, Germany.
| |
Collapse
|
41
|
Skov L, Coll Macià M, Lucotte EA, Cavassim MIA, Castellano D, Schierup MH, Munch K. Extraordinary selection on the human X chromosome associated with archaic admixture. CELL GENOMICS 2023; 3:100274. [PMID: 36950386 PMCID: PMC10025451 DOI: 10.1016/j.xgen.2023.100274] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Revised: 09/15/2022] [Accepted: 01/26/2023] [Indexed: 03/04/2023]
Abstract
The X chromosome in non-African humans shows less diversity and less Neanderthal introgression than expected from neutral evolution. Analyzing 162 human male X chromosomes worldwide, we identified fourteen chromosomal regions where nearly identical haplotypes spanning several hundred kilobases are found at high frequencies in non-Africans. Genetic drift alone cannot explain the existence of these haplotypes, which must have been associated with strong positive selection in partial selective sweeps. Moreover, the swept haplotypes are entirely devoid of archaic ancestry as opposed to the non-swept haplotypes in the same genomic regions. The ancient Ust'-Ishim male dated at 45,000 before the present (BP) also carries the swept haplotypes, implying that selection on the haplotypes must have occurred between 45,000 and 55,000 years ago. Finally, we find that the chromosomal positions of sweeps overlap previously reported hotspots of selective sweeps in great ape evolution, suggesting a mechanism of selection unique to X chromosomes.
Collapse
Affiliation(s)
- Laurits Skov
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-5800, USA
| | - Moisès Coll Macià
- Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark
| | - Elise Anne Lucotte
- Ecologie Systématique Evolution, Univ. Paris-Sud, AgroParisTech, CNRS, Université Paris-Saclay, Gif-sur-Yvette, France
| | | | - David Castellano
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ 85721, USA
| | | | - Kasper Munch
- Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark
- Corresponding author
| |
Collapse
|
42
|
Wu T, Xu S. Understanding the contemporary high obesity rate from an evolutionary genetic perspective. Hereditas 2023; 160:5. [PMID: 36750916 PMCID: PMC9903520 DOI: 10.1186/s41065-023-00268-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 01/31/2023] [Indexed: 02/09/2023] Open
Abstract
The topic of obesity is gaining increasing popularity globally. From an evolutionary genetic perspective, it is believed that the main cause of the high obesity rate is the mismatch between environment and genes after people have shifted toward a modern high-calorie diet. However, it has been debated for over 60 years about how obesity-related genes become prevalent all over the world. Here, we review the three most influential hypotheses or viewpoints, i.e., the thrifty gene hypothesis, the drifty gene hypothesis, and the maladaptation viewpoint. In particular, genome-wide association studies in the recent 10 years have provided rich findings and evidence to be considered for a better understanding of the evolutionary genetic mechanisms of obesity. We anticipate this brief review to direct further studies and inspire the future application of precision medicine in obesity treatment.
Collapse
Affiliation(s)
- Tong Wu
- grid.8547.e0000 0001 0125 2443State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438 China
| | - Shuhua Xu
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center of Genetics and Development, School of Life Sciences, Fudan University, Shanghai, 200438, China. .,Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai, 201203, China. .,Department of Liver Surgery and Transplantation, Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai, 200032, China.
| |
Collapse
|
43
|
Heianza Y, Xue Q, Rood J, Bray GA, Sacks FM, Qi L. Circulating thrifty microRNA is related to insulin sensitivity, adiposity, and energy metabolism in adults with overweight and obesity: the POUNDS Lost trial. Am J Clin Nutr 2023; 117:121-129. [PMID: 36789931 PMCID: PMC10196610 DOI: 10.1016/j.ajcnut.2022.10.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 09/27/2022] [Accepted: 10/28/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND MicroRNA 128-1 (miR-128-1) was recently linked to the evolutionary adaptation to famine and identified as a thrifty microRNA that controls energy expenditure, contributing to obesity and impaired glucose metabolism. OBJECTIVES We investigated whether circulating miR-128-1-5p and its temporal changes in response to weight-loss diet interventions were related to regulating insulin resistance, adiposity, and energy expenditure in adults with overweight and obesity. We also examined whether habitual physical activity (PA) and different macronutrient intakes modified associations of changes in miR-128-1-5p with improved metabolic outcomes. METHODS This study included 495 adults who consumed weight-loss diets with different macronutrient intakes. Circulating levels of miR-128-1-5p were assessed at baseline and 6 mo after the interventions. Outcome measurements included changes in insulin resistance HOMA-IR, adiposity, and resting energy expenditure. RESULTS We observed significant relations between circulating miR-128-1-5p and the positive selection signals at the 2q21.3 locus assessed by the single nucleotide polymorphisms rs1446585 and rs4988235. Higher miR-128-1-5p levels were associated with greater HOMA-IR (β per 1 SD: 0.08 [SE 0.03]; P = 0.009), waist circumference (β, 1.16 [0.55]; P = 0.036), whole-body total % fat mass (β, 0.75 [0.30]; P = 0.013), and REE (β, 23 [11]; P = 0.037). In addition, higher miR-128-1-5p level was related to lower total PA index (β, -0.23 [0.07]; P = 0.001) and interacted with PA (Pinteraction < 0.05) on changes in HOMA-IR and adiposity. We found that greater increases in miR-128-1-5p levels after the interventions were associated with lesser improvements in HOMA-IR and adiposity in participants with no change/decreases in PA. Furthermore, we found that dietary fat (Pinteraction = 0.027) and protein (Pinteraction= 0.055) intakes modified relations between changes in miR-128-1-5p and REE. CONCLUSIONS Circulating thrifty miRNA was linked to regulating body fat, insulin resistance, and energy metabolism. Temporal changes in circulating miR-128-1-5p were associated with better weight-loss outcomes during the interventions; habitual PA and dietary macronutrient intake may modify such relations. This trial was registered at clinicaltrials.gov as NCT00072995.
Collapse
Affiliation(s)
- Yoriko Heianza
- Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA.
| | - Qiaochu Xue
- Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Jennifer Rood
- Pennington Biomedical Research Center, Louisiana State University, Baton Rouge, LA, USA
| | - George A Bray
- Pennington Biomedical Research Center, Louisiana State University, Baton Rouge, LA, USA
| | - Frank M Sacks
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Lu Qi
- Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA; Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| |
Collapse
|
44
|
Muktupavela RA, Petr M, Ségurel L, Korneliussen T, Novembre J, Racimo F. Modeling the spatiotemporal spread of beneficial alleles using ancient genomes. eLife 2022; 11:e73767. [PMID: 36537881 PMCID: PMC9767474 DOI: 10.7554/elife.73767] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 11/21/2022] [Indexed: 12/24/2022] Open
Abstract
Ancient genome sequencing technologies now provide the opportunity to study natural selection in unprecedented detail. Rather than making inferences from indirect footprints left by selection in present-day genomes, we can directly observe whether a given allele was present or absent in a particular region of the world at almost any period of human history within the last 10,000 years. Methods for studying selection using ancient genomes often rely on partitioning individuals into discrete time periods or regions of the world. However, a complete understanding of natural selection requires more nuanced statistical methods which can explicitly model allele frequency changes in a continuum across space and time. Here we introduce a method for inferring the spread of a beneficial allele across a landscape using two-dimensional partial differential equations. Unlike previous approaches, our framework can handle time-stamped ancient samples, as well as genotype likelihoods and pseudohaploid sequences from low-coverage genomes. We apply the method to a panel of published ancient West Eurasian genomes to produce dynamic maps showcasing the inferred spread of candidate beneficial alleles over time and space. We also provide estimates for the strength of selection and diffusion rate for each of these alleles. Finally, we highlight possible avenues of improvement for accurately tracing the spread of beneficial alleles in more complex scenarios.
Collapse
Affiliation(s)
- Rasa A Muktupavela
- Lundbeck GeoGenetics Centre, GLOBE Institute, Faculty of HealthCopenhagenDenmark
| | - Martin Petr
- Lundbeck GeoGenetics Centre, GLOBE Institute, Faculty of HealthCopenhagenDenmark
| | - Laure Ségurel
- UMR5558 Biométrie et Biologie Evolutive, CNRS - Université Lyon 1VilleurbanneFrance
| | | | - John Novembre
- Department of Human Genetics, University of ChicagoChicagoUnited States
| | - Fernando Racimo
- Lundbeck GeoGenetics Centre, GLOBE Institute, Faculty of HealthCopenhagenDenmark
| |
Collapse
|
45
|
Olive Oil in the Mediterranean Diet and Its Biochemical and Molecular Effects on Cardiovascular Health through an Analysis of Genetics and Epigenetics. Int J Mol Sci 2022; 23:ijms232416002. [PMID: 36555645 PMCID: PMC9782563 DOI: 10.3390/ijms232416002] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 11/27/2022] [Accepted: 12/05/2022] [Indexed: 12/23/2022] Open
Abstract
Human nutrition is a relatively new science based on biochemistry and the effects of food constituents. Ancient medicine considered many foods as remedies for physical performance or the treatment of diseases and, since ancient times, especially Greek, Asian and pre-Christian cultures similarly thought that they had beneficial effects on health, while others believed some foods were capable of causing illness. Hippocrates described the food as a form of medicine and stated that a balanced diet could help individuals stay healthy. Understanding molecular nutrition, the interaction between nutrients and DNA, and obtaining specific biomarkers could help formulate a diet in which food is not only a food but also a drug. Therefore, this study aims to analyze the role of the Mediterranean diet and olive oil on cardiovascular risk and to identify their influence from the genetic and epigenetic point of view to understand their possible protective effects.
Collapse
|
46
|
Zhang Z, Blumenfeld J, Ramnauth A, Barash I, Zhou P, Levine D, Parker T, Rennert H. A common intronic single nucleotide variant modifies PKD1 expression level. Clin Genet 2022; 102:483-493. [PMID: 36029107 PMCID: PMC10947153 DOI: 10.1111/cge.14214] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 08/20/2022] [Accepted: 08/22/2022] [Indexed: 11/26/2022]
Abstract
Autosomal dominant polycystic kidney disease (ADPKD), caused by mutations in PKD1 and PKD2 (PKD1/2), has unexplained phenotypic variability likely affected by environmental and other genetic factors. Approximately 10% of individuals with ADPKD phenotype have no causal mutation detected, possibly due to unrecognized risk variants of PKD1/2. This study was designed to identify risk variants of PKD genes through population genetic analyses. We used Wright's F-statistics (Fst) to evaluate common single nucleotide variants (SNVs) potentially favored by positive natural selection in PKD1 from 1000 Genomes Project (1KG) and genotyped 388 subjects from the Rogosin Institute ADPKD Data Repository. The variants with >90th percentile Fst scores underwent further investigation by in silico analysis and molecular genetics analyses. We identified a deep intronic SNV, rs3874648G> A, located in a conserved binding site of the splicing regulator Tra2-β in PKD1 intron 30. Reverse-transcription PCR (RT-PCR) of peripheral blood leukocytes (PBL) from an ADPKD patient homozygous for rs3874648-A identified an atypical PKD1 splice form. Functional analyses demonstrated that rs3874648-A allele increased Tra2-β binding affinity and activated a cryptic acceptor splice-site, causing a frameshift that introduced a premature stop codon in mRNA, thereby decreasing PKD1 full-length transcript level. PKD1 transcript levels were lower in PBL from rs3874648-G/A carriers than in rs3874648-G/G homozygotes in a small cohort of normal individuals and patients with PKD2 inactivating mutations. Our findings indicate that rs3874648G > A is a PKD1 expression modifier attenuating PKD1 expression through Tra2-β, while the derived G allele advantageously maintains PKD1 expression and is predominant in all subpopulations.
Collapse
Affiliation(s)
- Zhengmao Zhang
- Departments of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| | - Jon Blumenfeld
- Department of Medicine, Weill Cornell Medicine, New York, NY
- The Rogosin Institute, New York, NY
| | - Andrew Ramnauth
- Departments of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| | - Irina Barash
- Department of Medicine, Weill Cornell Medicine, New York, NY
- The Rogosin Institute, New York, NY
| | - Pengbo Zhou
- Departments of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| | - Daniel Levine
- Department of Biochemistry, Weill Cornell Medicine, New York, NY
- The Rogosin Institute, New York, NY
| | - Thomas Parker
- Department of Biochemistry, Weill Cornell Medicine, New York, NY
- The Rogosin Institute, New York, NY
| | - Hanna Rennert
- Departments of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY
| |
Collapse
|
47
|
Campagna L, Mo Z, Siepel A, Uy JAC. Selective sweeps on different pigmentation genes mediate convergent evolution of island melanism in two incipient bird species. PLoS Genet 2022; 18:e1010474. [PMID: 36318577 PMCID: PMC9624418 DOI: 10.1371/journal.pgen.1010474] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 10/12/2022] [Indexed: 11/19/2022] Open
Abstract
Insular organisms often evolve predictable phenotypes, like flightlessness, extreme body sizes, or increased melanin deposition. The evolutionary forces and molecular targets mediating these patterns remain mostly unknown. Here we study the Chestnut-bellied Monarch (Monarcha castaneiventris) from the Solomon Islands, a complex of closely related subspecies in the early stages of speciation. On the large island of Makira M. c. megarhynchus has a chestnut belly, whereas on the small satellite islands of Ugi, and Santa Ana and Santa Catalina (SA/SC) M. c. ugiensis is entirely iridescent blue-black (i.e., melanic). Melanism has likely evolved twice, as the Ugi and SA/SC populations were established independently. To investigate the genetic basis of melanism on each island we generated whole genome sequence data from all three populations. Non-synonymous mutations at the MC1R pigmentation gene are associated with melanism on SA/SC, while ASIP, an antagonistic ligand of MC1R, is associated with melanism on Ugi. Both genes show evidence of selective sweeps in traditional summary statistics and statistics derived from the ancestral recombination graph (ARG). Using the ARG in combination with machine learning, we inferred selection strength, timing of onset and allele frequency trajectories. MC1R shows evidence of a recent, strong, soft selective sweep. The region including ASIP shows more complex signatures; however, we find evidence for sweeps in mutations near ASIP, which are comparatively older than those on MC1R and have been under relatively strong selection. Overall, our study shows convergent melanism results from selective sweeps at independent molecular targets, evolving in taxa where coloration likely mediates reproductive isolation with the neighboring chestnut-bellied subspecies. Chestnut-bellied Monarchs (Monarcha castaneiventris ugiensis) from two archipelagos in the Solomon Islands have evolved entirely black plumage from a chestnut ancestor (Monarcha castaneiventris megarhynchus), a phenomenon known as island melanism. We obtain and analyze whole genome sequences using traditional summary statistics and new methods that combine inference of the ancestral recombination graph with machine learning. We find multiple lines of evidence for independent selective sweeps on the MC1R and ASIP genes, a receptor/ligand pair which regulates the production of melanin. Melanism on each archipelago is mediated by mutations in one of these two genes. Mutations in and around MC1R underwent a recent soft sweep experiencing strong selection on the islands of Santa Ana and Santa Catalina, whereas selection was also strong but comparatively older for ASIP on the island of Ugi. We show how melanism originated under positive selection on independent molecular targets, evolving convergently in taxa where coloration mediates reproductive isolation.
Collapse
Affiliation(s)
- Leonardo Campagna
- Fuller Evolutionary Biology Program, Cornell Lab of Ornithology, Ithaca, New York, United States of America
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, New York, United States of America
- * E-mail: (LC); (JACU)
| | - Ziyi Mo
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - J. Albert C. Uy
- Department of Biology, University of Rochester, Rochester, New York, United States of America
- * E-mail: (LC); (JACU)
| |
Collapse
|
48
|
Lieberman TD. Detecting bacterial adaptation within individual microbiomes. Philos Trans R Soc Lond B Biol Sci 2022; 377:20210243. [PMID: 35989602 PMCID: PMC9393564 DOI: 10.1098/rstb.2021.0243] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 04/17/2022] [Indexed: 12/11/2022] Open
Abstract
The human microbiome harbours a large capacity for within-person adaptive mutations. Commensal bacterial strains can stably colonize a person for decades, and billions of mutations are generated daily within each person's microbiome. Adaptive mutations emerging during health might be driven by selective forces that vary across individuals, vary within an individual, or are completely novel to the human population. Mutations emerging within individual microbiomes might impact the immune system, the metabolism of nutrients or drugs, and the stability of the community to perturbations. Despite this potential, relatively little attention has been paid to the possibility of adaptive evolution within complex human-associated microbiomes. This review discusses the promise of studying within-microbiome adaptation, the conceptual and technical limitations that may have contributed to an underappreciation of adaptive de novo mutations occurring within microbiomes to date, and methods for detecting recent adaptive evolution. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.
Collapse
Affiliation(s)
- Tami D. Lieberman
- Department of Civil and Environmental Engineering, Institute for Medical Engineering and Science,Massachusetts Institute of Technology, Cambridge, MA, USA
- Broad Institute, Cambridge, MA, USA
- Ragon Institute, Cambridge, MA, USA
| |
Collapse
|
49
|
Sezgin E, Kaplan E. Diverse selection pressures shaping the genetic architecture of behçet disease susceptibility. Front Genet 2022; 13:983646. [PMID: 36246630 PMCID: PMC9561091 DOI: 10.3389/fgene.2022.983646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 08/30/2022] [Indexed: 11/13/2022] Open
Abstract
Behçet disease (BD) is a polygenic, multifactorial, multisystem inflammatory condition with unknown etiology. Global distribution of BD is geographically structured, highest prevalence observed among East Asian, Middle Eastern, and Mediterranean populations. Although adaptive selection on a few BD susceptibility loci is speculated, a thorough evolutionary analysis on the genetic architecture of BD is lacking. We aimed to understand whether increased BD risk in the human populations with high prevalence is due to past selection on BD associated genes. We performed population genetics analyses with East Asian (high BD prevalence), European (low/very low BD prevalence), and African (very low/no BD prevalence) populations. Comparison of ancestral and derived alleles' frequencies versus their reported susceptible or protective effect on BD showed both derived and ancestral alleles are associated with increased BD risk. Variants showing higher risk to and more significant association with BD had smaller allele frequency differences, and showed less population differentiation compared to variants that showed smaller risk and less significant association with BD. Results suggest BD alleles are not unique to East Asians but are also found in other world populations at appreciable frequencies, and argue against selection favoring these variants only in populations with high BD prevalence. BD associated gene analyses showed similar evolutionary histories driven by neutral processes for many genes or balancing selection for HLA (Human Leukocyte Antigen) genes in all three populations studied. However, nucleotide diversity in several HLA region genes was much higher in East Asians suggesting selection for high nucleotide and haplotype diversity in East Asians. Recent selective sweep for genes involved in antigen recognition, peptide processing, immune and cellular differentiation regulation was observed only in East Asians. We conclude that the evolutionary processes shaping the genetic diversity in BD risk genes are diverse, and elucidating the underlying specific selection mechanisms is complex. Several of the genes examined in this study are risk factors (such as ERAP1, IL23R, HLA-G) for other inflammatory diseases. Thus, our conclusions are not only limited to BD but may have broader implications for other inflammatory diseases.
Collapse
Affiliation(s)
- Efe Sezgin
- Department of Food Engineering, Izmir Institute of Technology, Izmir, Turkey
- Biotechnology Interdisciplinary Program, Izmir Institute of Technology, Izmir, Turkey
| | - Elif Kaplan
- Biotechnology Interdisciplinary Program, Izmir Institute of Technology, Izmir, Turkey
| |
Collapse
|
50
|
Oh CH, Kim JW, Park YM, Kim GA, Jang JY, Chang YW, Yang JO, Kho HY, Park JK. Benefits of Flavored Lactose-Free Milk for Korean Adults with Lactose Intolerance. J Med Food 2022; 25:1003-1010. [PMID: 36179067 DOI: 10.1089/jmf.2022.k.0010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Although lactose-free dairy products for the clinical management of lactose intolerance (LI) are widely available, scientific evidence on their efficacy is still lacking. This study comparatively analyzed the efficacy of flavored lactose-free milk (LFM) and whole milk (WM) in reducing symptoms in South Korean adults with LI. This prospective study was conducted in adults suspected of LI. All screened participants underwent the hydrogen breath test (HBT) using 570 mL of chocolate-flavored WM (20 g of lactose) and responded to a symptom questionnaire. LI was confirmed when the ΔH2 peak exceeded 16 ppm above baseline values and with the occurrence of symptoms after WM consumption. The participants who were diagnosed with LI underwent the HBT again with 570 mL of chocolate-flavored LFM (0 g of lactose), followed by the symptom questionnaire survey after 1 week. After excluding 40 participants who did not meet the diagnostic criteria for LI and 2 who were lost to follow-up, a total of 28 lactose-intolerant individuals were enrolled in the study. The ΔH2 values in the first HBT were significantly higher than those in the second HBT (33.3 ± 21.6 ppm vs. 8.6 ± 6.3 ppm, P < .001). Similarly, there was a significant reduction in the total symptom score in the second HBT (4.18 ± 1.51 vs. 0.61 ± 0.98, P < .001). Flavored LFM is well tolerated in South Korean adults diagnosed with LI based on the HBT and symptom questionnaire results. Therefore, LFM may be a viable alternative to WM.
Collapse
Affiliation(s)
- Chi Hyuk Oh
- Division of Gastroenterology, Department of Internal Medicine, Kyung Hee University School of Medicine, Seoul, Korea
| | - Jung-Wook Kim
- Division of Gastroenterology, Department of Internal Medicine, Kyung Hee University School of Medicine, Seoul, Korea
| | - Yoo Min Park
- Division of Gastroenterology, Department of Internal Medicine, Kyung Hee University School of Medicine, Seoul, Korea
| | - Gi-Ae Kim
- Division of Gastroenterology, Department of Internal Medicine, Kyung Hee University School of Medicine, Seoul, Korea
| | - Jae-Young Jang
- Division of Gastroenterology, Department of Internal Medicine, Kyung Hee University School of Medicine, Seoul, Korea
| | - Young Woon Chang
- Department of Internal Medicine, Cheonan-Woori Hospital, Cheonan, Korea
| | - Jin Oh Yang
- Maeil Innovation Center, Maeil Dairies Co. Ltd., Pyeongtaek, Korea
| | - Ho Young Kho
- Maeil Innovation Center, Maeil Dairies Co. Ltd., Pyeongtaek, Korea
| | - Jun-Kyu Park
- Maeil Innovation Center, Maeil Dairies Co. Ltd., Pyeongtaek, Korea
| |
Collapse
|