1
|
Anderson NW, Kirk L, Schraiber JG, Ragsdale AP. A path integral approach for allele frequency dynamics under polygenic selection. Genetics 2025; 229:1-63. [PMID: 39531638 DOI: 10.1093/genetics/iyae182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 10/11/2024] [Accepted: 10/16/2024] [Indexed: 11/16/2024] Open
Abstract
Many phenotypic traits have a polygenic genetic basis, making it challenging to learn their genetic architectures and predict individual phenotypes. One promising avenue to resolve the genetic basis of complex traits is through evolve-and-resequence (E&R) experiments, in which laboratory populations are exposed to some selective pressure and trait-contributing loci are identified by extreme frequency changes over the course of the experiment. However, small laboratory populations will experience substantial random genetic drift, and it is difficult to determine whether selection played a role in a given allele frequency change (AFC). Predicting AFCs under drift and selection, even for alleles contributing to simple, monogenic traits, has remained a challenging problem. Recently, there have been efforts to apply the path integral, a method borrowed from physics, to solve this problem. So far, this approach has been limited to genic selection, and is therefore inadequate to capture the complexity of quantitative, highly polygenic traits that are commonly studied. Here, we extend one of these path integral methods, the perturbation approximation, to selection scenarios that are of interest to quantitative genetics. We derive analytic expressions for the transition probability (i.e. the probability that an allele will change in frequency from x to y in time t) of an allele contributing to a trait subject to stabilizing selection, as well as that of an allele contributing to a trait rapidly adapting to a new phenotypic optimum. We use these expressions to characterize the use of AFC to test for selection, as well as explore optimal design choices for E&R experiments to uncover the genetic architecture of polygenic traits under selection.
Collapse
Affiliation(s)
- Nathan W Anderson
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Lloyd Kirk
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Joshua G Schraiber
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Aaron P Ragsdale
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
2
|
Wolff R, Garud NR. Pervasive selective sweeps across human gut microbiomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.22.573162. [PMID: 38187688 PMCID: PMC10769429 DOI: 10.1101/2023.12.22.573162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
The human gut microbiome is composed of a highly diverse consortia of species which are continually evolving within and across hosts. The ability to identify adaptations common to many human gut microbiomes would not only reveal shared selection pressures across hosts, but also key drivers of functional differentiation of the microbiome that may affect community structure and host traits. However, to date there has not been a systematic scan for adaptations that have spread across human gut microbiomes. Here, we develop a novel selection scan statistic named the integrated Linkage Disequilibrium Score (iLDS) that can detect the spread of adaptive haplotypes across host microbiomes via migration and horizontal gene transfer. Specifically, iLDS leverages signals of hitchhiking of deleterious variants with the beneficial variant. Application of the statistic to ~30 of the most prevalent commensal gut species from 24 populations around the world revealed more than 300 selective sweeps across species. We find an enrichment for selective sweeps at loci involved in carbohydrate metabolism-potentially indicative of adaptation to features of host diet-and we find that the targets of selection significantly differ between Westernized and non-Westernized populations. Underscoring the potential role of diet in driving selection, we find a selective sweep absent from non-Westernized populations but ubiquitous in Westernized populations at a locus known to be involved in the metabolism of maltodextrin, a synthetic starch that has recently become a widespread component of Western diets. In summary, we demonstrate that selective sweeps across host microbiomes are a common feature of the evolution of the human gut microbiome, and that targets of selection may be strongly impacted by host diet.
Collapse
Affiliation(s)
- Richard Wolff
- Department of Ecology and Evolutionary Biology, UCLA
| | - Nandita R. Garud
- Department of Ecology and Evolutionary Biology, UCLA
- Department of Human Genetics, UCLA
| |
Collapse
|
3
|
Lyulina AS, Liu Z, Good BH. Linkage equilibrium between rare mutations. Genetics 2024; 228:iyae145. [PMID: 39222343 PMCID: PMC11538400 DOI: 10.1093/genetics/iyae145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 08/20/2024] [Indexed: 09/04/2024] Open
Abstract
Recombination breaks down genetic linkage by reshuffling existing variants onto new genetic backgrounds. These dynamics are traditionally quantified by examining the correlations between alleles, and how they decay as a function of the recombination rate. However, the magnitudes of these correlations are strongly influenced by other evolutionary forces like natural selection and genetic drift, making it difficult to tease out the effects of recombination. Here, we introduce a theoretical framework for analyzing an alternative family of statistics that measure the homoplasy produced by recombination. We derive analytical expressions that predict how these statistics depend on the rates of recombination and recurrent mutation, the strength of negative selection and genetic drift, and the present-day frequencies of the mutant alleles. We find that the degree of homoplasy can strongly depend on this frequency scale, which reflects the underlying timescales over which these mutations occurred. We show how these scaling properties can be used to isolate the effects of recombination and discuss their implications for the rates of horizontal gene transfer in bacteria.
Collapse
Affiliation(s)
- Anastasia S Lyulina
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Zhiru Liu
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Benjamin H Good
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub – San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
4
|
Di C, Lohmueller KE. Revisiting Dominance in Population Genetics. Genome Biol Evol 2024; 16:evae147. [PMID: 39114967 PMCID: PMC11306932 DOI: 10.1093/gbe/evae147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/24/2024] [Indexed: 08/11/2024] Open
Abstract
Dominance refers to the effect of a heterozygous genotype relative to that of the two homozygous genotypes. The degree of dominance of mutations for fitness can have a profound impact on how deleterious and beneficial mutations change in frequency over time as well as on the patterns of linked neutral genetic variation surrounding such selected alleles. Since dominance is such a fundamental concept, it has received immense attention throughout the history of population genetics. Early work from Fisher, Wright, and Haldane focused on understanding the conceptual basis for why dominance exists. More recent work has attempted to test these theories and conceptual models by estimating dominance effects of mutations. However, estimating dominance coefficients has been notoriously challenging and has only been done in a few species in a limited number of studies. In this review, we first describe some of the early theoretical and conceptual models for understanding the mechanisms for the existence of dominance. Second, we discuss several approaches used to estimate dominance coefficients and summarize estimates of dominance coefficients. We note trends that have been observed across species, types of mutations, and functional categories of genes. By comparing estimates of dominance coefficients for different types of genes, we test several hypotheses for the existence of dominance. Lastly, we discuss how dominance influences the dynamics of beneficial and deleterious mutations in populations and how the degree of dominance of deleterious mutations influences the impact of inbreeding on fitness.
Collapse
Affiliation(s)
- Chenlu Di
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, Los Angeles, CA, USA
| |
Collapse
|
5
|
Anderson NW, Kirk L, Schraiber JG, Ragsdale AP. A Path Integral Approach for Allele Frequency Dynamics Under Polygenic Selection. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.14.599114. [PMID: 38915613 PMCID: PMC11195211 DOI: 10.1101/2024.06.14.599114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Many phenotypic traits have a polygenic genetic basis, making it challenging to learn their genetic architectures and predict individual phenotypes. One promising avenue to resolve the genetic basis of complex traits is through evolve-and-resequence experiments, in which laboratory populations are exposed to some selective pressure and trait-contributing loci are identified by extreme frequency changes over the course of the experiment. However, small laboratory populations will experience substantial random genetic drift, and it is difficult to determine whether selection played a roll in a given allele frequency change. Predicting how much allele frequencies change under drift and selection had remained an open problem well into the 21st century, even those contributing to simple, monogenic traits. Recently, there have been efforts to apply the path integral, a method borrowed from physics, to solve this problem. So far, this approach has been limited to genic selection, and is therefore inadequate to capture the complexity of quantitative, highly polygenic traits that are commonly studied. Here we extend one of these path integral methods, the perturbation approximation, to selection scenarios that are of interest to quantitative genetics. In particular, we derive analytic expressions for the transition probability (i.e., the probability that an allele will change in frequency from x , to y in time t ) of an allele contributing to a trait subject to stabilizing selection, as well as that of an allele contributing to a trait rapidly adapting to a new phenotypic optimum. We use these expressions to characterize the use of allele frequency change to test for selection, as well as explore optimal design choices for evolve-and-resequence experiments to uncover the genetic architecture of polygenic traits under selection.
Collapse
Affiliation(s)
- Nathan W. Anderson
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Lloyd Kirk
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Joshua G. Schraiber
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Aaron P. Ragsdale
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI, 53706, USA
| |
Collapse
|
6
|
Lyulina AS, Liu Z, Good BH. Linkage equilibrium between rare mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.28.587282. [PMID: 38617331 PMCID: PMC11014483 DOI: 10.1101/2024.03.28.587282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Recombination breaks down genetic linkage by reshuffling existing variants onto new genetic backgrounds. These dynamics are traditionally quantified by examining the correlations between alleles, and how they decay as a function of the recombination rate. However, the magnitudes of these correlations are strongly influenced by other evolutionary forces like natural selection and genetic drift, making it difficult to tease out the effects of recombination. Here we introduce a theoretical framework for analyzing an alternative family of statistics that measure the homoplasy produced by recombination. We derive analytical expressions that predict how these statistics depend on the rates of recombination and recurrent mutation, the strength of negative selection and genetic drift, and the present-day frequencies of the mutant alleles. We find that the degree of homoplasy can strongly depend on this frequency scale, which reflects the underlying timescales over which these mutations occurred. We show how these scaling properties can be used to isolate the effects of recombination, and discuss their implications for the rates of horizontal gene transfer in bacteria.
Collapse
Affiliation(s)
- Anastasia S Lyulina
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Zhiru Liu
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - Benjamin H Good
- Department of Biology, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub - San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
7
|
Kyriazis CC, Lohmueller KE. Constraining models of dominance for nonsynonymous mutations in the human genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.25.582010. [PMID: 38463985 PMCID: PMC10925099 DOI: 10.1101/2024.02.25.582010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Dominance is a fundamental parameter in genetics, determining the dynamics of natural selection on deleterious and beneficial mutations, the patterns of genetic variation in natural populations, and the severity of inbreeding depression in a population. Despite this importance, dominance parameters remain poorly known, particularly in humans or other non-model organisms. A key reason for this lack of information about dominance is that it is extremely challenging to disentangle the selection coefficient (s) of a mutation from its dominance coefficient (h). Here, we explore dominance and selection parameters in humans by fitting models to the site frequency spectrum (SFS) for nonsynonymous mutations. When assuming a single dominance coefficient for all nonsynonymous mutations, we find that numerous h values can fit the data, so long as h is greater than ~0.15. Moreover, we also observe that theoretically-predicted models with a negative relationship between h and s can also fit the data well, including models with h=0.05 for strongly deleterious mutations. Finally, we use our estimated dominance and selection parameters to inform simulations revisiting the question of whether the out-of-Africa bottleneck has led to differences in genetic load between African and non-African human populations. These simulations suggest that the relative burden of genetic load in non-African populations depends on the dominance model assumed, with slight increases for more weakly recessive models and slight decreases shown for more strongly recessive models. Moreover, these results also demonstrate that models of partially recessive nonsynonymous mutations can explain the observed severity of inbreeding depression in humans, bridging the gap between molecular population genetics and direct measures of fitness in humans. Our work represents a comprehensive assessment of dominance and deleterious variation in humans, with implications for parameterizing models of deleterious variation in humans and other mammalian species.
Collapse
Affiliation(s)
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, USA
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, USA
- Department of Human Genetics, David Geffen School of Medicine, Los Angeles, USA
| |
Collapse
|
8
|
Zhang MJ, Durvasula A, Chiang C, Koch EM, Strober BJ, Shi H, Barton AR, Kim SS, Weissbrod O, Loh PR, Gazal S, Sunyaev S, Price AL. Pervasive correlations between causal disease effects of proximal SNPs vary with functional annotations and implicate stabilizing selection. RESEARCH SQUARE 2023:rs.3.rs-3707248. [PMID: 38168385 PMCID: PMC10760228 DOI: 10.21203/rs.3.rs-3707248/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.
Collapse
Affiliation(s)
- Martin Jinye Zhang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Arun Durvasula
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Colby Chiang
- Department of Pediatrics, Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA
| | - Evan M. Koch
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Benjamin J. Strober
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Alison R. Barton
- Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Samuel S. Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Po-Ru Loh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Steven Gazal
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
- Department of Quantitative and Computational Biology, University of Southern California
- Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California
| | - Shamil Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
9
|
Zhang MJ, Durvasula A, Chiang C, Koch EM, Strober BJ, Shi H, Barton AR, Kim SS, Weissbrod O, Loh PR, Gazal S, Sunyaev S, Price AL. Pervasive correlations between causal disease effects of proximal SNPs vary with functional annotations and implicate stabilizing selection. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.04.23299391. [PMID: 38106023 PMCID: PMC10723494 DOI: 10.1101/2023.12.04.23299391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.
Collapse
Affiliation(s)
- Martin Jinye Zhang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Arun Durvasula
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Colby Chiang
- Department of Pediatrics, Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
| | - Evan M Koch
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Benjamin J Strober
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Alison R Barton
- Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Samuel S Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Po-Ru Loh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Steven Gazal
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
- Department of Quantitative and Computational Biology, University of Southern California
- Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California
| | - Shamil Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
10
|
Medina-Muñoz SG, Ortega-Del Vecchyo D, Cruz-Hervert LP, Ferreyra-Reyes L, García-García L, Moreno-Estrada A, Ragsdale AP. Demographic modeling of admixed Latin American populations from whole genomes. Am J Hum Genet 2023; 110:1804-1816. [PMID: 37725976 PMCID: PMC10577084 DOI: 10.1016/j.ajhg.2023.08.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 08/17/2023] [Accepted: 08/23/2023] [Indexed: 09/21/2023] Open
Abstract
Demographic models of Latin American populations often fail to fully capture their complex evolutionary history, which has been shaped by both recent admixture and deeper-in-time demographic events. To address this gap, we used high-coverage whole-genome data from Indigenous American ancestries in present-day Mexico and existing genomes from across Latin America to infer multiple demographic models that capture the impact of different timescales on genetic diversity. Our approach, which combines analyses of allele frequencies and ancestry tract length distributions, represents a significant improvement over current models in predicting patterns of genetic variation in admixed Latin American populations. We jointly modeled the contribution of European, African, East Asian, and Indigenous American ancestries into present-day Latin American populations. We infer that the ancestors of Indigenous Americans and East Asians diverged ∼30 thousand years ago, and we characterize genetic contributions of recent migrations from East and Southeast Asia to Peru and Mexico. Our inferred demographic histories are consistent across different genomic regions and annotations, suggesting that our inferences are robust to the potential effects of linked selection. In conjunction with published distributions of fitness effects for new nonsynonymous mutations in humans, we show in large-scale simulations that our models recover important features of both neutral and deleterious variation. By providing a more realistic framework for understanding the evolutionary history of Latin American populations, our models can help address the historical under-representation of admixed groups in genomics research and can be a valuable resource for future studies of populations with complex admixture and demographic histories.
Collapse
Affiliation(s)
- Santiago G Medina-Muñoz
- National Laboratory of Genomics for Biodiversity (LANGEBIO), Advanced Genomics Unit (UGA), CINVESTAV, Irapuato, Guanajuato 36824, Mexico
| | - Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de Mexico, Juriquilla, Querétaro 76230, Mexico
| | | | | | | | - Andrés Moreno-Estrada
- National Laboratory of Genomics for Biodiversity (LANGEBIO), Advanced Genomics Unit (UGA), CINVESTAV, Irapuato, Guanajuato 36824, Mexico.
| | - Aaron P Ragsdale
- National Laboratory of Genomics for Biodiversity (LANGEBIO), Advanced Genomics Unit (UGA), CINVESTAV, Irapuato, Guanajuato 36824, Mexico; Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
11
|
Wade EE, Kyriazis CC, Cavassim MIA, Lohmueller KE. Quantifying the fraction of new mutations that are recessive lethal. Evolution 2023; 77:1539-1549. [PMID: 37074880 PMCID: PMC10309970 DOI: 10.1093/evolut/qpad061] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 04/14/2023] [Indexed: 04/20/2023]
Abstract
The presence and impact of recessive lethal mutations have been widely documented in diploid outcrossing species. However, precise estimates of the proportion of new mutations that are recessive lethal remain limited. Here, we evaluate the performance of Fit∂a∂i, a commonly used method for inferring the distribution of fitness effects (DFE), in the presence of lethal mutations. Using simulations, we demonstrate that in both additive and recessive cases, inference of the deleterious nonlethal portion of the DFE is minimally affected by a small proportion (<10%) of lethal mutations. Additionally, we demonstrate that while Fit∂a∂i cannot estimate the fraction of recessive lethal mutations, Fit∂a∂i can accurately infer the fraction of additive lethal mutations. Finally, as an alternative approach to estimate the proportion of mutations that are recessive lethal, we employ models of mutation-selection-drift balance using existing genomic parameters and estimates of segregating recessive lethals for humans and Drosophila melanogaster. In both species, the segregating recessive lethal load can be explained by a very small fraction (<1%) of new nonsynonymous mutations being recessive lethal. Our results refute recent assertions of a much higher proportion of mutations being recessive lethal (4%-5%), while highlighting the need for additional information on the joint distribution of selection and dominance coefficients.
Collapse
Affiliation(s)
- Emma E Wade
- Department of Ecology and Evolutionary Biology, University of California–Los Angeles, Los Angeles, CA, United States
- Department of Computer Science and Engineering, Mississippi State University, Starkville, MS, United States
| | - Christopher C Kyriazis
- Department of Ecology and Evolutionary Biology, University of California–Los Angeles, Los Angeles, CA, United States
| | - Maria Izabel A Cavassim
- Department of Ecology and Evolutionary Biology, University of California–Los Angeles, Los Angeles, CA, United States
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California–Los Angeles, Los Angeles, CA, United States
- Interdepartmental Program in Bioinformatics, University of California–Los Angeles, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine, University of California–Los Angeles, Los Angeles, CA, United States
| |
Collapse
|