1
|
Song H, Chu J, Li W, Li X, Fang L, Han J, Zhao S, Ma Y. A Novel Approach Utilizing Domain Adversarial Neural Networks for the Detection and Classification of Selective Sweeps. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2304842. [PMID: 38308186 PMCID: PMC11005742 DOI: 10.1002/advs.202304842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/10/2024] [Indexed: 02/04/2024]
Abstract
The identification and classification of selective sweeps are of great significance for improving the understanding of biological evolution and exploring opportunities for precision medicine and genetic improvement. Here, a domain adaptation sweep detection and classification (DASDC) method is presented to balance the alignment of two domains and the classification performance through a domain-adversarial neural network and its adversarial learning modules. DASDC effectively addresses the issue of mismatch between training data and real genomic data in deep learning models, leading to a significant improvement in its generalization capability, prediction robustness, and accuracy. The DASDC method demonstrates improved identification performance compared to existing methods and excels in classification performance, particularly in scenarios where there is a mismatch between application data and training data. The successful implementation of DASDC in real data of three distinct species highlights its potential as a useful tool for identifying crucial functional genes and investigating adaptive evolutionary mechanisms, particularly with the increasing availability of genomic data.
Collapse
Affiliation(s)
- Hui Song
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Jinyu Chu
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Wangjiao Li
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
| | - Lingzhao Fang
- Center for Quantitative Genetics and GenomicsAarhus UniversityAarhus8000Denmark
| | - Jianlin Han
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- CAAS‐ILRI Joint Laboratory on Livestock and Forage Genetic ResourcesInstitute of Animal ScienceChinese Academy of Agricultural Sciences (CAAS)Beijing100193China
- Livestock Genetics ProgramInternational Livestock Research Institute (ILRI)Nairobi00100Kenya
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
- Lingnan Modern Agricultural Science and Technology Guangdong LaboratoryGuangzhou510642China
| | - Yunlong Ma
- Key Laboratory of Agricultural Animal GeneticsBreeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of AgricultureHuazhong Agricultural UniversityWuhan430070China
- Hubei Hongshan LaboratoryWuhan430070China
- Lingnan Modern Agricultural Science and Technology Guangdong LaboratoryGuangzhou510642China
| |
Collapse
|
2
|
Iwasaki RL, Satta Y. Spatial and temporal diversity of positive selection on shared haplotypes at the PSCA locus among worldwide human populations. Heredity (Edinb) 2023:10.1038/s41437-023-00631-8. [PMID: 37353592 PMCID: PMC10382566 DOI: 10.1038/s41437-023-00631-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 05/16/2023] [Accepted: 05/17/2023] [Indexed: 06/25/2023] Open
Abstract
Selection on standing genetic variation is important for rapid local genetic adaptation when the environment changes. We report that, for the prostate stem cell antigen (PSCA) gene, different populations have different target haplotypes, even though haplotypes are shared among populations. The C-C-A haplotype, whereby the first C is located at rs2294008 of PSCA and is a low risk allele for gastric cancer, has become a target of positive selection in Asia. Conversely, the C-A-G haplotype carrying the same C allele has become a selection target mainly in Africa. However, Asian and African share both haplotypes, consistent with the haplotype divergence time (170 kya) prior to the out-of-Africa dispersal. The frequency of C-C-A/C-A-G is 0.344/0.278 in Asia and 0.209/0.416 in Africa. Two-dimensional site frequency spectrum analysis revealed that the extent of intra-allelic variability of the target haplotype is extremely small in each local population, suggesting that C-C-A or C-A-G is under ongoing hard sweeps in local populations. From the time to the most recent common ancestor (TMRCA) of selected haplotypes, the onset times of positive selection were recent (3-55 kya), concurrently with population subdivision from a common ancestor. Additionally, estimated selection coefficients from ABC analysis were up to ~3%, similar to those at other loci under recent positive selection. Phylogeny of local populations and TMRCA of selected haplotypes revealed that spatial and temporal switching of positive selection targets is a unique and novel feature of ongoing selection at PSCA. This switching may reflect the potential of rapid adaptability to distinct environments.
Collapse
Affiliation(s)
- Risa L Iwasaki
- Department of Evolutionary Studies of Biosystems, School of Advanced Science, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Kanagawa, 240-0193, Japan
- Research Center for Integrative Evolutionary Science, SOKENDAI, Hayama, Kanagawa, 240-0193, Japan
| | - Yoko Satta
- Department of Evolutionary Studies of Biosystems, School of Advanced Science, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Kanagawa, 240-0193, Japan.
- Research Center for Integrative Evolutionary Science, SOKENDAI, Hayama, Kanagawa, 240-0193, Japan.
| |
Collapse
|
3
|
Muktupavela RA, Petr M, Ségurel L, Korneliussen T, Novembre J, Racimo F. Modeling the spatiotemporal spread of beneficial alleles using ancient genomes. eLife 2022; 11:e73767. [PMID: 36537881 PMCID: PMC9767474 DOI: 10.7554/elife.73767] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 11/21/2022] [Indexed: 12/24/2022] Open
Abstract
Ancient genome sequencing technologies now provide the opportunity to study natural selection in unprecedented detail. Rather than making inferences from indirect footprints left by selection in present-day genomes, we can directly observe whether a given allele was present or absent in a particular region of the world at almost any period of human history within the last 10,000 years. Methods for studying selection using ancient genomes often rely on partitioning individuals into discrete time periods or regions of the world. However, a complete understanding of natural selection requires more nuanced statistical methods which can explicitly model allele frequency changes in a continuum across space and time. Here we introduce a method for inferring the spread of a beneficial allele across a landscape using two-dimensional partial differential equations. Unlike previous approaches, our framework can handle time-stamped ancient samples, as well as genotype likelihoods and pseudohaploid sequences from low-coverage genomes. We apply the method to a panel of published ancient West Eurasian genomes to produce dynamic maps showcasing the inferred spread of candidate beneficial alleles over time and space. We also provide estimates for the strength of selection and diffusion rate for each of these alleles. Finally, we highlight possible avenues of improvement for accurately tracing the spread of beneficial alleles in more complex scenarios.
Collapse
Affiliation(s)
- Rasa A Muktupavela
- Lundbeck GeoGenetics Centre, GLOBE Institute, Faculty of HealthCopenhagenDenmark
| | - Martin Petr
- Lundbeck GeoGenetics Centre, GLOBE Institute, Faculty of HealthCopenhagenDenmark
| | - Laure Ségurel
- UMR5558 Biométrie et Biologie Evolutive, CNRS - Université Lyon 1VilleurbanneFrance
| | | | - John Novembre
- Department of Human Genetics, University of ChicagoChicagoUnited States
| | - Fernando Racimo
- Lundbeck GeoGenetics Centre, GLOBE Institute, Faculty of HealthCopenhagenDenmark
| |
Collapse
|
4
|
Souilmi Y, Tobler R, Johar A, Williams M, Grey ST, Schmidt J, Teixeira JC, Rohrlach A, Tuke J, Johnson O, Gower G, Turney C, Cox M, Cooper A, Huber CD. Admixture has obscured signals of historical hard sweeps in humans. Nat Ecol Evol 2022; 6:2003-2015. [PMID: 36316412 PMCID: PMC9715430 DOI: 10.1038/s41559-022-01914-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Accepted: 09/16/2022] [Indexed: 11/06/2022]
Abstract
The role of natural selection in shaping biological diversity is an area of intense interest in modern biology. To date, studies of positive selection have primarily relied on genomic datasets from contemporary populations, which are susceptible to confounding factors associated with complex and often unknown aspects of population history. In particular, admixture between diverged populations can distort or hide prior selection events in modern genomes, though this process is not explicitly accounted for in most selection studies despite its apparent ubiquity in humans and other species. Through analyses of ancient and modern human genomes, we show that previously reported Holocene-era admixture has masked more than 50 historic hard sweeps in modern European genomes. Our results imply that this canonical mode of selection has probably been underappreciated in the evolutionary history of humans and suggest that our current understanding of the tempo and mode of selection in natural populations may be inaccurate.
Collapse
Affiliation(s)
- Yassine Souilmi
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
| | - Raymond Tobler
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Evolution of Cultural Diversity Initiative, Australian National University, Canberra, Australian Capital Territory, Australia.
| | - Angad Johar
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Cardiovascular Diseases, Mayo Clinic, Rochester, MN, USA.
| | - Matthew Williams
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Shane T Grey
- Transplantation Immunology Group, Immunology Division, Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia
- St Vincent's Clinical School, Faculty of Medicine, UNSW, Darlinghurst, New South Wales, Australia
| | - Joshua Schmidt
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - João C Teixeira
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Adam Rohrlach
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Archaeogenetics, Max Planck Institute for the Science of Human History, Jena, Germany
| | - Jonathan Tuke
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, The University of Adelaide, Adelaide, South Australia, Australia
- School of Mathematical Sciences, The University of Adelaide, Adelaide, South Australia, Australia
| | - Olivia Johnson
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Graham Gower
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia
| | - Chris Turney
- Chronos 14Carbon-Cycle Facility and Earth and Sustainability Science Research Centre, University of New South Wales, Sydney, New South Wales, Australia
| | - Murray Cox
- Statistics and Bioinformatics Group, School of Fundamental Sciences, Massey University, Palmerston North, New Zealand
| | - Alan Cooper
- South Australian Museum, Adelaide, South Australia, Australia.
- BlueSky Genetics, Ashton, South Australia, Australia.
| | - Christian D Huber
- Australian Centre for Ancient DNA, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Biology, Penn State University, University Park, PA, USA.
| |
Collapse
|
5
|
Nikolakis ZL, Adams RH, Wade KJ, Lund AJ, Carlton EJ, Castoe TA, Pollock DD. Prospects for genomic surveillance for selection in schistosome parasites. FRONTIERS IN EPIDEMIOLOGY 2022; 2:932021. [PMID: 38455290 PMCID: PMC10910990 DOI: 10.3389/fepid.2022.932021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 09/12/2022] [Indexed: 03/09/2024]
Abstract
Schistosomiasis is a neglected tropical disease caused by multiple parasitic Schistosoma species, and which impacts over 200 million people globally, mainly in low- and middle-income countries. Genomic surveillance to detect evidence for natural selection in schistosome populations represents an emerging and promising approach to identify and interpret schistosome responses to ongoing control efforts or other environmental factors. Here we review how genomic variation is used to detect selection, how these approaches have been applied to schistosomes, and how future studies to detect selection may be improved. We discuss the theory of genomic analyses to detect selection, identify experimental designs for such analyses, and review studies that have applied these approaches to schistosomes. We then consider the biological characteristics of schistosomes that are expected to respond to selection, particularly those that may be impacted by control programs. Examples include drug resistance, host specificity, and life history traits, and we review our current understanding of specific genes that underlie them in schistosomes. We also discuss how inherent features of schistosome reproduction and demography pose substantial challenges for effective identification of these traits and their genomic bases. We conclude by discussing how genomic surveillance for selection should be designed to improve understanding of schistosome biology, and how the parasite changes in response to selection.
Collapse
Affiliation(s)
- Zachary L. Nikolakis
- Department of Biology, University of Texas at Arlington, Arlington, TX, United States
| | - Richard H. Adams
- Department of Biological and Environmental Sciences, Georgia College and State University, Milledgeville, GA, United States
| | - Kristen J. Wade
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Andrea J. Lund
- Department of Environmental and Occupational Health, Colorado School of Public Health, University of Colorado, Anschutz, Aurora, CO, United States
| | - Elizabeth J. Carlton
- Department of Environmental and Occupational Health, Colorado School of Public Health, University of Colorado, Anschutz, Aurora, CO, United States
| | - Todd A. Castoe
- Department of Biology, University of Texas at Arlington, Arlington, TX, United States
| | - David D. Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, United States
| |
Collapse
|
6
|
Saitou M, Resendez S, Pradhan AJ, Wu F, Lie NC, Hall NJ, Zhu Q, Reinholdt L, Satta Y, Speidel L, Nakagome S, Hanchard NA, Churchill G, Lee C, Atilla-Gokcumen GE, Mu X, Gokcumen O. Sex-specific phenotypic effects and evolutionary history of an ancient polymorphic deletion of the human growth hormone receptor. SCIENCE ADVANCES 2021; 7:eabi4476. [PMID: 34559564 PMCID: PMC8462886 DOI: 10.1126/sciadv.abi4476] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 08/04/2021] [Indexed: 06/13/2023]
Abstract
The common deletion of the third exon of the growth hormone receptor gene (GHRd3) in humans is associated with birth weight, growth after birth, and time of puberty. However, its evolutionary history and the molecular mechanisms through which it affects phenotypes remain unresolved. We present evidence that this deletion was nearly fixed in the ancestral population of anatomically modern humans and Neanderthals but underwent a recent adaptive reduction in frequency in East Asia. We documented that GHRd3 is associated with protection from severe malnutrition. Using a novel mouse model, we found that, under calorie restriction, Ghrd3 leads to the female-like gene expression in male livers and the disappearance of sexual dimorphism in weight. The sex- and diet-dependent effects of GHRd3 in our mouse model are consistent with a model in which the allele frequency of GHRd3 varies throughout human evolution as a response to fluctuations in resource availability.
Collapse
Affiliation(s)
- Marie Saitou
- Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA
| | - Skyler Resendez
- Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA
| | | | - Fuguo Wu
- Department of Ophthalmology, Ross Eye Institute, Jacobs School of Medicine and Biological Sciences, University at Buffalo, Buffalo, NY, USA
| | - Natasha C. Lie
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Nancy J. Hall
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Qihui Zhu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Yoko Satta
- Department of Evolutionary Studies of Biosystems, SOKENDAI (Graduate University for Advanced Studies), Kanagawa Prefecture, Japan
| | - Leo Speidel
- University College London, Genetics Institute, London, UK
- The Francis Crick Institute, London, UK
| | | | - Neil A. Hanchard
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Precision Medicine Center, The First Affiliated Hospital of Xi’an Jiaotong University, Shaanxi, People’s Republic of China
| | | | - Xiuqian Mu
- Department of Ophthalmology, Ross Eye Institute, Jacobs School of Medicine and Biological Sciences, University at Buffalo, Buffalo, NY, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
7
|
Sellinger TPP, Abu-Awad D, Tellier A. Limits and convergence properties of the sequentially Markovian coalescent. Mol Ecol Resour 2021; 21:2231-2248. [PMID: 33978324 DOI: 10.1111/1755-0998.13416] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 04/19/2021] [Accepted: 04/29/2021] [Indexed: 02/07/2023]
Abstract
Several methods based on the sequentially Markovian coalescent (SMC) make use of full genome sequence data from samples to infer population demographic history including past changes in population size, admixture, migration events and population structure. More recently, the original theoretical framework has been extended to allow the simultaneous estimation of population size changes along with other life history traits such as selfing or seed banking. The latter developments enhance the applicability of SMC methods to nonmodel species. Although convergence proofs have been given using simulated data in a few specific cases, an in-depth investigation of the limitations of SMC methods is lacking. In order to explore such limits, we first develop a tool inferring the best case convergence of SMC methods assuming the true underlying coalescent genealogies are known. This tool can be used to quantify the amount and type of information that can be confidently retrieved from given data sets prior to the analysis of the real data. Second, we assess the inference accuracy when the assumptions of SMC approaches are violated due to departures from the model, namely the presence of transposable elements, variable recombination and mutation rates along the sequence, and SNP calling errors. Third, we deliver a new interpretation of SMC methods by highlighting the importance of the transition matrix, which we argue can be used as a set of summary statistics in other statistical inference methods, uncoupling the SMC from hidden Markov models (HMMs). We finally offer recommendations to better apply SMC methods and build adequate data sets under budget constraints.
Collapse
Affiliation(s)
| | - Diala Abu-Awad
- Department of Life Science Systems, Technical University of Munich, Munchen, Germany
| | - Aurélien Tellier
- Department of Life Science Systems, Technical University of Munich, Munchen, Germany
| |
Collapse
|