1
|
Li J, Li W, Peng X, Li X, Zhao S, Wang H, Ma Y. Genetic basis of phenotypic convergence in pig terminal sires using pathway-based selection signature detection methods. Anim Genet 2024; 55:664-669. [PMID: 38830632 DOI: 10.1111/age.13454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 04/10/2024] [Accepted: 05/21/2024] [Indexed: 06/05/2024]
Abstract
The primary purpose of genetic improvement in lean pig breeds is to enhance production performance. Owing to their similar breeding directions, Duroc and Pietrain pigs are ideal models for investigating the phenotypic convergence underlying artificial selection. However, most important economic traits are controlled by a polygenic basis, so traditional strategies for detecting selection signatures may not fully reveal the genetic basis of complex traits. The pathway-based gene network analysis method utilizes each pathway as a unit, overcoming the limitations of traditional strategies for detecting selection signatures by revealing the selection of complex biological processes. Here, we utilized 13 122 398 high-quality SNPs from whole-genome sequencing data of 48 Pietrain pigs, 156 Duroc pigs and 36 European wild boars to detect selective signatures. After calculating FST and iHS scores, we integrated the pathway information and utilized the r/bioconductor graphite and signet packages to construct gene networks, identify subnets and uncover candidate genes underlying selection. Using the traditional strategy, a total of 47 genomic regions exhibiting parallel selection were identified. The enriched genes, including INO80, FZR1, LEPR and FAF1, may be associated with reproduction, fat deposition and skeletal development. Using the pathway-based selection signatures detection method, we identified two significant biological pathways and eight potential candidate genes underlying parallel selection, such as VTN, FN1 and ITGAV. This study presents a novel strategy for investigating the genetic basis of complex traits and elucidating the phenotypic convergence underlying artificial selection, by integrating traditional selection signature methods with pathway-based gene network analysis.
Collapse
Affiliation(s)
- Jinhua Li
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
| | - Wangjiao Li
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
| | - Xia Peng
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
- Hubei Hongshan Laboratory, Wuhan, China
- Lingnan Modern Agricultural Science and Technology Guangdong Laboratory, Guangzhou, China
| | - Haiyan Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
| | - Yunlong Ma
- Key Laboratory of Agricultural Animal Genetics, Breeding, and Reproduction of the Ministry of Education & Key Laboratory of Swine Genetics and Breeding of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
- Lingnan Modern Agricultural Science and Technology Guangdong Laboratory, Guangzhou, China
| |
Collapse
|
2
|
Blanc J, Berg JJ. Testing for differences in polygenic scores in the presence of confounding. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.03.12.532301. [PMID: 36993707 PMCID: PMC10055004 DOI: 10.1101/2023.03.12.532301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Polygenic scores have become an important tool in human genetics, enabling the prediction of individuals' phenotypes from their genotypes. Understanding how the pattern of differences in polygenic score predictions across individuals intersects with variation in ancestry can provide insights into the evolutionary forces acting on the trait in question, and is important for understanding health disparities. However, because most polygenic scores are computed using effect estimates from population samples, they are susceptible to confounding by both genetic and environmental effects that are correlated with ancestry. The extent to which this confounding drives patterns in the distribution of polygenic scores depends on patterns of population structure in both the original estimation panel and in the prediction/test panel. Here, we use theory from population and statistical genetics, together with simulations, to study the procedure of testing for an association between polygenic scores and axes of ancestry variation in the presence of confounding. We use a general model of genetic relatedness to describe how confounding in the estimation panel biases the distribution of polygenic scores in a way that depends on the degree of overlap in population structure between panels. We then show how this confounding can bias tests for associations between polygenic scores and important axes of ancestry variation in the test panel. Specifically, for any given test, there exists a single axis of population structure in the GWAS panel that needs to be controlled for in order to protect the test. Based on this result, we propose a new approach for directly estimating this axis of population structure in the GWAS panel. We then use simulations to compare the performance of this approach to the standard approach in which the principal components of the GWAS panel genotypes are used to control for stratification. Author Summary Complex traits are influenced by both genetics and the environment. Human geneticists increasingly use polygenic scores, calculated as the weighted sum of trait-associated alleles, to predict genetic effects on a phenotype. Differences in polygenic scores across groups would therefore seem to indicate differences in the genetic basis of the trait, which are of interest to researchers across disciplines. However, because polygenic scores are usually computed using effect sizes estimated using population samples, they are susceptible to confounding due to both the genetic background and the environment. Here, we use theory from population and statistical genetics, together with simulations, to study how environmental and background genetic effects can confound tests for association between polygenic scores and axes of ancestry variation. We then develop a simple method to protect these tests from confounding, which we evaluate, alongside standard methods, across a range of possible situations. Our work helps clarify how bias in the distribution of polygenic scores is produced and provides insight to researchers wishing to protect their analyses from confounding.
Collapse
|
3
|
Cooke NP, Murray M, Cassidy LM, Mattiangeli V, Okazaki K, Kasai K, Gakuhari T, Bradley DG, Nakagome S. Genomic imputation of ancient Asian populations contrasts local adaptation in pre- and post-agricultural Japan. iScience 2024; 27:110050. [PMID: 38883821 PMCID: PMC11176660 DOI: 10.1016/j.isci.2024.110050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 03/25/2024] [Accepted: 05/17/2024] [Indexed: 06/18/2024] Open
Abstract
Early modern humans lived as hunter-gatherers for millennia before agriculture, yet the genetic adaptations of these populations remain a mystery. Here, we investigate selection in the ancient hunter-gatherer-fisher Jomon and contrast pre- and post-agricultural adaptation in the Japanese archipelago. Building on the successful validation of imputation with ancient Asian genomes, we identify selection signatures in the Jomon, particularly robust signals from KITLG variants, which may have influenced dark pigmentation evolution. The Jomon lacks well-known adaptive variants (EDAR, ADH1B, and ALDH2), marking their emergence after the advent of farming in the archipelago. Notably, the EDAR and ADH1B variants were prevalent in the archipelago 1,300 years ago, whereas the ALDH2 variant could have emerged later due to its absence in other ancient genomes. Overall, our study underpins local adaptation unique to the Jomon population, which in turn sheds light on post-farming selection that continues to shape contemporary Asian populations.
Collapse
Affiliation(s)
- Niall P Cooke
- School of Medicine, Trinity College Dublin, Dublin, Ireland
| | | | - Lara M Cassidy
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | | | - Kenji Okazaki
- Department of Anatomy, Faculty of Medicine, Tottori University, Yonago, Japan
| | - Kenji Kasai
- Toyama Prefectural Center for Archaeological Operations, Toyama, Japan
| | - Takashi Gakuhari
- Institute for the Study of Ancient Civilizations and Cultural Resources, Kanazawa University, Kanazawa, Japan
| | - Daniel G Bradley
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
| | - Shigeki Nakagome
- School of Medicine, Trinity College Dublin, Dublin, Ireland
- Institute for the Study of Ancient Civilizations and Cultural Resources, Kanazawa University, Kanazawa, Japan
| |
Collapse
|
4
|
Nava A, Lugli F, Lemmers S, Cerrito P, Mahoney P, Bondioli L, Müller W. Reading children's teeth to reconstruct life history and the evolution of human cooperation and cognition: The role of dental enamel microstructure and chemistry. Neurosci Biobehav Rev 2024; 163:105745. [PMID: 38825260 DOI: 10.1016/j.neubiorev.2024.105745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 05/25/2024] [Accepted: 05/30/2024] [Indexed: 06/04/2024]
Abstract
Studying infants in the past is crucial for understanding the evolution of human life history and the evolution of cooperation, cognition, and communication. An infant's growth, health, and mortality can provide information about the dynamics and structure of a population, their cultural practices, and the adaptive capacity of a community. Skeletal remains provide one way of accessing this information for humans recovered prior to the historical periods. Teeth in particular, are retrospective archives of information that can be accessed through morphological, micromorphological, and biogeochemical methods. This review discusses how the microanatomy and formation of teeth, and particularly enamel, serve as archives of somatic growth, stress, and the environment. Examining their role in the broader context of human evolution, we discuss dental biogeochemistry and emphasize how the incremental growth of tooth microstructure facilitates the reconstruction of temporal data related to health, diet, mobility, and stress in past societies. The review concludes by considering tooth microstructure as a biomarker and the potential clinical applications.
Collapse
Affiliation(s)
- Alessia Nava
- Department of Odontostomatological and Maxillofacial Sciences, Sapienza University of Rome, via Caserta 6, Rome 00161, Italy.
| | - Federico Lugli
- Institut of Geosciences, Goethe University Frankfurt, 60438, Frankfurt, Frankfurt am Main, Germany; Frankfurt Isotope and Element Research Center (FIERCE), Goethe University Frankfurt, Frankfurt am Main, Germany; Department of Chemical and Geological Science, University of Modena and Reggio Emilia, via Giuseppe Campi, 103, Modena 41125, Italy
| | - Simone Lemmers
- Elettra Sincrotrone Trieste S.C.p.A., AREA Science Park, s.s. 14 km 163,500, Basovizza, Trieste, Italy; Department of Psychiatry, Harvard Medical School, 401 Park Drive, Boston, MA, USA; Center for Genomic Medicine, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA, USA
| | - Paola Cerrito
- Department of Evolutionary Anthropology, University of Zürich, Zürich, Switzerland
| | - Patrick Mahoney
- School of Anthropology and Conservation, University of Kent, Giles Ln, Giles Ln, Canterbury CT2 7NZ, UK
| | - Luca Bondioli
- Department of Cultural Heritage, University of Padua, Piazza Capitaniato, 7, Padua 35139, Italy
| | - Wolfgang Müller
- Institut of Geosciences, Goethe University Frankfurt, 60438, Frankfurt, Frankfurt am Main, Germany; Frankfurt Isotope and Element Research Center (FIERCE), Goethe University Frankfurt, Frankfurt am Main, Germany
| |
Collapse
|
5
|
Peng D, Mulder OJ, Edge MD. Evaluating ARG-estimation methods in the context of estimating population-mean polygenic score histories. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.24.595829. [PMID: 38854009 PMCID: PMC11160635 DOI: 10.1101/2024.05.24.595829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Scalable methods for estimating marginal coalescent trees across the genome present new opportunities for studying evolution and have generated considerable excitement, with new methods extending scalability to thousands of samples. Benchmarking of the available methods has revealed general tradeoffs between accuracy and scalability, but performance in downstream applications has not always been easily predictable from general performance measures, suggesting that specific features of the ARG may be important for specific downstream applications of estimated ARGs. To exemplify this point, we benchmark ARG estimation methods with respect to a specific set of methods for estimating the historical time course of a population-mean polygenic score (PGS) using the marginal coalescent trees encoded by the ancestral recombination graph (ARG). Here we examine the performance in simulation of six ARG estimation methods: ARGweaver, RENT+, Relate, tsinfer+tsdate, ARG-Needle/ASMC-clust , and SINGER , using their estimated coalescent trees and examining bias, mean squared error (MSE), confidence interval coverage, and Type I and II error rates of the downstream methods. Although it does not scale to the sample sizes attainable by other new methods, SINGER produced the most accurate estimated PGS histories in many instances, even when Relate, tsinfer+tsdate , and ARG-Needle/ASMC-clust used samples ten times as large as those used by SINGER. In general, the best choice of method depends on the number of samples available and the historical time period of interest. In particular, the unprecedented sample sizes allowed by Relate, tsinfer+tsdate , and ARG-Needle/ASMC-clust are of greatest importance when the recent past is of interest-further back in time, most of the tree has coalesced, and differences in contemporary sample size are less salient.
Collapse
|
6
|
Fine AG, Steinrücken M. A novel expectation-maximization approach to infer general diploid selection from time-series genetic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.10.593575. [PMID: 38798346 PMCID: PMC11118272 DOI: 10.1101/2024.05.10.593575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Detecting and quantifying the strength of selection is a main objective in population genetics. Since selection acts over multiple generations, many approaches have been developed to detect and quantify selection using genetic data sampled at multiple points in time. Such time series genetic data is commonly analyzed using Hidden Markov Models, but in most cases, under the assumption of additive selection. However, many examples of genetic variation exhibiting non-additive mechanisms exist, making it critical to develop methods that can characterize selection in more general scenarios. Thus, we extend a previously introduced expectation-maximization algorithm for the inference of additive selection coefficients to the case of general diploid selection, in which heterozygote and homozygote fitnesses are parameterized independently. We furthermore introduce a framework to identify bespoke modes of diploid selection from given data, as well as a procedure for aggregating data across linked loci to increase power and robustness. Using extensive simulation studies, we find that our method accurately and efficiently estimates selection coefficients for different modes of diploid selection across a wide range of scenarios; however, power to classify the mode of selection is low unless selection is very strong. We apply our method to ancient DNA samples from Great Britain in the last 4,450 years, and detect evidence for selection in six genomic regions, including the well-characterized LCT locus. Our work is the first genome-wide scan characterizing signals of general diploid selection.
Collapse
Affiliation(s)
- Adam G Fine
- Department of Ecology and Evolution, University of Chicago
- Graduate Program in Biophysical Sciences, University of Chicago
| | - Matthias Steinrücken
- Department of Ecology and Evolution, University of Chicago
- Department of Human Genetics, University of Chicago
| |
Collapse
|
7
|
Li J, Amado A, Bank C. Rapid adaptation of recombining populations on tunable fitness landscapes. Mol Ecol 2024; 33:e16900. [PMID: 36855836 DOI: 10.1111/mec.16900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 01/28/2023] [Accepted: 02/01/2023] [Indexed: 03/02/2023]
Abstract
How does standing genetic variation affect polygenic adaptation in recombining populations? Despite a large body of work in quantitative genetics, epistatic and weak additive fitness effects among simultaneously segregating genetic variants are difficult to capture experimentally or to predict theoretically. In this study, we simulated adaptation on fitness landscapes with tunable ruggedness driven by standing genetic variation in recombining populations. We confirmed that recombination hinders the movement of a population through a rugged fitness landscape. When surveying the effect of epistasis on the fixation of alleles, we found that the combined effects of high ruggedness and high recombination probabilities lead to preferential fixation of alleles that had a high initial frequency. This indicates that positive epistatic alleles escape from being broken down by recombination when they start at high frequency. We further extract direct selection coefficients and pairwise epistasis along the adaptive path. When taking the final fixed genotype as the reference genetic background, we observe that, along the adaptive path, beneficial direct selection appears stronger and pairwise epistasis weaker than in the underlying fitness landscape. Quantitatively, the ratio of epistasis and direct selection is smaller along the adaptive path (≈ 1 ) than expected. Thus, adaptation on a rugged fitness landscape may lead to spurious signals of direct selection generated through epistasis. Our study highlights how the interplay of epistasis and recombination constrains the adaptation of a diverse population to a new environment.
Collapse
Affiliation(s)
- Juan Li
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Swiss Institute for Bioinformatics, Lausanne, Switzerland
| | - André Amado
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Swiss Institute for Bioinformatics, Lausanne, Switzerland
| | - Claudia Bank
- Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
- Swiss Institute for Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
8
|
Liu X, Koyama S, Tomizuka K, Takata S, Ishikawa Y, Ito S, Kosugi S, Suzuki K, Hikino K, Koido M, Koike Y, Horikoshi M, Gakuhari T, Ikegawa S, Matsuda K, Momozawa Y, Ito K, Kamatani Y, Terao C. Decoding triancestral origins, archaic introgression, and natural selection in the Japanese population by whole-genome sequencing. SCIENCE ADVANCES 2024; 10:eadi8419. [PMID: 38630824 PMCID: PMC11023554 DOI: 10.1126/sciadv.adi8419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 03/07/2024] [Indexed: 04/19/2024]
Abstract
We generated Japanese Encyclopedia of Whole-Genome/Exome Sequencing Library (JEWEL), a high-depth whole-genome sequencing dataset comprising 3256 individuals from across Japan. Analysis of JEWEL revealed genetic characteristics of the Japanese population that were not discernible using microarray data. First, rare variant-based analysis revealed an unprecedented fine-scale genetic structure. Together with population genetics analysis, the present-day Japanese can be decomposed into three ancestral components. Second, we identified unreported loss-of-function (LoF) variants and observed that for specific genes, LoF variants appeared to be restricted to a more limited set of transcripts than would be expected by chance, with PTPRD as a notable example. Third, we identified 44 archaic segments linked to complex traits, including a Denisovan-derived segment at NKX6-1 associated with type 2 diabetes. Most of these segments are specific to East Asians. Fourth, we identified candidate genetic loci under recent natural selection. Overall, our work provided insights into genetic characteristics of the Japanese population.
Collapse
Affiliation(s)
- Xiaoxi Liu
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
| | - Satoshi Koyama
- Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Medical and Population Genetics and Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Boston, MA, USA
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Kohei Tomizuka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Sadaaki Takata
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yuki Ishikawa
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Shuji Ito
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory for Bone and Joint Diseases, RIKEN Center for Medical Sciences, Tokyo, Japan
- Department of Orthopedic Surgery, Faculty of Medicine, Shimane University, Izumo, Japan
| | - Shunichi Kosugi
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kunihiko Suzuki
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Keiko Hikino
- Laboratory for Pharmacogenomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Masaru Koido
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Yoshinao Koike
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory for Bone and Joint Diseases, RIKEN Center for Medical Sciences, Tokyo, Japan
- Department of Orthopedic Surgery, Hokkaido University Graduate School of Medicine, Sapporo, Japan
| | - Momoko Horikoshi
- Laboratory for Genomics of Diabetes and Metabolism, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Takashi Gakuhari
- Institute for the Study of Ancient Civilizations and Cultural Resources, College of Human and Social Sciences, Kanazawa University, Kanazawa, Japan
| | - Shiro Ikegawa
- Laboratory for Bone and Joint Diseases, RIKEN Center for Medical Sciences, Tokyo, Japan
| | - Kochi Matsuda
- Laboratory of Genome Technology, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Yukihide Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kaoru Ito
- Laboratory for Cardiovascular Genomics and Informatics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Kamatani
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| |
Collapse
|
9
|
Riley R, Mathieson I, Mathieson S. Interpreting generative adversarial networks to infer natural selection from genetic data. Genetics 2024; 226:iyae024. [PMID: 38386895 PMCID: PMC10990424 DOI: 10.1093/genetics/iyae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 01/15/2024] [Accepted: 01/19/2024] [Indexed: 02/24/2024] Open
Abstract
Understanding natural selection and other forms of non-neutrality is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically require slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection and other local evolutionary processes that requires relatively few selection simulations during training. We build upon a generative adversarial network trained to simulate realistic neutral data. This consists of a generator (fitted demographic model), and a discriminator (convolutional neural network) that predicts whether a genomic region is real or fake. As the generator can only generate data under neutral demographic processes, regions of real data that the discriminator recognizes as having a high probability of being "real" do not fit the neutral demographic model and are therefore candidates for targets of selection. To incentivize identification of a specific mode of selection, we fine-tune the discriminator with a small number of custom non-neutral simulations. We show that this approach has high power to detect various forms of selection in simulations, and that it finds regions under positive selection identified by state-of-the-art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics.
Collapse
Affiliation(s)
- Rebecca Riley
- Department of Computer Science, Haverford College, Haverford, PA 19041, USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sara Mathieson
- Department of Computer Science, Haverford College, Haverford, PA 19041, USA
| |
Collapse
|
10
|
Robert A. Building references for nature conservation. CONSERVATION BIOLOGY : THE JOURNAL OF THE SOCIETY FOR CONSERVATION BIOLOGY 2024; 38:e14202. [PMID: 37811723 DOI: 10.1111/cobi.14202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 09/28/2023] [Indexed: 10/10/2023]
Abstract
Conservation references have long been used in conservation biology to compare current biodiversity processes and states with past conditions. However, beyond the paucity of data for the construction of ancient, even prehuman, references, the relevance of these ancient references for studying ecosystems radically modified by human activities is questionable, particularly when the notions of conservation references and conservation objectives are confused and when several conservation ethics coexist that require distinct references. Because of this implicit heterogeneity in the nature of the references and their temporal baseline, conservation references not only have different meanings, but also deliver different messages. I propose establishing a common framework for conservation references to approach past biological systems and build comparable references between studies and projects. The selection of these references (distinct from conservation objectives) should be an early, explicit, standardized, and transparent milestone in any conservation process and these references should be based on state, pressure, or process dynamics, rather than fixed states. Finally, the importance of the diversity of temporal baselines used to build conservation references and to measure anthropogenic impacts should be recognized to understand the biodiversity crisis in its entirety.
Collapse
Affiliation(s)
- Alexandre Robert
- Centre d'Ecologie et des Sciences de la Conservation (CESCO), Muséum national d'Histoire naturelle, Centre National de la Recherche Scientifique, Sorbonne Université, Paris, France
| |
Collapse
|
11
|
Venkatesh SS, Wittemans LBL, Palmer DS, Baya NA, Ferreira T, Hill B, Lassen FH, Parker MJ, Reibe S, Elhakeem A, Banasik K, Bruun MT, Erikstrup C, Jensen BA, Juul A, Mikkelsen C, Nielsen HS, Ostrowski SR, Pedersen OB, Rohde PD, Sorensen E, Ullum H, Westergaard D, Haraldsson A, Holm H, Jonsdottir I, Olafsson I, Steingrimsdottir T, Steinthorsdottir V, Thorleifsson G, Figueredo J, Karjalainen MK, Pasanen A, Jacobs BM, Hubers N, Lippincott M, Fraser A, Lawlor DA, Timpson NJ, Nyegaard M, Stefansson K, Magi R, Laivuori H, van Heel DA, Boomsma DI, Balasubramanian R, Seminara SB, Chan YM, Laisk T, Lindgren CM. Genome-wide analyses identify 21 infertility loci and over 400 reproductive hormone loci across the allele frequency spectrum. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.19.24304530. [PMID: 38562841 PMCID: PMC10984039 DOI: 10.1101/2024.03.19.24304530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Genome-wide association studies (GWASs) may help inform treatments for infertility, whose causes remain unknown in many cases. Here we present GWAS meta-analyses across six cohorts for male and female infertility in up to 41,200 cases and 687,005 controls. We identified 21 genetic risk loci for infertility (P≤5E-08), of which 12 have not been reported for any reproductive condition. We found positive genetic correlations between endometriosis and all-cause female infertility (rg=0.585, P=8.98E-14), and between polycystic ovary syndrome and anovulatory infertility (rg=0.403, P=2.16E-03). The evolutionary persistence of female infertility-risk alleles in EBAG9 may be explained by recent directional selection. We additionally identified up to 269 genetic loci associated with follicle-stimulating hormone (FSH), luteinising hormone, oestradiol, and testosterone through sex-specific GWAS meta-analyses (N=6,095-246,862). While hormone-associated variants near FSHB and ARL14EP colocalised with signals for anovulatory infertility, we found no rg between female infertility and reproductive hormones (P>0.05). Exome sequencing analyses in the UK Biobank (N=197,340) revealed that women carrying testosterone-lowering rare variants in GPC2 were at higher risk of infertility (OR=2.63, P=1.25E-03). Taken together, our results suggest that while individual genes associated with hormone regulation may be relevant for fertility, there is limited genetic evidence for correlation between reproductive hormones and infertility at the population level. We provide the first comprehensive view of the genetic architecture of infertility across multiple diagnostic criteria in men and women, and characterise its relationship to other health conditions.
Collapse
Affiliation(s)
- Samvida S Venkatesh
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom
| | - Laura B L Wittemans
- Novo Nordisk Research Centre Oxford, Oxford, United Kingdom
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, United Kingdom
| | - Duncan S Palmer
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, United Kingdom
| | - Nikolas A Baya
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom
| | - Teresa Ferreira
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
| | - Barney Hill
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, United Kingdom
| | - Frederik Heymann Lassen
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom
| | - Melody J Parker
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Nuffield Department of Clinical Medicine, University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
| | - Saskia Reibe
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, United Kingdom
| | - Ahmed Elhakeem
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Karina Banasik
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
- Department of Obstetrics and Gynecology, Copenhagen University Hospital, Hvidovre, Copenhagen, Denmark
| | - Mie T Bruun
- Department of Clinical Immunology, Odense University Hospital, Odense, Denmark
| | - Christian Erikstrup
- Department of Clinical Immunology, Aarhus University Hospital, Aarhus, Denmark
- Department of Clinical Medicine, Health, Aarhus University, Aarhus, Denmark
| | - Bitten A Jensen
- Department of Clinical Immunology, Aalborg University Hospital, Aalborg, Denmark
| | - Anders Juul
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen; Copenhagen, Denmark
- Department of Growth and Reproduction, Copenhagen University Hospital-Rigshospitalet, Copenhagen, Denmark
| | - Christina Mikkelsen
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Science, Copenhagen University, Copenhagen, Denmark
| | - Henriette S Nielsen
- Department of Obstetrics and Gynecology, The Fertility Clinic, Hvidovre University Hospital, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Sisse R Ostrowski
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Ole B Pedersen
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
- Department of Clinical Immunology, Zealand University Hospital, Kge, Denmark
| | - Palle D Rohde
- Genomic Medicine, Department of Health Science and Technology, Aalborg University, Aalborg, Denmark
| | - Erik Sorensen
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
| | | | - David Westergaard
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark
- Department of Obstetrics and Gynecology, Copenhagen University Hospital, Hvidovre, Copenhagen, Denmark
| | - Asgeir Haraldsson
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- Children's Hospital Iceland, Landspitali University Hospital, Reykjavik, Iceland
| | - Hilma Holm
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
| | - Ingileif Jonsdottir
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
| | - Isleifur Olafsson
- Department of Clinical Biochemistry, Landspitali University Hospital, Reykjavik, Iceland
| | - Thora Steingrimsdottir
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- Department of Obstetrics and Gynecology, Landspitali University Hospital, Reykjavik, Iceland
| | | | | | - Jessica Figueredo
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Minna K Karjalainen
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- Research Unit of Population Health, Faculty of Medicine, University of Oulu, Finland
- Northern Finland Birth Cohorts, Arctic Biobank, Infrastructure for Population Studies, Faculty of Medicine, University of Oulu, Oulu, Finland
| | - Anu Pasanen
- Research Unit of Clinical Medicine, Medical Research Center Oulu, University of Oulu, and Department of Children and Adolescents, Oulu University Hospital, Oulu, Finland
| | - Benjamin M Jacobs
- Centre for Preventive Neurology, Wolfson Institute of Population Health, Queen Mary University London, London, EC1M 6BQ, United Kingdom
| | - Nikki Hubers
- Department of Biological Psychology, Netherlands Twin Register, Vrije Universiteit, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development Institute, Amsterdam, The Netherlands
| | - Margaret Lippincott
- Harvard Reproductive Sciences Center and Reproductive Endocrine Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Abigail Fraser
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Deborah A Lawlor
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Nicholas J Timpson
- MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom
- Population Health Science, Bristol Medical School, University of Bristol, Bristol, United Kingdom
| | - Mette Nyegaard
- Genomic Medicine, Department of Health Science and Technology, Aalborg University, Aalborg, Denmark
| | - Kari Stefansson
- Faculty of Medicine, University of Iceland, Reykjavik, Iceland
- deCODE genetics/Amgen, Inc., Reykjavik, Iceland
| | - Reedik Magi
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Hannele Laivuori
- Institute for Molecular Medicine Finland, Helsinki Institute of Life Science, University of Helsinki, Helsinki, Finland
- Medical and Clinical Genetics, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
- Department of Obstetrics and Gynecology, Tampere University Hospital, Finland
- Center for Child, Adolescent, and Maternal Health Research, Faculty of Medicine and Health Technology, Tampere University, Finland
| | - David A van Heel
- Blizard Institute, Queen Mary University London, London, E1 2AT, United Kingdom
| | - Dorret I Boomsma
- Department of Biological Psychology, Netherlands Twin Register, Vrije Universiteit, Amsterdam, The Netherlands
- Amsterdam Reproduction and Development Institute, Amsterdam, The Netherlands
| | - Ravikumar Balasubramanian
- Harvard Reproductive Sciences Center and Reproductive Endocrine Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Stephanie B Seminara
- Harvard Reproductive Sciences Center and Reproductive Endocrine Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Yee-Ming Chan
- Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Endocrinology, Department of Pediatrics, Boston Children's Hospital, Boston, Massachusetts, United States of America
| | - Triin Laisk
- Estonian Genome Centre, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Cecilia M Lindgren
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7LF, United Kingdom
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, United Kingdom
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, United Kingdom
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
| |
Collapse
|
12
|
Schneider H, Krizanac AM, Falker-Gieske C, Heise J, Tetens J, Thaller G, Bennewitz J. Genomic dissection of the correlation between milk yield and various health traits using functional and evolutionary information about imputed sequence variants of 34,497 German Holstein cows. BMC Genomics 2024; 25:265. [PMID: 38461236 DOI: 10.1186/s12864-024-10115-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 02/13/2024] [Indexed: 03/11/2024] Open
Abstract
BACKGROUND Over the last decades, it was subject of many studies to investigate the genomic connection of milk production and health traits in dairy cattle. Thereby, incorporating functional information in genomic analyses has been shown to improve the understanding of biological and molecular mechanisms shaping complex traits and the accuracies of genomic prediction, especially in small populations and across-breed settings. Still, little is known about the contribution of different functional and evolutionary genome partitioning subsets to milk production and dairy health. Thus, we performed a uni- and a bivariate analysis of milk yield (MY) and eight health traits using a set of ~34,497 German Holstein cows with 50K chip genotypes and ~17 million imputed sequence variants divided into 27 subsets depending on their functional and evolutionary annotation. In the bivariate analysis, eight trait-combinations were observed that contrasted MY with each health trait. Two genomic relationship matrices (GRM) were included, one consisting of the 50K chip variants and one consisting of each set of subset variants, to obtain subset heritabilities and genetic correlations. In addition, 50K chip heritabilities and genetic correlations were estimated applying merely the 50K GRM. RESULTS In general, 50K chip heritabilities were larger than the subset heritabilities. The largest heritabilities were found for MY, which was 0.4358 for the 50K and 0.2757 for the subset heritabilities. Whereas all 50K genetic correlations were negative, subset genetic correlations were both, positive and negative (ranging from -0.9324 between MY and mastitis to 0.6662 between MY and digital dermatitis). The subsets containing variants which were annotated as noncoding related, splice sites, untranslated regions, metabolic quantitative trait loci, and young variants ranked highest in terms of their contribution to the traits` genetic variance. We were able to show that linkage disequilibrium between subset variants and adjacent variants did not cause these subsets` high effect. CONCLUSION Our results confirm the connection of milk production and health traits in dairy cattle via the animals` metabolic state. In addition, they highlight the potential of including functional information in genomic analyses, which helps to dissect the extent and direction of the observed traits` connection in more detail.
Collapse
Affiliation(s)
- Helen Schneider
- Institute of Animal Science, University of Hohenheim, 70599, Stuttgart, Germany.
| | - Ana-Marija Krizanac
- Department of Animal Sciences, University of Göttingen, 37077, Göttingen, Germany
| | | | - Johannes Heise
- Vereinigte Informationssysteme Tierhaltung w.V. (VIT), 27283, Verden, Germany
| | - Jens Tetens
- Department of Animal Sciences, University of Göttingen, 37077, Göttingen, Germany
| | - Georg Thaller
- Institute of Animal Breeding and Husbandry, Christian-Albrechts University of Kiel, 24098, Kiel, Germany
| | - Jörn Bennewitz
- Institute of Animal Science, University of Hohenheim, 70599, Stuttgart, Germany
| |
Collapse
|
13
|
Przeworski M. 2023 ASHG Scientific Achievement Award. Am J Hum Genet 2024; 111:425-427. [PMID: 38458164 PMCID: PMC10995464 DOI: 10.1016/j.ajhg.2023.12.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 12/12/2023] [Indexed: 03/10/2024] Open
Abstract
This article is based on the address given by the author at the 2023 meeting of The American Society of Human Genetics (ASHG) in Washington, D.C. A video of the original address can be found at the ASHG website.
Collapse
Affiliation(s)
- Molly Przeworski
- Departments of Biological Sciences and Systems Biology, Columbia University, New York, NY, USA.
| |
Collapse
|
14
|
Liu S, Luo H, Zhang P, Li Y, Hao D, Zhang S, Song T, Xu T, He S. Adaptive Selection of Cis-regulatory Elements in the Han Chinese. Mol Biol Evol 2024; 41:msae034. [PMID: 38377343 PMCID: PMC10917166 DOI: 10.1093/molbev/msae034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 01/18/2024] [Accepted: 02/05/2024] [Indexed: 02/22/2024] Open
Abstract
Cis-regulatory elements have an important role in human adaptation to the living environment. However, the lag in population genomic cohort studies and epigenomic studies, hinders the research in the adaptive analysis of cis-regulatory elements in human populations. In this study, we collected 4,013 unrelated individuals and performed a comprehensive analysis of adaptive selection of genome-wide cis-regulatory elements in the Han Chinese. In total, 12.34% of genomic regions are under the influence of adaptive selection, where 1.00% of enhancers and 2.06% of promoters are under positive selection, and 0.06% of enhancers and 0.02% of promoters are under balancing selection. Gene ontology enrichment analysis of these cis-regulatory elements under adaptive selection reveals that many positive selections in the Han Chinese occur in pathways involved in cell-cell adhesion processes, and many balancing selections are related to immune processes. Two classes of adaptive cis-regulatory elements related to cell adhesion were in-depth analyzed, one is the adaptive enhancers derived from neanderthal introgression, leads to lower hyaluronidase level in skin, and brings better performance on UV-radiation resistance to the Han Chinese. Another one is the cis-regulatory elements regulating wound healing, and the results suggest the positive selection inhibits coagulation and promotes angiogenesis and wound healing in the Han Chinese. Finally, we found that many pathogenic alleles, such as risky alleles of type 2 diabetes or schizophrenia, remain in the population due to the hitchhiking effect of positive selections. Our findings will help deepen our understanding of the adaptive evolution of genome regulation in the Han Chinese.
Collapse
Affiliation(s)
- Shuai Liu
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Huaxia Luo
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Peng Zhang
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yanyan Li
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Di Hao
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Sijia Zhang
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingrui Song
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan 250117, Shandong, China
| | - Shunmin He
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
15
|
Poyraz L, Colbran LL, Mathieson I. Predicting Functional Consequences of Recent Natural Selection in Britain. Mol Biol Evol 2024; 41:msae053. [PMID: 38466119 PMCID: PMC10962637 DOI: 10.1093/molbev/msae053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 02/02/2024] [Accepted: 03/01/2024] [Indexed: 03/12/2024] Open
Abstract
Ancient DNA can directly reveal the contribution of natural selection to human genomic variation. However, while the analysis of ancient DNA has been successful at identifying genomic signals of selection, inferring the phenotypic consequences of that selection has been more difficult. Most trait-associated variants are noncoding, so we expect that a large proportion of the phenotypic effects of selection will also act through noncoding variation. Since we cannot measure gene expression directly in ancient individuals, we used an approach (Joint-Tissue Imputation [JTI]) developed to predict gene expression from genotype data. We tested for changes in the predicted expression of 17,384 protein coding genes over a time transect of 4,500 years using 91 present-day and 616 ancient individuals from Britain. We identified 28 genes at seven genomic loci with significant (false discovery rate [FDR] < 0.05) changes in predicted expression levels in this time period. We compared the results from our transcriptome-wide scan to a genome-wide scan based on estimating per-single nucleotide polymorphism (SNP) selection coefficients from time series data. At five previously identified loci, our approach allowed us to highlight small numbers of genes with evidence for significant shifts in expression from peaks that in some cases span tens of genes. At two novel loci (SLC44A5 and NUP85), we identify selection on gene expression not captured by scans based on genomic signatures of selection. Finally, we show how classical selection statistics (iHS and SDS) can be combined with JTI models to incorporate functional information into scans that use present-day data alone. These results demonstrate the potential of this type of information to explore both the causes and consequences of natural selection.
Collapse
Affiliation(s)
- Lin Poyraz
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Laura L Colbran
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
16
|
Simon A, Coop G. The contribution of gene flow, selection, and genetic drift to five thousand years of human allele frequency change. Proc Natl Acad Sci U S A 2024; 121:e2312377121. [PMID: 38363870 PMCID: PMC10907250 DOI: 10.1073/pnas.2312377121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 01/09/2024] [Indexed: 02/18/2024] Open
Abstract
Genomic time series from experimental evolution studies and ancient DNA datasets offer us a chance to directly observe the interplay of various evolutionary forces. We show how the genome-wide variance in allele frequency change between two time points can be decomposed into the contributions of gene flow, genetic drift, and linked selection. In closed populations, the contribution of linked selection is identifiable because it creates covariances between time intervals, and genetic drift does not. However, repeated gene flow between populations can also produce directionality in allele frequency change, creating covariances. We show how to accurately separate the fraction of variance in allele frequency change due to admixture and linked selection in a population receiving gene flow. We use two human ancient DNA datasets, spanning around 5,000 y, as time transects to quantify the contributions to the genome-wide variance in allele frequency change. We find that a large fraction of genome-wide change is due to gene flow. In both cases, after correcting for known major gene flow events, we do not observe a signal of genome-wide linked selection. Thus despite the known role of selection in shaping long-term polymorphism levels, and an increasing number of examples of strong selection on single loci and polygenic scores from ancient DNA, it appears to be gene flow and drift, and not selection, that are the main determinants of recent genome-wide allele frequency change. Our approach should be applicable to the growing number of contemporary and ancient temporal population genomics datasets.
Collapse
Affiliation(s)
- Alexis Simon
- Center for Population Biology, University of California, Davis, CA95616
- Department of Evolution and Ecology, University of California, Davis, CA95616
| | - Graham Coop
- Center for Population Biology, University of California, Davis, CA95616
- Department of Evolution and Ecology, University of California, Davis, CA95616
| |
Collapse
|
17
|
Simon A, Coop G. The contribution of gene flow, selection, and genetic drift to five thousand years of human allele frequency change. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.11.548607. [PMID: 37503227 PMCID: PMC10370008 DOI: 10.1101/2023.07.11.548607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Genomic time series from experimental evolution studies and ancient DNA datasets offer us a chance to directly observe the interplay of various evolutionary forces. We show how the genome-wide variance in allele frequency change between two time points can be decomposed into the contributions of gene flow, genetic drift, and linked selection. In closed populations, the contribution of linked selection is identifiable because it creates covariances between time intervals, and genetic drift does not. However, repeated gene flow between populations can also produce directionality in allele frequency change, creating covariances. We show how to accurately separate the fraction of variance in allele frequency change due to admixture and linked selection in a population receiving gene flow. We use two human ancient DNA datasets, spanning around 5,000 years, as time transects to quantify the contributions to the genome-wide variance in allele frequency change. We find that a large fraction of genome-wide change is due to gene flow. In both cases, after correcting for known major gene flow events, we do not observe a signal of genome-wide linked selection. Thus despite the known role of selection in shaping long-term polymorphism levels, and an increasing number of examples of strong selection on single loci and polygenic scores from ancient DNA, it appears to be gene flow and drift, and not selection, that are the main determinants of recent genome-wide allele frequency change. Our approach should be applicable to the growing number of contemporary and ancient temporal population genomics datasets.
Collapse
Affiliation(s)
- Alexis Simon
- Center for Population Biology, University of California, Davis, CA 95616
- Department of Evolution and Ecology, University of California, Davis, CA 95616
| | - Graham Coop
- Center for Population Biology, University of California, Davis, CA 95616
- Department of Evolution and Ecology, University of California, Davis, CA 95616
| |
Collapse
|
18
|
Bai X, Bao Y, Bei S, Bu C, Cao R, Cao Y, Cen H, Chao J, Chen F, Chen H, Chen K, Chen M, Chen M, Chen M, Chen Q, Chen R, Chen S, Chen T, Chen X, Chen X, Cheng Y, Chu Y, Cui Q, Dong L, Du Z, Duan G, Fan S, Fan Z, Fang X, Fang Z, Feng Z, Fu S, Gao F, Gao G, Gao H, Gao W, Gao X, Gao X, Gao X, Gong J, Gong J, Gou Y, Gu S, Guo AY, Guo G, Guo X, Han C, Hao D, Hao L, He Q, He S, He S, Hu W, Huang K, Huang T, Huang X, Huang Y, Jia P, Jia Y, Jiang C, Jiang M, Jiang S, Jiang T, Jiang X, Jin E, Jin W, Kang H, Kang H, Kong D, Lan L, Lei W, Li CY, Li C, Li C, Li H, Li J, Li J, Li L, Li P, Li R, Li X, Li Y, Li Y, Li Z, Liao X, Lin S, Lin Y, Ling Y, Liu B, Liu CJ, Liu D, Liu GH, Liu L, Liu S, Liu W, Liu X, Liu X, Liu Y, Liu Y, Lu M, Lu T, Luo H, Luo H, Luo M, Luo S, Luo X, Ma L, Ma Y, Mai J, Meng J, Meng X, Meng Y, Meng Y, Miao W, Miao YR, Ni L, Nie Z, Niu G, Niu X, Niu Y, Pan R, Pan S, Peng D, Peng J, Qi J, Qi Y, Qian Q, Qin Y, Qu H, Ren J, Ren J, Sang Z, Shang K, Shen WK, Shen Y, Shi Y, Song S, Song T, Su T, Sun J, Sun Y, Sun Y, Sun Y, Tang B, Tang D, Tang Q, Tang Z, Tian D, Tian F, Tian W, Tian Z, Wang A, Wang G, Wang G, Wang J, Wang J, Wang P, Wang P, Wang W, Wang Y, Wang Y, Wang Y, Wang Y, Wang Z, Wei H, Wei Y, Wei Z, Wu D, Wu G, Wu S, Wu S, Wu W, Wu W, Wu Z, Xia Z, Xiao J, Xiao L, Xiao Y, Xie G, Xie GY, Xie J, Xie Y, Xiong J, Xiong Z, Xu D, Xu S, Xu T, Xu T, Xue Y, Xue Y, Yan C, Yang D, Yang F, Yang F, Yang H, Yang J, Yang K, Yang N, Yang QY, Yang S, Yang X, Yang X, Yang X, Yang YG, Ye W, Yu C, Yu F, Yu S, Yuan C, Yuan H, Zeng J, Zhai S, Zhang C, Zhang F, Zhang G, Zhang M, Zhang P, Zhang Q, Zhang R, Zhang S, Zhang W, Zhang W, Zhang W, Zhang X, Zhang X, Zhang Y, Zhang Y, Zhang Y, Zhang YE, Zhang Y, Zhang Z, Zhang Z, Zhao D, Zhao F, Zhao G, Zhao M, Zhao W, Zhao W, Zhao X, Zhao Y, Zhao Y, Zhao Z, Zheng X, Zheng Y, Zhou C, Zhou H, Zhou X, Zhou X, Zhou Y, Zhou Y, Zhu J, Zhu L, Zhu R, Zhu T, Zong W, Zou D, Zuo Z. Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2024. Nucleic Acids Res 2024; 52:D18-D32. [PMID: 38018256 PMCID: PMC10767964 DOI: 10.1093/nar/gkad1078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/12/2023] [Accepted: 10/27/2023] [Indexed: 11/30/2023] Open
Abstract
The National Genomics Data Center (NGDC), which is a part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support the global academic and industrial communities. With the rapid accumulation of multi-omics data at an unprecedented pace, CNCB-NGDC continuously expands and updates core database resources through big data archiving, integrative analysis and value-added curation. Importantly, NGDC collaborates closely with major international databases and initiatives to ensure seamless data exchange and interoperability. Over the past year, significant efforts have been dedicated to integrating diverse omics data, synthesizing expanding knowledge, developing new resources, and upgrading major existing resources. Particularly, several database resources are newly developed for the biodiversity of protists (P10K), bacteria (NTM-DB, MPA) as well as plant (PPGR, SoyOmics, PlantPan) and disease/trait association (CROST, HervD Atlas, HALL, MACdb, BioKA, BioKA, RePoS, PGG.SV, NAFLDkb). All the resources and services are publicly accessible at https://ngdc.cncb.ac.cn.
Collapse
|
19
|
Gao Z. Unveiling recent and ongoing adaptive selection in human populations. PLoS Biol 2024; 22:e3002469. [PMID: 38236800 PMCID: PMC10796035 DOI: 10.1371/journal.pbio.3002469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2024] Open
Abstract
Genome-wide scans for signals of selection have become a routine part of the analysis of population genomic variation datasets and have resulted in compelling evidence of selection during recent human evolution. This Essay spotlights methodological innovations that have enabled the detection of selection over very recent timescales, even in contemporary human populations. By harnessing large-scale genomic and phenotypic datasets, these new methods use different strategies to uncover connections between genotype, phenotype, and fitness. This Essay outlines the rationale and key findings of each strategy, discusses challenges in interpretation, and describes opportunities to improve detection and understanding of ongoing selection in human populations.
Collapse
Affiliation(s)
- Ziyue Gao
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
20
|
Irving-Pease EK, Refoyo-Martínez A, Barrie W, Ingason A, Pearson A, Fischer A, Sjögren KG, Halgren AS, Macleod R, Demeter F, Henriksen RA, Vimala T, McColl H, Vaughn AH, Speidel L, Stern AJ, Scorrano G, Ramsøe A, Schork AJ, Rosengren A, Zhao L, Kristiansen K, Iversen AKN, Fugger L, Sudmant PH, Lawson DJ, Durbin R, Korneliussen T, Werge T, Allentoft ME, Sikora M, Nielsen R, Racimo F, Willerslev E. The selection landscape and genetic legacy of ancient Eurasians. Nature 2024; 625:312-320. [PMID: 38200293 PMCID: PMC10781624 DOI: 10.1038/s41586-023-06705-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 10/03/2023] [Indexed: 01/12/2024]
Abstract
The Holocene (beginning around 12,000 years ago) encompassed some of the most significant changes in human evolution, with far-reaching consequences for the dietary, physical and mental health of present-day populations. Using a dataset of more than 1,600 imputed ancient genomes1, we modelled the selection landscape during the transition from hunting and gathering, to farming and pastoralism across West Eurasia. We identify key selection signals related to metabolism, including that selection at the FADS cluster began earlier than previously reported and that selection near the LCT locus predates the emergence of the lactase persistence allele by thousands of years. We also find strong selection in the HLA region, possibly due to increased exposure to pathogens during the Bronze Age. Using ancient individuals to infer local ancestry tracts in over 400,000 samples from the UK Biobank, we identify widespread differences in the distribution of Mesolithic, Neolithic and Bronze Age ancestries across Eurasia. By calculating ancestry-specific polygenic risk scores, we show that height differences between Northern and Southern Europe are associated with differential Steppe ancestry, rather than selection, and that risk alleles for mood-related phenotypes are enriched for Neolithic farmer ancestry, whereas risk alleles for diabetes and Alzheimer's disease are enriched for Western hunter-gatherer ancestry. Our results indicate that ancient selection and migration were large contributors to the distribution of phenotypic diversity in present-day Europeans.
Collapse
Affiliation(s)
- Evan K Irving-Pease
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
| | - Alba Refoyo-Martínez
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - William Barrie
- GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK
| | - Andrés Ingason
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Institute of Biological Psychiatry, Mental Health Services, Copenhagen University Hospital, Roskilde, Denmark
| | - Alice Pearson
- Department of Genetics, University of Cambridge, Cambridge, UK
- Department of Zoology, University of Cambridge, Cambridge, UK
| | - Anders Fischer
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden
- Sealand Archaeology, Kalundborg, Denmark
| | - Karl-Göran Sjögren
- Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden
| | - Alma S Halgren
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
| | - Ruairidh Macleod
- GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK
- UCL Genetics Institute, University College London, London, UK
| | - Fabrice Demeter
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Eco-anthropologie, Muséum national d'Histoire naturelle, CNRS, Université Paris Cité, Musée de l'Homme, Paris, France
| | - Rasmus A Henriksen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Tharsika Vimala
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Hugh McColl
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Andrew H Vaughn
- Center for Computational Biology, University of California, Berkeley, CA, USA
| | - Leo Speidel
- UCL Genetics Institute, University College London, London, UK
- Ancient Genomics Laboratory, The Francis Crick Institute, London, UK
| | - Aaron J Stern
- Center for Computational Biology, University of California, Berkeley, CA, USA
| | - Gabriele Scorrano
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Abigail Ramsøe
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Andrew J Schork
- Institute of Biological Psychiatry, Mental Health Services, Copenhagen University Hospital, Roskilde, Denmark
- Neurogenomics Division, The Translational Genomics Research Institute (TGEN), Phoenix, AZ, USA
| | - Anders Rosengren
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Institute of Biological Psychiatry, Mental Health Services, Copenhagen University Hospital, Roskilde, Denmark
| | - Lei Zhao
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Kristian Kristiansen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden
| | - Astrid K N Iversen
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
- Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
| | - Lars Fugger
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
- Department of Clinical Medicine, Aarhus University Hospital, Aarhus, Denmark
- MRC Human Immunology Unit, John Radcliffe Hospital, University of Oxford, Oxford, UK
| | - Peter H Sudmant
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
- Center for Computational Biology, University of California, Berkeley, CA, USA
| | - Daniel J Lawson
- Institute of Statistical Sciences, School of Mathematics, University of Bristol, Bristol, UK
| | - Richard Durbin
- Department of Genetics, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Cambridge, UK
| | - Thorfinn Korneliussen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Thomas Werge
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
- Institute of Biological Psychiatry, Mental Health Center Sct Hans, Copenhagen University Hospital, Copenhagen, Denmark
| | - Morten E Allentoft
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Trace and Environmental DNA (TrEnD) Laboratory, School of Molecular and Life Science, Curtin University, Perth, Western Australia, Australia
| | - Martin Sikora
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus Nielsen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
- Departments of Integrative Biology and Statistics, UC Berkeley, Berkeley, CA, USA.
| | - Fernando Racimo
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
| | - Eske Willerslev
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
- GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK.
- MARUM Center for Marine Environmental Sciences and Faculty of Geosciences, University of Bremen, Bremen, Germany.
| |
Collapse
|
21
|
Panigrahi M, Rajawat D, Nayak SS, Ghildiyal K, Sharma A, Jain K, Lei C, Bhushan B, Mishra BP, Dutt T. Landmarks in the history of selective sweeps. Anim Genet 2023; 54:667-688. [PMID: 37710403 DOI: 10.1111/age.13355] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 08/28/2023] [Indexed: 09/16/2023]
Abstract
Half a century ago, a seminal article on the hitchhiking effect by Smith and Haigh inaugurated the concept of the selection signature. Selective sweeps are characterised by the rapid spread of an advantageous genetic variant through a population and hence play an important role in shaping evolution and research on genetic diversity. The process by which a beneficial allele arises and becomes fixed in a population, leading to a increase in the frequency of other linked alleles, is known as genetic hitchhiking or genetic draft. Kimura's neutral theory and hitchhiking theory are complementary, with Kimura's neutral evolution as the 'null model' and positive selection as the 'signal'. Both are widely accepted in evolution, especially with genomics enabling precise measurements. Significant advances in genomic technologies, such as next-generation sequencing, high-density SNP arrays and powerful bioinformatics tools, have made it possible to systematically investigate selection signatures in a variety of species. Although the history of selection signatures is relatively recent, progress has been made in the last two decades, owing to the increasing availability of large-scale genomic data and the development of computational methods. In this review, we embark on a journey through the history of research on selective sweeps, ranging from early theoretical work to recent empirical studies that utilise genomic data.
Collapse
Affiliation(s)
- Manjit Panigrahi
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Divya Rajawat
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | | | - Kanika Ghildiyal
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Anurodh Sharma
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Karan Jain
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Chuzhao Lei
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi, China
| | - Bharat Bhushan
- Division of Animal Genetics, Indian Veterinary Research Institute, Bareilly, India
| | - Bishnu Prasad Mishra
- Division of Animal Biotechnology, ICAR-National Bureau of Animal Genetic Resources, Karnal, India
| | - Triveni Dutt
- Livestock Production and Management Section, Indian Veterinary Research Institute, Bareilly, India
| |
Collapse
|
22
|
Bose A, Burch M, Chowdhury A, Paschou P, Drineas P. Structure-informed clustering for population stratification in association studies. BMC Bioinformatics 2023; 24:411. [PMID: 37907836 PMCID: PMC10619291 DOI: 10.1186/s12859-023-05511-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 10/02/2023] [Indexed: 11/02/2023] Open
Abstract
BACKGROUND Identifying variants associated with complex traits is a challenging task in genetic association studies due to linkage disequilibrium (LD) between genetic variants and population stratification, unrelated to the disease risk. Existing methods of population structure correction use principal component analysis or linear mixed models with a random effect when modeling associations between a trait of interest and genetic markers. However, due to stringent significance thresholds and latent interactions between the markers, these methods often fail to detect genuinely associated variants. RESULTS To overcome this, we propose CluStrat, which corrects for complex arbitrarily structured populations while leveraging the linkage disequilibrium induced distances between genetic markers. It performs an agglomerative hierarchical clustering using the Mahalanobis distance covariance matrix of the markers. In simulation studies, we show that our method outperforms existing methods in detecting true causal variants. Applying CluStrat on WTCCC2 and UK Biobank cohorts, we found biologically relevant associations in Schizophrenia and Myocardial Infarction. CluStrat was also able to correct for population structure in polygenic adaptation of height in Europeans. CONCLUSIONS CluStrat highlights the advantages of biologically relevant distance metrics, such as the Mahalanobis distance, which captures the cryptic interactions within populations in the presence of LD better than the Euclidean distance.
Collapse
Affiliation(s)
- Aritra Bose
- Computational Genomics, IBM T.J Watson Research Center, Yorktown Heights, NY, USA
| | - Myson Burch
- Computational Genomics, IBM T.J Watson Research Center, Yorktown Heights, NY, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Agniva Chowdhury
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Peristera Paschou
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Petros Drineas
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
23
|
Luo H, Zhang P, Zhang W, Zheng Y, Hao D, Shi Y, Niu Y, Song T, Li Y, Zhao S, Chen H, Xu T, He S. Recent positive selection signatures reveal phenotypic evolution in the Han Chinese population. Sci Bull (Beijing) 2023; 68:2391-2404. [PMID: 37661541 DOI: 10.1016/j.scib.2023.08.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 05/08/2023] [Accepted: 08/10/2023] [Indexed: 09/05/2023]
Abstract
Characterizing natural selection signatures and relationships with phenotype spectra is important for understanding human evolution and both biological and pathological mechanisms. Here, we identified 24 genetic loci under recent selection by analyzing rare singletons in 3946 high-depth whole-genome sequencing data of Han Chinese. The loci include immune-related gene regions (MHC cluster, IGH cluster, STING1, and PSG), alcohol metabolism-related gene regions (ADH1B, ALDH2, and ALDH3B2), and the olfactory perception gene OR4C16, in which the MHC cluster, ADH1B, and ALDH2 were also identified by TOPMed and WestLake Biobank. Among the signals, the IGH cluster is particularly interesting, in which the favored allele of variant 14_105737776_C_T (rs117518546, IgG1-G396R) promotes immune response, but also increases the risk of an autoimmune disease systemic lupus erythematosus (SLE). It is also surprising that our newly discovered ALDH3B2 evolved in the opposite direction to ALDH2 for alcohol metabolism. Besides monogenic traits, we found that multiple complex traits experienced polygenic adaptation. Particularly, multi-methods consistently revealed that lower blood pressure was favored in natural selection. Finally, we built a database named RePoS (recent positive selection, http://bigdata.ibp.ac.cn/RePoS/) to integrate and display multi-population selection signals. Our study extended our understanding of natural evolution and phenotype adaptation in Han Chinese as well as other populations.
Collapse
Affiliation(s)
- Huaxia Luo
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; Department of Pediatrics, Peking University First Hospital, Beijing 100034, China
| | - Peng Zhang
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Wanyu Zhang
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yu Zheng
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Di Hao
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yirong Shi
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiwei Niu
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingrui Song
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yanyan Li
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shilei Zhao
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; China National Center for Bioinformation, Beijing 100101, China
| | - Hua Chen
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; China National Center for Bioinformation, Beijing 100101, China.
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; Shandong First Medical University & Shandong Academy of Medical Sciences, Taian 271016, China.
| | - Shunmin He
- Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
24
|
Poyraz L, Colbran LL, Mathieson I. Predicting functional consequences of recent natural selection in Britain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.16.562549. [PMID: 37904954 PMCID: PMC10614889 DOI: 10.1101/2023.10.16.562549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Ancient DNA can directly reveal the contribution of natural selection to human genomic variation. However, while the analysis of ancient DNA has been successful at identifying genomic signals of selection, inferring the phenotypic consequences of that selection has been more difficult. Most trait-associated variants are non-coding, so we expect that a large proportion of the phenotypic effects of selection will also act through non-coding variation. Since we cannot measure gene expression directly in ancient individuals, we used an approach (Joint-Tissue Imputation; JTI) developed to predict gene expression from genotype data. We tested for changes in the predicted expression of 17,384 protein coding genes over a time transect of 4500 years using 91 present-day and 616 ancient individuals from Britain. We identified 28 genes at seven genomic loci with significant (FDR < 0.05) changes in predicted expression levels in this time period. We compared the results from our transcriptome-wide scan to a genome-wide scan based on estimating per-SNP selection coefficients from time series data. At five previously identified loci, our approach allowed us to highlight small numbers of genes with evidence for significant shifts in expression from peaks that in some cases span tens of genes. At two novel loci (SLC44A5 and NUP85), we identify selection on gene expression not captured by scans based on genomic signatures of selection. Finally we show how classical selection statistics (iHS and SDS) can be combined with JTI models to incorporate functional information into scans that use present-day data alone. These results demonstrate the potential of this type of information to explore both the causes and consequences of natural selection.
Collapse
Affiliation(s)
- Lin Poyraz
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
| | - Laura L. Colbran
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
25
|
He Y, Guo Y, Zheng W, Yue T, Zhang H, Wang B, Feng Z, Ouzhuluobu, Cui C, Liu K, Zhou B, Zeng X, Li L, Wang T, Wang Y, Zhang C, Xu S, Qi X, Su B. Polygenic adaptation leads to a higher reproductive fitness of native Tibetans at high altitude. Curr Biol 2023; 33:4037-4051.e5. [PMID: 37643619 DOI: 10.1016/j.cub.2023.08.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 06/01/2023] [Accepted: 08/04/2023] [Indexed: 08/31/2023]
Abstract
The adaptation of Tibetans to high-altitude environments has been studied extensively. However, the direct assessment of evolutionary adaptation, i.e., the reproductive fitness of Tibetans and its genetic basis, remains elusive. Here, we conduct systematic phenotyping and genome-wide association analysis of 2,252 mother-newborn pairs of indigenous Tibetans, covering 12 reproductive traits and 76 maternal physiological traits. Compared with the lowland immigrants living at high altitudes, indigenous Tibetans show better reproductive outcomes, reflected by their lower abortion rate, higher birth weight, and better fetal development. The results of genome-wide association analyses indicate a polygenic adaptation of reproduction in Tibetans, attributed to the genomic backgrounds of both the mothers and the newborns. Furthermore, the EPAS1-edited mice display higher reproductive fitness under chronic hypoxia, mirroring the situation in Tibetans. Collectively, these results shed new light on the phenotypic pattern and the genetic mechanism of human reproductive fitness in extreme environments.
Collapse
Affiliation(s)
- Yaoxi He
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China.
| | - Yongbo Guo
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Wangshan Zheng
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Tian Yue
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Hui Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650000, China
| | - Bin Wang
- Fukang Obstetrics, Gynecology and Children Branch Hospital, Tibetan Fukang Hospital, Lhasa 850000, China
| | - Zhanying Feng
- CEMS, NCMIS, MDIS, Academy of Mathematics & Systems Science, Chinese Academy of Sciences, Beijing 100080, China
| | - Ouzhuluobu
- Fukang Obstetrics, Gynecology and Children Branch Hospital, Tibetan Fukang Hospital, Lhasa 850000, China; High Altitude Medical Research Center, School of Medicine, Tibetan University, Lhasa 850000, China
| | - Chaoying Cui
- Fukang Obstetrics, Gynecology and Children Branch Hospital, Tibetan Fukang Hospital, Lhasa 850000, China; High Altitude Medical Research Center, School of Medicine, Tibetan University, Lhasa 850000, China
| | - Kai Liu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Bin Zhou
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Xuerui Zeng
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, University of Chinese Academy of Sciences, Beijing 100101, China
| | - Liya Li
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| | - Tianyun Wang
- Department of Medical Genetics, Center for Medical Genetics, School of Basic Medical Sciences, Peking University Health Science Center, Beijing 100191, China
| | - Yong Wang
- CEMS, NCMIS, MDIS, Academy of Mathematics & Systems Science, Chinese Academy of Sciences, Beijing 100080, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| | - Chao Zhang
- Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Shuhua Xu
- State Key Laboratory of Genetic Engineering, Center for Evolutionary Biology, Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200438, China; Human Phenome Institute, Zhangjiang Fudan International Innovation Center, and Ministry of Education Key Laboratory of Contemporary Anthropology, Fudan University, Shanghai 201203, China
| | - Xuebin Qi
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Fukang Obstetrics, Gynecology and Children Branch Hospital, Tibetan Fukang Hospital, Lhasa 850000, China; State Key Laboratory of Primate Biomedical Research, Institute of Primate Translational Medicine, Kunming University of Science and Technology, Kunming 650000, China.
| | - Bing Su
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China.
| |
Collapse
|
26
|
Höllinger I, Wölfl B, Hermisson J. A theory of oligogenic adaptation of a quantitative trait. Genetics 2023; 225:iyad139. [PMID: 37550847 PMCID: PMC10550320 DOI: 10.1093/genetics/iyad139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 04/20/2023] [Accepted: 07/13/2023] [Indexed: 08/09/2023] Open
Abstract
Rapid phenotypic adaptation is widespread in nature, but the underlying genetic dynamics remain controversial. Whereas population genetics envisages sequential beneficial substitutions, quantitative genetics assumes a collective response through subtle shifts in allele frequencies. This dichotomy of a monogenic and a highly polygenic view of adaptation raises the question of a middle ground, as well as the factors controlling the transition. Here, we consider an additive quantitative trait with equal locus effects under Gaussian stabilizing selection that adapts to a new trait optimum after an environmental change. We present an analytical framework based on Yule branching processes to describe how phenotypic adaptation is achieved by collective changes in allele frequencies at the underlying loci. In particular, we derive an approximation for the joint allele-frequency distribution conditioned on the trait mean as a comprehensive descriptor of the adaptive architecture. Depending on the model parameters, this architecture reproduces the well-known patterns of sequential, monogenic sweeps, or of subtle, polygenic frequency shifts. Between these endpoints, we observe oligogenic architecture types that exhibit characteristic patterns of partial sweeps. We find that a single compound parameter, the population-scaled background mutation rate Θbg, is the most important predictor of the type of adaptation, while selection strength, the number of loci in the genetic basis, and linkage only play a minor role.
Collapse
Affiliation(s)
- Ilse Höllinger
- Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Vienna, Austria
| | - Benjamin Wölfl
- Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Vienna, Austria
- Vienna Graduate School of Population Genetics, University of Vienna and Veterinary Medical University of Vienna, Vienna, Austria
- Vienna Doctoral School of Ecology and Evolution, University of Vienna, Vienna, Austria
| | - Joachim Hermisson
- Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Vienna, Austria
- Max Perutz Labs, Vienna Biocenter Campus (VBC), Dr.-Bohr-Gasse 9, 1030 Vienna, Austria
| |
Collapse
|
27
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data. Mol Biol Evol 2023; 40:msad216. [PMID: 37772983 PMCID: PMC10581699 DOI: 10.1093/molbev/msad216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/10/2023] [Accepted: 09/14/2023] [Indexed: 09/30/2023] Open
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
28
|
Mikhailova SV. Problems with studying directional natural selection in humans. Vavilovskii Zhurnal Genet Selektsii 2023; 27:684-693. [PMID: 38023807 PMCID: PMC10643113 DOI: 10.18699/vjgb-23-79] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/03/2023] [Accepted: 07/03/2023] [Indexed: 12/01/2023] Open
Abstract
The review describes the main methods for assessing directional selection in human populations. These include bioinformatic analysis of DNA sequences via detection of linkage disequilibrium and of deviations from the random distribution of frequencies of genetic variants, demographic and anthropometric studies based on a search for a correlation between fertility and phenotypic traits, genome-wide association studies on fertility along with genetic loci and polygenic risk scores, and a comparison of allele frequencies between generations (in modern samples and in those obtained from burials). Each approach has its limitations and is applicable to different periods in the evolution of Homo sapiens. The main source of error in such studies is thought to be sample stratification, the small number of studies on nonwhite populations, the impossibility of a complete comparison of the associations found and functionally significant causative variants, and the difficulty with taking into account all nongenetic determinants of fertility in contemporary populations. The results obtained by various methods indicate that the direction of human adaptation to new food products has not changed during evolution since the Neolithic; many variants of immunity genes associated with inflammatory and autoimmune diseases in modern populations have undergone positive selection over the past 2-3 thousand years owing to the spread of bacterial and viral infections. For some genetic variants and polygenic traits, an alteration of the direction of natural selection in Europe has been documented, e. g., for those associated with an immune response and cognitive abilities. Examination of the correlation between fertility and educational attainment yields conflicting results. In modern populations, to a greater extent than previously, there is selection for variants of genes responsible for social adaptation and behavioral phenotypes. In particular, several articles have shown a positive correlation of fertility with polygenic risk scores of attention deficit/hyperactivity disorder.
Collapse
Affiliation(s)
- S V Mikhailova
- Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
29
|
González-Peñas J, de Hoyos L, Díaz-Caneja CM, Andreu-Bernabeu Á, Stella C, Gurriarán X, Fañanás L, Bobes J, González-Pinto A, Crespo-Facorro B, Martorell L, Vilella E, Muntané G, Molto MD, Gonzalez-Piqueras JC, Parellada M, Arango C, Costas J. Recent natural selection conferred protection against schizophrenia by non-antagonistic pleiotropy. Sci Rep 2023; 13:15500. [PMID: 37726359 PMCID: PMC10509162 DOI: 10.1038/s41598-023-42578-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 09/12/2023] [Indexed: 09/21/2023] Open
Abstract
Schizophrenia is a debilitating psychiatric disorder associated with a reduced fertility and decreased life expectancy, yet common predisposing variation substantially contributes to the onset of the disorder, which poses an evolutionary paradox. Previous research has suggested balanced selection, a mechanism by which schizophrenia risk alleles could also provide advantages under certain environments, as a reliable explanation. However, recent studies have shown strong evidence against a positive selection of predisposing loci. Furthermore, evolutionary pressures on schizophrenia risk alleles could have changed throughout human history as new environments emerged. Here in this study, we used 1000 Genomes Project data to explore the relationship between schizophrenia predisposing loci and recent natural selection (RNS) signatures after the human diaspora out of Africa around 100,000 years ago on a genome-wide scale. We found evidence for significant enrichment of RNS markers in derived alleles arisen during human evolution conferring protection to schizophrenia. Moreover, both partitioned heritability and gene set enrichment analyses of mapped genes from schizophrenia predisposing loci subject to RNS revealed a lower involvement in brain and neuronal related functions compared to those not subject to RNS. Taken together, our results suggest non-antagonistic pleiotropy as a likely mechanism behind RNS that could explain the persistence of schizophrenia common predisposing variation in human populations due to its association to other non-psychiatric phenotypes.
Collapse
Affiliation(s)
- Javier González-Peñas
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Calle Ibiza, 43, 28009, Madrid, Spain.
- Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain.
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain.
| | - Lucía de Hoyos
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Calle Ibiza, 43, 28009, Madrid, Spain
- Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain
- Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Covadonga M Díaz-Caneja
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Calle Ibiza, 43, 28009, Madrid, Spain
- Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- School of Medicine, Universidad Complutense, Madrid, Spain
| | - Álvaro Andreu-Bernabeu
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Calle Ibiza, 43, 28009, Madrid, Spain
- Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain
- School of Medicine, Universidad Complutense, Madrid, Spain
| | - Carol Stella
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Calle Ibiza, 43, 28009, Madrid, Spain
- Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain
| | - Xaquín Gurriarán
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Calle Ibiza, 43, 28009, Madrid, Spain
- Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain
| | - Lourdes Fañanás
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- Department of Evolutionary Biology, Ecology and Environmental Sciences, Faculty of Biology, University of Barcelona, Barcelona, Spain
| | - Julio Bobes
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- Faculty of Medicine and Health Sciences - Psychiatry, Universidad de Oviedo, ISPA, INEUROPA, Oviedo, Spain
| | - Ana González-Pinto
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- BIOARABA Health Research Institute, OSI Araba, University Hospital, University of the Basque Country, Vitoria, Spain
| | - Benedicto Crespo-Facorro
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- Department of Psychiatry, Hospital Universitario Virgen del Rocío, Universidad de Sevilla, Seville, Spain
| | - Lourdes Martorell
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- Hospital Universitari Institut Pere Mata, IISPV, Universitat Rovira I Virgili, Reus, Spain
| | - Elisabet Vilella
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- Hospital Universitari Institut Pere Mata, IISPV, Universitat Rovira I Virgili, Reus, Spain
| | - Gerard Muntané
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- Hospital Universitari Institut Pere Mata, IISPV, Universitat Rovira I Virgili, Reus, Spain
| | - María Dolores Molto
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- Department of Genetics, University of Valencia, Campus of Burjassot, Valencia, Spain
- Department of Medicine, Universitat de València, Valencia, Spain
| | - Jose Carlos Gonzalez-Piqueras
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- Department of Medicine, Universitat de València, Valencia, Spain
- Fundación Investigación Hospital Clínico de Valencia, INCLIVA, 46010, Valencia, Spain
| | - Mara Parellada
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Calle Ibiza, 43, 28009, Madrid, Spain
- Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- School of Medicine, Universidad Complutense, Madrid, Spain
| | - Celso Arango
- Department of Child and Adolescent Psychiatry, Institute of Psychiatry and Mental Health, Hospital General Universitario Gregorio Marañón, Calle Ibiza, 43, 28009, Madrid, Spain
- Instituto de Investigación Sanitaria Gregorio Marañón (IiSGM), Madrid, Spain
- CIBERSAM, Centro Investigación Biomédica en Red Salud Mental, Madrid, Spain
- School of Medicine, Universidad Complutense, Madrid, Spain
| | - Javier Costas
- Instituto de Investigación Sanitaria (IDIS) de Santiago de Compostela, Complexo Hospitalario Universitario de Santiago de Compostela (CHUS), Servizo Galego de Saúde (SERGAS), Santiago de Compostela, Galicia, Spain
| |
Collapse
|
30
|
Hawkes G, Yengo L, Vedantam S, Marouli E, Beaumont RN, Tyrrell J, Weedon MN, Hirschhorn J, Frayling TM, Wood AR. Identification and analysis of individuals who deviate from their genetically-predicted phenotype. PLoS Genet 2023; 19:e1010934. [PMID: 37733769 PMCID: PMC10564121 DOI: 10.1371/journal.pgen.1010934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 10/10/2023] [Accepted: 08/22/2023] [Indexed: 09/23/2023] Open
Abstract
Findings from genome-wide association studies have facilitated the generation of genetic predictors for many common human phenotypes. Stratifying individuals misaligned to a genetic predictor based on common variants may be important for follow-up studies that aim to identify alternative causal factors. Using genome-wide imputed genetic data, we aimed to classify 158,951 unrelated individuals from the UK Biobank as either concordant or deviating from two well-measured phenotypes. We first applied our methods to standing height: our primary analysis classified 244 individuals (0.15%) as misaligned to their genetically predicted height. We show that these individuals are enriched for self-reporting being shorter or taller than average at age 10, diagnosed congenital malformations, and rare loss-of-function variants in genes previously catalogued as causal for growth disorders. Secondly, we apply our methods to LDL cholesterol (LDL-C). We classified 156 (0.12%) individuals as misaligned to their genetically predicted LDL-C and show that these individuals were enriched for both clinically actionable cardiovascular risk factors and rare genetic variants in genes previously shown to be involved in metabolic processes. Individuals whose LDL-C was higher than expected based on the genetic predictor were also at higher risk of developing coronary artery disease and type-two diabetes, even after adjustment for measured LDL-C, BMI and age, suggesting upward deviation from genetically predicted LDL-C is indicative of generally poor health. Our results remained broadly consistent when performing sensitivity analysis based on a variety of parametric and non-parametric methods to define individuals deviating from polygenic expectation. Our analyses demonstrate the potential importance of quantitatively identifying individuals for further follow-up based on deviation from genetic predictions.
Collapse
Affiliation(s)
- Gareth Hawkes
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, United Kingdom
| | - Loic Yengo
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Sailaja Vedantam
- Endocrinology, Boston Children’s Hospital, Sharon, Massachusetts, United States of America
| | - Eirini Marouli
- William Harvey Research Institute, Barts and The London School of Medicine and Dentistry Queen Mary University of London, London, United Kingdom
| | - Robin N. Beaumont
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, United Kingdom
| | | | - Jessica Tyrrell
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, United Kingdom
| | - Michael N. Weedon
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, United Kingdom
| | - Joel Hirschhorn
- Boston Children’s Hospital/Broad Institute, Boston, Massachusetts, United States of America
| | - Timothy M. Frayling
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, United Kingdom
| | - Andrew R. Wood
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, United Kingdom
| |
Collapse
|
31
|
Riley R, Mathieson I, Mathieson S. INTERPRETING GENERATIVE ADVERSARIAL NETWORKS TO INFER NATURAL SELECTION FROM GENETIC DATA. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.07.531546. [PMID: 36945387 PMCID: PMC10028936 DOI: 10.1101/2023.03.07.531546] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
Understanding natural selection in humans and other species is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically requires slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Mismatches between simulated training data and real test data can lead to incorrect inference. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection that requires relatively few selection simulations during training. We use a Generative Adversarial Network (GAN) trained to simulate realistic neutral data. The resulting GAN consists of a generator (fitted demographic model) and a discriminator (convolutional neural network). For a genomic region, the discriminator predicts whether it is "real" or "fake" in the sense that it could have been simulated by the generator. As the "real" training data includes regions that experienced selection and the generator cannot produce such regions, regions with a high probability of being real are likely to have experienced selection. To further incentivize this behavior, we "fine-tune" the discriminator with a small number of selection simulations. We show that this approach has high power to detect selection in simulations, and that it finds regions under selection identified by state-of-the art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics. In summary, our approach is a novel, efficient, and powerful way to use machine learning to detect natural selection.
Collapse
Affiliation(s)
- Rebecca Riley
- Department of Computer Science, Haverford College, Haverford PA, 19041 USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia PA, 19104 USA
| | - Sara Mathieson
- Department of Computer Science, Haverford College, Haverford PA, 19041 USA
| |
Collapse
|
32
|
Colbran LL, Ramos-Almodovar FC, Mathieson I. A gene-level test for directional selection on gene expression. Genetics 2023; 224:iyad060. [PMID: 37036411 PMCID: PMC10213495 DOI: 10.1093/genetics/iyad060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 01/11/2023] [Accepted: 03/31/2023] [Indexed: 04/11/2023] Open
Abstract
Most variants identified in human genome-wide association studies and scans for selection are noncoding. Interpretation of their effects and the way in which they contribute to phenotypic variation and adaptation in human populations is therefore limited by our understanding of gene regulation and the difficulty of confidently linking noncoding variants to genes. To overcome this, we developed a gene-wise test for population-specific selection based on combinations of regulatory variants. Specifically, we use the QX statistic to test for polygenic selection on cis-regulatory variants based on whether the variance across populations in the predicted expression of a particular gene is higher than expected under neutrality. We then applied this approach to human data, testing for selection on 17,388 protein-coding genes in 26 populations from the Thousand Genomes Project. We identified 45 genes with significant evidence (FDR<0.1) for selection, including FADS1, KHK, SULT1A2, ITGAM, and several genes in the HLA region. We further confirm that these signals correspond to plausible population-level differences in predicted expression. While the small number of significant genes (0.2%) is consistent with most cis-regulatory variation evolving under genetic drift or stabilizing selection, it remains possible that there are effects not captured in this study. Our gene-level QX score is independent of standard genomic tests for selection, and may therefore be useful in combination with traditional selection scans to specifically identify selection on regulatory variation. Overall, our results demonstrate the utility of combining population-level genomic data with functional data to understand the evolution of gene expression.
Collapse
Affiliation(s)
- Laura L Colbran
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | | |
Collapse
|
33
|
Caro-Consuegra R, Lucas-Sánchez M, Comas D, Bosch E. Identifying signatures of positive selection in human populations from North Africa. Sci Rep 2023; 13:8166. [PMID: 37210386 DOI: 10.1038/s41598-023-35312-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 05/16/2023] [Indexed: 05/22/2023] Open
Abstract
Because of its location, North Africa (NA) has witnessed continuous demographic movements with an impact on the genomes of present-day human populations. Genomic data describe a complex scenario with varying proportions of at least four main ancestry components: Maghrebi, Middle Eastern-, European-, and West-and-East-African-like. However, the footprint of positive selection in NA has not been studied. Here, we compile genome-wide genotyping data from 190 North Africans and individuals from surrounding populations, investigate for signatures of positive selection using allele frequencies and linkage disequilibrium-based methods and infer ancestry proportions to discern adaptive admixture from post-admixture selection events. Our results show private candidate genes for selection in NA involved in insulin processing (KIF5A), immune function (KIF5A, IL1RN, TLR3), and haemoglobin phenotypes (BCL11A). We also detect signatures of positive selection related to skin pigmentation (SLC24A5, KITLG), and immunity function (IL1R1, CD44, JAK1) shared with European populations and candidate genes associated with haemoglobin phenotypes (HPSE2, HBE1, HBG2), other immune-related (DOCK2) traits, and insulin processing (GLIS3) traits shared with West and East African populations. Finally, the SLC8A1 gene, which codifies for a sodium-calcium exchanger, was the only candidate identified under post-admixture selection in Western NA.
Collapse
Affiliation(s)
- Rocio Caro-Consuegra
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Medicina i Ciències de la Vida, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, 08003, Barcelona, Spain
| | - Marcel Lucas-Sánchez
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Medicina i Ciències de la Vida, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, 08003, Barcelona, Spain
| | - David Comas
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Medicina i Ciències de la Vida, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, 08003, Barcelona, Spain
| | - Elena Bosch
- Institut de Biologia Evolutiva (UPF-CSIC), Departament de Medicina i Ciències de la Vida, Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, 08003, Barcelona, Spain.
- Centro de Investigación Biomédica en Red de Salud Mental, Instituto de Salud Carlos III, 28029, Madrid, Spain.
| |
Collapse
|
34
|
Ahlquist KD, Sugden LA, Ramachandran S. Enabling interpretable machine learning for biological data with reliability scores. PLoS Comput Biol 2023; 19:e1011175. [PMID: 37235578 PMCID: PMC10249903 DOI: 10.1371/journal.pcbi.1011175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 06/08/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
Machine learning tools have proven useful across biological disciplines, allowing researchers to draw conclusions from large datasets, and opening up new opportunities for interpreting complex and heterogeneous biological data. Alongside the rapid growth of machine learning, there have also been growing pains: some models that appear to perform well have later been revealed to rely on features of the data that are artifactual or biased; this feeds into the general criticism that machine learning models are designed to optimize model performance over the creation of new biological insights. A natural question arises: how do we develop machine learning models that are inherently interpretable or explainable? In this manuscript, we describe the SWIF(r) reliability score (SRS), a method building on the SWIF(r) generative framework that reflects the trustworthiness of the classification of a specific instance. The concept of the reliability score has the potential to generalize to other machine learning methods. We demonstrate the utility of the SRS when faced with common challenges in machine learning including: 1) an unknown class present in testing data that was not present in training data, 2) systemic mismatch between training and testing data, and 3) instances of testing data that have missing values for some attributes. We explore these applications of the SRS using a range of biological datasets, from agricultural data on seed morphology, to 22 quantitative traits in the UK Biobank, and population genetic simulations and 1000 Genomes Project data. With each of these examples, we demonstrate how the SRS can allow researchers to interrogate their data and training approach thoroughly, and to pair their domain-specific knowledge with powerful machine-learning frameworks. We also compare the SRS to related tools for outlier and novelty detection, and find that it has comparable performance, with the advantage of being able to operate when some data are missing. The SRS, and the broader discussion of interpretable scientific machine learning, will aid researchers in the biological machine learning space as they seek to harness the power of machine learning without sacrificing rigor and biological insight.
Collapse
Affiliation(s)
- K. D. Ahlquist
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, Rhode Island, United States of America
| | - Lauren A. Sugden
- Department of Mathematics and Computer Science, Duquesne University, Pittsburgh, Pennsylvania, United States of America
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
- Department of Ecology, Evolution and Organismal Biology, Brown University, Providence, Rhode Island, United States of America
- Data Science Initiative, Brown University, Providence, Rhode Island, United States of America
| |
Collapse
|
35
|
Mathieson I, Day FR, Barban N, Tropf FC, Brazel DM, Vaez A, van Zuydam N, Bitarello BD, Gardner EJ, Akimova ET, Azad A, Bergmann S, Bielak LF, Boomsma DI, Bosak K, Brumat M, Buring JE, Cesarini D, Chasman DI, Chavarro JE, Cocca M, Concas MP, Davey Smith G, Davies G, Deary IJ, Esko T, Faul JD, Franco O, Ganna A, Gaskins AJ, Gelemanovic A, de Geus EJC, Gieger C, Girotto G, Gopinath B, Grabe HJ, Gunderson EP, Hayward C, He C, van Heemst D, Hill WD, Hoffmann ER, Homuth G, Hottenga JJ, Huang H, Hyppӧnen E, Ikram MA, Jansen R, Johannesson M, Kamali Z, Kardia SLR, Kavousi M, Kifley A, Kiiskinen T, Kraft P, Kühnel B, Langenberg C, Liew G, Lind PA, Luan J, Mägi R, Magnusson PKE, Mahajan A, Martin NG, Mbarek H, McCarthy MI, McMahon G, Medland SE, Meitinger T, Metspalu A, Mihailov E, Milani L, Missmer SA, Mitchell P, Møllegaard S, Mook-Kanamori DO, Morgan A, van der Most PJ, de Mutsert R, Nauck M, Nolte IM, Noordam R, Penninx BWJH, Peters A, Peyser PA, Polašek O, Power C, Pribisalic A, Redmond P, Rich-Edwards JW, Ridker PM, Rietveld CA, Ring SM, Rose LM, Rueedi R, Shukla V, Smith JA, Stankovic S, Stefánsson K, Stöckl D, Strauch K, Swertz MA, Teumer A, Thorleifsson G, Thorsteinsdottir U, Thurik AR, Timpson NJ, Turman C, Uitterlinden AG, Waldenberger M, Wareham NJ, Weir DR, Willemsen G, Zhao JH, Zhao W, Zhao Y, Snieder H, den Hoed M, Ong KK, Mills MC, Perry JRB. Genome-wide analysis identifies genetic effects on reproductive success and ongoing natural selection at the FADS locus. Nat Hum Behav 2023; 7:790-801. [PMID: 36864135 DOI: 10.1038/s41562-023-01528-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 01/12/2023] [Indexed: 03/04/2023]
Abstract
Identifying genetic determinants of reproductive success may highlight mechanisms underlying fertility and identify alleles under present-day selection. Using data in 785,604 individuals of European ancestry, we identified 43 genomic loci associated with either number of children ever born (NEB) or childlessness. These loci span diverse aspects of reproductive biology, including puberty timing, age at first birth, sex hormone regulation, endometriosis and age at menopause. Missense variants in ARHGAP27 were associated with higher NEB but shorter reproductive lifespan, suggesting a trade-off at this locus between reproductive ageing and intensity. Other genes implicated by coding variants include PIK3IP1, ZFP82 and LRP4, and our results suggest a new role for the melanocortin 1 receptor (MC1R) in reproductive biology. As NEB is one component of evolutionary fitness, our identified associations indicate loci under present-day natural selection. Integration with data from historical selection scans highlighted an allele in the FADS1/2 gene locus that has been under selection for thousands of years and remains so today. Collectively, our findings demonstrate that a broad range of biological mechanisms contribute to reproductive success.
Collapse
Affiliation(s)
- Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| | - Felix R Day
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Nicola Barban
- Alma Mater Studiorum University of Bologna, Bologna, Italy
| | - Felix C Tropf
- Nuffield College, University of Oxford, Oxford, UK
- École Nationale de la Statistique et de L'administration Économique (ENSAE), Paris, France
- Center for Research in Economics and Statistics (CREST), Paris, France
| | - David M Brazel
- Nuffield College, University of Oxford, Oxford, UK
- Leverhulme Centre for Demographic Science, University of Oxford, Oxford, UK
| | - Ahmad Vaez
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Bioinformatics, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Natalie van Zuydam
- Beijer Laboratory and Department of Immunology, Genetics and Pathology, Uppsala University and SciLifeLab, Uppsala, Sweden
| | - Bárbara D Bitarello
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Eugene J Gardner
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Evelina T Akimova
- Nuffield College, University of Oxford, Oxford, UK
- Leverhulme Centre for Demographic Science, University of Oxford, Oxford, UK
| | - Ajuna Azad
- DNRF Center for Chromosome Stability, Department of Cellular and Molecular Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Sven Bergmann
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Department of Integrative Biomedical Sciences, University of Cape Town, Cape Town, South Africa
| | - Lawrence F Bielak
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, USA
| | - Dorret I Boomsma
- Department of Biological Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
- Amsterdam Reproduction and Development (AR&D) Research Institute, Amsterdam, the Netherlands
| | | | - Marco Brumat
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste, Italy
| | - Julie E Buring
- Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - David Cesarini
- Department of Economics, New York University, New York, NY, USA
- Research Institute for Industrial Economics, Stockholm, Sweden
- National Bureau of Economic Research, Cambridge, MA, USA
| | - Daniel I Chasman
- Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Jorge E Chavarro
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Massimiliano Cocca
- Institute for Maternal and Child Health, IRCCS 'Burlo Garofolo', Trieste, Italy
| | - Maria Pina Concas
- Institute for Maternal and Child Health, IRCCS 'Burlo Garofolo', Trieste, Italy
| | | | - Gail Davies
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Ian J Deary
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Tõnu Esko
- Estonian Genome Center, University of Tartu, Tartu, Estonia
- Broad Institute of the Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA
| | - Jessica D Faul
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Oscar Franco
- Department of Epidemiology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Andrea Ganna
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
- Analytic and Translational Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Audrey J Gaskins
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | | | - Eco J C de Geus
- Department of Biological Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Christian Gieger
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Giorgia Girotto
- Department of Medical, Surgical and Health Sciences, University of Trieste, Trieste, Italy
- Institute for Maternal and Child Health, IRCCS 'Burlo Garofolo', Trieste, Italy
| | - Bamini Gopinath
- Centre for Vision Research, Westmead Institute for Medical Research and Department of Ophthalmology, University of Sydney, Sydney, New South Wales, Australia
| | - Hans Jörgen Grabe
- Department of Psychiatry and Psychotherapy, University Medicine Greifswald, Greifswald, Germany
| | - Erica P Gunderson
- Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
| | - Caroline Hayward
- Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
| | - Chunyan He
- Markey Cancer Center, University of Kentucky, Lexington, KY, USA
- Department of Internal Medicine, Division of Medical Oncology, University of Kentucky College of Medicine, Lexington, KY, USA
| | - Diana van Heemst
- Department of Internal Medicine, Section of Gerontology and Geriatrics, Leiden University Medical Center, Leiden, the Netherlands
| | - W David Hill
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Eva R Hoffmann
- DNRF Center for Chromosome Stability, Department of Cellular and Molecular Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Georg Homuth
- Interfaculty Institute for Genetics and Functional Genomics, University of Greifswald, Greifswald, Germany
| | - Jouke Jan Hottenga
- Department of Biological Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Hongyang Huang
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Elina Hyppӧnen
- Australian Centre for Precision Health, University of South Australia Cancer Research Institute, Adelaide, South Australia, Australia
- South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia
| | - M Arfan Ikram
- Department of Epidemiology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Rick Jansen
- Department of Psychiatry, Amsterdam Public Health and Amsterdam Neuroscience, Amsterdam UMC, Vrije Universiteit, Amsterdam, the Netherlands
| | - Magnus Johannesson
- Department of Economics, Stockholm School of Economics, Stockholm, Sweden
| | - Zoha Kamali
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Bioinformatics, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Sharon L R Kardia
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, USA
| | - Maryam Kavousi
- Department of Epidemiology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Annette Kifley
- Centre for Vision Research, Westmead Institute for Medical Research and Department of Ophthalmology, University of Sydney, Sydney, New South Wales, Australia
| | - Tuomo Kiiskinen
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, Finland
- National Institute for Health and Welfare, Helsinki, Finland
| | - Peter Kraft
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Brigitte Kühnel
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Claudia Langenberg
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Gerald Liew
- Centre for Vision Research, Westmead Institute for Medical Research and Department of Ophthalmology, University of Sydney, Sydney, New South Wales, Australia
| | - Penelope A Lind
- Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Jian'an Luan
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Patrik K E Magnusson
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Anubha Mahajan
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Nicholas G Martin
- Genetic Epidemiology, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Hamdi Mbarek
- Department of Biological Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
- Qatar Genome Programme, Qatar Foundation Research, Development and Innovation, Qatar Foundation, Doha, Qatar
| | - Mark I McCarthy
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - George McMahon
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Sarah E Medland
- Psychiatric Genetics, QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Thomas Meitinger
- Institute of Human Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Andres Metspalu
- Estonian Genome Center, University of Tartu, Tartu, Estonia
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | | | - Lili Milani
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Stacey A Missmer
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Division of Adolescent and Young Adult Medicine, Department of Medicine, Boston Children's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Obstetrics, Gynecology, and Reproductive Biology, College of Human Medicine, Michigan State University, Grand Rapids, MI, USA
| | - Paul Mitchell
- Centre for Vision Research, Westmead Institute for Medical Research and Department of Ophthalmology, University of Sydney, Sydney, New South Wales, Australia
| | - Stine Møllegaard
- Department of Sociology, University of Copenhagen, Copenhagen, Denmark
| | - Dennis O Mook-Kanamori
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands
- Department of Public Health and Primary Care, Leiden University Medical Center, Leiden, the Netherlands
| | - Anna Morgan
- Institute for Maternal and Child Health, IRCCS 'Burlo Garofolo', Trieste, Italy
| | - Peter J van der Most
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Renée de Mutsert
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands
| | - Matthias Nauck
- Institute of Clinical Chemistry and Laboratory Medicine, University Medicine Greifswald, Greifswald, Germany
| | - Ilja M Nolte
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Raymond Noordam
- Department of Internal Medicine, Section of Gerontology and Geriatrics, Leiden University Medical Center, Leiden, the Netherlands
| | - Brenda W J H Penninx
- Department of Psychiatry, EMGO Institute for Health and Care Research and Neuroscience Campus Amsterdam, VU University Medical Center/GGZ inGeest, Amsterdam, the Netherlands
| | - Annette Peters
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Patricia A Peyser
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, USA
| | - Ozren Polašek
- University of Split School of Medicine, Split, Croatia
- Algebra University College, Zagreb, Croatia
| | - Chris Power
- Population, Policy and Practice Research and Teaching Department, UCL Great Ormond Street Institute of Child Health, London, UK
| | | | - Paul Redmond
- Lothian Birth Cohorts, Department of Psychology, University of Edinburgh, Edinburgh, UK
| | - Janet W Rich-Edwards
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Women's Health, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Paul M Ridker
- Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Cornelius A Rietveld
- Erasmus University Rotterdam Institute for Behavior and Biology, Rotterdam, the Netherlands
- Department of Applied Economics, Erasmus School of Economics, Rotterdam, the Netherlands
| | - Susan M Ring
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
| | | | - Rico Rueedi
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Vallari Shukla
- DNRF Center for Chromosome Stability, Department of Cellular and Molecular Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Jennifer A Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Stasa Stankovic
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | | | - Doris Stöckl
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Konstantin Strauch
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany
- Institute of Genetic Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
- Chair of Genetic Epidemiology, IBE, Faculty of Medicine, LMU, Munich, Germany
| | - Morris A Swertz
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Alexander Teumer
- Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany
| | | | | | - A Roy Thurik
- Erasmus University Rotterdam Institute for Behavior and Biology, Rotterdam, the Netherlands
- Department of Applied Economics, Erasmus School of Economics, Rotterdam, the Netherlands
- Montpellier Business School, Montpellier, France
| | | | - Constance Turman
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - André G Uitterlinden
- Erasmus University Rotterdam Institute for Behavior and Biology, Rotterdam, the Netherlands
- Department of Internal Medicine, Erasmus MC, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Melanie Waldenberger
- Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
- Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Nicholas J Wareham
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - David R Weir
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Gonneke Willemsen
- Department of Biological Psychology, Amsterdam Public Health Research Institute, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Jing Hau Zhao
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Wei Zhao
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, USA
| | - Yajie Zhao
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Harold Snieder
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Marcel den Hoed
- Beijer Laboratory and Department of Immunology, Genetics and Pathology, Uppsala University and SciLifeLab, Uppsala, Sweden
| | - Ken K Ong
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Melinda C Mills
- Nuffield College, University of Oxford, Oxford, UK.
- Leverhulme Centre for Demographic Science, University of Oxford, Oxford, UK.
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.
- Department of Economics, Econometrics and Finance, University of Groningen, Groningen, the Netherlands.
| | - John R B Perry
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK.
| |
Collapse
|
36
|
Liao K, Carlson J, Zöllner S. The effect of mutation subtypes on the allele frequency spectrum and population genetics inference. G3 (BETHESDA, MD.) 2023; 13:jkad035. [PMID: 36759699 PMCID: PMC10085755 DOI: 10.1093/g3journal/jkad035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/23/2023] [Accepted: 01/26/2023] [Indexed: 02/11/2023]
Abstract
Population genetics has adapted as technological advances in next-generation sequencing have resulted in an exponential increase of genetic data. A common approach to efficiently analyze genetic variation present in large sequencing data is through the allele frequency spectrum, defined as the distribution of allele frequencies in a sample. While the frequency spectrum serves to summarize patterns of genetic variation, it implicitly assumes mutation types (A→C vs C→T) as interchangeable. However, mutations of different types arise and spread due to spatial and temporal variation in forces such as mutation rate and biased gene conversion that result in heterogeneity in the distribution of allele frequencies across sites. In this work, we explore the impact of this simplification on multiple aspects of population genetic modeling. As a site's mutation rate is strongly affected by flanking nucleotides, we defined a mutation subtype by the base pair change and adjacent nucleotides (e.g. AAA→ATA) and systematically assessed the heterogeneity in the frequency spectrum across 96 distinct 3-mer mutation subtypes using n = 3556 whole-genome sequenced individuals of European ancestry. We observed substantial variation across the subtype-specific frequency spectra, with some of the variation being influenced by molecular factors previously identified for single base mutation types. Estimates of model parameters from demographic inference performed for each mutation subtype's AFS individually varied drastically across the 96 subtypes. In local patterns of variation, a combination of regional subtype composition and local genomic factors shaped the regional frequency spectrum across genomic regions. Our results illustrate how treating variants in large sequencing samples as interchangeable may confound population genetic frameworks and encourages us to consider the unique evolutionary mechanisms of analyzed polymorphisms.
Collapse
Affiliation(s)
- Kevin Liao
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jedidiah Carlson
- Department of Integrative Biology, University of Texas at Austin, Austin, TX 78712, USA
- Department of Population Health, University of Texas at Austin, Austin, TX 78712, USA
| | - Sebastian Zöllner
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Psychiatry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
37
|
Vilgalys TP, Klunk J, Demeure CE, Cheng X, Shiratori M, Madej J, Beau R, Elli D, Patino MI, Redfern R, DeWitte SN, Gamble JA, Boldsen JL, Carmichael A, Varlik N, Eaton K, Grenier JC, Golding GB, Devault A, Rouillard JM, Yotova V, Sindeaux R, Ye CJ, Bikaran M, Dumaine A, Brinkworth JF, Missiakas D, Rouleau GA, Steinrücken M, Pizarro-Cerdá J, Poinar HN, Barreiro LB. Reply to Barton et al: signatures of natural selection during the Black Death. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.06.535944. [PMID: 37066254 PMCID: PMC10104142 DOI: 10.1101/2023.04.06.535944] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Barton et al.1 raise several statistical concerns regarding our original analyses2 that highlight the challenge of inferring natural selection using ancient genomic data. We show here that these concerns have limited impact on our original conclusions. Specifically, we recover the same signature of enrichment for high FST values at the immune loci relative to putatively neutral sites after switching the allele frequency estimation method to a maximum likelihood approach, filtering to only consider known human variants, and down-sampling our data to the same mean coverage across sites. Furthermore, using permutations, we show that the rs2549794 variant near ERAP2 continues to emerge as the strongest candidate for selection (p = 1.2×10-5), falling below the Bonferroni-corrected significance threshold recommended by Barton et al. Importantly, the evidence for selection on ERAP2 is further supported by functional data demonstrating the impact of the ERAP2 genotype on the immune response to Y. pestis and by epidemiological data from an independent group showing that the putatively selected allele during the Black Death protects against severe respiratory infection in contemporary populations.
Collapse
Affiliation(s)
- Tauras P Vilgalys
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Jennifer Klunk
- McMaster Ancient DNA Centre, Departments of Anthropology, Biology and Biochemistry, McMaster University, Hamilton, Ontario, Canada L8S4L9
- Daicel Arbor Biosciences, Ann Arbor, MI, USA
| | - Christian E Demeure
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Yersinia Research Unit, Microbiology Department, F-75015 Paris, France
| | - Xiaoheng Cheng
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Mari Shiratori
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Julien Madej
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Yersinia Research Unit, Microbiology Department, F-75015 Paris, France
| | - Rémi Beau
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Yersinia Research Unit, Microbiology Department, F-75015 Paris, France
| | - Derek Elli
- Department of Microbiology, Ricketts Laboratory, University of Chicago, Lemont, IL, USA
| | - Maria I Patino
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Rebecca Redfern
- Centre for Human Bioarchaeology, Museum of London, London, UK, EC2Y 5HN
| | - Sharon N DeWitte
- Department of Anthropology, University of South Carolina, Columbia, SC, USA
| | - Julia A Gamble
- Department of Anthropology, University of Manitoba, Winnipeg, Manitoba, R3T2N2
| | - Jesper L Boldsen
- Department of Forensic Medicine, Unit of Anthropology (ADBOU), University of Southern Denmark, Odense S, 5260, Denmark
| | - Ann Carmichael
- History Department, Indiana University, Bloomington, IN, USA
| | - Nükhet Varlik
- Department of History, Rutgers University-Newark, NJ, USA
| | - Katherine Eaton
- McMaster Ancient DNA Centre, Departments of Anthropology, Biology and Biochemistry, McMaster University, Hamilton, Ontario, Canada L8S4L9
| | - Jean-Christophe Grenier
- Montreal Heart Institute, Faculty of Medicine, Université de Montréal, Montréal, Quebec, Canada, H1T 1C7
| | - G Brian Golding
- McMaster Ancient DNA Centre, Departments of Anthropology, Biology and Biochemistry, McMaster University, Hamilton, Ontario, Canada L8S4L9
| | | | - Jean-Marie Rouillard
- Daicel Arbor Biosciences, Ann Arbor, MI, USA
- Department of Chemical Engineering, University of Michigan Ann Arbor, Ann Arbor, MI, USA
| | - Vania Yotova
- Centre Hospitalier Universitaire Sainte-Justine, Montréal, Quebec, Canada, H3T 1C5
| | - Renata Sindeaux
- Centre Hospitalier Universitaire Sainte-Justine, Montréal, Quebec, Canada, H3T 1C5
| | - Chun Jimmie Ye
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Matin Bikaran
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Anne Dumaine
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Jessica F Brinkworth
- Department of Anthropology, University of Illinois Urbana-Champaign, Urbana, IL, USA
- Carl R Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Dominique Missiakas
- Department of Microbiology, Ricketts Laboratory, University of Chicago, Lemont, IL, USA
| | - Guy A Rouleau
- Montreal Neurological Institute-Hospital, McGill University, Montréal, Quebec, Canada, H3A 2B4
| | - Matthias Steinrücken
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Javier Pizarro-Cerdá
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Yersinia Research Unit, Microbiology Department, F-75015 Paris, France
| | - Hendrik N Poinar
- McMaster Ancient DNA Centre, Departments of Anthropology, Biology and Biochemistry, McMaster University, Hamilton, Ontario, Canada L8S4L9
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
- Department of Microbiology, Ricketts Laboratory, University of Chicago, Lemont, IL, USA
| | - Luis B Barreiro
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
- Centre for Human Bioarchaeology, Museum of London, London, UK, EC2Y 5HN
- Department of Anthropology, University of South Carolina, Columbia, SC, USA
- Department of Anthropology, University of Manitoba, Winnipeg, Manitoba, R3T2N2
| |
Collapse
|
38
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor decomposition based feature extraction and classification to detect natural selection from genomic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.27.527731. [PMID: 37034767 PMCID: PMC10081272 DOI: 10.1101/2023.03.27.527731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under non-convex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data while preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx , which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
|
39
|
Barton AR, Santander CG, Skoglund P, Moltke I, Reich D, Mathieson I. Insufficient evidence for natural selection associated with the Black Death. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.14.532615. [PMID: 36993413 PMCID: PMC10055098 DOI: 10.1101/2023.03.14.532615] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Klunk et al. analyzed ancient DNA data from individuals in London and Denmark before, during and after the Black Death [1], and argued that allele frequency changes at immune genes were too large to be produced by random genetic drift and thus must reflect natural selection. They also identified four specific variants that they claimed show evidence of selection including at ERAP2, for which they estimate a selection coefficient of 0.39-several times larger than any selection coefficient on a common human variant reported to date. Here we show that these claims are unsupported for four reasons. First, the signal of enrichment of large allele frequency changes in immune genes comparing people in London before and after the Black Death disappears after an appropriate randomization test is carried out: the P value increases by ten orders of magnitude and is no longer significant. Second, a technical error in the estimation of allele frequencies means that none of the four originally reported loci actually pass the filtering thresholds. Third, the filtering thresholds do not adequately correct for multiple testing. Finally, in the case of the ERAP2 variant rs2549794, which Klunk et al. show experimentally may be associated with a host interaction with Y. pestis, we find no evidence of significant frequency change either in the data that Klunk et al. report, or in published data spanning 2,000 years. While it remains plausible that immune genes were subject to natural selection during the Black Death, the magnitude of this selection and which specific genes may have been affected remains unknown.
Collapse
Affiliation(s)
- Alison R. Barton
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Cindy G. Santander
- Department of Biology, University of Copenhagen, Copenhagen, DK-2200, Denmark
| | - Pontus Skoglund
- Ancient Genomics Laboratory, The Francis Crick Institute, London NW1 1AT, UK
| | - Ida Moltke
- Department of Biology, University of Copenhagen, Copenhagen, DK-2200, Denmark
| | - David Reich
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia PA 19104, USA
| |
Collapse
|
40
|
Davy T, Ju D, Mathieson I, Skoglund P. Hunter-gatherer admixture facilitated natural selection in Neolithic European farmers. Curr Biol 2023; 33:1365-1371.e3. [PMID: 36963383 PMCID: PMC10153476 DOI: 10.1016/j.cub.2023.02.049] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 11/17/2022] [Accepted: 02/15/2023] [Indexed: 03/26/2023]
Abstract
Ancient DNA has revealed multiple episodes of admixture in human prehistory during geographic expansions associated with cultural innovations. One important example is the expansion of Neolithic agricultural groups out of the Near East into Europe and their consequent admixture with Mesolithic hunter-gatherers.1,2,3,4 Ancient genomes from this period provide an opportunity to study the role of admixture in providing new genetic variation for selection to act upon, and also to identify genomic regions that resisted hunter-gatherer introgression and may thus have contributed to agricultural adaptations. We used genome-wide DNA from 677 individuals spanning Mesolithic and Neolithic Europe to infer ancestry deviations in the genomes of admixed individuals and to test for natural selection after admixture by testing for deviations from a genome-wide null distribution. We find that the region around the pigmentation-associated gene SLC24A5 shows the greatest overrepresentation of Neolithic local ancestry in the genome (|Z| = 3.46). In contrast, we find the greatest overrepresentation of Mesolithic ancestry across the major histocompatibility complex (MHC; |Z| = 4.21), a major immunity locus, which also shows allele frequency deviations indicative of selection following admixture (p = 1 × 10-56). This could reflect negative frequency-dependent selection on MHC alleles common in Neolithic populations or that Mesolithic alleles were positively selected for and facilitated adaptation in Neolithic populations to pathogens or other environmental factors. Our study extends previous results that highlight immune function and pigmentation as targets of adaptation in more recent populations to selection processes in the Stone Age.
Collapse
Affiliation(s)
- Tom Davy
- Ancient Genomics Laboratory, Francis Crick Institute, 1 Midland Road, NW1 1AT London, UK.
| | - Dan Ju
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, 415 Curie Blvd, Philadelphia, PA 19104, USA
| | - Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, 415 Curie Blvd, Philadelphia, PA 19104, USA
| | - Pontus Skoglund
- Ancient Genomics Laboratory, Francis Crick Institute, 1 Midland Road, NW1 1AT London, UK.
| |
Collapse
|
41
|
Fuhrmann N, Prakash C, Kaiser TS. Polygenic adaptation from standing genetic variation allows rapid ecotype formation. eLife 2023; 12:82824. [PMID: 36852484 PMCID: PMC9977305 DOI: 10.7554/elife.82824] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 02/07/2023] [Indexed: 03/01/2023] Open
Abstract
Adaptive ecotype formation can be the first step to speciation, but the genetic underpinnings of this process are poorly understood. Marine midges of the genus Clunio (Diptera) have recolonized Northern European shore areas after the last glaciation. In response to local tide conditions they have formed different ecotypes with respect to timing of adult emergence, oviposition behavior and larval habitat. Genomic analysis confirms the recent establishment of these ecotypes, reflected in massive haplotype sharing between ecotypes, irrespective of whether there is ongoing gene flow or geographic isolation. QTL mapping and genome screens reveal patterns of polygenic adaptation from standing genetic variation. Ecotype-associated loci prominently include circadian clock genes, as well as genes affecting sensory perception and nervous system development, hinting to a central role of these processes in ecotype formation. Our data show that adaptive ecotype formation can occur rapidly, with ongoing gene flow and largely based on a re-assortment of existing alleles.
Collapse
Affiliation(s)
- Nico Fuhrmann
- Max Planck Institute for Evolutionary BiologyPlönGermany
| | - Celine Prakash
- Max Planck Institute for Evolutionary BiologyPlönGermany
| | | |
Collapse
|
42
|
Hawkes G, Yengo L, Vedantam S, Marouli E, Beaumont RN, Tyrrell J, Weedon MN, Hirschhorn J, Frayling TM, Wood AR. Identification and analysis of individuals who deviate from their genetically-predicted phenotype. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.10.528019. [PMID: 36798175 PMCID: PMC9934696 DOI: 10.1101/2023.02.10.528019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
Abstract
Findings from genome-wide association studies have facilitated the generation of genetic predictors for many common human phenotypes. Stratifying individuals misaligned to a genetic predictor based on common variants may be important for follow-up studies that aim to identify alternative causal factors. Using genome-wide imputed genetic data, we aimed to classify 158,951 unrelated individuals from the UK Biobank as either concordant or deviating from two well-measured phenotypes. We first applied our methods to standing height: our primary analysis classified 244 individuals (0.15%) as misaligned to their genetically predicted height. We show that these individuals are enriched for self-reporting being shorter or taller than average at age 10, diagnosed congenital malformations, and rare loss-of-function variants in genes previously catalogued as causal for growth disorders. Secondly, we apply our methods to LDL cholesterol. We classified 156 (0.12%) individuals as misaligned to their genetically predicted LDL cholesterol and show that these individuals were enriched for both clinically actionable cardiovascular risk factors and rare genetic variants in genes previously shown to be involved in metabolic processes. Individuals whose LDL-C was higher than expected based on the genetic predictor were also at higher risk of developing coronary artery disease and type-two diabetes, even after adjustment for measured LDL-C, BMI and age, suggesting upward deviation from genetically predicted LDL-C is indicative of generally poor health. Our results remained broadly consistent when performing sensitivity analysis based on a variety of parametric and non-parametric methods to define individuals deviating from polygenic expectation. Our analyses demonstrate the potential importance of quantitatively identifying individuals for further follow-up based on deviation from genetic predictions. Author Summary Human genetics is becoming increasingly useful to help predict human traits across a population owing to findings from large-scale genetic association studies and advances in the power of genetic predictors. This provides an opportunity to potentially identify individuals that deviate from genetic predictions for a common phenotype under investigation. For example, an individual may be genetically predicted to be tall, but be shorter than expected. It is potentially important to identify individuals who deviate from genetic predictions as this can facilitate further follow-up to assess likely causes. Using 158,951 unrelated individuals from the UK Biobank, with height and LDL cholesterol, as exemplar traits, we demonstrate that approximately 0.15% & 0.12% of individuals deviate from their genetically predicted phenotypes respectively. We observed these individuals to be enriched for a range of rare clinical diagnoses, as well as rare genetic factors that may be causal. Our analyses also demonstrate several methods for detecting individuals who deviate from genetic predictions that can be applied to a range of continuous human phenotypes.
Collapse
Affiliation(s)
- Gareth Hawkes
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, UK
| | - Loic Yengo
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | | | - Eirini Marouli
- William Harvey Research Institute, Barts and The London School of Medicine and Dentistry Queen Mary University of London, London
| | - Robin N Beaumont
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, UK
| | | | - Jessica Tyrrell
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, UK
| | - Michael N Weedon
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, UK
| | | | - Timothy M Frayling
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, UK
| | - Andrew R Wood
- Genetics of Complex Traits, College of Medicine and Health, University of Exeter, Exeter, Devon, UK
| |
Collapse
|
43
|
Mahmoud M, Tost M, Ha NT, Simianer H, Beissinger T. Ghat: an R package for identifying adaptive polygenic traits. G3 (BETHESDA, MD.) 2023; 13:jkac319. [PMID: 36454082 PMCID: PMC9911052 DOI: 10.1093/g3journal/jkac319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 01/21/2022] [Accepted: 11/14/2022] [Indexed: 12/03/2022]
Abstract
Identifying selection on polygenic complex traits in crops and livestock is important for understanding evolution and helps prioritize important characteristics for breeding. Quantitative trait loci (QTL) that contribute to polygenic trait variation often exhibit small or infinitesimal effects. This hinders the ability to detect QTL-controlling polygenic traits because enormously high statistical power is needed for their detection. Recently, we circumvented this challenge by introducing a method to identify selection on complex traits by evaluating the relationship between genome-wide changes in allele frequency and estimates of effect size. The approach involves calculating a composite statistic across all markers that capture this relationship, followed by implementing a linkage disequilibrium-aware permutation test to evaluate if the observed pattern differs from that expected due to drift during evolution and population stratification. In this manuscript, we describe "Ghat," an R package developed to implement this method to test for selection on polygenic traits. We demonstrate the package by applying it to test for polygenic selection on 15 published European wheat traits including yield, biomass, quality, morphological characteristics, and disease resistance traits. Moreover, we applied Ghat to different simulated populations with different breeding histories and genetic architectures. The results highlight the power of Ghat to identify selection on complex traits. The Ghat package is accessible on CRAN, the Comprehensive R Archival Network, and on GitHub.
Collapse
Affiliation(s)
- Medhat Mahmoud
- Department of Crop Science, University of Goettingen, Goettingen 37075, Germany
- Center for Integrated Breeding Research, University of Goettingen, Goettingen 37075, Germany
| | - Mila Tost
- Department of Crop Science, University of Goettingen, Goettingen 37075, Germany
- Center for Integrated Breeding Research, University of Goettingen, Goettingen 37075, Germany
| | - Ngoc-Thuy Ha
- Department of Animal Sciences, University of Goettingen, Goettingen 37075, Germany
| | - Henner Simianer
- Center for Integrated Breeding Research, University of Goettingen, Goettingen 37075, Germany
- Department of Animal Sciences, University of Goettingen, Goettingen 37075, Germany
| | - Timothy Beissinger
- Department of Crop Science, University of Goettingen, Goettingen 37075, Germany
- Center for Integrated Breeding Research, University of Goettingen, Goettingen 37075, Germany
| |
Collapse
|
44
|
Yuan S, Shi Y, Zhou BF, Liang YY, Chen XY, An QQ, Fan YR, Shen Z, Ingvarsson PK, Wang B. Genomic vulnerability to climate change in Quercus acutissima, a dominant tree species in East Asian deciduous forests. Mol Ecol 2023; 32:1639-1655. [PMID: 36626136 DOI: 10.1111/mec.16843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 12/30/2022] [Accepted: 01/05/2023] [Indexed: 01/11/2023]
Abstract
Understanding the evolutionary processes that shape the landscape of genetic variation and influence the response of species to future climate change is critical for biodiversity conservation. Here, we sampled 27 populations across the distribution range of a dominant forest tree, Quercus acutissima, in East Asia, and applied genome-wide analyses to track the evolutionary history and predict the fate of populations under future climate. We found two genetic groups (East and West) in Q. acutissima that diverged during Pliocene. We also found a heterogeneous landscape of genomic variation in this species, which may have been shaped by population demography and linked selections. Using genotype-environment association analyses, we identified climate-associated SNPs in a diverse set of genes and functional categories, indicating a model of polygenic adaptation in Q. acutissima. We further estimated three genetic offset metrics to quantify genomic vulnerability of this species to climate change due to the complex interplay between local adaptation and migration. We found that marginal populations are under higher risk of local extinction because of future climate change, and may not be able to track suitable habitats to maintain the gene-environment relationships observed under the current climate. We also detected higher reverse genetic offsets in northern China, indicating that genetic variation currently present in the whole range of Q. acutissima may not adapt to future climate conditions in this area. Overall, this study illustrates how evolutionary processes have shaped the landscape of genomic variation, and provides a comprehensive genome-wide view of climate maladaptation in Q. acutissima.
Collapse
Affiliation(s)
- Shuai Yuan
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| | - Yong Shi
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| | - Biao-Feng Zhou
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| | - Yi-Ye Liang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| | - Xue-Yan Chen
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| | - Qing-Qing An
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| | - Yan-Ru Fan
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| | - Zhao Shen
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| | - Pär K Ingvarsson
- Department of Plant Biology, Linnean Center for Plant Biology, Uppsala BioCenter, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Baosheng Wang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Guangdong Provincial Key Laboratory of Applied Botany, Guangzhou, China.,South China National Botanical Garden, Guangzhou, China
| |
Collapse
|
45
|
Genetic footprints of assortative mating in the Japanese population. Nat Hum Behav 2023; 7:65-73. [PMID: 36138222 PMCID: PMC9883156 DOI: 10.1038/s41562-022-01438-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Accepted: 07/20/2022] [Indexed: 02/03/2023]
Abstract
Assortative mating (AM) is a pattern characterized by phenotypic similarities between mating partners. Detecting the evidence of AM has been challenging due to the lack of large-scale datasets that include phenotypic data on both partners, especially in populations of non-European ancestries. Gametic phase disequilibrium between trait-associated alleles is a signature of parental AM on a polygenic trait, which can be detected even without partner data. Here, using polygenic scores for 81 traits in the Japanese population using BioBank Japan Project genome-wide association studies data (n = 172,270), we found evidence of AM on the liability to type 2 diabetes and coronary artery disease, as well as on dietary habits. In cross-population comparison using United Kingdom Biobank data (n = 337,139) we found shared but heterogeneous impacts of AM between populations.
Collapse
|
46
|
Valette T, Leitwein M, Lascaux JM, Desmarais E, Berrebi P, Guinand B. Redundancy analysis, genome-wide association studies and the pigmentation of brown trout (Salmo trutta L.). JOURNAL OF FISH BIOLOGY 2023; 102:96-118. [PMID: 36218076 DOI: 10.1111/jfb.15243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 10/04/2022] [Indexed: 06/16/2023]
Abstract
The association of molecular variants with phenotypic variation is a main issue in biology, often tackled with genome-wide association studies (GWAS). GWAS are challenging, with increasing, but still limited, use in evolutionary biology. We used redundancy analysis (RDA) as a complimentary ordination approach to single- and multitrait GWAS to explore the molecular basis of pigmentation variation in brown trout (Salmo trutta) belonging to wild populations impacted by hatchery fish. Based on 75,684 single nucleotide polymorphic (SNP) markers, RDA, single- and multitrait GWAS allowed the extraction of 337 independent colour patterning loci (CPLs) associated with trout pigmentation traits, such as the number of red and black spots on flanks. Collectively, these CPLs (i) mapped onto 35 out of 40 brown trout linkage groups indicating a polygenic genomic architecture of pigmentation, (ii) were found to be associated with 218 candidate genes, including 197 genes formerly mentioned in the literature associated to skin pigmentation, skin patterning, differentiation or structure notably in a close relative, the rainbow trout (Onchorhynchus mykiss), and (iii) related to functions relevant to pigmentation variation (e.g., calcium- and ion-binding, cell adhesion). Annotated CPLs include genes with well-known pigmentation effects (e.g., PMEL, SLC45A2, SOX10), but also markers associated with genes formerly found expressed in rainbow or brown trout skins. RDA was also shown to be useful to investigate management issues, especially the dynamics of trout pigmentation submitted to several generations of hatchery introgression.
Collapse
|
47
|
Liu L, Khan A, Sanchez-Rodriguez E, Zanoni F, Li Y, Steers N, Balderes O, Zhang J, Krithivasan P, LeDesma RA, Fischman C, Hebbring SJ, Harley JB, Moncrieffe H, Kottyan LC, Namjou-Khales B, Walunas TL, Knevel R, Raychaudhuri S, Karlson EW, Denny JC, Stanaway IB, Crosslin D, Rauen T, Floege J, Eitner F, Moldoveanu Z, Reily C, Knoppova B, Hall S, Sheff JT, Julian BA, Wyatt RJ, Suzuki H, Xie J, Chen N, Zhou X, Zhang H, Hammarström L, Viktorin A, Magnusson PKE, Shang N, Hripcsak G, Weng C, Rundek T, Elkind MSV, Oelsner EC, Barr RG, Ionita-Laza I, Novak J, Gharavi AG, Kiryluk K. Genetic regulation of serum IgA levels and susceptibility to common immune, infectious, kidney, and cardio-metabolic traits. Nat Commun 2022; 13:6859. [PMID: 36369178 PMCID: PMC9651905 DOI: 10.1038/s41467-022-34456-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 10/25/2022] [Indexed: 11/13/2022] Open
Abstract
Immunoglobulin A (IgA) mediates mucosal responses to food antigens and the intestinal microbiome and is involved in susceptibility to mucosal pathogens, celiac disease, inflammatory bowel disease, and IgA nephropathy. We performed a genome-wide association study of serum IgA levels in 41,263 individuals of diverse ancestries and identified 20 genome-wide significant loci, including 9 known and 11 novel loci. Co-localization analyses with expression QTLs prioritized candidate genes for 14 of 20 significant loci. Most loci encoded genes that produced immune defects and IgA abnormalities when genetically manipulated in mice. We also observed positive genetic correlations of serum IgA levels with IgA nephropathy, type 2 diabetes, and body mass index, and negative correlations with celiac disease, inflammatory bowel disease, and several infections. Mendelian randomization supported elevated serum IgA as a causal factor in IgA nephropathy. African ancestry was consistently associated with higher serum IgA levels and greater frequency of IgA-increasing alleles compared to other ancestries. Our findings provide novel insights into the genetic regulation of IgA levels and its potential role in human disease.
Collapse
Affiliation(s)
- Lili Liu
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Atlas Khan
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Elena Sanchez-Rodriguez
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Francesca Zanoni
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Yifu Li
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Nicholas Steers
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Olivia Balderes
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Junying Zhang
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Priya Krithivasan
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Robert A. LeDesma
- grid.16750.350000 0001 2097 5006Lewis Thomas Laboratory, Department of Molecular Biology, Princeton University, Princeton, NJ USA
| | - Clara Fischman
- grid.25879.310000 0004 1936 8972Department of Medicine, University of Pennsylvania, Philadelphia, PA USA
| | - Scott J. Hebbring
- grid.280718.40000 0000 9274 7048Center for Human Genetics, Marshfield Clinic Research Institute, Marshfield, WI USA
| | - John B. Harley
- grid.239573.90000 0000 9025 8099Center of Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital, Cincinnati, OH USA ,grid.24827.3b0000 0001 2179 9593Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH USA ,grid.413848.20000 0004 0420 2128US Department of Veterans Affairs Medical Center, Cincinnati, OH USA
| | - Halima Moncrieffe
- grid.239573.90000 0000 9025 8099Center of Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital, Cincinnati, OH USA ,grid.24827.3b0000 0001 2179 9593Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH USA
| | - Leah C. Kottyan
- grid.239573.90000 0000 9025 8099Center of Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital, Cincinnati, OH USA ,grid.24827.3b0000 0001 2179 9593Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH USA
| | - Bahram Namjou-Khales
- grid.239573.90000 0000 9025 8099Center of Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital, Cincinnati, OH USA ,grid.24827.3b0000 0001 2179 9593Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH USA
| | - Theresa L. Walunas
- grid.16753.360000 0001 2299 3507Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL USA
| | - Rachel Knevel
- grid.62560.370000 0004 0378 8294Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA USA
| | - Soumya Raychaudhuri
- grid.62560.370000 0004 0378 8294Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA USA
| | - Elizabeth W. Karlson
- grid.62560.370000 0004 0378 8294Division of Rheumatology, Immunology and Allergy, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA USA
| | - Joshua C. Denny
- grid.152326.10000 0001 2264 7217Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN USA
| | - Ian B. Stanaway
- grid.34477.330000000122986657Kidney Research Institute, Division of Nephrology, Department of Medicine, University of Washington, Seattle, WA USA
| | - David Crosslin
- grid.34477.330000000122986657Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington, Seattle, WA USA
| | - Thomas Rauen
- grid.1957.a0000 0001 0728 696XDepartment of Nephrology, RWTH University of Aachen, Aachen, Germany
| | - Jürgen Floege
- grid.1957.a0000 0001 0728 696XDepartment of Nephrology, RWTH University of Aachen, Aachen, Germany
| | - Frank Eitner
- grid.1957.a0000 0001 0728 696XDepartment of Nephrology, RWTH University of Aachen, Aachen, Germany ,grid.420044.60000 0004 0374 4101Kidney Diseases Research, Bayer Pharma AG, Wuppertal, Germany
| | - Zina Moldoveanu
- grid.265892.20000000106344187Department of Microbiology and Medicine, University of Alabama at Birmingham, Birmingham, AL USA
| | - Colin Reily
- grid.265892.20000000106344187Department of Microbiology and Medicine, University of Alabama at Birmingham, Birmingham, AL USA
| | - Barbora Knoppova
- grid.265892.20000000106344187Department of Microbiology and Medicine, University of Alabama at Birmingham, Birmingham, AL USA
| | - Stacy Hall
- grid.265892.20000000106344187Department of Microbiology and Medicine, University of Alabama at Birmingham, Birmingham, AL USA
| | - Justin T. Sheff
- grid.265892.20000000106344187Department of Microbiology and Medicine, University of Alabama at Birmingham, Birmingham, AL USA
| | - Bruce A. Julian
- grid.265892.20000000106344187Department of Microbiology and Medicine, University of Alabama at Birmingham, Birmingham, AL USA
| | - Robert J. Wyatt
- grid.267301.10000 0004 0386 9246Division of Pediatric Nephrology, University of Tennessee Health Sciences Center, Memphis, TN USA
| | - Hitoshi Suzuki
- grid.258269.20000 0004 1762 2738Department of Nephrology, Juntendo University Faculty of Medicine, Tokyo, Japan
| | - Jingyuan Xie
- grid.16821.3c0000 0004 0368 8293Department of Nephrology, Institute of Nephrology, Shanghai Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Nan Chen
- grid.16821.3c0000 0004 0368 8293Department of Nephrology, Institute of Nephrology, Shanghai Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Xujie Zhou
- grid.11135.370000 0001 2256 9319Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, China
| | - Hong Zhang
- grid.11135.370000 0001 2256 9319Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, China
| | - Lennart Hammarström
- grid.4714.60000 0004 1937 0626Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| | - Alexander Viktorin
- grid.4714.60000 0004 1937 0626Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Patrik K. E. Magnusson
- grid.4714.60000 0004 1937 0626Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Ning Shang
- grid.21729.3f0000000419368729Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - George Hripcsak
- grid.21729.3f0000000419368729Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Chunhua Weng
- grid.21729.3f0000000419368729Department of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Tatjana Rundek
- grid.26790.3a0000 0004 1936 8606Department of Neurology, University of Miami, Miami, FL USA ,grid.26790.3a0000 0004 1936 8606Evelyn F. McKnight Brain Institute, University of Miami, Miami, FL USA
| | - Mitchell S. V. Elkind
- grid.21729.3f0000000419368729Department of Neurology, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Elizabeth C. Oelsner
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - R. Graham Barr
- grid.21729.3f0000000419368729Division of General Medicine, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA ,grid.21729.3f0000000419368729Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY USA
| | - Iuliana Ionita-Laza
- grid.21729.3f0000000419368729Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY USA
| | - Jan Novak
- grid.265892.20000000106344187Department of Microbiology and Medicine, University of Alabama at Birmingham, Birmingham, AL USA
| | - Ali G. Gharavi
- grid.21729.3f0000000419368729Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY USA
| | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians & Surgeons, Columbia University, New York, NY, USA.
| |
Collapse
|
48
|
Abraham A, LaBella AL, Capra JA, Rokas A. Mosaic patterns of selection in genomic regions associated with diverse human traits. PLoS Genet 2022; 18:e1010494. [PMID: 36342969 PMCID: PMC9671423 DOI: 10.1371/journal.pgen.1010494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 11/17/2022] [Accepted: 10/21/2022] [Indexed: 11/09/2022] Open
Abstract
Natural selection shapes the genetic architecture of many human traits. However, the prevalence of different modes of selection on genomic regions associated with variation in traits remains poorly understood. To address this, we developed an efficient computational framework to calculate positive and negative enrichment of different evolutionary measures among regions associated with complex traits. We applied the framework to summary statistics from >900 genome-wide association studies (GWASs) and 11 evolutionary measures of sequence constraint, population differentiation, and allele age while accounting for linkage disequilibrium, allele frequency, and other potential confounders. We demonstrate that this framework yields consistent results across GWASs with variable sample sizes, numbers of trait-associated SNPs, and analytical approaches. The resulting evolutionary atlas maps diverse signatures of selection on genomic regions associated with complex human traits on an unprecedented scale. We detected positive enrichment for sequence conservation among trait-associated regions for the majority of traits (>77% of 290 high power GWASs), which included reproductive traits. Many traits also exhibited substantial positive enrichment for population differentiation, especially among hair, skin, and pigmentation traits. In contrast, we detected widespread negative enrichment for signatures of balancing selection (51% of GWASs) and absence of enrichment for evolutionary signals in regions associated with late-onset Alzheimer's disease. These results support a pervasive role for negative selection on regions of the human genome that contribute to variation in complex traits, but also demonstrate that diverse modes of evolution are likely to have shaped trait-associated loci. This atlas of evolutionary signatures across the diversity of available GWASs will enable exploration of the relationship between the genetic architecture and evolutionary processes in the human genome.
Collapse
Affiliation(s)
- Abin Abraham
- Vanderbilt University Medical Center, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Abigail L. LaBella
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina, United States of America
- North Carolina Research Center, Kannapolis, North Carolina, United States of America
| | - John A. Capra
- Bakar Computational Health Sciences Institute, University of California, San Francisco, California, United States of America
- Department of Epidemiology and Biostatistics, University of California, San Francisco, California, United States of America
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- Evolutionary Studies Initiative, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, Tennessee, United States of America
| |
Collapse
|
49
|
Yang F, Crossley MS, Schrader L, Dubovskiy IM, Wei SJ, Zhang R. Polygenic adaptation contributes to the invasive success of the Colorado potato beetle. Mol Ecol 2022; 31:5568-5580. [PMID: 35984732 DOI: 10.1111/mec.16666] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 07/03/2022] [Accepted: 08/15/2022] [Indexed: 12/24/2022]
Abstract
How invasive species cope with novel selective pressures with limited genetic variation is a fundamental question in molecular ecology. Several mechanisms have been proposed, but they can lack generality. Here, we addressed an alternative solution, polygenic adaptation, wherein traits that arise from multiple combinations of loci may be less sensitive to loss of variation during invasion. We tested the polygenic signal of environmental adaptation of Colorado potato beetle (CPB) introduced in Eurasia. Population genomic analyses showed declining genetic diversity in the eastward expansion of Eurasian populations, and weak population genetic structure (except for the invasion fronts in Asia). Demographic history showed that all populations shared a strong bottleneck about 100 years ago when CPB was introduced to Europe. Genome scans revealed a suite of genes involved in activity regulation functions that are plausibly related to cold stress, including some well-founded functions (e.g., the activity of phosphodiesterase, the G-protein regulator) and discrete functions. Such polygenic architecture supports the hypothesis that polygenic adaptation and potentially genetic redundancy can fuel the adaptation of CPB despite strong genetic depletion, thus representing a promising general mechanism for resolving the genetic paradox of invasion. More broadly, most complex traits based on polygenes may be less sensitive to invasive bottlenecks, thus ensuring the evolutionary success of invasive species in novel environments.
Collapse
Affiliation(s)
- Fangyuan Yang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,Beijing Academy of Agriculture and Forestry Sciences, Institute of Plant and Environmental Protection, Beijing, China
| | - Michael S Crossley
- Department of Entomology and Wildlife Ecology, University of Delaware, Newark, Delaware, USA
| | - Lukas Schrader
- Institute for Evolution & Biodiversity, University of Münster, Münster, Germany
| | - Ivan M Dubovskiy
- Laboratory of Biological Plant Protection and Biotechnology, Novosibirsk State Agrarian University, Novosibirsk, Russia
| | - Shu-Jun Wei
- Beijing Academy of Agriculture and Forestry Sciences, Institute of Plant and Environmental Protection, Beijing, China
| | - Runzhi Zhang
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,College of Life Science, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
50
|
Abstract
The human microbiome harbours a large capacity for within-person adaptive mutations. Commensal bacterial strains can stably colonize a person for decades, and billions of mutations are generated daily within each person's microbiome. Adaptive mutations emerging during health might be driven by selective forces that vary across individuals, vary within an individual, or are completely novel to the human population. Mutations emerging within individual microbiomes might impact the immune system, the metabolism of nutrients or drugs, and the stability of the community to perturbations. Despite this potential, relatively little attention has been paid to the possibility of adaptive evolution within complex human-associated microbiomes. This review discusses the promise of studying within-microbiome adaptation, the conceptual and technical limitations that may have contributed to an underappreciation of adaptive de novo mutations occurring within microbiomes to date, and methods for detecting recent adaptive evolution. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.
Collapse
Affiliation(s)
- Tami D Lieberman
- Department of Civil and Environmental Engineering, Institute for Medical Engineering and Science,Massachusetts Institute of Technology, Cambridge, MA, USA.,Broad Institute, Cambridge, MA, USA.,Ragon Institute, Cambridge, MA, USA
| |
Collapse
|