1
|
Amin MR, Hasan M, DeGiorgio M. Digital Image Processing to Detect Adaptive Evolution. Mol Biol Evol 2024; 41:msae242. [PMID: 39565932 PMCID: PMC11631197 DOI: 10.1093/molbev/msae242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 10/28/2024] [Accepted: 11/13/2024] [Indexed: 11/22/2024] Open
Abstract
In recent years, advances in image processing and machine learning have fueled a paradigm shift in detecting genomic regions under natural selection. Early machine learning techniques employed population-genetic summary statistics as features, which focus on specific genomic patterns expected by adaptive and neutral processes. Though such engineered features are important when training data are limited, the ease at which simulated data can now be generated has led to the recent development of approaches that take in image representations of haplotype alignments and automatically extract important features using convolutional neural networks. Digital image processing methods termed α-molecules are a class of techniques for multiscale representation of objects that can extract a diverse set of features from images. One such α-molecule method, termed wavelet decomposition, lends greater control over high-frequency components of images. Another α-molecule method, termed curvelet decomposition, is an extension of the wavelet concept that considers events occurring along curves within images. We show that application of these α-molecule techniques to extract features from image representations of haplotype alignments yield high true positive rate and accuracy to detect hard and soft selective sweep signatures from genomic data with both linear and nonlinear machine learning classifiers. Moreover, we find that such models are easy to visualize and interpret, with performance rivaling those of contemporary deep learning approaches for detecting sweeps.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
2
|
Zhang MY, Cao RD, Chen Y, Ma JC, Shi CM, Zhang YF, Zhang JX, Zhang YH. Genomic and Phenotypic Adaptations of Rattus tanezumi to Cold Limit Its Further Northward Expansion and Range Overlap with R. norvegicus. Mol Biol Evol 2024; 41:msae106. [PMID: 38829799 PMCID: PMC11184353 DOI: 10.1093/molbev/msae106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 05/19/2024] [Accepted: 05/28/2024] [Indexed: 06/05/2024] Open
Abstract
Global climate change has led to shifts in the distribution ranges of many terrestrial species, promoting their migration from lower altitudes or latitudes to higher ones. Meanwhile, successful invaders have developed genetic adaptations enabling the colonization of new environments. Over the past 40 years, Rattus tanezumi (RT) has expanded into northern China (Northwest and North China) from its southern origins. We studied the cold adaptation of RT and its potential for northward expansion by comparing it with sympatric Rattus norvegicus (RN), which is well adapted to cold regions. Through population genomic analysis, we revealed that the invading RT rats have split into three distinct populations: the North, Northwest, and Tibetan populations. The first two populations exhibited high genetic diversity, while the latter population showed remarkably low genetic diversity. These rats have developed various genetic adaptations to cold, arid, hypoxic, and high-UV conditions. Cold acclimation tests revealed divergent thermoregulation between RT and RN. Specifically, RT exhibited higher brown adipose tissue activity and metabolic rates than did RN. Transcriptome analysis highlighted changes in genes regulating triglyceride catabolic processes in RT, including Apoa1 and Apoa4, which were upregulated, under selection and associated with local adaptation. In contrast, RN showed changes in carbohydrate metabolism genes. Despite the cold adaptation of RT, we observed genotypic and phenotypic constraints that may limit its ability to cope with severe low temperatures farther north. Consequently, it is less likely that RT rats will invade and overlap with RN rats in farther northern regions.
Collapse
Affiliation(s)
- Ming-Yu Zhang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Rui-Dong Cao
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yi Chen
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jian-Cang Ma
- Zhangye Maize Stock Production Base, Zhangye 734024, Gansu, China
| | - Cheng-Min Shi
- College of Plant Protection, Hebei Agricultural University, Baoding 071001, Hebei, China
| | - Yun-Feng Zhang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Jian-Xu Zhang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yao-Hua Zhang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- School of Resources and Environmental Engineering, Anhui University, Hefei 230601, Anhui, China
| |
Collapse
|
3
|
Shpak M, Lawrence KN, Pool JE. The Precision and Power of Population Branch Statistics in Identifying the Genomic Signatures of Local Adaptation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.594139. [PMID: 38798330 PMCID: PMC11118325 DOI: 10.1101/2024.05.14.594139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Population branch statistics, which estimate the branch lengths of focal populations with respect to two outgroups, have been used as an alternative to FST-based genome-wide scans for identifying loci associated with local selective sweeps. In addition to the original population branch statistic (PBS), there are subsequently proposed branch rescalings: normalized population branch statistic (PBSn1), which adjusts focal branch length with respect to outgroup branch lengths at the same locus, and population branch excess (PBE), which also incorporates median branch lengths at other loci. PBSn1 and PBE have been proposed to be less sensitive to allele frequency divergence generated by background selection or geographically ubiquitous positive selection rather than local selective sweeps. However, the accuracy and statistical power of branch statistics have not been systematically assessed. To do so, we simulate genomes in representative large and small populations with varying proportions of sites evolving under genetic drift or background selection (approximated using variable Ne), local selective sweeps, and geographically parallel selective sweeps. We then assess the probability that local selective sweep loci are correctly identified as outliers by FST and by each of the branch statistics. We find that branch statistics consistently outperform FST at identifying local sweeps. When background selection and/or parallel sweeps are introduced, PBSn1 and especially PBE correctly identify local sweeps among their top outliers at a higher frequency than PBS. These results validate the greater specificity of rescaled branch statistics such as PBE to detect population-specific positive selection, supporting their use in genomic studies focused on local adaptation.
Collapse
Affiliation(s)
- Max Shpak
- Laboratory of Genetics, University of Wisconsin–Madison, Madison, WI, USA
| | - Kadee N. Lawrence
- Laboratory of Genetics, University of Wisconsin–Madison, Madison, WI, USA
| | - John E. Pool
- Laboratory of Genetics, University of Wisconsin–Madison, Madison, WI, USA
| |
Collapse
|
4
|
Pless E, Eckburg AM, Henn BM. Predicting Environmental and Ecological Drivers of Human Population Structure. Mol Biol Evol 2023; 40:msad094. [PMID: 37146165 PMCID: PMC10172848 DOI: 10.1093/molbev/msad094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 03/30/2023] [Accepted: 04/03/2023] [Indexed: 05/07/2023] Open
Abstract
Landscape, climate, and culture can all structure human populations, but few existing methods are designed to simultaneously disentangle among a large number of variables in explaining genetic patterns. We developed a machine learning method for identifying the variables which best explain migration rates, as measured by the coalescent-based program MAPS that uses shared identical by descent tracts to infer spatial migration across a region of interest. We applied our method to 30 human populations in eastern Africa with high-density single nucleotide polymorphism array data. The remarkable diversity of ethnicities, languages, and environments in this region offers a unique opportunity to explore the variables that shape migration and genetic structure. We explored more than 20 spatial variables relating to landscape, climate, and presence of tsetse flies. The full model explained ∼40% of the variance in migration rate over the past 56 generations. Precipitation, minimum temperature of the coldest month, and elevation were the variables with the highest impact. Among the three groups of tsetse flies, the most impactful was fusca which transmits livestock trypanosomiasis. We also tested for adaptation to high elevation among Ethiopian populations. We did not identify well-known genes related to high elevation, but we did find signatures of positive selection related to metabolism and disease. We conclude that the environment has influenced the migration and adaptation of human populations in eastern Africa; the remaining variance in structure is likely due in part to cultural or other factors not captured in our model.
Collapse
Affiliation(s)
- Evlyn Pless
- Department of Anthropology, Center for Population Biology, University of California, Davis, CA
| | - Anders M Eckburg
- Department of Anthropology, Center for Population Biology, University of California, Davis, CA
| | - Brenna M Henn
- Department of Anthropology, Center for Population Biology, University of California, Davis, CA
- UC Davis Genome Center, University of California, Davis, CA
| |
Collapse
|
5
|
Youm DJ, Ko BJ, Kim D, Park M, Won S, Lee YH, Kim B, Seol D, Chai HH, Lim D, Jeong C, Kim H. The idiosyncratic genome of Korean long-tailed chicken as a valuable genetic resource. iScience 2023; 26:106236. [PMID: 36915682 PMCID: PMC10006692 DOI: 10.1016/j.isci.2023.106236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 11/28/2022] [Accepted: 02/15/2023] [Indexed: 02/19/2023] Open
Abstract
Today, breeds with ornamental traits such as exceptionally long tail feathers are economically valuable. However, the genetic basis of long-tail feathers is yet to be understood. To provide better understanding of long tail feathers, we sequenced Korean long-tailed chicken (KLC) genomes and compared them with genomes of other chicken breeds. We first analyzed the genome structure of KLC and its genomic relationship with other chickens and observed unique characteristics. Subsequently, we searched for genomic regions under selection. Feather keratin 1-like enriched region and several genes were found to have novel putative functions and effects on the long tail trait in KLC. Our findings support the value of KLC as a unique genetic resource and cast light on the genetic basis of long tail traits in avian species. We expect this novel knowledge to provide new genomic evidence and options for designing and implementing genetic improvements of ornamental chicken productivity through precision crossbreeding aids.
Collapse
Affiliation(s)
- Dong-Jae Youm
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Donghee Kim
- School of Biological Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Myeongkyu Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Republic of Korea
| | - Sohyoung Won
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Republic of Korea
- eGnome, Inc, Seoul 05836, Republic of Korea
| | - Young Ho Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Republic of Korea
| | - Bongsang Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
- eGnome, Inc, Seoul 05836, Republic of Korea
| | - Donghyeok Seol
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
| | - Han-Ha Chai
- Animal Genomics & Bioinformatics Division, National Institute of Animal Science, RDA 1500, Wanju 55365, Republic of Korea
| | - Dajeong Lim
- Animal Genomics & Bioinformatics Division, National Institute of Animal Science, RDA 1500, Wanju 55365, Republic of Korea
| | - Choongwon Jeong
- School of Biological Sciences, Seoul National University, Seoul 08826, Republic of Korea
- Corresponding author
| | - Heebal Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul 08826, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Republic of Korea
- eGnome, Inc, Seoul 05836, Republic of Korea
- Corresponding author
| |
Collapse
|
6
|
van Eeden G, Uren C, Pless E, Mastoras M, van der Spuy GD, Tromp G, Henn BM, Möller M. The recombination landscape of the Khoe-San likely represents the upper limits of recombination divergence in humans. Genome Biol 2022; 23:172. [PMID: 35945619 PMCID: PMC9361568 DOI: 10.1186/s13059-022-02744-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 08/01/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recombination maps are important resources for epidemiological and evolutionary analyses; however, there are currently no recombination maps representing any African population outside of those with West African ancestry. We infer the demographic history for the Nama, an indigenous Khoe-San population of southern Africa, and derive a novel, population-specific recombination map from the whole genome sequencing of 54 Nama individuals. We hypothesise that there are no publicly available recombination maps representative of the Nama, considering the deep population divergence and subsequent isolation of the Khoe-San from other African groups. RESULTS We show that the recombination landscape of the Nama does not cluster with any continental groups with publicly available representative recombination maps. Finally, we use selection scans as an example of how fine-scale differences between the Nama recombination map and the combined Phase II HapMap recombination map can impact the outcome of selection scans. CONCLUSIONS Fine-scale differences in recombination can meaningfully alter the results of a selection scan. The recombination map we infer likely represents an upper bound on the extent of divergence we expect to see for a recombination map in humans and would be of interest to any researcher that wants to test the sensitivity of population genetic or GWAS analysis to recombination map input.
Collapse
Affiliation(s)
- Gerald van Eeden
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Caitlin Uren
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, 7602 South Africa
| | - Evlyn Pless
- Department of Anthropology, Center for Population Biology and the Genome Center, University of California (UC) Davis, Davis, CA USA
| | - Mira Mastoras
- Department of Anthropology, Center for Population Biology and the Genome Center, University of California (UC) Davis, Davis, CA USA
| | - Gian D. van der Spuy
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, 7602 South Africa
- SAMRC-SHIP South African Tuberculosis Bioinformatics Initiative (SATBBI), Center for Bioinformatics and Computational Biology, Cape Town, South Africa
| | - Gerard Tromp
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, 7602 South Africa
- SAMRC-SHIP South African Tuberculosis Bioinformatics Initiative (SATBBI), Center for Bioinformatics and Computational Biology, Cape Town, South Africa
| | - Brenna M. Henn
- Department of Anthropology, Center for Population Biology and the Genome Center, University of California (UC) Davis, Davis, CA USA
| | - Marlo Möller
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, 7602 South Africa
| |
Collapse
|
7
|
Gabián M, Morán P, Saura M, Carvajal-Rodríguez A. Detecting Local Adaptation between North and South European Atlantic Salmon Populations. BIOLOGY 2022; 11:933. [PMID: 35741456 PMCID: PMC9219887 DOI: 10.3390/biology11060933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 06/09/2022] [Accepted: 06/16/2022] [Indexed: 06/15/2023]
Abstract
Pollution and other anthropogenic effects have driven a decrease in Atlantic salmon (Salmo salar) in the Iberian Peninsula. The restocking effort carried out in the 1980s, with salmon from northern latitudes with the aim of mitigating the decline of native populations, failed, probably due to the deficiency in adaptation of foreign salmon from northern Europe to the warm waters of the Iberian Peninsula. This result would imply that the Iberian populations of Atlantic salmon have experienced local adaptation in their past evolutionary history, as has been described for other populations of this species and other salmonids. Local adaptation can occur by divergent selections between environments, favoring the fixation of alleles that increase the fitness of a population in the environment it inhabits relative to other alleles favored in another population. In this work, we compared the genomes of different populations from the Iberian Peninsula (Atlantic and Cantabric basins) and Scotland in order to provide tentative evidence of candidate SNPs responsible for the adaptive differences between populations, which may explain the failures of restocking carried out during the 1980s. For this purpose, the samples were genotyped with a 220,000 high-density SNP array (Affymetrix) specific to Atlantic salmon. Our results revealed potential evidence of local adaptation for North Spanish and Scottish populations. As expected, most differences concerned the comparison of the Iberian Peninsula with Scotland, although there were also differences between Atlantic and Cantabric populations. A high proportion of the genes identified are related to development and cellular metabolism, DNA transcription and anatomical structure. A particular SNP was identified within the NADP-dependent malic enzyme-2 (mMEP-2*), previously reported by independent studies as a candidate for local adaptation in salmon from the Iberian Peninsula. Interestingly, the corresponding SNP within the mMEP-2* region was consistent with a genomic pattern of divergent selection.
Collapse
Affiliation(s)
- María Gabián
- Centro de Investigación Mariña (CIM), Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain; (M.G.); (P.M.)
| | - Paloma Morán
- Centro de Investigación Mariña (CIM), Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain; (M.G.); (P.M.)
| | - María Saura
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), 28040 Madrid, Spain;
| | - Antonio Carvajal-Rodríguez
- Centro de Investigación Mariña (CIM), Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain; (M.G.); (P.M.)
| |
Collapse
|
8
|
Maiorano AM, Cardoso DF, Carvalheiro R, Júnior GAF, de Albuquerque LG, de Oliveira HN. Signatures of selection in Nelore cattle revealed by whole-genome sequencing data. Genomics 2022; 114:110304. [DOI: 10.1016/j.ygeno.2022.110304] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 01/07/2022] [Accepted: 02/01/2022] [Indexed: 11/04/2022]
|
9
|
Laval G, Patin E, Boutillier P, Quintana-Murci L. Sporadic occurrence of recent selective sweeps from standing variation in humans as revealed by an approximate Bayesian computation approach. Genetics 2021; 219:6377789. [PMID: 34849862 DOI: 10.1093/genetics/iyab161] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 09/01/2021] [Indexed: 12/14/2022] Open
Abstract
During their dispersals over the last 100,000 years, modern humans have been exposed to a large variety of environments, resulting in genetic adaptation. While genome-wide scans for the footprints of positive Darwinian selection have increased knowledge of genes and functions potentially involved in human local adaptation, they have globally produced evidence of a limited contribution of selective sweeps in humans. Conversely, studies based on machine learning algorithms suggest that recent sweeps from standing variation are widespread in humans, an observation that has been recently questioned. Here, we sought to formally quantify the number of recent selective sweeps in humans, by leveraging approximate Bayesian computation and whole-genome sequence data. Our computer simulations revealed suitable ABC estimations, regardless of the frequency of the selected alleles at the onset of selection and the completion of sweeps. Under a model of recent selection from standing variation, we inferred that an average of 68 (from 56 to 79) and 140 (from 94 to 198) sweeps occurred over the last 100,000 years of human history, in African and Eurasian populations, respectively. The former estimation is compatible with human adaptation rates estimated since divergence with chimps, and reveals numbers of sweeps per generation per site in the range of values estimated in Drosophila. Our results confirm the rarity of selective sweeps in humans and show a low contribution of sweeps from standing variation to recent human adaptation.
Collapse
Affiliation(s)
- Guillaume Laval
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France
| | - Etienne Patin
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France
| | - Pierre Boutillier
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, Paris 75015, France.,Human Genomics and Evolution, Collège de France, 75005 Paris, France
| |
Collapse
|
10
|
Guiblet WM, DeGiorgio M, Cheng X, Chiaromonte F, Eckert KA, Huang YF, Makova KD. Selection and thermostability suggest G-quadruplexes are novel functional elements of the human genome. Genome Res 2021; 31:1136-1149. [PMID: 34187812 PMCID: PMC8256861 DOI: 10.1101/gr.269589.120] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 05/24/2021] [Indexed: 12/11/2022]
Abstract
Approximately 1% of the human genome has the ability to fold into G-quadruplexes (G4s)-noncanonical strand-specific DNA structures forming at G-rich motifs. G4s regulate several key cellular processes (e.g., transcription) and have been hypothesized to participate in others (e.g., firing of replication origins). Moreover, G4s differ in their thermostability, and this may affect their function. Yet, G4s may also hinder replication, transcription, and translation and may increase genome instability and mutation rates. Therefore, depending on their genomic location, thermostability, and functionality, G4 loci might evolve under different selective pressures, which has never been investigated. Here we conducted the first genome-wide analysis of G4 distribution, thermostability, and selection. We found an overrepresentation, high thermostability, and purifying selection for G4s within genic components in which they are expected to be functional-promoters, CpG islands, and 5' and 3' UTRs. A similar pattern was observed for G4s within replication origins, enhancers, eQTLs, and TAD boundary regions, strongly suggesting their functionality. In contrast, G4s on the nontranscribed strand of exons were underrepresented, were unstable, and evolved neutrally. In general, G4s on the nontranscribed strand of genic components had lower density and were less stable than those on the transcribed strand, suggesting that the former are avoided at the RNA level. Across the genome, purifying selection was stronger at stable G4s. Our results suggest that purifying selection preserves the sequences of functional G4s, whereas nonfunctional G4s are too costly to be tolerated in the genome. Thus, G4s are emerging as fundamental, functional genomic elements.
Collapse
Affiliation(s)
- Wilfried M Guiblet
- Bioinformatics and Genomics Graduate Program, Penn State University, University Park, Pennsylvania 16802, USA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida 33431, USA
| | - Xiaoheng Cheng
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
| | - Francesca Chiaromonte
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Medical Genomics, Penn State University, University Park and Hershey, Pennsylvania 16802, USA
- Sant'Anna School of Advanced Studies, 56127 Pisa, Italy
| | - Kristin A Eckert
- Center for Medical Genomics, Penn State University, University Park and Hershey, Pennsylvania 16802, USA
- Department of Pathology, Penn State University, College of Medicine, Hershey, Pennsylvania 17033, USA
| | - Yi-Fei Huang
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
- Center for Medical Genomics, Penn State University, University Park and Hershey, Pennsylvania 16802, USA
| | - Kateryna D Makova
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
- Center for Medical Genomics, Penn State University, University Park and Hershey, Pennsylvania 16802, USA
| |
Collapse
|
11
|
Harris AM, DeGiorgio M. A Likelihood Approach for Uncovering Selective Sweep Signatures from Haplotype Data. Mol Biol Evol 2021; 37:3023-3046. [PMID: 32392293 PMCID: PMC7530616 DOI: 10.1093/molbev/msaa115] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Selective sweeps are frequent and varied signatures in the genomes of natural populations, and detecting them is consequently important in understanding mechanisms of adaptation by natural selection. Following a selective sweep, haplotypic diversity surrounding the site under selection decreases, and this deviation from the background pattern of variation can be applied to identify sweeps. Multiple methods exist to locate selective sweeps in the genome from haplotype data, but none leverages the power of a model-based approach to make their inference. Here, we propose a likelihood ratio test statistic T to probe whole-genome polymorphism data sets for selective sweep signatures. Our framework uses a simple but powerful model of haplotype frequency spectrum distortion to find sweeps and additionally make an inference on the number of presently sweeping haplotypes in a population. We found that the T statistic is suitable for detecting both hard and soft sweeps across a variety of demographic models, selection strengths, and ages of the beneficial allele. Accordingly, we applied the T statistic to variant calls from European and sub-Saharan African human populations, yielding primarily literature-supported candidates, including LCT, RSPH3, and ZNF211 in CEU, SYT1, RGS18, and NNT in YRI, and HLA genes in both populations. We also searched for sweep signatures in Drosophila melanogaster, finding expected candidates at Ace, Uhg1, and Pimet. Finally, we provide open-source software to compute the T statistic and the inferred number of presently sweeping haplotypes from whole-genome data.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, PA.,Molecular, Cellular, and Integrative Biosciences, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
12
|
Abstract
As human populations spread across the world, they adapted genetically to local conditions. So too did the resident microorganism communities that everyone carries with them. However, the collective influence of the diverse and dynamic community of resident microbes on host evolution is poorly understood. The taxonomic composition of the microbiota varies among individuals and displays a range of sometimes redundant functions that modify the physicochemical environment of the host and may alter selection pressures. Here we review known human traits and genes for which the microbiota may have contributed or responded to changes in host diet, climate, or pathogen exposure. Integrating host–microbiota interactions in human adaptation could offer new approaches to improve our understanding of human health and evolution.
Collapse
Affiliation(s)
- Taichi A. Suzuki
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Ruth E. Ley
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Tübingen, Germany
| |
Collapse
|
13
|
Walsh S, Pagani L, Xue Y, Laayouni H, Tyler-Smith C, Bertranpetit J. Positive selection in admixed populations from Ethiopia. BMC Genet 2020; 21:108. [PMID: 33092534 PMCID: PMC7580818 DOI: 10.1186/s12863-020-00908-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 08/27/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND In the process of adaptation of humans to their environment, positive or adaptive selection has played a main role. Positive selection has, however, been under-studied in African populations, despite their diversity and importance for understanding human history. RESULTS Here, we have used 119 available whole-genome sequences from five Ethiopian populations (Amhara, Oromo, Somali, Wolayta and Gumuz) to investigate the modes and targets of positive selection in this part of the world. The site frequency spectrum-based test SFselect was applied to idfentify a wide range of events of selection (old and recent), and the haplotype-based statistic integrated haplotype score to detect more recent events, in each case with evaluation of the significance of candidate signals by extensive simulations. Additional insights were provided by considering admixture proportions and functional categories of genes. We identified both individual loci that are likely targets of classic sweeps and groups of genes that may have experienced polygenic adaptation. We found population-specific as well as shared signals of selection, with folate metabolism and the related ultraviolet response and skin pigmentation standing out as a shared pathway, perhaps as a response to the high levels of ultraviolet irradiation, and in addition strong signals in genes such as IFNA, MRC1, immunoglobulins and T-cell receptors which contribute to defend against pathogens. CONCLUSIONS Signals of positive selection were detected in Ethiopian populations revealing novel adaptations in East Africa, and abundant targets for functional follow-up.
Collapse
Affiliation(s)
- Sandra Walsh
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Dr. Aiguader, 88 08003, Barcelona, Catalonia, Spain
| | - Luca Pagani
- Estonian Biocentre, Institute of Genomics, University of Tartu, 51010, Tartu, Estonia
- Department of Biology, University of Padova, 35131, Padova, Italy
| | - Yali Xue
- The Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Hafid Laayouni
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Dr. Aiguader, 88 08003, Barcelona, Catalonia, Spain
- Bioinformatics Studies, ESCI-UPF, Barcelona, Catalonia, Spain
| | - Chris Tyler-Smith
- The Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.
| | - Jaume Bertranpetit
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Dr. Aiguader, 88 08003, Barcelona, Catalonia, Spain.
| |
Collapse
|
14
|
Harris AM, DeGiorgio M. Identifying and Classifying Shared Selective Sweeps from Multilocus Data. Genetics 2020; 215:143-171. [PMID: 32152048 PMCID: PMC7198270 DOI: 10.1534/genetics.120.303137] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 02/29/2020] [Indexed: 11/18/2022] Open
Abstract
Positive selection causes beneficial alleles to rise to high frequency, resulting in a selective sweep of the diversity surrounding the selected sites. Accordingly, the signature of a selective sweep in an ancestral population may still remain in its descendants. Identifying signatures of selection in the ancestor that are shared among its descendants is important to contextualize the timing of a sweep, but few methods exist for this purpose. We introduce the statistic SS-H12, which can identify genomic regions under shared positive selection across populations and is based on the theory of the expected haplotype homozygosity statistic H12, which detects recent hard and soft sweeps from the presence of high-frequency haplotypes. SS-H12 is distinct from comparable statistics because it requires a minimum of only two populations, and properly identifies and differentiates between independent convergent sweeps and true ancestral sweeps, with high power and robustness to a variety of demographic models. Furthermore, we can apply SS-H12 in conjunction with the ratio of statistics we term [Formula: see text] and [Formula: see text] to further classify identified shared sweeps as hard or soft. Finally, we identified both previously reported and novel shared sweep candidates from human whole-genome sequences. Previously reported candidates include the well-characterized ancestral sweeps at LCT and SLC24A5 in Indo-Europeans, as well as GPHN worldwide. Novel candidates include an ancestral sweep at RGS18 in sub-Saharan Africans involved in regulating the platelet response and implicated in sudden cardiac death, and a convergent sweep at C2CD5 between European and East Asian populations that may explain their different insulin responses.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802
- Molecular, Cellular, and Integrative Biosciences at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida 33431
| |
Collapse
|
15
|
Khrunin AV, Khvorykh GV, Fedorov AN, Limborska SA. Genomic landscape of the signals of positive natural selection in populations of Northern Eurasia: A view from Northern Russia. PLoS One 2020; 15:e0228778. [PMID: 32023328 PMCID: PMC7001972 DOI: 10.1371/journal.pone.0228778] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Accepted: 01/23/2020] [Indexed: 12/15/2022] Open
Abstract
Natural selection of beneficial genetic variants played a critical role in human adaptation to a wide range of environmental conditions. Northern Eurasia, despite its severe climate, is home to lots of ethnically diverse populations. The genetic variants associated with the survival of these populations have hardly been analyzed. We searched for the genomic signatures of positive selection in (1) the genome-wide microarray data of 432 people from eight different northern Russian populations and (2) the whole-genome sequences of 250 people from Northern Eurasia from a public repository through testing the extended haplotype homozigosity (EHH) and direct comparison of allele frequency, respectively. The 20 loci with the strongest selection signals were characterized in detail. Among the top EHH hits were the NRG3 and NBEA genes, which are involved in the development and functioning of the neural system, the PTPRM gene, which mediates cell-cell interactions and adhesion, and a region on chromosome 4 (chr4:28.7-28.9 Mb) that contained several loci affiliated with different classes of non-coding RNAs (RN7SL101P, MIR4275, MESTP3, and LINC02364). NBEA and the region on chromosome 4 were novel selection targets that were identified for the first time in Western Siberian populations. Cross-population comparisons of EHH profiles suggested a particular role for the chr4:28.7-28.9 Mb region in the local adaptation of Western Siberians. The strongest selection signal identified in Siberian sequenced genomes was formed by six SNPs on chromosome 11 (chr11:124.9-125.2 Mb). This region included well-known genes SLC37A2 and PKNOX2. SLC37A2 is most-highly expressed in the gut. Its expression is regulated by vitamin D, which is often deficient in northern regions. The PKNOX2 gene is a transcription factor of the homeobox family that is expressed in the brain and many other tissues. This gene is associated with alcohol addiction, which is widespread in many Northern Eurasian populations.
Collapse
Affiliation(s)
- Andrey V. Khrunin
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Moscow, Russia
| | - Gennady V. Khvorykh
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Moscow, Russia
| | - Alexei N. Fedorov
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Moscow, Russia
- Department of Medicine, University of Toledo, Toledo, Ohio, United States of America
| | - Svetlana A. Limborska
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
16
|
Bhati M, Kadri NK, Crysnanto D, Pausch H. Assessing genomic diversity and signatures of selection in Original Braunvieh cattle using whole-genome sequencing data. BMC Genomics 2020; 21:27. [PMID: 31914939 PMCID: PMC6950892 DOI: 10.1186/s12864-020-6446-y] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2019] [Accepted: 12/31/2019] [Indexed: 02/07/2023] Open
Abstract
Background Autochthonous cattle breeds are an important source of genetic variation because they might carry alleles that enable them to adapt to local environment and food conditions. Original Braunvieh (OB) is a local cattle breed of Switzerland used for beef and milk production in alpine areas. Using whole-genome sequencing (WGS) data of 49 key ancestors, we characterize genomic diversity, genomic inbreeding, and signatures of selection in Swiss OB cattle at nucleotide resolution. Results We annotated 15,722,811 SNPs and 1,580,878 Indels including 10,738 and 2763 missense deleterious and high impact variants, respectively, that were discovered in 49 OB key ancestors. Six Mendelian trait-associated variants that were previously detected in breeds other than OB, segregated in the sequenced key ancestors including variants causal for recessive xanthinuria and albinism. The average nucleotide diversity (1.6 × 10− 3) was higher in OB than many mainstream European cattle breeds. Accordingly, the average genomic inbreeding derived from runs of homozygosity (ROH) was relatively low (FROH = 0.14) in the 49 OB key ancestor animals. However, genomic inbreeding was higher in OB cattle of more recent generations (FROH = 0.16) due to a higher number of long (> 1 Mb) runs of homozygosity. Using two complementary approaches, composite likelihood ratio test and integrated haplotype score, we identified 95 and 162 genomic regions encompassing 136 and 157 protein-coding genes, respectively, that showed evidence (P < 0.005) of past and ongoing selection. These selection signals were enriched for quantitative trait loci related to beef traits including meat quality, feed efficiency and body weight and pathways related to blood coagulation, nervous and sensory stimulus. Conclusions We provide a comprehensive overview of sequence variation in Swiss OB cattle genomes. With WGS data, we observe higher genomic diversity and less inbreeding in OB than many European mainstream cattle breeds. Footprints of selection were detected in genomic regions that are possibly relevant for meat quality and adaptation to local environmental conditions. Considering that the population size is low and genomic inbreeding increased in the past generations, the implementation of optimal mating strategies seems warranted to maintain genetic diversity in the Swiss OB cattle population.
Collapse
Affiliation(s)
- Meenu Bhati
- Animal Genomics, ETH Zürich, Zürich, Switzerland.
| | | | | | | |
Collapse
|
17
|
Fan S, Kelly DE, Beltrame MH, Hansen MEB, Mallick S, Ranciaro A, Hirbo J, Thompson S, Beggs W, Nyambo T, Omar SA, Meskel DW, Belay G, Froment A, Patterson N, Reich D, Tishkoff SA. African evolutionary history inferred from whole genome sequence data of 44 indigenous African populations. Genome Biol 2019; 20:82. [PMID: 31023338 PMCID: PMC6485071 DOI: 10.1186/s13059-019-1679-2] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Accepted: 03/22/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Africa is the origin of modern humans within the past 300 thousand years. To infer the complex demographic history of African populations and adaptation to diverse environments, we sequenced the genomes of 92 individuals from 44 indigenous African populations. RESULTS Genetic structure analyses indicate that among Africans, genetic ancestry is largely partitioned by geography and language, though we observe mixed ancestry in many individuals, consistent with both short- and long-range migration events followed by admixture. Phylogenetic analysis indicates that the San genetic lineage is basal to all modern human lineages. The San and Niger-Congo, Afroasiatic, and Nilo-Saharan lineages were substantially diverged by 160 kya (thousand years ago). In contrast, the San and Central African rainforest hunter-gatherer (CRHG), Hadza hunter-gatherer, and Sandawe hunter-gatherer lineages were diverged by ~ 120-100 kya. Niger-Congo, Nilo-Saharan, and Afroasiatic lineages diverged more recently by ~ 54-16 kya. Eastern and western CRHG lineages diverged by ~ 50-31 kya, and the western CRHG lineages diverged by ~ 18-12 kya. The San and CRHG populations maintained the largest effective population size compared to other populations prior to 60 kya. Further, we observed signatures of positive selection at genes involved in muscle development, bone synthesis, reproduction, immune function, energy metabolism, and cell signaling, which may contribute to local adaptation of African populations. CONCLUSIONS We observe high levels of genomic variation between ethnically diverse Africans which is largely correlated with geography and language. Our study indicates ancient population substructure and local adaptation of Africans.
Collapse
Affiliation(s)
- Shaohua Fan
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Present Address: State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences, Fudan University, 2005 Songhu Road, Shanghai, China
| | - Derek E Kelly
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Marcia H Beltrame
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Matthew E B Hansen
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Swapan Mallick
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Alessia Ranciaro
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jibril Hirbo
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
- Present Address: Division of Genetic Medicine, Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN, 37232, USA
| | - Simon Thompson
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - William Beggs
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Thomas Nyambo
- Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dares Salaam, Tanzania
| | - Sabah A Omar
- Center for Biotechnology Research and Development, Kenya Medical Research Institute, Nairobi, Kenya
| | | | - Gurja Belay
- Department of Biology, Addis Ababa University, Addis Ababa, Ethiopia
| | | | - Nick Patterson
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
| | - David Reich
- Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Sarah A Tishkoff
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, 19104, USA.
- Department of Biology, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
18
|
Gouveia MH, Bergen AW, Borda V, Nunes K, Leal TP, Ogwang MD, Yeboah ED, Mensah JE, Kinyera T, Otim I, Nabalende H, Legason ID, Mpoloka SW, Mokone GG, Kerchan P, Bhatia K, Reynolds SJ, Birtwum RB, Adjei AA, Tettey Y, Tay E, Hoover R, Pfeiffer RM, Biggar RJ, Goedert JJ, Prokunina-Olsson L, Dean M, Yeager M, Lima-Costa MF, Hsing AW, Tishkoff SA, Chanock SJ, Tarazona-Santos E, Mbulaiteye SM. Genetic signatures of gene flow and malaria-driven natural selection in sub-Saharan populations of the "endemic Burkitt Lymphoma belt". PLoS Genet 2019; 15:e1008027. [PMID: 30849090 PMCID: PMC6426263 DOI: 10.1371/journal.pgen.1008027] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Revised: 03/20/2019] [Accepted: 02/17/2019] [Indexed: 12/13/2022] Open
Abstract
Populations in sub-Saharan Africa have historically been exposed to intense selection from chronic infection with falciparum malaria. Interestingly, populations with the highest malaria intensity can be identified by the increased occurrence of endemic Burkitt Lymphoma (eBL), a pediatric cancer that affects populations with intense malaria exposure, in the so called "eBL belt" in sub-Saharan Africa. However, the effects of intense malaria exposure and sub-Saharan populations' genetic histories remain poorly explored. To determine if historical migrations and intense malaria exposure have shaped the genetic composition of the eBL belt populations, we genotyped ~4.3 million SNPs in 1,708 individuals from Ghana and Northern Uganda, located on opposite sides of eBL belt and with ≥ 7 months/year of intense malaria exposure and published evidence of high incidence of BL. Among 35 Ghanaian tribes, we showed a predominantly West-Central African ancestry and genomic footprints of gene flow from Gambian and East African populations. In Uganda, the North West population showed a predominantly Nilotic ancestry, and the North Central population was a mixture of Nilotic and Southern Bantu ancestry, while the Southwest Ugandan population showed a predominant Southern Bantu ancestry. Our results support the hypothesis of diverse ancestral origins of the Ugandan, Kenyan and Tanzanian Great Lakes African populations, reflecting a confluence of Nilotic, Cushitic and Bantu migrations in the last 3000 years. Natural selection analyses suggest, for the first time, a strong positive selection signal in the ATP2B4 gene (rs10900588) in Northern Ugandan populations. These findings provide important baseline genomic data to facilitate disease association studies, including of eBL, in eBL belt populations.
Collapse
Affiliation(s)
- Mateus H. Gouveia
- Instituto de Pesquisa René Rachou, Fundação Oswaldo Cruz, Belo Horizonte, Minas Gerais, Brazil
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Center for Research on Genomics & Global Health, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Andrew W. Bergen
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Victor Borda
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Kelly Nunes
- Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil
| | - Thiago P. Leal
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Martin D. Ogwang
- EMBLEM Study, African Field Epidemiology Network, Kampala, Uganda
| | | | | | - Tobias Kinyera
- EMBLEM Study, African Field Epidemiology Network, Kampala, Uganda
| | - Isaac Otim
- EMBLEM Study, African Field Epidemiology Network, Kampala, Uganda
| | | | | | | | - Gaonyadiwe George Mokone
- Department of Biomedical Sciences, University of Botswana School of Medicine, Gaborone, Botswana
| | - Patrick Kerchan
- EMBLEM Study, African Field Epidemiology Network, Kampala, Uganda
| | - Kishor Bhatia
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Steven J. Reynolds
- Division of Intramural Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | | | | | - Yao Tettey
- University of Ghana Medical School, Accra, Ghana
| | - Evelyn Tay
- University of Ghana Medical School, Accra, Ghana
| | - Robert Hoover
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Ruth M. Pfeiffer
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Robert J. Biggar
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - James J. Goedert
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Ludmila Prokunina-Olsson
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Michael Dean
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Meredith Yeager
- Cancer Genomics Research Laboratory, Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, US Department of Health and Human Services, Frederick, Maryland, United States of America
| | - M. Fernanda Lima-Costa
- Instituto de Pesquisa René Rachou, Fundação Oswaldo Cruz, Belo Horizonte, Minas Gerais, Brazil
| | - Ann W. Hsing
- Stanford Cancer Institute, Stanford University, Stanford, California, United States of America
| | - Sarah A. Tishkoff
- Department of Genetics and Biology, University of Pennsylvania, Philadelphia, United States of America
| | - Stephen J. Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| | - Eduardo Tarazona-Santos
- Departamento de Biologia Geral, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Sam M. Mbulaiteye
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, US Department of Health and Human Services, Bethesda, Maryland, United States of America
| |
Collapse
|
19
|
Genomic evidence for shared common ancestry of East African hunting-gathering populations and insights into local adaptation. Proc Natl Acad Sci U S A 2019; 116:4166-4175. [PMID: 30782801 DOI: 10.1073/pnas.1817678116] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Anatomically modern humans arose in Africa ∼300,000 years ago, but the demographic and adaptive histories of African populations are not well-characterized. Here, we have generated a genome-wide dataset from 840 Africans, residing in western, eastern, southern, and northern Africa, belonging to 50 ethnicities, and speaking languages belonging to four language families. In addition to agriculturalists and pastoralists, our study includes 16 populations that practice, or until recently have practiced, a hunting-gathering (HG) lifestyle. We observe that genetic structure in Africa is broadly correlated not only with geography, but to a lesser extent, with linguistic affiliation and subsistence strategy. Four East African HG (EHG) populations that are geographically distant from each other show evidence of common ancestry: the Hadza and Sandawe in Tanzania, who speak languages with clicks classified as Khoisan; the Dahalo in Kenya, whose language has remnant clicks; and the Sabue in Ethiopia, who speak an unclassified language. Additionally, we observed common ancestry between central African rainforest HGs and southern African San, the latter of whom speak languages with clicks classified as Khoisan. With the exception of the EHG, central African rainforest HGs, and San, other HG groups in Africa appear genetically similar to neighboring agriculturalist or pastoralist populations. We additionally demonstrate that infectious disease, immune response, and diet have played important roles in the adaptive landscape of African history. However, while the broad biological processes involved in recent human adaptation in Africa are often consistent across populations, the specific loci affected by selective pressures more often vary across populations.
Collapse
|
20
|
Hallmark B, Karafet TM, Hsieh P, Osipova LP, Watkins JC, Hammer MF. Genomic Evidence of Local Adaptation to Climate and Diet in Indigenous Siberians. Mol Biol Evol 2018; 36:315-327. [DOI: 10.1093/molbev/msy211] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Affiliation(s)
- Brian Hallmark
- Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ
| | | | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington, Seattle, WA
| | - Ludmila P Osipova
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Joseph C Watkins
- Interdisciplinary Program in Statistics, University of Arizona, Tucson, AZ
| | - Michael F Hammer
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ
- Department of Genome Sciences, University of Washington, Seattle, WA
| |
Collapse
|
21
|
Torres R, Szpiech ZA, Hernandez RD. Human demographic history has amplified the effects of background selection across the genome. PLoS Genet 2018; 14:e1007387. [PMID: 29912945 PMCID: PMC6056204 DOI: 10.1371/journal.pgen.1007387] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Revised: 07/23/2018] [Accepted: 04/30/2018] [Indexed: 01/22/2023] Open
Abstract
Natural populations often grow, shrink, and migrate over time. Such demographic processes can affect genome-wide levels of genetic diversity. Additionally, genetic variation in functional regions of the genome can be altered by natural selection, which drives adaptive mutations to higher frequencies or purges deleterious ones. Such selective processes affect not only the sites directly under selection but also nearby neutral variation through genetic linkage via processes referred to as genetic hitchhiking in the context of positive selection and background selection (BGS) in the context of purifying selection. While there is extensive literature examining the consequences of selection at linked sites at demographic equilibrium, less is known about how non-equilibrium demographic processes influence the effects of hitchhiking and BGS. Utilizing a global sample of human whole-genome sequences from the Thousand Genomes Project and extensive simulations, we investigate how non-equilibrium demographic processes magnify and dampen the consequences of selection at linked sites across the human genome. When binning the genome by inferred strength of BGS, we observe that, compared to Africans, non-African populations have experienced larger proportional decreases in neutral genetic diversity in strong BGS regions. We replicate these findings in admixed populations by showing that non-African ancestral components of the genome have also been affected more severely in these regions. We attribute these differences to the strong, sustained/recurrent population bottlenecks that non-Africans experienced as they migrated out of Africa and throughout the globe. Furthermore, we observe a strong correlation between FST and the inferred strength of BGS, suggesting a stronger rate of genetic drift. Forward simulations of human demographic history with a model of BGS support these observations. Our results show that non-equilibrium demography significantly alters the consequences of selection at linked sites and support the need for more work investigating the dynamic process of multiple evolutionary forces operating in concert.
Collapse
Affiliation(s)
- Raul Torres
- Biomedical Sciences Graduate Program, University of California San Francisco, San Francisco, CA, United States of America
| | - Zachary A. Szpiech
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, United States of America
| | - Ryan D. Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, United States of America
- Institute for Computational Health Sciences, University of California San Francisco, San Francisco, CA, United States of America
- Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA, United States of America
- * E-mail:
| |
Collapse
|
22
|
Austerlitz F, Heyer E. Neutral Theory: From Complex Population History to Natural Selection and Sociocultural Phenomena in Human Populations. Mol Biol Evol 2018; 35:1304-1307. [PMID: 29659992 DOI: 10.1093/molbev/msy067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Here, we present a synthetic view on how Kimura's Neutral theory has helped us gaining insight on the different evolutionary forces that shape human evolution. We put this perspective in the frame of recent emerging challenges: the use of whole genome data for reconstructing population histories, natural selection on complex polygenic traits, and integrating cultural processes in human evolution.
Collapse
Affiliation(s)
- Frédéric Austerlitz
- UMR 7206 Eco-Anthropologie et Ethnobiologie, CNRS, MNHN, Université Paris Diderot, Paris, France
| | - Evelyne Heyer
- UMR 7206 Eco-Anthropologie et Ethnobiologie, CNRS, MNHN, Université Paris Diderot, Paris, France
| |
Collapse
|
23
|
Sugden LA, Atkinson EG, Fischer AP, Rong S, Henn BM, Ramachandran S. Localization of adaptive variants in human genomes using averaged one-dependence estimation. Nat Commun 2018; 9:703. [PMID: 29459739 PMCID: PMC5818606 DOI: 10.1038/s41467-018-03100-7] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 01/19/2018] [Indexed: 12/19/2022] Open
Abstract
Statistical methods for identifying adaptive mutations from population genetic data face several obstacles: assessing the significance of genomic outliers, integrating correlated measures of selection into one analytic framework, and distinguishing adaptive variants from hitchhiking neutral variants. Here, we introduce SWIF(r), a probabilistic method that detects selective sweeps by learning the distributions of multiple selection statistics under different evolutionary scenarios and calculating the posterior probability of a sweep at each genomic site. SWIF(r) is trained using simulations from a user-specified demographic model and explicitly models the joint distributions of selection statistics, thereby increasing its power to both identify regions undergoing sweeps and localize adaptive mutations. Using array and exome data from 45 ‡Khomani San hunter-gatherers of southern Africa, we identify an enrichment of adaptive signals in genes associated with metabolism and obesity. SWIF(r) provides a transparent probabilistic framework for localizing beneficial mutations that is extensible to a variety of evolutionary scenarios.
Collapse
Affiliation(s)
- Lauren Alpert Sugden
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.
- Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA.
| | - Elizabeth G Atkinson
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Annie P Fischer
- Division of Applied Mathematics, Brown University, Providence, RI, 02912, USA
| | - Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI, 02912, USA
| | - Brenna M Henn
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA.
- Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, 02912, USA.
| |
Collapse
|
24
|
Byars SG, Huang QQ, Gray LA, Bakshi A, Ripatti S, Abraham G, Stearns SC, Inouye M. Genetic loci associated with coronary artery disease harbor evidence of selection and antagonistic pleiotropy. PLoS Genet 2017; 13:e1006328. [PMID: 28640878 PMCID: PMC5480811 DOI: 10.1371/journal.pgen.1006328] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 05/02/2017] [Indexed: 12/18/2022] Open
Abstract
Traditional genome-wide scans for positive selection have mainly uncovered selective sweeps associated with monogenic traits. While selection on quantitative traits is much more common, very few signals have been detected because of their polygenic nature. We searched for positive selection signals underlying coronary artery disease (CAD) in worldwide populations, using novel approaches to quantify relationships between polygenic selection signals and CAD genetic risk. We identified new candidate adaptive loci that appear to have been directly modified by disease pressures given their significant associations with CAD genetic risk. These candidates were all uniquely and consistently associated with many different male and female reproductive traits suggesting selection may have also targeted these because of their direct effects on fitness. We found that CAD loci are significantly enriched for lifetime reproductive success relative to the rest of the human genome, with evidence that the relationship between CAD and lifetime reproductive success is antagonistic. This supports the presence of antagonistic-pleiotropic tradeoffs on CAD loci and provides a novel explanation for the maintenance and high prevalence of CAD in modern humans. Lastly, we found that positive selection more often targeted CAD gene regulatory variants using HapMap3 lymphoblastoid cell lines, which further highlights the unique biological significance of candidate adaptive loci underlying CAD. Our study provides a novel approach for detecting selection on polygenic traits and evidence that modern human genomes have evolved in response to CAD-induced selection pressures and other early-life traits sharing pleiotropic links with CAD.
Collapse
Affiliation(s)
- Sean G. Byars
- Centre for Systems Genomics, School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
- Department of Pathology, The University of Melbourne, Parkville, Victoria, Australia
| | - Qin Qin Huang
- Centre for Systems Genomics, School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
- Department of Pathology, The University of Melbourne, Parkville, Victoria, Australia
- Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Lesley-Ann Gray
- Centre for Systems Genomics, School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
- Department of Pathology, The University of Melbourne, Parkville, Victoria, Australia
| | - Andrew Bakshi
- Centre for Systems Genomics, School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
| | - Samuli Ripatti
- Institute of Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
- Department of Public Health, University of Helsinki, Helsinki, Finland
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Gad Abraham
- Centre for Systems Genomics, School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
- Department of Pathology, The University of Melbourne, Parkville, Victoria, Australia
- Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Stephen C. Stearns
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, United States of America
| | - Michael Inouye
- Centre for Systems Genomics, School of BioSciences, The University of Melbourne, Parkville, Victoria, Australia
- Department of Pathology, The University of Melbourne, Parkville, Victoria, Australia
- Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| |
Collapse
|
25
|
Kim SJ, Ka S, Ha JW, Kim J, Yoo D, Kim K, Lee HK, Lim D, Cho S, Hanotte O, Mwai OA, Dessie T, Kemp S, Oh SJ, Kim H. Cattle genome-wide analysis reveals genetic signatures in trypanotolerant N'Dama. BMC Genomics 2017; 18:371. [PMID: 28499406 PMCID: PMC5427609 DOI: 10.1186/s12864-017-3742-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 04/27/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Indigenous cattle in Africa have adapted to various local environments to acquire superior phenotypes that enhance their survival under harsh conditions. While many studies investigated the adaptation of overall African cattle, genetic characteristics of each breed have been poorly studied. RESULTS We performed the comparative genome-wide analysis to assess evidence for subspeciation within species at the genetic level in trypanotolerant N'Dama cattle. We analysed genetic variation patterns in N'Dama from the genomes of 101 cattle breeds including 48 samples of five indigenous African cattle breeds and 53 samples of various commercial breeds. Analysis of SNP variances between cattle breeds using wMI, XP-CLR, and XP-EHH detected genes containing N'Dama-specific genetic variants and their potential associations. Functional annotation analysis revealed that these genes are associated with ossification, neurological and immune system. Particularly, the genes involved in bone formation indicate that local adaptation of N'Dama may engage in skeletal growth as well as immune systems. CONCLUSIONS Our results imply that N'Dama might have acquired distinct genotypes associated with growth and regulation of regional diseases including trypanosomiasis. Moreover, this study offers significant insights into identifying genetic signatures for natural and artificial selection of diverse African cattle breeds.
Collapse
Affiliation(s)
- Soo-Jin Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 08826, Republic of Korea.,C&K Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Sojeong Ka
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 08826, Republic of Korea
| | - Jung-Woo Ha
- Clova, NAVER Corp., Seongnam, 13561, Republic of Korea
| | - Jaemin Kim
- C&K Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - DongAhn Yoo
- C&K Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea.,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
| | - Kwondo Kim
- C&K Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea.,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
| | - Hak-Kyo Lee
- Department of Animal Biotechnology, Chonbuk National University, Jeonju, 66414, Republic of Korea
| | - Dajeong Lim
- Division of Animal Genomics and Bioinformatics, National Institute of Animal Science, RDA, Jeonju, 55365, Republic of Korea
| | - Seoae Cho
- C&K Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Olivier Hanotte
- University of Nottingham, School of Life Sciences, Nottingham, NG7 2RD, UK.,International Livestock Research Institute, Addis Ababa, Ethiopia
| | - Okeyo Ally Mwai
- International Livestock Research Institute, Box 30709-00100, Nairobi, Kenya
| | - Tadelle Dessie
- International Livestock Research Institute, Addis Ababa, Ethiopia
| | - Stephen Kemp
- International Livestock Research Institute, Box 30709-00100, Nairobi, Kenya.,The Centre for Tropical Livestock Genetics and Health, The Roslin Institute, University of Edinburgh, Easter Bush Campus, Edinburgh, Scotland, UK
| | - Sung Jong Oh
- National Institute of Animal Science, RDA, Wanju, 55365, Republic of Korea.
| | - Heebal Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 08826, Republic of Korea. .,C&K Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea. .,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
26
|
Pavlidis P, Alachiotis N. A survey of methods and tools to detect recent and strong positive selection. ACTA ACUST UNITED AC 2017; 24:7. [PMID: 28405579 PMCID: PMC5385031 DOI: 10.1186/s40709-017-0064-0] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 03/29/2017] [Indexed: 01/25/2023]
Abstract
Positive selection occurs when an allele is favored by natural selection. The frequency of the favored allele increases in the population and due to genetic hitchhiking the neighboring linked variation diminishes, creating so-called selective sweeps. Detecting traces of positive selection in genomes is achieved by searching for signatures introduced by selective sweeps, such as regions of reduced variation, a specific shift of the site frequency spectrum, and particular LD patterns in the region. A variety of methods and tools can be used for detecting sweeps, ranging from simple implementations that compute summary statistics such as Tajima's D, to more advanced statistical approaches that use combinations of statistics, maximum likelihood, machine learning etc. In this survey, we present and discuss summary statistics and software tools, and classify them based on the selective sweep signature they detect, i.e., SFS-based vs. LD-based, as well as their capacity to analyze whole genomes or just subgenomic regions. Additionally, we summarize the results of comparisons among four open-source software releases (SweeD, SweepFinder, SweepFinder2, and OmegaPlus) regarding sensitivity, specificity, and execution times. In equilibrium neutral models or mild bottlenecks, both SFS- and LD-based methods are able to detect selective sweeps accurately. Methods and tools that rely on LD exhibit higher true positive rates than SFS-based ones under the model of a single sweep or recurrent hitchhiking. However, their false positive rate is elevated when a misspecified demographic model is used to represent the null hypothesis. When the correct (or similar to the correct) demographic model is used instead, the false positive rates are considerably reduced. The accuracy of detecting the true target of selection is decreased in bottleneck scenarios. In terms of execution time, LD-based methods are typically faster than SFS-based methods, due to the nature of required arithmetic.
Collapse
Affiliation(s)
- Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, 70013 Crete, Greece
| | - Nikolaos Alachiotis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, 70013 Crete, Greece
| |
Collapse
|
27
|
Kim J, Hanotte O, Mwai OA, Dessie T, Bashir S, Diallo B, Agaba M, Kim K, Kwak W, Sung S, Seo M, Jeong H, Kwon T, Taye M, Song KD, Lim D, Cho S, Lee HJ, Yoon D, Oh SJ, Kemp S, Lee HK, Kim H. The genome landscape of indigenous African cattle. Genome Biol 2017; 18:34. [PMID: 28219390 PMCID: PMC5319050 DOI: 10.1186/s13059-017-1153-y] [Citation(s) in RCA: 148] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 01/11/2017] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND The history of African indigenous cattle and their adaptation to environmental and human selection pressure is at the root of their remarkable diversity. Characterization of this diversity is an essential step towards understanding the genomic basis of productivity and adaptation to survival under African farming systems. RESULTS We analyze patterns of African cattle genetic variation by sequencing 48 genomes from five indigenous populations and comparing them to the genomes of 53 commercial taurine breeds. We find the highest genetic diversity among African zebu and sanga cattle. Our search for genomic regions under selection reveals signatures of selection for environmental adaptive traits. In particular, we identify signatures of selection including genes and/or pathways controlling anemia and feeding behavior in the trypanotolerant N'Dama, coat color and horn development in Ankole, and heat tolerance and tick resistance across African cattle especially in zebu breeds. CONCLUSIONS Our findings unravel at the genome-wide level, the unique adaptive diversity of African cattle while emphasizing the opportunities for sustainable improvement of livestock productivity on the continent.
Collapse
Affiliation(s)
- Jaemin Kim
- C&K genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Olivier Hanotte
- The University of Nottingham, School of Life Sciences, Nottingham, NG7 2RD, UK
- International Livestock Research institute (ILRI), P. O. Box 5689, Addis Ababa, Ethiopia
| | - Okeyo Ally Mwai
- International Livestock Research Institute (ILRI), Box 30709 -00100, Nairobi, Kenya
| | - Tadelle Dessie
- International Livestock Research institute (ILRI), P. O. Box 5689, Addis Ababa, Ethiopia
| | - Salim Bashir
- Department of Parasitology, Faculty of Veterinary Medicine, University of Khartoum, 13314, Khartoum North, Sudan
| | - Boubacar Diallo
- National Coordinateur RGA, Ministère Elevage - Productions Animales, B.P. 559, Conakry, Guinea
| | - Morris Agaba
- Nelson Mandela African Institution of Science and Technology, Nelson Mandela Road. P. O. Box 447, Arusha, Tanzania
| | - Kwondo Kim
- C&K genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-741, Republic of Korea
| | - Woori Kwak
- C&K genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Samsun Sung
- C&K genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Minseok Seo
- C&K genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Hyeonsoo Jeong
- Department of Animal Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Taehyung Kwon
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Mengistie Taye
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 151-742, Republic of Korea
- College of Agriculture and Environmental Sciences, Bahir Dar University, P. O. Box 79, Bahir Dar, Ethiopia
| | - Ki-Duk Song
- The Animal Molecular Genetics and Breeding Center, Chonbuk National University, Jeonju, 54896, Republic of Korea
- Department of Animal Biotechnology, Chonbuk National University, Jeonju, 561-756, Republic of Korea
| | - Dajeong Lim
- Division of Animal Genomics & Bioinformatics, National Institute of Animal Science, RDA, Jeonju, 565-851, Republic of Korea
| | - Seoae Cho
- C&K genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Hyun-Jeong Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-741, Republic of Korea
- Animal Nutritional & Physiology Team, National Institute of Animal Science, RDA, Jeonju, 565-851, Republic of Korea
| | - Duhak Yoon
- Department of Animal Science, Kyungpook National University, Sangju, 742-711, Republic of Korea
| | - Sung Jong Oh
- National Institute of Animal Science, RDA, Jeonju, 565-851, Republic of Korea
| | - Stephen Kemp
- International Livestock Research Institute (ILRI), Box 30709 -00100, Nairobi, Kenya
- The Centre for Tropical Livestock Genetics and Health, The Roslin Institute, The University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | - Hak-Kyo Lee
- The Animal Molecular Genetics and Breeding Center, Chonbuk National University, Jeonju, 54896, Republic of Korea.
- Department of Animal Biotechnology, Chonbuk National University, Jeonju, 561-756, Republic of Korea.
| | - Heebal Kim
- C&K genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea.
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, 151-742, Republic of Korea.
- Institute for Biomedical Sciences, Shinshu University, Nagano, Japan.
| |
Collapse
|
28
|
Badouin H, Gladieux P, Gouzy J, Siguenza S, Aguileta G, Snirc A, Le Prieur S, Jeziorski C, Branca A, Giraud T. Widespread selective sweeps throughout the genome of model plant pathogenic fungi and identification of effector candidates. Mol Ecol 2017; 26:2041-2062. [DOI: 10.1111/mec.13976] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2016] [Revised: 12/15/2016] [Accepted: 12/19/2016] [Indexed: 12/11/2022]
Affiliation(s)
- H. Badouin
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; 91400 Orsay France
| | - P. Gladieux
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; 91400 Orsay France
- UMR BGPI; Campus International de Baillarguet; INRA; 34398 Montpellier France
| | - J. Gouzy
- Laboratoire des Interactions Plantes-Microorganismes (LIPM); UMR441; INRA; 31326 Castanet-Tolosan France
- Laboratoire des Interactions Plantes-Microorganismes (LIPM); UMR2594; CNRS; 31326 Castanet-Tolosan France
| | - S. Siguenza
- Laboratoire des Interactions Plantes-Microorganismes (LIPM); UMR441; INRA; 31326 Castanet-Tolosan France
- Laboratoire des Interactions Plantes-Microorganismes (LIPM); UMR2594; CNRS; 31326 Castanet-Tolosan France
| | - G. Aguileta
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; 91400 Orsay France
| | - A. Snirc
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; 91400 Orsay France
| | - S. Le Prieur
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; 91400 Orsay France
| | - C. Jeziorski
- Genotoul; GeT-PlaGe; INRA Auzeville 31326 Castanet-Tolosan France
- UAR1209; INRA Auzeville 31326 Castanet-Tolosan France
| | - A. Branca
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; 91400 Orsay France
| | - T. Giraud
- Ecologie Systématique Evolution, Univ. Paris-Sud, CNRS, AgroParisTech; Université Paris-Saclay; 91400 Orsay France
| |
Collapse
|
29
|
Park L. Evidence of Recent Intricate Adaptation in Human Populations. PLoS One 2016; 11:e0165870. [PMID: 27992444 PMCID: PMC5167553 DOI: 10.1371/journal.pone.0165870] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 10/19/2016] [Indexed: 11/18/2022] Open
Abstract
Recent human adaptations have shaped population differentiation in genomic regions containing putative functional variants, mostly located in predicted regulatory elements. However, their actual functionalities and the underlying mechanism of recent adaptation remain poorly understood. In the current study, regions of genes and repeats were investigated for functionality depending on the degree of population differentiation, FST or ΔDAF (a difference in derived allele frequency). The high FST in the 5´ or 3´ untranslated regions (UTRs), in particular, confirmed that population differences arose mainly from differences in regulation. Expression quantitative trait loci (eQTL) analyses using lymphoblastoid cell lines indicated that the majority of the highly population-specific regions represented cis- and/or trans-eQTL. However, groups having the highest ΔDAFs did not necessarily have higher proportions of eQTL variants; in these groups, the patterns were complex, indicating recent intricate adaptations. The results indicated that East Asian (EAS) and European populations (EUR) experienced mutual selection pressures. The mean derived allele frequency of the high ΔDAF groups suggested that EAS and EUR underwent strong adaptation; however, the African population in Africa (AFR) experienced slight, yet broad, adaptation. The DAF distributions of variants in the gene regions showed clear selective pressure in each population, which implies the existence of more recent regulatory adaptations in cells other than lymphoblastoid cell lines. In-depth analysis of population-differentiated regions indicated that the coding gene, RNF135, represented a trans-regulation hotspot via cis-regulation by the population-specific variants in the region of selective sweep. Together, the results provide strong evidence of actual intricate adaptation of human populations via regulatory manipulation.
Collapse
Affiliation(s)
- Leeyoung Park
- Natural Science Research Institute, Yonsei University, Seoul, Korea
- * E-mail:
| |
Collapse
|
30
|
Genetic surfing in human populations: from genes to genomes. Curr Opin Genet Dev 2016; 41:53-61. [DOI: 10.1016/j.gde.2016.08.003] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 07/06/2016] [Accepted: 08/02/2016] [Indexed: 12/20/2022]
|
31
|
Abstract
The wealth of available genetic information is allowing the reconstruction of human demographic and adaptive history. Demography and purifying selection affect the purge of rare, deleterious mutations from the human population, whereas positive and balancing selection can increase the frequency of advantageous variants, improving survival and reproduction in specific environmental conditions. In this review, I discuss how theoretical and empirical population genetics studies, using both modern and ancient DNA data, are a powerful tool for obtaining new insight into the genetic basis of severe disorders and complex disease phenotypes, rare and common, focusing particularly on infectious disease risk.
Collapse
Affiliation(s)
- Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Department of Genomes & Genetics, Institut Pasteur, Paris, 75015, France.
- Centre National de la Recherche Scientifique, URA3012, Paris, 75015, France.
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris, 75015, France.
| |
Collapse
|
32
|
Randhawa IAS, Khatkar MS, Thomson PC, Raadsma HW. A Meta-Assembly of Selection Signatures in Cattle. PLoS One 2016; 11:e0153013. [PMID: 27045296 PMCID: PMC4821596 DOI: 10.1371/journal.pone.0153013] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 03/22/2016] [Indexed: 12/31/2022] Open
Abstract
Since domestication, significant genetic improvement has been achieved for many traits of commercial importance in cattle, including adaptation, appearance and production. In response to such intense selection pressures, the bovine genome has undergone changes at the underlying regions of functional genetic variants, which are termed “selection signatures”. This article reviews 64 recent (2009–2015) investigations testing genomic diversity for departure from neutrality in worldwide cattle populations. In particular, we constructed a meta-assembly of 16,158 selection signatures for individual breeds and their archetype groups (European, African, Zebu and composite) from 56 genome-wide scans representing 70,743 animals of 90 pure and crossbred cattle breeds. Meta-selection-scores (MSS) were computed by combining published results at every given locus, within a sliding window span. MSS were adjusted for common samples across studies and were weighted for significance thresholds across and within studies. Published selection signatures show extensive coverage across the bovine genome, however, the meta-assembly provides a consensus profile of 263 genomic regions of which 141 were unique (113 were breed-specific) and 122 were shared across cattle archetypes. The most prominent peaks of MSS represent regions under selection across multiple populations and harboured genes of known major effects (coat color, polledness and muscle hypertrophy) and genes known to influence polygenic traits (stature, adaptation, feed efficiency, immunity, behaviour, reproduction, beef and dairy production). As the first meta-assembly of selection signatures, it offers novel insights about the hotspots of selective sweeps in the bovine genome, and this method could equally be applied to other species.
Collapse
Affiliation(s)
- Imtiaz A. S. Randhawa
- Reprogen - Animal Bioscience Group, Faculty of Veterinary Science, The University of Sydney, 425 Werombi Road, Camden, 2570, NSW, Australia
- * E-mail:
| | - Mehar S. Khatkar
- Reprogen - Animal Bioscience Group, Faculty of Veterinary Science, The University of Sydney, 425 Werombi Road, Camden, 2570, NSW, Australia
| | - Peter C. Thomson
- Reprogen - Animal Bioscience Group, Faculty of Veterinary Science, The University of Sydney, 425 Werombi Road, Camden, 2570, NSW, Australia
| | - Herman W. Raadsma
- Reprogen - Animal Bioscience Group, Faculty of Veterinary Science, The University of Sydney, 425 Werombi Road, Camden, 2570, NSW, Australia
| |
Collapse
|
33
|
DeGiorgio M, Huber CD, Hubisz MJ, Hellmann I, Nielsen R. SweepFinder2: increased sensitivity, robustness and flexibility. Bioinformatics 2016; 32:1895-7. [PMID: 27153702 DOI: 10.1093/bioinformatics/btw051] [Citation(s) in RCA: 185] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 01/19/2016] [Indexed: 12/14/2022] Open
Abstract
UNLABELLED SweepFinder is a widely used program that implements a powerful likelihood-based method for detecting recent positive selection, or selective sweeps. Here, we present SweepFinder2, an extension of SweepFinder with increased sensitivity and robustness to the confounding effects of mutation rate variation and background selection. Moreover, SweepFinder2 has increased flexibility that enables the user to specify test sites, set the distance between test sites and utilize a recombination map. AVAILABILITY AND IMPLEMENTATION SweepFinder2 is a freely-available (www.personal.psu.edu/mxd60/sf2.html) software package that is written in C and can be run from a Unix command line. CONTACT mxd60@psu.edu.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Biology and Institute for CyberScience, Pennsylvania State University, University Park, PA, USA
| | - Christian D Huber
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA
| | - Melissa J Hubisz
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA
| | - Ines Hellmann
- Department Biologie II, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany and
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, CA, USA
| |
Collapse
|
34
|
Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. Proc Natl Acad Sci U S A 2015; 113:E440-9. [PMID: 26712023 DOI: 10.1073/pnas.1510805112] [Citation(s) in RCA: 155] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The Out-of-Africa (OOA) dispersal ∼ 50,000 y ago is characterized by a series of founder events as modern humans expanded into multiple continents. Population genetics theory predicts an increase of mutational load in populations undergoing serial founder effects during range expansions. To test this hypothesis, we have sequenced full genomes and high-coverage exomes from seven geographically divergent human populations from Namibia, Congo, Algeria, Pakistan, Cambodia, Siberia, and Mexico. We find that individual genomes vary modestly in the overall number of predicted deleterious alleles. We show via spatially explicit simulations that the observed distribution of deleterious allele frequencies is consistent with the OOA dispersal, particularly under a model where deleterious mutations are recessive. We conclude that there is a strong signal of purifying selection at conserved genomic positions within Africa, but that many predicted deleterious mutations have evolved as if they were neutral during the expansion out of Africa. Under a model where selection is inversely related to dominance, we show that OOA populations are likely to have a higher mutation load due to increased allele frequencies of nearly neutral variants that are recessive or partially recessive.
Collapse
|
35
|
Haasl RJ, Payseur BA. Fifteen years of genomewide scans for selection: trends, lessons and unaddressed genetic sources of complication. Mol Ecol 2015. [PMID: 26224644 DOI: 10.1111/mec.13339] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Genomewide scans for natural selection (GWSS) have become increasingly common over the last 15 years due to increased availability of genome-scale genetic data. Here, we report a representative survey of GWSS from 1999 to present and find that (i) between 1999 and 2009, 35 of 49 (71%) GWSS focused on human, while from 2010 to present, only 38 of 83 (46%) of GWSS focused on human, indicating increased focus on nonmodel organisms; (ii) the large majority of GWSS incorporate interpopulation or interspecific comparisons using, for example F(ST), cross-population extended haplotype homozygosity or the ratio of nonsynonymous to synonymous substitutions; (iii) most GWSS focus on detection of directional selection rather than other modes such as balancing selection; and (iv) in human GWSS, there is a clear shift after 2004 from microsatellite markers to dense SNP data. A survey of GWSS meant to identify loci positively selected in response to severe hypoxic conditions support an approach to GWSS in which a list of a priori candidate genes based on potential selective pressures are used to filter the list of significant hits a posteriori. We also discuss four frequently ignored determinants of genomic heterogeneity that complicate GWSS: mutation, recombination, selection and the genetic architecture of adaptive traits. We recommend that GWSS methodology should better incorporate aspects of genomewide heterogeneity using empirical estimates of relevant parameters and/or realistic, whole-chromosome simulations to improve interpretation of GWSS results. Finally, we argue that knowledge of potential selective agents improves interpretation of GWSS results and that new methods focused on correlations between environmental variables and genetic variation can help automate this approach.
Collapse
Affiliation(s)
- Ryan J Haasl
- Department of Biology, University of Wisconsin-Platteville, 1 University Plaza, Platteville, WI, 53818, USA
| | - Bret A Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA
| |
Collapse
|
36
|
Ralph PL, Coop G. The Role of Standing Variation in Geographic Convergent Adaptation. Am Nat 2015; 186 Suppl 1:S5-23. [PMID: 26656217 DOI: 10.1086/682948] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
The extent to which populations experiencing shared selective pressures adapt through a shared genetic response is relevant to many questions in evolutionary biology. In this article, we explore how standing genetic variation contributes to convergent genetic responses in a geographically spread population. Geographically limited dispersal slows the spread of each selected allele, hence allowing other alleles to spread before any one comes to dominate the population. When selectively equivalent alleles meet, their progress is substantially slowed, dividing the species range into a random tessellation, which can be well understood by analogy to a Poisson process model of crystallization. In this framework, we derive the geographic scale over which an allele dominates and the proportion of adaptive alleles that arise from standing variation. Finally, we explore how negative pleiotropic effects of alleles can bias the subset of alleles that contribute to the species' adaptive response. We apply the results to the malaria-resistance glucose-6-phosphate dehydrogenase-deficiency alleles, where the large mutational target size makes it a likely candidate for adaptation from deleterious standing variation. Our results suggest that convergent adaptation may be common. Therefore, caution must be exercised when arguing that strongly geographically restricted alleles are the outcome of local adaptation. We close by discussing the implications of these results for ideas of species coherence and the nature of divergence between species.
Collapse
Affiliation(s)
- Peter L Ralph
- Computational Biology and Bioinformatics, University of Southern California, Los Angeles, California 90089
| | | |
Collapse
|
37
|
Pybus M, Luisi P, Dall'Olio GM, Uzkudun M, Laayouni H, Bertranpetit J, Engelken J. Hierarchical boosting: a machine-learning framework to detect and classify hard selective sweeps in human populations. Bioinformatics 2015; 31:3946-52. [PMID: 26315912 DOI: 10.1093/bioinformatics/btv493] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 08/17/2015] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION Detecting positive selection in genomic regions is a recurrent topic in natural population genetic studies. However, there is little consistency among the regions detected in several genome-wide scans using different tests and/or populations. Furthermore, few methods address the challenge of classifying selective events according to specific features such as age, intensity or state (completeness). RESULTS We have developed a machine-learning classification framework that exploits the combined ability of some selection tests to uncover different polymorphism features expected under the hard sweep model, while controlling for population-specific demography. As a result, we achieve high sensitivity toward hard selective sweeps while adding insights about their completeness (whether a selected variant is fixed or not) and age of onset. Our method also determines the relevance of the individual methods implemented so far to detect positive selection under specific selective scenarios. We calibrated and applied the method to three reference human populations from The 1000 Genome Project to generate a genome-wide classification map of hard selective sweeps. This study improves detection of selective sweep by overcoming the classical selection versus no-selection classification strategy, and offers an explanation to the lack of consistency observed among selection tests when applied to real data. Very few signals were observed in the African population studied, while our method presents higher sensitivity in this population demography. AVAILABILITY AND IMPLEMENTATION The genome-wide results for three human populations from The 1000 Genomes Project and an R-package implementing the 'Hierarchical Boosting' framework are available at http://hsb.upf.edu/.
Collapse
Affiliation(s)
- Marc Pybus
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Pierre Luisi
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Barcelona 08003, Spain, Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Giovanni Marco Dall'Olio
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Barcelona 08003, Spain, Division of Cancer Studies, King's College of London, London SE1 1UL, UK and
| | - Manu Uzkudun
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Hafid Laayouni
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Barcelona 08003, Spain, Departament de Genètica i de Microbiologia, Universitat Autonòma de Barcelona, Bellaterra 8193, Spain
| | - Jaume Bertranpetit
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Johannes Engelken
- Institut de Biologia Evolutiva (UPF-CSIC), Universitat Pompeu Fabra, Barcelona 08003, Spain
| |
Collapse
|
38
|
Jeong H, Song KD, Seo M, Caetano-Anollés K, Kim J, Kwak W, Oh JD, Kim E, Jeong DK, Cho S, Kim H, Lee HK. Exploring evidence of positive selection reveals genetic basis of meat quality traits in Berkshire pigs through whole genome sequencing. BMC Genet 2015; 16:104. [PMID: 26289667 PMCID: PMC4545873 DOI: 10.1186/s12863-015-0265-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2015] [Accepted: 08/13/2015] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Natural and artificial selection following domestication has led to the existence of more than a hundred pig breeds, as well as incredible variation in phenotypic traits. Berkshire pigs are regarded as having superior meat quality compared to other breeds. As the meat production industry seeks selective breeding approaches to improve profitable traits such as meat quality, information about genetic determinants of these traits is in high demand. However, most of the studies have been performed using trained sensory panel analysis without investigating the underlying genetic factors. Here we investigate the relationship between genomic composition and this phenotypic trait by scanning for signatures of positive selection in whole-genome sequencing data. RESULTS We generated genomes of 10 Berkshire pigs at a total of 100.6 coverage depth, using the Illumina Hiseq2000 platform. Along with the genomes of 11 Landrace and 13 Yorkshire pigs, we identified genomic variants of 18.9 million SNVs and 3.4 million Indels in the mapped regions. We identified several associated genes related to lipid metabolism, intramuscular fatty acid deposition, and muscle fiber type which attribute to pork quality (TG, FABP1, AKIRIN2, GLP2R, TGFBR3, JPH3, ICAM2, and ERN1) by applying between population statistical tests (XP-EHH and XP-CLR). A statistical enrichment test was also conducted to detect breed specific genetic variation. In addition, de novo short sequence read assembly strategy identified several candidate genes (SLC25A14, IGF1, PI4KA, CACNA1A) as also contributing to lipid metabolism. CONCLUSIONS Results revealed several candidate genes involved in Berkshire meat quality; most of these genes are involved in lipid metabolism and intramuscular fat deposition. These results can provide a basis for future research on the genomic characteristics of Berkshire pigs.
Collapse
Affiliation(s)
- Hyeonsoo Jeong
- Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Seoul, Kwan-ak Gu, 151-741, Republic of Korea.
| | - Ki-Duk Song
- Department of Animal Biotechnology, Chonbuk National University, Jeonju, 561-756, Republic of Korea.
| | - Minseok Seo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Seoul, Kwan-ak Gu, 151-741, Republic of Korea.
| | | | - Jaemin Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Seoul, Kwan-ak Gu, 151-741, Republic of Korea.
| | - Woori Kwak
- Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Seoul, Kwan-ak Gu, 151-741, Republic of Korea.
- C&K genomics, Main Bldg. #514, SNU Research Park, Seoul, 151-919, Republic of Korea.
| | - Jae-Don Oh
- Department of Animal Biotechnology, Chonbuk National University, Jeonju, 561-756, Republic of Korea.
| | - EuiSoo Kim
- Department of Animal Science, Iowa State University, Ames, IA, 50011, USA.
| | - Dong Kee Jeong
- Department of Animal Biotechnology, Faculty of Biotechnology, Jeju National University, Ara-1 Dong, Jeju-Do, Jeju, 690-756, Republic of Korea.
| | - Seoae Cho
- C&K genomics, Main Bldg. #514, SNU Research Park, Seoul, 151-919, Republic of Korea.
| | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Kwan-ak St. 599, Seoul, Kwan-ak Gu, 151-741, Republic of Korea.
- C&K genomics, Main Bldg. #514, SNU Research Park, Seoul, 151-919, Republic of Korea.
- Department of Agricultural Biotechnology, Seoul National University, Seoul, 151-742, South Korea.
| | - Hak-Kyo Lee
- Department of Animal Biotechnology, Chonbuk National University, Jeonju, 561-756, Republic of Korea.
| |
Collapse
|
39
|
Kim H, Song KD, Kim HJ, Park W, Kim J, Lee T, Shin DH, Kwak W, Kwon YJ, Sung S, Moon S, Lee KT, Kim N, Hong JK, Eo KY, Seo KS, Kim G, Park S, Yun CH, Kim H, Choi K, Kim J, Lee WK, Kim DK, Oh JD, Kim ES, Cho S, Lee HK, Kim TH, Kim H. Exploring the genetic signature of body size in Yucatan miniature pig. PLoS One 2015; 10:e0121732. [PMID: 25885114 PMCID: PMC4401510 DOI: 10.1371/journal.pone.0121732] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 02/18/2015] [Indexed: 01/09/2023] Open
Abstract
Since being domesticated about 10,000-12,000 years ago, domestic pigs (Sus scrofa domesticus) have been selected for traits of economic importance, in particular large body size. However, Yucatan miniature pigs have been selected for small body size to withstand high temperature environment and for laboratory use. This renders the Yucatan miniature pig a valuable model for understanding the evolution of body size. We investigate the genetic signature for selection of body size in the Yucatan miniature pig. Phylogenetic distance of Yucatan miniature pig was compared to other large swine breeds (Yorkshire, Landrace, Duroc and wild boar). By estimating the XP-EHH statistic using re-sequencing data derived from 70 pigs, we were able to unravel the signatures of selection of body size. We found that both selections at the level of organism, and at the cellular level have occurred. Selection at the higher levels include feed intake, regulation of body weight and increase in mass while selection at the molecular level includes cell cycle and cell proliferation. Positively selected genes probed by XP-EHH may provide insight into the docile character and innate immunity as well as body size of Yucatan miniature pig.
Collapse
Affiliation(s)
- Hyeongmin Kim
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Ki Duk Song
- Genomic Informatics Center, Hankyong National University, Anseong, 456-749, Republic of Korea
| | - Hyeon Jeong Kim
- CHO & KIM Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - WonCheoul Park
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Jaemin Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Taeheon Lee
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Dong-Hyun Shin
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Woori Kwak
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Young-jun Kwon
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Republic of Korea
| | - Samsun Sung
- CHO & KIM Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Sunjin Moon
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Kyung-Tai Lee
- National Institute of Animal Science, RDA, Suwon, 441-706, Republic of Korea
| | - Namshin Kim
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 305-806, Republic of Korea
| | - Joon Ki Hong
- Swine Science Division, National Institute of Animal Science, RDA, Cheonan, 331-801, Republic of Korea
| | - Kyung Yeon Eo
- Animal Research Division, Seoul Zoo, Seoul, 427-702, Republic of Korea
| | - Kang Seok Seo
- Department of Animal Science and Technology, College of Life Science and Natural Resources, Sunchon National University, Suncheon, 540-950, Republic of Korea
| | - Girak Kim
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Sungmoo Park
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Cheol-Heui Yun
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea
| | - Hyunil Kim
- Optipharm, Inc., 63, Osongsangmyeong 6-ro, Osong-eup, Chengwon-gun, Chungcheongbuk-do, 363-954, Republic of Korea
| | - Kimyung Choi
- Optipharm, Inc., 63, Osongsangmyeong 6-ro, Osong-eup, Chengwon-gun, Chungcheongbuk-do, 363-954, Republic of Korea
| | - Jiho Kim
- Optipharm, Inc., 63, Osongsangmyeong 6-ro, Osong-eup, Chengwon-gun, Chungcheongbuk-do, 363-954, Republic of Korea
| | - Woon Kyu Lee
- Laboratory of Developmental Genetics, College of Medicine, Inha University, Incheon, 400-103, Republic of Korea
| | - Duk-Kyung Kim
- Genomic Informatics Center, Hankyong National University, Anseong, 456-749, Republic of Korea
| | - Jae-Don Oh
- Genomic Informatics Center, Hankyong National University, Anseong, 456-749, Republic of Korea
| | - Eui-Soo Kim
- Department of Animal Sciencs, Iowa State University, Ames, Iowa, 50011, United States of America
| | - Seoae Cho
- CHO & KIM Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| | - Hak-Kyo Lee
- Genomic Informatics Center, Hankyong National University, Anseong, 456-749, Republic of Korea
| | - Tae-Hun Kim
- National Institute of Animal Science, RDA, Suwon, 441-706, Republic of Korea
| | - Heebal Kim
- Department of Agricultural Biotechnology, Animal Biotechnology Major, and Research Institute for Agriculture and Life Sciences, Seoul National University, Seoul, 151-921, Republic of Korea; CHO & KIM Genomics, Seoul National University Research Park, Seoul, 151-919, Republic of Korea
| |
Collapse
|
40
|
Wollstein A, Stephan W. Inferring positive selection in humans from genomic data. INVESTIGATIVE GENETICS 2015; 6:5. [PMID: 25834723 PMCID: PMC4381672 DOI: 10.1186/s13323-015-0023-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 02/23/2015] [Indexed: 01/06/2023]
Abstract
Adaptation can be described as an evolutionary process that leads to an adjustment of the phenotypes of a population to their environment. In the classical view, new mutations can introduce novel phenotypic features into a population that leave footprints in the genome after fixation, such as selective sweeps. Alternatively, existing genetic variants may become beneficial after an environmental change and increase in frequency. Although they may not reach fixation, they may cause a shift of the optimum of a phenotypic trait controlled by multiple loci. With the availability of polymorphism data from various organisms, including humans and chimpanzees, it has become possible to detect molecular evidence of adaptation and to estimate the strength and target of positive selection. In this review, we discuss the two competing models of adaptation and suitable approaches for detecting the footprints of positive selection on the molecular level.
Collapse
Affiliation(s)
- Andreas Wollstein
- Section of Evolutionary Biology, Department of Biology II, University of Munich, Großhaderner Str. 2, 82152 Planegg-Martinsried, Germany
| | - Wolfgang Stephan
- Section of Evolutionary Biology, Department of Biology II, University of Munich, Großhaderner Str. 2, 82152 Planegg-Martinsried, Germany
| |
Collapse
|
41
|
Kim J, Cho S, Caetano-Anolles K, Kim H, Ryu YC. Genome-wide detection and characterization of positive selection in Korean Native Black Pig from Jeju Island. BMC Genet 2015; 16:3. [PMID: 25634476 PMCID: PMC4314801 DOI: 10.1186/s12863-014-0160-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2014] [Accepted: 12/30/2014] [Indexed: 01/01/2023] Open
Abstract
Background In the 1980s, Korean native black pigs from Jeju Island (Jeju black pigs) served as representative sample of Korean native black pigs, and efforts were made to help the species rebound from the brink of extinction, which occurred as a result of the introduction of Western pig breeds. Geographical separation of Jeju Island from the Korean peninsula has allowed Jeju black pigs not only to acquire unique characteristics but also to retain merits of rare Korean native black pigs. Results To further analyze the Jeju black pig genome, we performed whole-genome re-sequencing (average read depth of 14×) of 8 Jeju black pig and 6 Korean pigs (which live on the Korean peninsula) to compare and identify putative signatures of positive selection in Jeju black pig, the true and pure Korean native black pigs. The candidate genes potentially under positive selection in Jeju black pig support previous reports of high marbling score, rare occurrence of pale, soft, exudative (PSE) meat, but low growth rate and carcass weight compared to Western breeds. Conclusions Several candidate genes potentially under positive selection were involved in fatty acid transport and may have contributed to the unique characteristics of meat quality in JBP. Jeju black pigs can offer a unique opportunity to investigate the true genetic resource of once endangered Korean native black pigs. Further genome-wide analyses of Jeju black pigs on a larger population scale are required in order to define a conservation strategy and improvement of native pig resources. Electronic supplementary material The online version of this article (doi:10.1186/s12863-014-0160-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jaemin Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Korea.
| | - Seoae Cho
- CHO&KIM genomics, Main Bldg. #514, SNU Research Park, Seoul National University Mt.4-2, NakSeoungDae, Gwanakgu, Seoul, 151-919, Republic of Korea.
| | | | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 151-742, Korea. .,CHO&KIM genomics, Main Bldg. #514, SNU Research Park, Seoul National University Mt.4-2, NakSeoungDae, Gwanakgu, Seoul, 151-919, Republic of Korea. .,Department of Agricultural Biotechnology and Research Institute of Population Genomics, Seoul National University, Seoul, 151-742, Republic of Korea.
| | - Youn-Chul Ryu
- Division of Biotechnology, The Research Institute for Subtropical Agriculture and Biotechnology, Jeju National University, Jeju, 690-756, Republic of Korea.
| |
Collapse
|
42
|
Xu L, Bickhart DM, Cole JB, Schroeder SG, Song J, Tassell CPV, Sonstegard TS, Liu GE. Genomic signatures reveal new evidences for selection of important traits in domestic cattle. Mol Biol Evol 2014; 32:711-25. [PMID: 25431480 DOI: 10.1093/molbev/msu333] [Citation(s) in RCA: 117] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
We investigated diverse genomic selections using high-density single nucleotide polymorphism data of five distinct cattle breeds. Based on allele frequency differences, we detected hundreds of candidate regions under positive selection across Holstein, Angus, Charolais, Brahman, and N'Dama. In addition to well-known genes such as KIT, MC1R, ASIP, GHR, LCORL, NCAPG, WIF1, and ABCA12, we found evidence for a variety of novel and less-known genes under selection in cattle, such as LAP3, SAR1B, LRIG3, FGF5, and NUDCD3. Selective sweeps near LAP3 were then validated by next-generation sequencing. Genome-wide association analysis involving 26,362 Holsteins confirmed that LAP3 and SAR1B were related to milk production traits, suggesting that our candidate regions were likely functional. In addition, haplotype network analyses further revealed distinct selective pressures and evolution patterns across these five cattle breeds. Our results provided a glimpse into diverse genomic selection during cattle domestication, breed formation, and recent genetic improvement. These findings will facilitate genome-assisted breeding to improve animal production and health.
Collapse
Affiliation(s)
- Lingyang Xu
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705, USA Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA
| | - Derek M Bickhart
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - John B Cole
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Steven G Schroeder
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Jiuzhou Song
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA
| | - Curtis P Van Tassell
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - Tad S Sonstegard
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD 20705, USA
| |
Collapse
|
43
|
Reddy UK, Abburi L, Abburi VL, Saminathan T, Cantrell R, Vajja VG, Reddy R, Tomason YR, Levi A, Wehner TC, Nimmakayala P. A genome-wide scan of selective sweeps and association mapping of fruit traits using microsatellite markers in watermelon. J Hered 2014; 106:166-76. [PMID: 25425675 DOI: 10.1093/jhered/esu077] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Our genetic diversity study uses microsatellites of known map position to estimate genome level population structure and linkage disequilibrium, and to identify genomic regions that have undergone selection during watermelon domestication and improvement. Thirty regions that showed evidence of selective sweep were scanned for the presence of candidate genes using the watermelon genome browser (www.icugi.org). We localized selective sweeps in intergenic regions, close to the promoters, and within the exons and introns of various genes. This study provided an evidence of convergent evolution for the presence of diverse ecotypes with special reference to American and European ecotypes. Our search for location of linked markers in the whole-genome draft sequence revealed that BVWS00358, a GA repeat microsatellite, is the GAGA type transcription factor located in the 5' untranslated regions of a structure and insertion element that expresses a Cys2His2 Zinc finger motif, with presumed biological processes related to chitin response and transcriptional regulation. In addition, BVWS01708, an ATT repeat microsatellite, located in the promoter of a DTW domain-containing protein (Cla002761); and 2 other simple sequence repeats that association mapping link to fruit length and rind thickness.
Collapse
Affiliation(s)
- Umesh K Reddy
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Lavanya Abburi
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Venkata Lakshmi Abburi
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Thangasamy Saminathan
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Robert Cantrell
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Venkata Gopinath Vajja
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Rishi Reddy
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Yan R Tomason
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Amnon Levi
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Todd C Wehner
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| | - Padma Nimmakayala
- From the Department of Biology, Gus R. Douglass Institute, West Virginia State University, Institute, WV 25112-1000 (Reddy, Abburi, Saminathan, Cantrell, Vajja, Reddy, Tomason, and Nimmakayala); the U.S. Vegetable Laboratory, USDA, ARS, 2875 Savannah Highway, Charleston, SC 29414 (Levi); and the Department of Horticultural Science, North Carolina State University, Raleigh, NC 27695-7609 (Wehner)
| |
Collapse
|
44
|
High-resolution genetic map for understanding the effect of genome-wide recombination rate on nucleotide diversity in watermelon. G3-GENES GENOMES GENETICS 2014; 4:2219-30. [PMID: 25227227 PMCID: PMC4232547 DOI: 10.1534/g3.114.012815] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We used genotyping by sequencing to identify a set of 10,480 single nucleotide polymorphism (SNP) markers for constructing a high-resolution genetic map of 1096 cM for watermelon. We assessed the genome-wide variation in recombination rate (GWRR) across the map and found an association between GWRR and genome-wide nucleotide diversity. Collinearity between the map and the genome-wide reference sequence for watermelon was studied to identify inconsistency and chromosome rearrangements. We assessed genome-wide nucleotide diversity, linkage disequilibrium (LD), and selective sweep for wild, semi-wild, and domesticated accessions of Citrullus lanatus var. lanatus to track signals of domestication. Principal component analysis combined with chromosome-wide phylogenetic study based on 1563 SNPs obtained after LD pruning with minor allele frequency of 0.05 resolved the differences between semi-wild and wild accessions as well as relationships among worldwide sweet watermelon. Population structure analysis revealed predominant ancestries for wild, semi-wild, and domesticated watermelons as well as admixture of various ancestries that were important for domestication. Sliding window analysis of Tajima’s D across various chromosomes was used to resolve selective sweep. LD decay was estimated for various chromosomes. We identified a strong selective sweep on chromosome 3 consisting of important genes that might have had a role in sweet watermelon domestication.
Collapse
|
45
|
Pickrell JK, Reich D. Toward a new history and geography of human genes informed by ancient DNA. Trends Genet 2014; 30:377-89. [PMID: 25168683 PMCID: PMC4163019 DOI: 10.1016/j.tig.2014.07.007] [Citation(s) in RCA: 94] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Revised: 07/21/2014] [Accepted: 07/28/2014] [Indexed: 12/12/2022]
Abstract
Genetic information contains a record of the history of our species, and technological advances have transformed our ability to access this record. Many studies have used genome-wide data from populations today to learn about the peopling of the globe and subsequent adaptation to local conditions. Implicit in this research is the assumption that the geographic locations of people today are informative about the geographic locations of their ancestors in the distant past. However, it is now clear that long-range migration, admixture, and population replacement subsequent to the initial out-of-Africa expansion have altered the genetic structure of most of the world's human populations. In light of this we argue that it is time to critically reevaluate current models of the peopling of the globe, as well as the importance of natural selection in determining the geographic distribution of phenotypes. We specifically highlight the transformative potential of ancient DNA. By accessing the genetic make-up of populations living at archaeologically known times and places, ancient DNA makes it possible to directly track migrations and responses to natural selection.
Collapse
Affiliation(s)
- Joseph K Pickrell
- New York Genome Center, New York, NY, USA; Department of Biological Sciences, Columbia University, New York, NY, USA.
| | - David Reich
- Department of Genetics, Harvard Medical School, Boston, MA, USA; Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA; Broad Institute of the Massachusetts Institute of Technology (MIT) and Harvard, Cambridge, MA, USA.
| |
Collapse
|
46
|
DeGiorgio M, Lohmueller KE, Nielsen R. A model-based approach for identifying signatures of ancient balancing selection in genetic data. PLoS Genet 2014; 10:e1004561. [PMID: 25144706 PMCID: PMC4140648 DOI: 10.1371/journal.pgen.1004561] [Citation(s) in RCA: 112] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 06/26/2014] [Indexed: 01/19/2023] Open
Abstract
While much effort has focused on detecting positive and negative directional selection in the human genome, relatively little work has been devoted to balancing selection. This lack of attention is likely due to the paucity of sophisticated methods for identifying sites under balancing selection. Here we develop two composite likelihood ratio tests for detecting balancing selection. Using simulations, we show that these methods outperform competing methods under a variety of assumptions and demographic models. We apply the new methods to whole-genome human data, and find a number of previously-identified loci with strong evidence of balancing selection, including several HLA genes. Additionally, we find evidence for many novel candidates, the strongest of which is FANK1, an imprinted gene that suppresses apoptosis, is expressed during meiosis in males, and displays marginal signs of segregation distortion. We hypothesize that balancing selection acts on this locus to stabilize the segregation distortion and negative fitness effects of the distorter allele. Thus, our methods are able to reproduce many previously-hypothesized signals of balancing selection, as well as discover novel interesting candidates. In the past, balancing selection was a topic of great theoretical interest that received much attention. However, there has been little focus toward developing methods to identify regions of the genome that are under balancing selection. In this article, we present the first set of likelihood-based methods that explicitly model the spatial distribution of polymorphism expected near a site under long-term balancing selection. Simulation results show that our methods outperform commonly-used summary statistics for identifying regions under balancing selection. Finally, we performed a scan for balancing selection in Africans and Europeans using our new methods and identified a gene called FANK1 as our top candidate outside the HLA region. We hypothesize that the maintenance of polymorphism at FANK1 is the result of segregation distortion.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, United States of America
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
47
|
Sjöstrand AE, Sjödin P, Jakobsson M. Private haplotypes can reveal local adaptation. BMC Genet 2014; 15:61. [PMID: 24885734 PMCID: PMC4040116 DOI: 10.1186/1471-2156-15-61] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Accepted: 05/07/2014] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Genome-wide scans for regions that demonstrate deviating patterns of genetic variation have become common approaches for finding genes targeted by selection. Several genomic patterns have been utilized for this purpose, including deviations in haplotype homozygosity, frequency spectra and genetic differentiation between populations. RESULTS We describe a novel approach based on the Maximum Frequency of Private Haplotypes--MFPH--to search for signals of recent population-specific selection. The MFPH statistic is straightforward to compute for phased SNP- and sequence-data. Using both simulated and empirical data, we show that MFPH can be a powerful statistic to detect recent population-specific selection, that it performs at the same level as other commonly used summary statistics (e.g. FST, iHS and XP-EHH), and that MFPH in some cases capture signals of selection that are missed by other statistics. For instance, in the Maasai, MFPH reveals a strong signal of selection in a region where other investigated statistics fail to pick up a clear signal that contains the genes DOCK3, MAPKAPK3 and CISH. This region has been suggested to affect height in many populations based on phenotype-genotype association studies. It has specifically been suggested to be targeted by selection in Pygmy groups, which are on the opposite end of the human height spectrum compared to the Maasai. CONCLUSIONS From the analysis of both simulated and publicly available empirical data, we show that MFPH represents a summary statistic that can provide further insight concerning population-specific adaptation.
Collapse
Affiliation(s)
- Agnès E Sjöstrand
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
- UMR 7206 Eco-anthropologie et Ethnobiologie, CNRS-MNHN-Université Paris VII, Paris, France
- Laboratoire TIMC-IMAG, Centre National de la Recherche Scientifique, Université Joseph Fourier, Grenoble, France
| | - Per Sjödin
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Mattias Jakobsson
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
- Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| |
Collapse
|
48
|
Lee HJ, Kim J, Lee T, Son JK, Yoon HB, Baek KS, Jeong JY, Cho YM, Lee KT, Yang BC, Lim HJ, Cho K, Kim TH, Kwon EG, Nam J, Kwak W, Cho S, Kim H. Deciphering the genetic blueprint behind Holstein milk proteins and production. Genome Biol Evol 2014; 6:1366-74. [PMID: 24920005 PMCID: PMC4079194 DOI: 10.1093/gbe/evu102] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Holstein is known to provide higher milk yields than most other cattle breeds, and the dominant position of Holstein today is the result of various selection pressures. Holstein cattle have undergone intensive selection for milk production in recent decades, which has left genome-wide footprints of domestication. To further characterize the bovine genome, we performed whole-genome resequencing analysis of 10 Holstein and 11 Hanwoo cattle to identify regions containing genes as outliers in Holstein, including CSN1S1, CSN2, CSN3, and KIT whose products are likely involved in the yield and proteins of milk and their distinctive black-and-white markings. In addition, genes indicative of positive selection were associated with cardiovascular disease, which is related to simultaneous propagation of genetic defects, also known as inbreeding depression in Holstein.
Collapse
Affiliation(s)
- Hyun-Jeong Lee
- Division of Animal Genomics and Bioinformatics, National Institute of Animal Science, Suwon, Republic of KoreaDepartment of Agricultural Biotechnology and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of KoreaInterdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
| | - Jaemin Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, KoreaCHO&KIM Genomics, SNU Research Park, Seoul National University Mt.4-2, Seoul, Republic of Korea
| | - Taeheon Lee
- Department of Agricultural Biotechnology and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
| | - Jun Kyu Son
- Division of Dairy Science, National Institute of Animal Science, Suwon, Republic of Korea
| | - Ho-Baek Yoon
- Division of Dairy Science, National Institute of Animal Science, Suwon, Republic of Korea
| | - Kwang-Soo Baek
- Division of Dairy Science, National Institute of Animal Science, Suwon, Republic of Korea
| | - Jin Young Jeong
- Division of Animal Genomics and Bioinformatics, National Institute of Animal Science, Suwon, Republic of Korea
| | - Yong-Min Cho
- Division of Animal Genomics and Bioinformatics, National Institute of Animal Science, Suwon, Republic of Korea
| | - Kyung-Tai Lee
- Division of Animal Genomics and Bioinformatics, National Institute of Animal Science, Suwon, Republic of Korea
| | - Byoung-Chul Yang
- Division of Animal Biotechnology, National Institute of Animal Science, Suwon, Republic of Korea
| | - Hyun-Joo Lim
- Division of Dairy Science, National Institute of Animal Science, Suwon, Republic of Korea
| | - Kwanghyeon Cho
- Division of Animal Breeding & Genetics, National Institute of Animal Science, Cheonan, Republic of Korea
| | - Tae-Hun Kim
- Division of Animal Genomics and Bioinformatics, National Institute of Animal Science, Suwon, Republic of Korea
| | - Eung Gi Kwon
- Division of Dairy Science, National Institute of Animal Science, Suwon, Republic of Korea
| | - Jungrye Nam
- Department of Agricultural Biotechnology and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of Korea
| | - Woori Kwak
- CHO&KIM Genomics, SNU Research Park, Seoul National University Mt.4-2, Seoul, Republic of Korea
| | - Seoae Cho
- CHO&KIM Genomics, SNU Research Park, Seoul National University Mt.4-2, Seoul, Republic of Korea
| | - Heebal Kim
- Department of Agricultural Biotechnology and Research Institute of Population Genomics, Seoul National University, Seoul, Republic of KoreaInterdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
| |
Collapse
|
49
|
Abstract
Infectious pathogens are among the strongest selective forces that shape the human genome. Migrations and cultural changes in the past 100,000 years exposed populations to dangerous new pathogens. Host genetics influences susceptibility to infectious disease. Evolutionary adaptations for resistance and symbiosis may underlie common immune-mediated diseases. Signatures of selection and methods to detect them vary with the age, geographical spread and virulence of the pathogen. A history of selection on a trait adds power to association studies by driving the emergence of common alleles of strong effect. Combining selection and association metrics can further increase power. Genome-wide association studies (GWASs) of susceptibility to pathogens that are moderately old (1,000–50,000 years ago), geographically limited in history and exerted strong positive selective pressure will have the most power if GWASs can be done in the historically affected population. An understanding of host–pathogen interactions can inform the development of new therapies for both infectious diseases and common immune-mediated diseases.
The impact of various infectious agents on human survival and reproduction over thousands of years has exerted selective pressure on numerous regions of the human genome. This Review describes how such signatures of selection can be detected and integrated with data from complementary approaches, such as genome-wide association studies, to provide biological insights into host–pathogen interactions. The ancient biological 'arms race' between microbial pathogens and humans has shaped genetic variation in modern populations, and this has important implications for the growing field of medical genomics. As humans migrated throughout the world, populations encountered distinct pathogens, and natural selection increased the prevalence of alleles that are advantageous in the new ecosystems in both host and pathogens. This ancient history now influences human infectious disease susceptibility and microbiome homeostasis, and contributes to common diseases that show geographical disparities, such as autoimmune and metabolic disorders. Using new high-throughput technologies, analytical methods and expanding public data resources, the investigation of natural selection is leading to new insights into the function and dysfunction of human biology.
Collapse
|
50
|
Fagny M, Patin E, Enard D, Barreiro LB, Quintana-Murci L, Laval G. Exploring the occurrence of classic selective sweeps in humans using whole-genome sequencing data sets. Mol Biol Evol 2014; 31:1850-68. [PMID: 24694833 DOI: 10.1093/molbev/msu118] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Genome-wide scans for selection have identified multiple regions of the human genome as being targeted by positive selection. However, only a small proportion has been replicated across studies, and the prevalence of positive selection as a mechanism of adaptive change in humans remains controversial. Here we explore the power of two haplotype-based statistics--the integrated haplotype score (iHS) and the Derived Intraallelic Nucleotide Diversity (DIND) test--in the context of next-generation sequencing data, and evaluate their robustness to demography and other selection modes. We show that these statistics are both powerful for the detection of recent positive selection, regardless of population history, and robust to variation in coverage, with DIND being insensitive to very low coverage. We apply these statistics to whole-genome sequence data sets from the 1000 Genomes Project and Complete Genomics. We found that putative targets of selection were highly significantly enriched in genic and nonsynonymous single nucleotide polymorphisms, and that DIND was more powerful than iHS in the context of small sample sizes, low-quality genotype calling, or poor coverage. As we excluded genomic confounders and alternative selection models, such as background selection, the observed enrichment attests to the action of recent, strong positive selection. Further support to the adaptive significance of these genomic regions came from their enrichment in functional variants detected by genome-wide association studies, informing the relationship between past selection and current benign and disease-related phenotypic variation. Our results indicate that hard sweeps targeting low-frequency standing variation have played a moderate, albeit significant, role in recent human evolution.
Collapse
Affiliation(s)
- Maud Fagny
- Institut Pasteur, Human Evolutionary Genetics, Department of Genomes and Genetics, Paris, FranceCentre National de la Recherche Scientifique, URA3012, Paris, FranceUniversité Pierre et Marie Curie, Cellule Pasteur UPMC, Paris, France
| | - Etienne Patin
- Institut Pasteur, Human Evolutionary Genetics, Department of Genomes and Genetics, Paris, FranceCentre National de la Recherche Scientifique, URA3012, Paris, France
| | | | - Luis B Barreiro
- Department of Pediatrics, Sainte-Justine Hospital Research Center, University of Montreal, Montreal, Quebec, Canada
| | - Lluis Quintana-Murci
- Institut Pasteur, Human Evolutionary Genetics, Department of Genomes and Genetics, Paris, FranceCentre National de la Recherche Scientifique, URA3012, Paris, France
| | - Guillaume Laval
- Institut Pasteur, Human Evolutionary Genetics, Department of Genomes and Genetics, Paris, FranceCentre National de la Recherche Scientifique, URA3012, Paris, France
| |
Collapse
|