1
|
Mychaleckyj JC, Havt A, Nayak U, Pinkerton R, Farber E, Concannon P, Lima AA, Guerrant RL. Genome-Wide Analysis in Brazilians Reveals Highly Differentiated Native American Genome Regions. Mol Biol Evol 2017; 34:559-574. [PMID: 28100790 PMCID: PMC5430616 DOI: 10.1093/molbev/msw249] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Despite its population, geographic size, and emerging economic importance, disproportionately little genome-scale research exists into genetic factors that predispose Brazilians to disease, or the population genetics of risk. After identification of suitable proxy populations and careful analysis of tri-continental admixture in 1,538 North-Eastern Brazilians to estimate individual ancestry and ancestral allele frequencies, we computed 400,000 genome-wide locus-specific branch length (LSBL) Fst statistics of Brazilian Amerindian ancestry compared to European and African; and a similar set of differentiation statistics for their Amerindian component compared with the closest Asian 1000 Genomes population (surprisingly, Bengalis in Bangladesh). After ranking SNPs by these statistics, we identified the top 10 highly differentiated SNPs in five genome regions in the LSBL tests of Brazilian Amerindian ancestry compared to European and African; and the top 10 SNPs in eight regions comparing their Amerindian component to the closest Asian 1000 Genomes population. We found SNPs within or proximal to the genes CIITA (rs6498115), SMC6 (rs1834619), and KLHL29 (rs2288697) were most differentiated in the Amerindian-specific branch, while SNPs in the genes ADAMTS9 (rs7631391), DOCK2 (rs77594147), SLC28A1 (rs28649017), ARHGAP5 (rs7151991), and CIITA (rs45601437) were most highly differentiated in the Asian comparison. These genes are known to influence immune function, metabolic and anthropometry traits, and embryonic development. These analyses have identified candidate genes for selection within Amerindian ancestry, and by comparison of the two analyses, those for which the differentiation may have arisen during the migration from Asia to the Americas.
Collapse
Affiliation(s)
- Josyf C Mychaleckyj
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA.,Department of Public Health Sciences, University of Virginia, Charlottesville, VA
| | - Alexandre Havt
- Departamento de Fisiologia e Farmacologia, Universidade Federal do Ceará, Fortaleza, Brazil.,INCT-Instituto de Biomedicina Universidade Federal do Ceará, Fortaleza, Brazil
| | - Uma Nayak
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA
| | - Relana Pinkerton
- Center for Global Health, University of Virginia, Charlottesville, VA
| | - Emily Farber
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA
| | - Patrick Concannon
- Genetics Institute, University of Florida, Gainesville, FL.,Department of Pathology Immunology and Laboratory Medicine, University of Florida, Gainesville, FL
| | - Aldo A Lima
- Departamento de Fisiologia e Farmacologia, Universidade Federal do Ceará, Fortaleza, Brazil.,INCT-Instituto de Biomedicina Universidade Federal do Ceará, Fortaleza, Brazil
| | | |
Collapse
|
2
|
Haasl RJ, Payseur BA. Fifteen years of genomewide scans for selection: trends, lessons and unaddressed genetic sources of complication. Mol Ecol 2015. [PMID: 26224644 DOI: 10.1111/mec.13339] [Citation(s) in RCA: 124] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Genomewide scans for natural selection (GWSS) have become increasingly common over the last 15 years due to increased availability of genome-scale genetic data. Here, we report a representative survey of GWSS from 1999 to present and find that (i) between 1999 and 2009, 35 of 49 (71%) GWSS focused on human, while from 2010 to present, only 38 of 83 (46%) of GWSS focused on human, indicating increased focus on nonmodel organisms; (ii) the large majority of GWSS incorporate interpopulation or interspecific comparisons using, for example F(ST), cross-population extended haplotype homozygosity or the ratio of nonsynonymous to synonymous substitutions; (iii) most GWSS focus on detection of directional selection rather than other modes such as balancing selection; and (iv) in human GWSS, there is a clear shift after 2004 from microsatellite markers to dense SNP data. A survey of GWSS meant to identify loci positively selected in response to severe hypoxic conditions support an approach to GWSS in which a list of a priori candidate genes based on potential selective pressures are used to filter the list of significant hits a posteriori. We also discuss four frequently ignored determinants of genomic heterogeneity that complicate GWSS: mutation, recombination, selection and the genetic architecture of adaptive traits. We recommend that GWSS methodology should better incorporate aspects of genomewide heterogeneity using empirical estimates of relevant parameters and/or realistic, whole-chromosome simulations to improve interpretation of GWSS results. Finally, we argue that knowledge of potential selective agents improves interpretation of GWSS results and that new methods focused on correlations between environmental variables and genetic variation can help automate this approach.
Collapse
Affiliation(s)
- Ryan J Haasl
- Department of Biology, University of Wisconsin-Platteville, 1 University Plaza, Platteville, WI, 53818, USA
| | - Bret A Payseur
- Laboratory of Genetics, University of Wisconsin-Madison, 425 Henry Mall, Madison, WI, 53706, USA
| |
Collapse
|
3
|
Bhatia G, Patterson N, Pasaniuc B, Zaitlen N, Genovese G, Pollack S, Mallick S, Myers S, Tandon A, Spencer C, Palmer CD, Adeyemo AA, Akylbekova EL, Cupples LA, Divers J, Fornage M, Kao WHL, Lange L, Li M, Musani S, Mychaleckyj JC, Ogunniyi A, Papanicolaou G, Rotimi CN, Rotter JI, Ruczinski I, Salako B, Siscovick DS, Tayo BO, Yang Q, McCarroll S, Sabeti P, Lettre G, De Jager P, Hirschhorn J, Zhu X, Cooper R, Reich D, Wilson JG, Price AL. Genome-wide comparison of African-ancestry populations from CARe and other cohorts reveals signals of natural selection. Am J Hum Genet 2011; 89:368-81. [PMID: 21907010 DOI: 10.1016/j.ajhg.2011.07.025] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2011] [Revised: 07/18/2011] [Accepted: 07/29/2011] [Indexed: 12/11/2022] Open
Abstract
The study of recent natural selection in human populations has important applications to human history and medicine. Positive natural selection drives the increase in beneficial alleles and plays a role in explaining diversity across human populations. By discovering traits subject to positive selection, we can better understand the population level response to environmental pressures including infectious disease. Our study examines unusual population differentiation between three large data sets to detect natural selection. The populations examined, African Americans, Nigerians, and Gambians, are genetically close to one another (F(ST) < 0.01 for all pairs), allowing us to detect selection even with moderate changes in allele frequency. We also develop a tree-based method to pinpoint the population in which selection occurred, incorporating information across populations. Our genome-wide significant results corroborate loci previously reported to be under selection in Africans including HBB and CD36. At the HLA locus on chromosome 6, results suggest the existence of multiple, independent targets of population-specific selective pressure. In addition, we report a genome-wide significant (p = 1.36 × 10(-11)) signal of selection in the prostate stem cell antigen (PSCA) gene. The most significantly differentiated marker in our analysis, rs2920283, is highly differentiated in both Africa and East Asia and has prior genome-wide significant associations to bladder and gastric cancers.
Collapse
Affiliation(s)
- Gaurav Bhatia
- Harvard- Massachusetts Institute of Technology (MIT) Division of Health, Science and Technology, Cambridge, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Carroll TP, O'Connor CA, Floyd O, McPartlin J, Kelleher DP, O'Brien G, Dimitrov BD, Morris VB, Taggart CC, McElvaney NG. The prevalence of alpha-1 antitrypsin deficiency in Ireland. Respir Res 2011; 12:91. [PMID: 21752289 PMCID: PMC3155497 DOI: 10.1186/1465-9921-12-91] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Accepted: 07/13/2011] [Indexed: 02/06/2023] Open
Abstract
Background Alpha-1 antitrypsin deficiency (AATD) results from mutations in the SERPINA1 gene and classically presents with early-onset emphysema and liver disease. The most common mutation presenting with clinical evidence is the Z mutation, while the S mutation is associated with a milder plasma deficiency. AATD is an under-diagnosed condition and the World Health Organisation recommends targeted detection programmes for AATD in patients with chronic obstructive pulmonary disease (COPD), non-responsive asthma, cryptogenic liver disease and first degree relatives of known AATD patients. Methods We present data from the first 3,000 individuals screened following ATS/ERS guidelines as part of the Irish National Targeted Detection Programme (INTDP). We also investigated a DNA collection of 1,100 individuals randomly sampled from the general population. Serum and DNA was collected from both groups and mutations in the SERPINA1 gene detected by phenotyping or genotyping. Results The Irish National Targeted Detection Programme identified 42 ZZ, 44 SZ, 14 SS, 430 MZ, 263 MS, 20 IX and 2 rare mutations. Analysis of 1,100 randomly selected individuals identified 113 MS, 46 MZ, 2 SS and 2 SZ genotypes. Conclusion Our findings demonstrate that AATD in Ireland is more prevalent than previously estimated with Z and S allele frequencies among the highest in the world. Furthermore, our targeted detection programme enriched the population of those carrying the Z but not the S allele, suggesting the Z allele is more important in the pathogenesis of those conditions targeted by the detection programme.
Collapse
Affiliation(s)
- Tomás P Carroll
- Department of Medicine, Royal College of Surgeons in Ireland Education and Research Centre, Beaumont Hospital, Dublin 9, Ireland.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Tong P, Prendergast JGD, Lohan AJ, Farrington SM, Cronin S, Friel N, Bradley DG, Hardiman O, Evans A, Wilson JF, Loftus B. Sequencing and analysis of an Irish human genome. Genome Biol 2010; 11:R91. [PMID: 20822512 PMCID: PMC2965383 DOI: 10.1186/gb-2010-11-9-r91] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2010] [Revised: 07/13/2010] [Accepted: 09/07/2010] [Indexed: 11/10/2022] Open
Abstract
Background Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence. Results Using sequence data from a branch of the European ancestral tree as yet unsequenced, we identify variants that may be specific to this population. Through comparisons with HapMap and previous genetic association studies, we identified novel disease-associated variants, including a novel nonsense variant putatively associated with inflammatory bowel disease. We describe a novel method for improving SNP calling accuracy at low genome coverage using haplotype information. This analysis has implications for future re-sequencing studies and validates the imputation of Irish haplotypes using data from the current Human Genome Diversity Cell Line Panel (HGDP-CEPH). Finally, we identify gene duplication events as constituting significant targets of recent positive selection in the human lineage. Conclusions Our findings show that there remains utility in generating whole genome sequences to illustrate both general principles and reveal specific instances of human biology. With increasing access to low cost sequencing we would predict that even armed with the resources of a small research group a number of similar initiatives geared towards answering specific biological questions will emerge.
Collapse
Affiliation(s)
- Pin Tong
- Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Akey JM. Constructing genomic maps of positive selection in humans: where do we go from here? Genome Res 2009; 19:711-22. [PMID: 19411596 DOI: 10.1101/gr.086652.108] [Citation(s) in RCA: 348] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Identifying targets of positive selection in humans has, until recently, been frustratingly slow, relying on the analysis of individual candidate genes. Genomics, however, has provided the necessary resources to systematically interrogate the entire genome for signatures of natural selection. To date, 21 genome-wide scans for recent or ongoing positive selection have been performed in humans. A key challenge is to begin synthesizing these newly constructed maps of positive selection into a coherent narrative of human evolutionary history and derive a deeper mechanistic understanding of how natural populations evolve. Here, I chronicle the recent history of the burgeoning field of human population genomics, critically assess genome-wide scans for positive selection in humans, identify important gaps in knowledge, and discuss both short- and long-term strategies for traversing the path from the low-resolution, incomplete, and error-prone maps of selection today to the ultimate goal of a detailed molecular, mechanistic, phenotypic, and population genetics characterization of adaptive alleles.
Collapse
Affiliation(s)
- Joshua M Akey
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
7
|
Gu J, Orr N, Park SD, Katz LM, Sulimova G, MacHugh DE, Hill EW. A genome scan for positive selection in thoroughbred horses. PLoS One 2009; 4:e5767. [PMID: 19503617 PMCID: PMC2685479 DOI: 10.1371/journal.pone.0005767] [Citation(s) in RCA: 100] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2008] [Accepted: 01/22/2009] [Indexed: 01/10/2023] Open
Abstract
Thoroughbred horses have been selected for exceptional racing performance resulting in system-wide structural and functional adaptations contributing to elite athletic phenotypes. Because selection has been recent and intense in a closed population that stems from a small number of founder animals Thoroughbreds represent a unique population within which to identify genomic contributions to exercise-related traits. Employing a population genetics-based hitchhiking mapping approach we performed a genome scan using 394 autosomal and X chromosome microsatellite loci and identified positively selected loci in the extreme tail-ends of the empirical distributions for (1) deviations from expected heterozygosity (Ewens-Watterson test) in Thoroughbred (n = 112) and (2) global differentiation among four geographically diverse horse populations (F(ST)). We found positively selected genomic regions in Thoroughbred enriched for phosphoinositide-mediated signalling (3.2-fold enrichment; P<0.01), insulin receptor signalling (5.0-fold enrichment; P<0.01) and lipid transport (2.2-fold enrichment; P<0.05) genes. We found a significant overrepresentation of sarcoglycan complex (11.1-fold enrichment; P<0.05) and focal adhesion pathway (1.9-fold enrichment; P<0.01) genes highlighting the role for muscle strength and integrity in the Thoroughbred athletic phenotype. We report for the first time candidate athletic-performance genes within regions targeted by selection in Thoroughbred horses that are principally responsible for fatty acid oxidation, increased insulin sensitivity and muscle strength: ACSS1 (acyl-CoA synthetase short-chain family member 1), ACTA1 (actin, alpha 1, skeletal muscle), ACTN2 (actinin, alpha 2), ADHFE1 (alcohol dehydrogenase, iron containing, 1), MTFR1 (mitochondrial fission regulator 1), PDK4 (pyruvate dehydrogenase kinase, isozyme 4) and TNC (tenascin C). Understanding the genetic basis for exercise adaptation will be crucial for the identification of genes within the complex molecular networks underlying obesity and its consequential pathologies, such as type 2 diabetes. Therefore, we propose Thoroughbred as a novel in vivo large animal model for understanding molecular protection against metabolic disease.
Collapse
Affiliation(s)
- Jingjing Gu
- Animal Genomics Laboratory, School of Agriculture, Food Science and Veterinary Medicine, College of Life Sciences, University College Dublin, Belfield, Dublin, Ireland
| | - Nick Orr
- Animal Genomics Laboratory, School of Agriculture, Food Science and Veterinary Medicine, College of Life Sciences, University College Dublin, Belfield, Dublin, Ireland
- The Breakthrough Breast Cancer Research Centre, Chester Beatty Laboratories, The Institute of Cancer Research, London, United Kingdom
| | - Stephen D. Park
- Animal Genomics Laboratory, School of Agriculture, Food Science and Veterinary Medicine, College of Life Sciences, University College Dublin, Belfield, Dublin, Ireland
| | - Lisa M. Katz
- University Veterinary Hospital, School of Agriculture, Food Science and Veterinary Medicine, College of Life Sciences, University College Dublin, Belfield, Dublin, Ireland
| | - Galina Sulimova
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - David E. MacHugh
- Animal Genomics Laboratory, School of Agriculture, Food Science and Veterinary Medicine, College of Life Sciences, University College Dublin, Belfield, Dublin, Ireland
- Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland
| | - Emmeline W. Hill
- Animal Genomics Laboratory, School of Agriculture, Food Science and Veterinary Medicine, College of Life Sciences, University College Dublin, Belfield, Dublin, Ireland
| |
Collapse
|