1
|
Jighly A. Boosting genome-wide association power and genomic prediction accuracy for date palm fruit traits with advanced statistics. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2024; 344:112110. [PMID: 38704095 DOI: 10.1016/j.plantsci.2024.112110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 03/05/2024] [Accepted: 04/30/2024] [Indexed: 05/06/2024]
Abstract
The date palm is economically vital in the Middle East and North Africa, providing essential fibres, vitamins, and carbohydrates. Understanding the genetic architecture of its traits remains complex due to the tree's perennial nature and long generation times. This study aims to address these complexities by employing advanced genome-wide association (GWAS) and genomic prediction models using previously published data involving fruit acid content, sugar content, dimension, and colour traits. The multivariate GWAS model identified seven QTL, including five novel associations, that shed light on the genetic control of these traits. Furthermore, the research evaluates different genomic prediction models that considered genotype by environment and genotype by trait interactions. While colour- traits demonstrate strong predictive power, other traits display moderate accuracies across different models and scenarios aligned with the expectations when using small reference populations. When designing the cross-validation to predict new individuals, the accuracy of the best multi-trait model was significantly higher than all single-trait models for dimension traits, but not for the remaining traits, which showed similar performances. However, the cross-validation strategy that masked random phenotypic records (i.e., mimicking the unbalanced phenotypic records) showed significantly higher accuracy for all traits except acid contents. The findings underscore the importance of understanding genetic architecture for informed breeding strategies. The research emphasises the need for larger population sizes and multivariate models to enhance gene tagging power and predictive accuracy to advance date palm breeding programs. These findings support more targeted breeding in date palm, improving productivity and resilience to various environments.
Collapse
|
2
|
Chen J, Jia Y, Zhong J, Zhang K, Dai H, He G, Li F, Zeng L, Fan C, Xu H. Novel mutation leading to splice donor loss in a conserved site of DMD gene causes Duchenne muscular dystrophy with cryptorchidism. J Med Genet 2024:jmg-2024-109896. [PMID: 38621993 DOI: 10.1136/jmg-2024-109896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 04/04/2024] [Indexed: 04/17/2024]
Abstract
BACKGROUND As one of the most common congenital abnormalities in male births, cryptorchidism has been found to have a polygenic aetiology according to previous studies of common variants. However, little is known about genetic predisposition of rare variants for cryptorchidism, since rare variants have larger effective size on diseases than common variants. METHODS In this study, a cohort of 115 Chinese probands with cryptorchidism was analysed using whole-genome sequencing, alongside 19 parental controls and 2136 unaffected men. Additionally, CRISPR-Cas9 editing of a conserved variant was performed in a mouse model, with MRI screening used to observe the phenotype. RESULTS In 30 of 115 patients (26.1%), we identified four novel genes (ARSH, DMD, MAGEA4 and SHROOM2) affecting at least five unrelated patients and four known genes (USP9Y, UBA1, BCORL1 and KDM6A) with the candidate rare pathogenic variants affecting at least two cases. Burden tests of rare variants revealed the genome-wide significances for newly identified genes (p<2.5×10-6) under the Bonferroni correction. Surprisingly, novel and known genes were mainly found on X chromosome (seven on X and one on Y) and all rare X-chromosomal segregating variants exhibited a maternal inheritance rather than de novo origin. CRISPR-Cas9 mouse modelling of a splice donor loss variant in DMD (NC_000023.11:g.32454661C>G), which resides in a conserved site across vertebrates, replicated bilateral cryptorchidism phenotypes, confirmed by MRI at 4 and 10 weeks. The movement tests further revealed symptoms of Duchenne muscular dystrophy (DMD) in transgenic mice. CONCLUSION Our results revealed the role of the DMD gene mutation in causing cryptorchidism. The results also suggest that maternal-X inheritance of pathogenic defects could have a predominant role in the development of cryptorchidism.
Collapse
Affiliation(s)
- Jianhai Chen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, USA
| | - Yangying Jia
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
- Department of Chemistry, The University of Chicago, Chicago, Illinois, USA
| | - Jie Zhong
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Kun Zhang
- Department of Radiology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of Ministry of Education, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Hongzheng Dai
- Department of Human and Molecular Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Guanglin He
- Institute of Rare Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan, People's Republic of China
| | - Fuping Li
- Laboratory of Molecular Translational Medicine, Center for Translational Medicine, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Ministry of Education, Clinical Research Center for Birth Defects of Sichuan Province, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Li Zeng
- Department of Pediatric Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Chuanzhu Fan
- Department of Biological Sciences, Wayne State University, Detroit, Michigan, USA
| | - Huayan Xu
- Department of Radiology, Key Laboratory of Birth Defects and Related Diseases of Women and Children of Ministry of Education, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
3
|
Žukauskaitė G, Domarkienė I, Rančelis T, Kavaliauskienė I, Baronas K, Kučinskas V, Ambrozaitytė L. Putative protective genomic variation in the Lithuanian population. Genet Mol Biol 2024; 47:e20230030. [PMID: 38626572 PMCID: PMC11021042 DOI: 10.1590/1678-4685-gmb-2023-0030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 01/01/2024] [Indexed: 04/18/2024] Open
Abstract
Genomic effect variants associated with survival and protection against complex diseases vary between populations due to microevolutionary processes. The aim of this study was to analyse diversity and distribution of effect variants in a context of potential positive selection. In total, 475 individuals of Lithuanian origin were genotyped using high-throughput scanning and/or sequencing technologies. Allele frequency analysis for the pre-selected effect variants was performed using the catalogue of single nucleotide polymorphisms. Comparison of the pre-selected effect variants with variants in primate species was carried out to ascertain which allele was derived and potentially of protective nature. Recent positive selection analysis was performed to verify this protective effect. Four variants having significantly different frequencies compared to European populations were identified while two other variants reached borderline significance. Effect variant in SLC30A8 gene may potentially protect against type 2 diabetes. The existing paradox of high rates of type 2 diabetes in the Lithuanian population and the relatively high frequencies of potentially protective genome variants against it indicate a lack of knowledge about the interactions between environmental factors, regulatory regions, and other genome variation. Identification of effect variants is a step towards better understanding of the microevolutionary processes, etiopathogenetic mechanisms, and personalised medicine.
Collapse
Affiliation(s)
- Gabrielė Žukauskaitė
- Vilnius University, Faculty of Medicine, Institute of Biomedical Sciences, Department of Human and Medical Genetics, Vilnius, Lithuania
| | - Ingrida Domarkienė
- Vilnius University, Faculty of Medicine, Institute of Biomedical Sciences, Department of Human and Medical Genetics, Vilnius, Lithuania
| | - Tautvydas Rančelis
- Vilnius University, Faculty of Medicine, Institute of Biomedical Sciences, Department of Human and Medical Genetics, Vilnius, Lithuania
| | - Ingrida Kavaliauskienė
- Vilnius University, Faculty of Medicine, Institute of Biomedical Sciences, Department of Human and Medical Genetics, Vilnius, Lithuania
| | - Karolis Baronas
- Vilnius University, Faculty of Medicine, Institute of Biomedical Sciences, Department of Human and Medical Genetics, Vilnius, Lithuania
| | - Vaidutis Kučinskas
- Vilnius University, Faculty of Medicine, Institute of Biomedical Sciences, Department of Human and Medical Genetics, Vilnius, Lithuania
| | - Laima Ambrozaitytė
- Vilnius University, Faculty of Medicine, Institute of Biomedical Sciences, Department of Human and Medical Genetics, Vilnius, Lithuania
| |
Collapse
|
4
|
Mahadevan J, Sud R, Nadella RK, Vani P, Subramaniam AG, Paul P, Ganapathy A, Mannan AU, Chandru V, Viswanath B, Purushottam M, Jain S. Targeted Sequencing Detects Variants That May Contribute to the Risk of Neuropsychiatric Disorders. Indian J Psychol Med 2022; 44:516-522. [PMID: 36157006 PMCID: PMC9460021 DOI: 10.1177/0253717621993672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Affiliation(s)
- Jayant Mahadevan
- Dept. of Psychiatry, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| | - Reeteka Sud
- Molecular Genetics Laboratory, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| | - Ravi Kumar Nadella
- Dept. of Psychiatry, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| | - Pulaparambil Vani
- Dept. of Psychiatry, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| | - Anand G Subramaniam
- Molecular Genetics Laboratory, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| | - Pradip Paul
- Molecular Genetics Laboratory, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| | - Aparna Ganapathy
- Strand Center for Genomics and Personalized Medicine, Strand Life Sciences, Bengaluru, Karnataka, India
| | - Ashraf U Mannan
- Strand Center for Genomics and Personalized Medicine, Strand Life Sciences, Bengaluru, Karnataka, India
| | - Vijay Chandru
- Strand Center for Genomics and Personalized Medicine, Strand Life Sciences, Bengaluru, Karnataka, India.,Centre for Biosystems Science and Engineering, Indian Institute of Science, Bengaluru, Karnataka, India
| | - Biju Viswanath
- Dept. of Psychiatry, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India.,Molecular Genetics Laboratory, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| | - Meera Purushottam
- Molecular Genetics Laboratory, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| | - Sanjeev Jain
- Dept. of Psychiatry, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India.,Molecular Genetics Laboratory, Neurobiology Research Centre, National Institute of Mental Health and Neurosciences (NIMHANS), Bengaluru, Karnataka, India
| |
Collapse
|
5
|
NyuWa Genome resource: A deep whole-genome sequencing-based variation profile and reference panel for the Chinese population. Cell Rep 2021; 37:110017. [PMID: 34788621 DOI: 10.1016/j.celrep.2021.110017] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 05/04/2021] [Accepted: 10/28/2021] [Indexed: 01/07/2023] Open
Abstract
The lack of haplotype reference panels and whole-genome sequencing resources specific to the Chinese population has greatly hindered genetic studies in the world's largest population. Here, we present the NyuWa genome resource, based on deep (26.2×) sequencing of 2,999 Chinese individuals, and construct a NyuWa reference panel of 5,804 haplotypes and 19.3 million variants, which is a high-quality publicly available Chinese population-specific reference panel with thousands of samples. Compared with other panels, the NyuWa reference panel reduces the Han Chinese imputation error rate by a margin ranging from 30% to 51%. Population structure and imputation simulation tests support the applicability of one integrated reference panel for northern and southern Chinese. In addition, a total of 22,504 loss-of-function variants in coding and noncoding genes are identified, including 11,493 novel variants. These results highlight the value of the NyuWa genome resource in facilitating genetic research in Chinese and Asian populations.
Collapse
|
6
|
Hande SH, Krishna SM, Sahote KK, Dev N, Erl TP, Ramakrishna K, Ravidhran R, Das R. Population genetic variation of SLC6A4 gene, associated with neurophysiological development. J Genet 2021. [DOI: 10.1007/s12041-021-01266-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
7
|
Eckert MA, Harris KC, Lang H, Lewis MA, Schmiedt RA, Schulte BA, Steel KP, Vaden KI, Dubno JR. Translational and interdisciplinary insights into presbyacusis: A multidimensional disease. Hear Res 2021; 402:108109. [PMID: 33189490 PMCID: PMC7927149 DOI: 10.1016/j.heares.2020.108109] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 10/19/2020] [Accepted: 10/25/2020] [Indexed: 12/18/2022]
Abstract
There are multiple etiologies and phenotypes of age-related hearing loss or presbyacusis. In this review we summarize findings from animal and human studies of presbyacusis, including those that provide the theoretical framework for distinct metabolic, sensory, and neural presbyacusis phenotypes. A key finding in quiet-aged animals is a decline in the endocochlear potential (EP) that results in elevated pure-tone thresholds across frequencies with greater losses at higher frequencies. In contrast, sensory presbyacusis appears to derive, in part, from acute and cumulative effects on hair cells of a lifetime of environmental exposures (e.g., noise), which often result in pronounced high frequency hearing loss. These patterns of hearing loss in animals are recognizable in the human audiogram and can be classified into metabolic and sensory presbyacusis phenotypes, as well as a mixed metabolic+sensory phenotype. However, the audiogram does not fully characterize age-related changes in auditory function. Along with the effects of peripheral auditory system declines on the auditory nerve, primary degeneration in the spiral ganglion also appears to contribute to central auditory system aging. These inner ear alterations often correlate with structural and functional changes throughout the central nervous system and may explain suprathreshold speech communication difficulties in older adults with hearing loss. Throughout this review we highlight potential methods and research directions, with the goal of advancing our understanding, prevention, diagnosis, and treatment of presbyacusis.
Collapse
Affiliation(s)
- Mark A Eckert
- Medical University of South Carolina, Department of Otolaryngology - Head and Neck Surgery, Charleston, SC 29425, USA.
| | - Kelly C Harris
- Medical University of South Carolina, Department of Otolaryngology - Head and Neck Surgery, Charleston, SC 29425, USA
| | - Hainan Lang
- Medical University of South Carolina, Department of Pathology and Laboratory Medicine, Charleston, SC 29425, USA
| | - Morag A Lewis
- King's College London, Wolfson Centre for Age-Related Diseases, London SE1 1UL, United Kingdom
| | - Richard A Schmiedt
- Medical University of South Carolina, Department of Otolaryngology - Head and Neck Surgery, Charleston, SC 29425, USA
| | - Bradley A Schulte
- Medical University of South Carolina, Department of Pathology and Laboratory Medicine, Charleston, SC 29425, USA; Medical University of South Carolina, Department of Otolaryngology - Head and Neck Surgery, Charleston, SC 29425, USA
| | - Karen P Steel
- King's College London, Wolfson Centre for Age-Related Diseases, London SE1 1UL, United Kingdom
| | - Kenneth I Vaden
- Medical University of South Carolina, Department of Otolaryngology - Head and Neck Surgery, Charleston, SC 29425, USA
| | - Judy R Dubno
- Medical University of South Carolina, Department of Otolaryngology - Head and Neck Surgery, Charleston, SC 29425, USA; Medical University of South Carolina, Department of Pathology and Laboratory Medicine, Charleston, SC 29425, USA
| |
Collapse
|
8
|
Umans BD, Battle A, Gilad Y. Where Are the Disease-Associated eQTLs? Trends Genet 2021; 37:109-124. [PMID: 32912663 PMCID: PMC8162831 DOI: 10.1016/j.tig.2020.08.009] [Citation(s) in RCA: 128] [Impact Index Per Article: 42.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 08/07/2020] [Accepted: 08/14/2020] [Indexed: 02/07/2023]
Abstract
Most disease-associated variants, although located in putatively regulatory regions, do not have detectable effects on gene expression. One explanation could be that we have not examined gene expression in the cell types or conditions that are most relevant for disease. Even large-scale efforts to study gene expression across tissues are limited to human samples obtained opportunistically or postmortem, mostly from adults. In this review we evaluate recent findings and suggest an alternative strategy, drawing on the dynamic and highly context-specific nature of gene regulation. We discuss new technologies that can extend the standard regulatory mapping framework to more diverse, disease-relevant cell types and states.
Collapse
Affiliation(s)
- Benjamin D Umans
- Department of Medicine, University of Chicago, Chicago, IL, USA.
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
| | - Yoav Gilad
- Department of Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
9
|
Spear ML, Diaz-Papkovich A, Ziv E, Yracheta JM, Gravel S, Torgerson DG, Hernandez RD. Recent shifts in the genomic ancestry of Mexican Americans may alter the genetic architecture of biomedical traits. eLife 2020; 9:e56029. [PMID: 33372659 PMCID: PMC7771964 DOI: 10.7554/elife.56029] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 12/13/2020] [Indexed: 11/13/2022] Open
Abstract
People in the Americas represent a diverse continuum of populations with varying degrees of admixture among African, European, and Amerindigenous ancestries. In the United States, populations with non-European ancestry remain understudied, and thus little is known about the genetic architecture of phenotypic variation in these populations. Using genotype data from the Hispanic Community Health Study/Study of Latinos, we find that Amerindigenous ancestry increased by an average of ~20% spanning 1940s-1990s in Mexican Americans. These patterns result from complex interactions between several population and cultural factors which shaped patterns of genetic variation and influenced the genetic architecture of complex traits in Mexican Americans. We show for height how polygenic risk scores based on summary statistics from a European-based genome-wide association study perform poorly in Mexican Americans. Our findings reveal temporal changes in population structure within Hispanics/Latinos that may influence biomedical traits, demonstrating a need to improve our understanding of admixed populations.
Collapse
Affiliation(s)
- Melissa L Spear
- Biomedical Sciences Graduate Program, University of California, San FranciscoSan FranciscoUnited States
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
| | - Alex Diaz-Papkovich
- McGill Genome Centre, McGill UniversityMontrealCanada
- Quantitative Life Sciences Program, McGill UniversityMontrealCanada
| | - Elad Ziv
- Division of General Internal Medicine, University of California, San FranciscoSan FranciscoUnited States
- Department of Medicine, University of California, San FranciscoSan FranciscoUnited States
- Institute of Human Genetics, University of California, San FranciscoSan FranciscoUnited States
- Helen Diller Family Comprehensive Cancer Center, University of California, San FranciscoSan FranciscoUnited States
| | - Joseph M Yracheta
- Native BioData ConsortiumEagle ButteUnited States
- Bloomberg School of Public Health, Johns Hopkins UniversityBaltimoreUnited States
| | - Simon Gravel
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
| | - Dara G Torgerson
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
- Department of Epidemiology and Biostatistics University of California, San FranciscoSan FranciscoUnited States
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
- Institute of Human Genetics, University of California, San FranciscoSan FranciscoUnited States
- Bakar Computational Health Sciences Institute, University of California, San FranciscoSan FranciscoUnited States
- Quantitative Biosciences Institute, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
10
|
Dapas M, Lin FTJ, Nadkarni GN, Sisk R, Legro RS, Urbanek M, Hayes MG, Dunaif A. Distinct subtypes of polycystic ovary syndrome with novel genetic associations: An unsupervised, phenotypic clustering analysis. PLoS Med 2020; 17:e1003132. [PMID: 32574161 PMCID: PMC7310679 DOI: 10.1371/journal.pmed.1003132] [Citation(s) in RCA: 107] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Accepted: 05/13/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Polycystic ovary syndrome (PCOS) is a common, complex genetic disorder affecting up to 15% of reproductive-age women worldwide, depending on the diagnostic criteria applied. These diagnostic criteria are based on expert opinion and have been the subject of considerable controversy. The phenotypic variation observed in PCOS is suggestive of an underlying genetic heterogeneity, but a recent meta-analysis of European ancestry PCOS cases found that the genetic architecture of PCOS defined by different diagnostic criteria was generally similar, suggesting that the criteria do not identify biologically distinct disease subtypes. We performed this study to test the hypothesis that there are biologically relevant subtypes of PCOS. METHODS AND FINDINGS Using biochemical and genotype data from a previously published PCOS genome-wide association study (GWAS), we investigated whether there were reproducible phenotypic subtypes of PCOS with subtype-specific genetic associations. Unsupervised hierarchical cluster analysis was performed on quantitative anthropometric, reproductive, and metabolic traits in a genotyped cohort of 893 PCOS cases (median and interquartile range [IQR]: age = 28 [25-32], body mass index [BMI] = 35.4 [28.2-41.5]). The clusters were replicated in an independent, ungenotyped cohort of 263 PCOS cases (median and IQR: age = 28 [24-33], BMI = 35.7 [28.4-42.3]). The clustering revealed 2 distinct PCOS subtypes: a "reproductive" group (21%-23%), characterized by higher luteinizing hormone (LH) and sex hormone binding globulin (SHBG) levels with relatively low BMI and insulin levels, and a "metabolic" group (37%-39%), characterized by higher BMI, glucose, and insulin levels with lower SHBG and LH levels. We performed a GWAS on the genotyped cohort, limiting the cases to either the reproductive or metabolic subtypes. We identified alleles in 4 loci that were associated with the reproductive subtype at genome-wide significance (PRDM2/KAZN, P = 2.2 × 10-10; IQCA1, P = 2.8 × 10-9; BMPR1B/UNC5C, P = 9.7 × 10-9; CDH10, P = 1.2 × 10-8) and one locus that was significantly associated with the metabolic subtype (KCNH7/FIGN, P = 1.0 × 10-8). We developed a predictive model to classify a separate, family-based cohort of 73 women with PCOS (median and IQR: age = 28 [25-33], BMI = 34.3 [27.8-42.3]) and found that the subtypes tended to cluster in families and that carriers of previously reported rare variants in DENND1A, a gene that regulates androgen biosynthesis, were significantly more likely to have the reproductive subtype of PCOS. Limitations of our study were that only PCOS cases of European ancestry diagnosed by National Institutes of Health (NIH) criteria were included, the sample sizes for the subtype GWAS were small, and the GWAS findings were not replicated. CONCLUSIONS In conclusion, we have found reproducible reproductive and metabolic subtypes of PCOS. Furthermore, these subtypes were associated with novel, to our knowledge, susceptibility loci. Our results suggest that these subtypes are biologically relevant because they appear to have distinct genetic architecture. This study demonstrates how phenotypic subtyping can be used to gain additional insights from GWAS data.
Collapse
Affiliation(s)
- Matthew Dapas
- Division of Endocrinology, Metabolism, and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Frederick T. J. Lin
- Division of Endocrinology, Metabolism, and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Girish N. Nadkarni
- Division of Nephrology, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | - Ryan Sisk
- Division of Endocrinology, Metabolism, and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Richard S. Legro
- Department of Obstetrics and Gynecology, Penn State College of Medicine, Hershey, Pennsylvania, United States of America
| | - Margrit Urbanek
- Division of Endocrinology, Metabolism, and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Center for Reproductive Science, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - M. Geoffrey Hayes
- Division of Endocrinology, Metabolism, and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Center for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Department of Anthropology, Northwestern University, Evanston, Illinois, United States of America
| | - Andrea Dunaif
- Division of Endocrinology, Diabetes and Bone Disease, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
11
|
Uricchio LH. Evolutionary perspectives on polygenic selection, missing heritability, and GWAS. Hum Genet 2020; 139:5-21. [PMID: 31201529 PMCID: PMC8059781 DOI: 10.1007/s00439-019-02040-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 06/06/2019] [Indexed: 12/26/2022]
Abstract
Genome-wide association studies (GWAS) have successfully identified many trait-associated variants, but there is still much we do not know about the genetic basis of complex traits. Here, we review recent theoretical and empirical literature regarding selection on complex traits to argue that "missing heritability" is as much an evolutionary problem as it is a statistical problem. We discuss empirical findings that suggest a role for selection in shaping the effect sizes and allele frequencies of causal variation underlying complex traits, and the limitations of these studies. We then use simulations of selection, realistic genome structure, and complex human demography to illustrate the results of recent theoretical work on polygenic selection, and show that statistical inference of causal loci is sharply affected by evolutionary processes. In particular, when selection acts on causal alleles, it hampers the ability to detect causal loci and constrains the transferability of GWAS results across populations. Last, we discuss the implications of these findings for future association studies, and suggest that future statistical methods to infer causal loci for genetic traits will benefit from explicit modeling of the joint distribution of effect sizes and allele frequencies under plausible evolutionary models.
Collapse
Affiliation(s)
- Lawrence H Uricchio
- Department of Biology, Stanford University, Stanford, CA, USA.
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
12
|
Lindeboom RGH, Vermeulen M, Lehner B, Supek F. The impact of nonsense-mediated mRNA decay on genetic disease, gene editing and cancer immunotherapy. Nat Genet 2019; 51:1645-1651. [PMID: 31659324 PMCID: PMC6858879 DOI: 10.1038/s41588-019-0517-5] [Citation(s) in RCA: 138] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 09/23/2019] [Indexed: 12/21/2022]
Abstract
Premature termination codons (PTCs) can result in the production of truncated proteins or the degradation of messenger RNAs by nonsense-mediated mRNA decay (NMD). Which of these outcomes occurs can alter the effect of a mutation, with the engagement of NMD being dependent on a series of rules. Here, by applying these rules genome-wide to obtain a resource called NMDetective, we explore the impact of NMD on genetic disease and approaches to therapy. First, human genetic diseases differ in whether NMD typically aggravates or alleviates the effects of PTCs. Second, failure to trigger NMD is a cause of ineffective gene inactivation by CRISPR-Cas9 gene editing. Finally, NMD is a determinant of the efficacy of cancer immunotherapy, with only frameshifted transcripts that escape NMD predicting a response. These results demonstrate the importance of incorporating the rules of NMD into clinical decision-making. Moreover, they suggest that inhibiting NMD may be effective in enhancing cancer immunotherapy.
Collapse
Affiliation(s)
- Rik G H Lindeboom
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Michiel Vermeulen
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Ben Lehner
- Systems Biology Program, Centre for Genomic Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain. .,Universitat Pompeu Fabra, Barcelona, Spain. .,Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain.
| | - Fran Supek
- Institució Catalana de Recerca i Estudis Avançats, Barcelona, Spain. .,Institut de Recerca Biomedica Barcelona, The Barcelona Institute of Science and Technology, Barcelona, Spain.
| |
Collapse
|
13
|
Fragoza R, Das J, Wierbowski SD, Liang J, Tran TN, Liang S, Beltran JF, Rivera-Erick CA, Ye K, Wang TY, Yao L, Mort M, Stenson PD, Cooper DN, Wei X, Keinan A, Schimenti JC, Clark AG, Yu H. Extensive disruption of protein interactions by genetic variants across the allele frequency spectrum in human populations. Nat Commun 2019; 10:4141. [PMID: 31515488 PMCID: PMC6742646 DOI: 10.1038/s41467-019-11959-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Accepted: 08/06/2019] [Indexed: 12/19/2022] Open
Abstract
Each human genome carries tens of thousands of coding variants. The extent to which this variation is functional and the mechanisms by which they exert their influence remains largely unexplored. To address this gap, we leverage the ExAC database of 60,706 human exomes to investigate experimentally the impact of 2009 missense single nucleotide variants (SNVs) across 2185 protein-protein interactions, generating interaction profiles for 4797 SNV-interaction pairs, of which 421 SNVs segregate at > 1% allele frequency in human populations. We find that interaction-disruptive SNVs are prevalent at both rare and common allele frequencies. Furthermore, these results suggest that 10.5% of missense variants carried per individual are disruptive, a higher proportion than previously reported; this indicates that each individual's genetic makeup may be significantly more complex than expected. Finally, we demonstrate that candidate disease-associated mutations can be identified through shared interaction perturbations between variants of interest and known disease mutations.
Collapse
Affiliation(s)
- Robert Fragoza
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Jishnu Das
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, 02139, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Shayne D Wierbowski
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Jin Liang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Tina N Tran
- Department of Biomedical Science, Cornell University, Ithaca, NY, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Siqi Liang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Juan F Beltran
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Christen A Rivera-Erick
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Kaixiong Ye
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Ting-Yi Wang
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Li Yao
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Peter D Stenson
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Xiaomu Wei
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Alon Keinan
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - John C Schimenti
- Department of Biomedical Science, Cornell University, Ithaca, NY, 14853, USA
| | - Andrew G Clark
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14853, USA
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA.
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
14
|
Lin YL, Gokcumen O. Fine-Scale Characterization of Genomic Structural Variation in the Human Genome Reveals Adaptive and Biomedically Relevant Hotspots. Genome Biol Evol 2019; 11:1136-1151. [PMID: 30887040 PMCID: PMC6475128 DOI: 10.1093/gbe/evz058] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/16/2019] [Indexed: 12/25/2022] Open
Abstract
Genomic structural variants (SVs) are distributed nonrandomly across the human genome. The "hotspots" of SVs have been implicated in evolutionary innovations, as well as medical conditions. However, the evolutionary and biomedical features of these hotspots remain incompletely understood. Here, we analyzed data from 2,504 genomes to construct a refined map of 1,148 SV hotspots in human genomes. We confirmed that segmental duplication-related nonallelic homologous recombination is an important mechanistic driver of SV hotspot formation. However, to our surprise, we also found that a majority of SVs in hotspots do not form through such recombination-based mechanisms, suggesting diverse mechanistic and selective forces shaping hotspots. Indeed, our evolutionary analyses showed that the majority of SV hotspots are within gene-poor regions and evolve under relaxed negative selection or neutrality. However, we still found a small subset of SV hotspots harboring genes that are enriched for anthropologically crucial functions and evolve under geography-specific and balancing adaptive forces. These include two independent hotspots on different chromosomes affecting alpha and beta hemoglobin gene clusters. Biomedically, we found that the SV hotspots coincide with breakpoints of clinically relevant, large de novo SVs, significantly more often than genome-wide expectations. For example, we showed that the breakpoints of multiple large SVs, which lead to idiopathic short stature, coincide with SV hotspots. Therefore, the mutational instability in SV hotpots likely enables chromosomal breaks that lead to pathogenic structural variation formations. Overall, our study contributes to a better understanding of the mutational and adaptive landscape of the genome.
Collapse
Affiliation(s)
- Yen-Lung Lin
- Department of Biological Sciences, University at Buffalo
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo
- Corresponding author: E-mail: or
| |
Collapse
|
15
|
The demographic and adaptive history of central African hunter-gatherers and farmers. Curr Opin Genet Dev 2018; 53:90-97. [PMID: 30103089 DOI: 10.1016/j.gde.2018.07.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Accepted: 07/18/2018] [Indexed: 01/06/2023]
Abstract
Central Africa, a forested region that supports an exceptionally high biodiversity, hosts the world's largest group of hunter-gatherers, who live in close proximity with groups that have adopted agriculture over the past 5000 years. Our understanding of the prehistory of these populations has been dramatically hampered by the almost total absence of fossil remains in this region, a limitation that has recently been circumvented by population genomics approaches. Different studies have estimated that ancestors of rainforest hunter-gatherers and Bantu-speaking farmers separated more than 60 000 years ago, supporting the occurrence of ancient population structure in Africa since the Late Pleistocene. Conversely, the Holocene in central Africa was characterized by large-scale population migrations associated with the emergence of agriculture, and increased genetic interactions between autochthonous rainforest hunter-gatherers and expanding Bantu-speaking farmers. Genomic scans have detected numerous candidate loci for positive selection in these populations, including convergent adaptation for short stature in groups of rainforest hunter-gatherers and local adaptation to endemic malaria in western and central Africans. Furthermore, there is recent increasing evidence that adaptive variation has been acquired by various African populations through admixture, suggesting a previously unappreciated role of intraspecies gene flow in local adaptation. Ancient and modern DNA studies will greatly broaden, and probably challenge, our view on the past history of central Africa, where introgression from yet uncharacterized archaic hominins and long-term adaptation to distinct ecological niches are suspected.
Collapse
|
16
|
Ioannidis NM, Davis JR, DeGorter MK, Larson NB, McDonnell SK, French AJ, Battle AJ, Hastie TJ, Thibodeau SN, Montgomery SB, Bustamante CD, Sieh W, Whittemore AS. FIRE: functional inference of genetic variants that regulate gene expression. Bioinformatics 2018; 33:3895-3901. [PMID: 28961785 DOI: 10.1093/bioinformatics/btx534] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 08/23/2017] [Indexed: 12/18/2022] Open
Abstract
Motivation Interpreting genetic variation in noncoding regions of the genome is an important challenge for personal genome analysis. One mechanism by which noncoding single nucleotide variants (SNVs) influence downstream phenotypes is through the regulation of gene expression. Methods to predict whether or not individual SNVs are likely to regulate gene expression would aid interpretation of variants of unknown significance identified in whole-genome sequencing studies. Results We developed FIRE (Functional Inference of Regulators of Expression), a tool to score both noncoding and coding SNVs based on their potential to regulate the expression levels of nearby genes. FIRE consists of 23 random forests trained to recognize SNVs in cis-expression quantitative trait loci (cis-eQTLs) using a set of 92 genomic annotations as predictive features. FIRE scores discriminate cis-eQTL SNVs from non-eQTL SNVs in the training set with a cross-validated area under the receiver operating characteristic curve (AUC) of 0.807, and discriminate cis-eQTL SNVs shared across six populations of different ancestry from non-eQTL SNVs with an AUC of 0.939. FIRE scores are also predictive of cis-eQTL SNVs across a variety of tissue types. Availability and implementation FIRE scores for genome-wide SNVs in hg19/GRCh37 are available for download at https://sites.google.com/site/fireregulatoryvariation/. Contact nilah@stanford.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Marianne K DeGorter
- Department of Genetics
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | | | - Amy J French
- Department of Laboratory Medicine & Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | - Alexis J Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Trevor J Hastie
- Department of Statistics, Stanford University, Stanford, CA 94305, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Stephen N Thibodeau
- Department of Laboratory Medicine & Pathology, Mayo Clinic, Rochester, MN 55905, USA
| | - Stephen B Montgomery
- Department of Genetics
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Carlos D Bustamante
- Department of Genetics
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Weiva Sieh
- Department of Health Research & Policy
- Department of Population Health Science & Policy
- Department of Genetics & Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Alice S Whittemore
- Department of Health Research & Policy
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
17
|
Alachiotis N, Pavlidis P. RAiSD detects positive selection based on multiple signatures of a selective sweep and SNP vectors. Commun Biol 2018; 1:79. [PMID: 30271960 PMCID: PMC6123745 DOI: 10.1038/s42003-018-0085-8] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 06/05/2018] [Indexed: 12/16/2022] Open
Abstract
Selective sweeps leave distinct signatures locally in genomes, enabling the detection of loci that have undergone recent positive selection. Multiple signatures of a selective sweep are known, yet each neutrality test only identifies a single signature. We present RAiSD (Raised Accuracy in Sweep Detection), an open-source software that implements a novel, to our knowledge, and parameter-free detection mechanism that relies on multiple signatures of a selective sweep via the enumeration of SNP vectors. RAiSD achieves higher sensitivity and accuracy than the current state of the art, while the computational complexity is greatly reduced, allowing up to 1000 times faster processing than widely used tools, and negligible memory requirements. Nikolaos Alachiotis and Pavlos Pavlidis present RAiSD, a computational method for identifying multiple signatures of selective sweeps using single nucleotide polymorphism vectors. They show that RAiSD has higher sensitivity and accuracy with reduced computational complexity than current methods.
Collapse
Affiliation(s)
- Nikolaos Alachiotis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Nikolaou Plastira 100, 70013, Heraklion, Crete, Greece.
| | - Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Nikolaou Plastira 100, 70013, Heraklion, Crete, Greece.
| |
Collapse
|
18
|
Torres R, Szpiech ZA, Hernandez RD. Human demographic history has amplified the effects of background selection across the genome. PLoS Genet 2018; 14:e1007387. [PMID: 29912945 PMCID: PMC6056204 DOI: 10.1371/journal.pgen.1007387] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Revised: 07/23/2018] [Accepted: 04/30/2018] [Indexed: 01/22/2023] Open
Abstract
Natural populations often grow, shrink, and migrate over time. Such demographic processes can affect genome-wide levels of genetic diversity. Additionally, genetic variation in functional regions of the genome can be altered by natural selection, which drives adaptive mutations to higher frequencies or purges deleterious ones. Such selective processes affect not only the sites directly under selection but also nearby neutral variation through genetic linkage via processes referred to as genetic hitchhiking in the context of positive selection and background selection (BGS) in the context of purifying selection. While there is extensive literature examining the consequences of selection at linked sites at demographic equilibrium, less is known about how non-equilibrium demographic processes influence the effects of hitchhiking and BGS. Utilizing a global sample of human whole-genome sequences from the Thousand Genomes Project and extensive simulations, we investigate how non-equilibrium demographic processes magnify and dampen the consequences of selection at linked sites across the human genome. When binning the genome by inferred strength of BGS, we observe that, compared to Africans, non-African populations have experienced larger proportional decreases in neutral genetic diversity in strong BGS regions. We replicate these findings in admixed populations by showing that non-African ancestral components of the genome have also been affected more severely in these regions. We attribute these differences to the strong, sustained/recurrent population bottlenecks that non-Africans experienced as they migrated out of Africa and throughout the globe. Furthermore, we observe a strong correlation between FST and the inferred strength of BGS, suggesting a stronger rate of genetic drift. Forward simulations of human demographic history with a model of BGS support these observations. Our results show that non-equilibrium demography significantly alters the consequences of selection at linked sites and support the need for more work investigating the dynamic process of multiple evolutionary forces operating in concert.
Collapse
Affiliation(s)
- Raul Torres
- Biomedical Sciences Graduate Program, University of California San Francisco, San Francisco, CA, United States of America
| | - Zachary A. Szpiech
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, United States of America
| | - Ryan D. Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, United States of America
- Institute for Computational Health Sciences, University of California San Francisco, San Francisco, CA, United States of America
- Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA, United States of America
- * E-mail:
| |
Collapse
|
19
|
Vuckovic D, Mezzavilla M, Cocca M, Morgan A, Brumat M, Catamo E, Concas MP, Biino G, Franzè A, Ambrosetti U, Pirastu M, Gasparini P, Girotto G. Whole-genome sequencing reveals new insights into age-related hearing loss: cumulative effects, pleiotropy and the role of selection. Eur J Hum Genet 2018; 26:1167-1179. [PMID: 29725052 PMCID: PMC6057993 DOI: 10.1038/s41431-018-0126-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Revised: 02/05/2018] [Accepted: 02/13/2018] [Indexed: 01/17/2023] Open
Abstract
Age-related hearing loss (ARHL) is the most common sensory disorder in the elderly. Although not directly life threatening, it contributes to loss of autonomy and is associated with anxiety, depression and cognitive decline. To search for genetic risk factors underlying ARHL, a large whole-genome sequencing (WGS) approach has been carried out in a cohort of 212 cases and controls, both older than 50 years to select genes characterized by a burden of variants specific to cases or controls. Accordingly, the total variation load per gene was compared and two groups were detected: 375 genes more variable in cases and 371 more variable in controls. In both cases, Gene Ontology analysis showed that the largest enrichment for biological processes (fold > 5, p-value = 0.042) was the “sensory perception of sound”, suggesting cumulative genetic effects were involved. Replication confirmed 141 genes, while additional analysis based on natural selection led to a prioritization of 21 genes. The majority of them (20 out of 21) showed positive expression in mouse cochlea cDNA and were associated with two functional pathways. Among them, two genes were previously associated with hearing (CSMD1 and PTRPD) and re-sequenced in a large Italian cohort of ARHL patients (N = 389). Results led to the identification of six coding variants not detected in cases so far, suggesting a possible protective role, which requires investigation. In conclusion, we show that this multistep strategy (WGS, selection, expression, pathway analysis and targeted re-sequencing) can provide major insights into the molecular characterization of complex diseases such as ARHL.
Collapse
Affiliation(s)
- Dragana Vuckovic
- Medical Sciences, Chirurgical and Health Department, University of Trieste, Trieste, Italy. .,Medical Genetics, Institute for Maternal and Child Health - IRCCS "Burlo Garofolo", Trieste, Italy.
| | - Massimo Mezzavilla
- Medical Genetics, Institute for Maternal and Child Health - IRCCS "Burlo Garofolo", Trieste, Italy
| | - Massimiliano Cocca
- Medical Genetics, Institute for Maternal and Child Health - IRCCS "Burlo Garofolo", Trieste, Italy
| | - Anna Morgan
- Medical Sciences, Chirurgical and Health Department, University of Trieste, Trieste, Italy.,Medical Genetics, Institute for Maternal and Child Health - IRCCS "Burlo Garofolo", Trieste, Italy
| | - Marco Brumat
- Medical Sciences, Chirurgical and Health Department, University of Trieste, Trieste, Italy
| | - Eulalia Catamo
- Medical Genetics, Institute for Maternal and Child Health - IRCCS "Burlo Garofolo", Trieste, Italy
| | - Maria Pina Concas
- Medical Genetics, Institute for Maternal and Child Health - IRCCS "Burlo Garofolo", Trieste, Italy
| | - Ginevra Biino
- Institute of Molecular Genetics, National Research Council of Italy, Pavia, Italy
| | - Annamaria Franzè
- Ceinge Advanced Biotechnology, Naples, Italy.,Neuroscience, Reproductive and Odontology Sciences Department, University of Naples "Federico II", Naples, Italy
| | - Umberto Ambrosetti
- UO Audiology, Fondazione IRCCS Ca Granda, Ospedale Maggiore Policlinico, Mangiagalli e Regina Elena, Milan, Italy.,Audiology Unit, Department of Clinical Sciences and Community Health, University of Milan, Milan, Italy
| | - Mario Pirastu
- Institute of Population Genetics, National Research Council of Italy, Sassari, Italy
| | - Paolo Gasparini
- Medical Sciences, Chirurgical and Health Department, University of Trieste, Trieste, Italy.,Medical Genetics, Institute for Maternal and Child Health - IRCCS "Burlo Garofolo", Trieste, Italy
| | - Giorgia Girotto
- Medical Sciences, Chirurgical and Health Department, University of Trieste, Trieste, Italy.,Medical Genetics, Institute for Maternal and Child Health - IRCCS "Burlo Garofolo", Trieste, Italy
| |
Collapse
|
20
|
|
21
|
Pal LR, Kundu K, Yin Y, Moult J. CAGI4 Crohn's exome challenge: Marker SNP versus exome variant models for assigning risk of Crohn disease. Hum Mutat 2017; 38:1225-1234. [PMID: 28512778 PMCID: PMC5576730 DOI: 10.1002/humu.23256] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Revised: 05/09/2017] [Accepted: 05/10/2017] [Indexed: 12/18/2022]
Abstract
Understanding the basis of complex trait disease is a fundamental problem in human genetics. The CAGI Crohn's Exome challenges are providing insight into the adequacy of current disease models by requiring participants to identify which of a set of individuals has been diagnosed with the disease, given exome data. For the CAGI4 round, we developed a method that used the genotypes from exome sequencing data only to impute the status of genome wide association studies marker SNPs. We then used the imputed genotypes as input to several machine learning methods that had been trained to predict disease status from marker SNP information. We achieved the best performance using Naïve Bayes and with a consensus machine learning method, obtaining an area under the curve of 0.72, larger than other methods used in CAGI4. We also developed a model that incorporated the contribution from rare missense variants in the exome data, but this performed less well. Future progress is expected to come from the use of whole genome data rather than exomes.
Collapse
Affiliation(s)
- Lipika R. Pal
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850
| | - Kunal Kundu
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, MD 20742, USA
| | - Yizhou Yin
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, MD 20742, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742
| |
Collapse
|
22
|
Josephs EB, Stinchcombe JR, Wright SI. What can genome-wide association studies tell us about the evolutionary forces maintaining genetic variation for quantitative traits? THE NEW PHYTOLOGIST 2017; 214:21-33. [PMID: 28211582 DOI: 10.1111/nph.14410] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Accepted: 11/14/2016] [Indexed: 05/27/2023]
Abstract
Contents 21 I. 21 II. 22 III. 24 IV. 25 V. 29 30 References 30 SUMMARY: Understanding the evolutionary forces that shape genetic variation within species has long been a goal of evolutionary biology. Integrating data for the genetic architecture of traits from genome-wide association mapping studies (GWAS) along with the development of new population genetic methods for identifying selection in sequence data may allow us to evaluate the roles of mutation-selection balance and balancing selection in shaping genetic variation at various scales. Here, we review the theoretical predictions for genetic architecture and additional signals of selection on genomic sequence for the loci that affect traits. Next, we review how plant GWAS have tested for the signatures of various selective scenarios. Limited evidence to date suggests that within-population variation is maintained primarily by mutation-selection balance while variation across the landscape is the result of local adaptation. However, there are a number of inherent biases in these interpretations. We highlight these challenges and suggest ways forward to further understanding of the maintenance of variation.
Collapse
Affiliation(s)
- Emily B Josephs
- Department of Evolution and Ecology, University of California, Davis, One Shields Avenue, Davis, CA, 95616, USA
| | - John R Stinchcombe
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks St., Toronto, ON, M5S 3B2, Canada
| | - Stephen I Wright
- Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks St., Toronto, ON, M5S 3B2, Canada
| |
Collapse
|
23
|
A Model of Compound Heterozygous, Loss-of-Function Alleles Is Broadly Consistent with Observations from Complex-Disease GWAS Datasets. PLoS Genet 2017; 13:e1006573. [PMID: 28103232 PMCID: PMC5289629 DOI: 10.1371/journal.pgen.1006573] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Revised: 02/02/2017] [Accepted: 01/05/2017] [Indexed: 12/17/2022] Open
Abstract
The genetic component of complex disease risk in humans remains largely unexplained. A corollary is that the allelic spectrum of genetic variants contributing to complex disease risk is unknown. Theoretical models that relate population genetic processes to the maintenance of genetic variation for quantitative traits may suggest profitable avenues for future experimental design. Here we use forward simulation to model a genomic region evolving under a balance between recurrent deleterious mutation and Gaussian stabilizing selection. We consider multiple genetic and demographic models, and several different methods for identifying genomic regions harboring variants associated with complex disease risk. We demonstrate that the model of gene action, relating genotype to phenotype, has a qualitative effect on several relevant aspects of the population genetic architecture of a complex trait. In particular, the genetic model impacts genetic variance component partitioning across the allele frequency spectrum and the power of statistical tests. Models with partial recessivity closely match the minor allele frequency distribution of significant hits from empirical genome-wide association studies without requiring homozygous effect sizes to be small. We highlight a particular gene-based model of incomplete recessivity that is appealing from first principles. Under that model, deleterious mutations in a genomic region partially fail to complement one another. This model of gene-based recessivity predicts the empirically observed inconsistency between twin and SNP based estimated of dominance heritability. Furthermore, this model predicts considerable levels of unexplained variance associated with intralocus epistasis. Our results suggest a need for improved statistical tools for region based genetic association and heritability estimation. Gene action determines how mutations affect phenotype. When placed in an evolutionary context, the details of the genotype-to-phenotype model can impact the maintenance of genetic variation for complex traits. Likewise, non-equilibrium demographic history may affect patterns of genetic variation. Here, we explore the impact of genetic model and population growth on distribution of genetic variance across the allele frequency spectrum underlying risk for a complex disease. Using forward-in-time population genetic simulations, we show that the genetic model has important impacts on the composition of variation for complex disease risk in a population. We explicitly simulate genome-wide association studies (GWAS) and perform heritability estimation on population samples. A particular model of gene-based partial recessivity, based on allelic non-complementation, aligns well with empirical results. This model is congruent with the dominance variance estimates from both SNPs and twins, and the minor allele frequency distribution of GWAS hits.
Collapse
|
24
|
Hochberg ME, Noble RJ. A framework for how environment contributes to cancer risk. Ecol Lett 2017; 20:117-134. [DOI: 10.1111/ele.12726] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2016] [Revised: 10/03/2016] [Accepted: 12/01/2016] [Indexed: 12/22/2022]
Affiliation(s)
- Michael E. Hochberg
- Intstitut des Sciences de l'Evolution de Montpellier; Université de Montpellier; Place E. Bataillon, CC065 34095 Montpellier Cedex 5 France
- Santa Fe Institute; 1399 Hyde Park Rd. Santa Fe NM 87501 USA
| | - Robert J. Noble
- Intstitut des Sciences de l'Evolution de Montpellier; Université de Montpellier; Place E. Bataillon, CC065 34095 Montpellier Cedex 5 France
| |
Collapse
|
25
|
Field Y, Boyle EA, Telis N, Gao Z, Gaulton KJ, Golan D, Yengo L, Rocheleau G, Froguel P, McCarthy MI, Pritchard JK. Detection of human adaptation during the past 2000 years. Science 2016; 354:760-764. [PMID: 27738015 PMCID: PMC5182071 DOI: 10.1126/science.aag0776] [Citation(s) in RCA: 234] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Accepted: 10/03/2016] [Indexed: 12/22/2022]
Abstract
Detection of recent natural selection is a challenging problem in population genetics. Here we introduce the singleton density score (SDS), a method to infer very recent changes in allele frequencies from contemporary genome sequences. Applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past ~2000 to 3000 years. We see strong signals of selection at lactase and the major histocompatibility complex, and in favor of blond hair and blue eyes. For polygenic adaptation, we find that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we identify shifts associated with other complex traits, suggesting that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
Collapse
Affiliation(s)
- Yair Field
- Department of Genetics, Stanford University, Stanford, CA 94305, USA.
- Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA
| | - Evan A Boyle
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Natalie Telis
- Program in Biomedical Informatics, Stanford University, Stanford, CA 94305, USA
| | - Ziyue Gao
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA
| | - Kyle J Gaulton
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Wellcome Trust Center for Human Genetics, and Oxford Center for Diabetes Endocrinology and Metabolism, University of Oxford, Oxford, UK
| | - David Golan
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Loic Yengo
- Univ. Lille, CNRS, Institut Pasteur de Lille, UMR 8199-EGID, F-59000 Lille, France
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia
| | - Ghislain Rocheleau
- Univ. Lille, CNRS, Institut Pasteur de Lille, UMR 8199-EGID, F-59000 Lille, France
| | - Philippe Froguel
- Univ. Lille, CNRS, Institut Pasteur de Lille, UMR 8199-EGID, F-59000 Lille, France
- Imperial College, Department of Genomics of Common Disease, London Hammersmith Hospital, London, UK
| | - Mark I McCarthy
- Wellcome Trust Center for Human Genetics, and Oxford Center for Diabetes Endocrinology and Metabolism, University of Oxford, Oxford, UK
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA 94305, USA.
- Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305, USA
- Department of Biology, Stanford University, Stanford, CA, USA
| |
Collapse
|
26
|
Abstract
The wealth of available genetic information is allowing the reconstruction of human demographic and adaptive history. Demography and purifying selection affect the purge of rare, deleterious mutations from the human population, whereas positive and balancing selection can increase the frequency of advantageous variants, improving survival and reproduction in specific environmental conditions. In this review, I discuss how theoretical and empirical population genetics studies, using both modern and ancient DNA data, are a powerful tool for obtaining new insight into the genetic basis of severe disorders and complex disease phenotypes, rare and common, focusing particularly on infectious disease risk.
Collapse
Affiliation(s)
- Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Department of Genomes & Genetics, Institut Pasteur, Paris, 75015, France.
- Centre National de la Recherche Scientifique, URA3012, Paris, 75015, France.
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris, 75015, France.
| |
Collapse
|
27
|
Uricchio LH, Zaitlen NA, Ye CJ, Witte JS, Hernandez RD. Selection and explosive growth alter genetic architecture and hamper the detection of causal rare variants. Genome Res 2016; 26:863-73. [PMID: 27197206 PMCID: PMC4937562 DOI: 10.1101/gr.202440.115] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2015] [Accepted: 05/16/2016] [Indexed: 12/20/2022]
Abstract
The role of rare alleles in complex phenotypes has been hotly debated, but most rare variant association tests (RVATs) do not account for the evolutionary forces that affect genetic architecture. Here, we use simulation and numerical algorithms to show that explosive population growth, as experienced by human populations, can dramatically increase the impact of very rare alleles on trait variance. We then assess the ability of RVATs to detect causal loci using simulations and human RNA-seq data. Surprisingly, we find that statistical performance is worst for phenotypes in which genetic variance is due mainly to rare alleles, and explosive population growth decreases power. Although many studies have attempted to identify causal rare variants, few have reported novel associations. This has sometimes been interpreted to mean that rare variants make negligible contributions to complex trait heritability. Our work shows that RVATs are not robust to realistic human evolutionary forces, so general conclusions about the impact of rare variants on complex traits may be premature.
Collapse
Affiliation(s)
- Lawrence H Uricchio
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, California 94143, USA; Graduate Program in Bioinformatics, University of California, San Francisco, San Francisco, California 94143, USA
| | - Noah A Zaitlen
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, California 94143, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, California 94143, USA; Institute for Quantitative Biosciences (QB3), University of California, San Francisco, San Francisco, California 94143, USA
| | - Chun Jimmie Ye
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, California 94143, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, California 94143, USA; Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, California 94143, USA
| | - John S Witte
- Institute for Human Genetics, University of California, San Francisco, San Francisco, California 94143, USA; Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, California 94143, USA
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, California 94143, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, California 94143, USA; Institute for Quantitative Biosciences (QB3), University of California, San Francisco, San Francisco, California 94143, USA
| |
Collapse
|
28
|
Kunkle BW, Jaworski J, Barral S, Vardarajan B, Beecham GW, Martin ER, Cantwell LS, Partch A, Bird TD, Raskind WH, DeStefano AL, Carney RM, Cuccaro M, Vance JM, Farrer LA, Goate AM, Foroud T, Mayeux RP, Schellenberg GD, Haines JL, Pericak-Vance MA. Genome-wide linkage analyses of non-Hispanic white families identify novel loci for familial late-onset Alzheimer's disease. Alzheimers Dement 2016; 12:2-10. [PMID: 26365416 PMCID: PMC4717829 DOI: 10.1016/j.jalz.2015.05.020] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Revised: 05/14/2015] [Accepted: 05/29/2015] [Indexed: 12/13/2022]
Abstract
INTRODUCTION Few high penetrance variants that explain risk in late-onset Alzheimer's disease (LOAD) families have been found. METHODS We performed genome-wide linkage and identity-by-descent (IBD) analyses on 41 non-Hispanic white families exhibiting likely dominant inheritance of LOAD, and having no mutations at known familial Alzheimer's disease (AD) loci, and a low burden of APOE ε4 alleles. RESULTS Two-point parametric linkage analysis identified 14 significantly linked regions, including three novel linkage regions for LOAD (5q32, 11q12.2-11q14.1, and 14q13.3), one of which replicates a genome-wide association LOAD locus, the MS4A6A-MS4A4E gene cluster at 11q12.2. Five of the 14 regions (3q25.31, 4q34.1, 8q22.3, 11q12.2-14.1, and 19q13.41) are supported by strong multipoint results (logarithm of odds [LOD*] ≥1.5). Nonparametric multipoint analyses produced an additional significant locus at 14q32.2 (LOD* = 4.18). The 1-LOD confidence interval for this region contains one gene, C14orf177, and the microRNA Mir_320, whereas IBD analyses implicates an additional gene BCL11B, a regulator of brain-derived neurotrophic signaling, a pathway associated with pathogenesis of several neurodegenerative diseases. DISCUSSION Examination of these regions after whole-genome sequencing may identify highly penetrant variants for familial LOAD.
Collapse
Affiliation(s)
- Brian W Kunkle
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - James Jaworski
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Sandra Barral
- The Taub Institute of Research on Alzheimer's Disease, College of Physicians and Surgeons, Columbia University, New York, NY, USA; The Gertrude H. Sergievsky Center, College of Physicians and Surgeons, Columbia University, New York, NY, USA; Department of Neurology, College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | - Badri Vardarajan
- The Taub Institute of Research on Alzheimer's Disease, College of Physicians and Surgeons, Columbia University, New York, NY, USA; The Gertrude H. Sergievsky Center, College of Physicians and Surgeons, Columbia University, New York, NY, USA; Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA; Hope Center for Neurological Disorders, Department of Neurology, Washington University School of Medicine, St. Louis, MO, USA
| | - Gary W Beecham
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Eden R Martin
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Laura S Cantwell
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Amanda Partch
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Thomas D Bird
- Department of Neurology, University of Washington, Seattle, WA, USA; Department of Medicine, University of Washington, Seattle, WA, USA
| | - Wendy H Raskind
- Department of Medicine, University of Washington, Seattle, WA, USA; Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA, USA
| | - Anita L DeStefano
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Regina M Carney
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA; Department of Psychiatry and Behavioral Sciences, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Michael Cuccaro
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA; Dr. John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Jeffrey M Vance
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA; Dr. John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Lindsay A Farrer
- Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA, USA; Department of Medicine (Biomedical Genetics), Boston University School of Medicine and Public Health, MA, USA; Department of Neurology, Boston University School of Medicine and Public Health, MA, USA; Department of Ophthalmology, Boston University School of Medicine and Public Health, MA, USA; Department of Epidemiology, Boston University School of Public Health, MA, USA
| | - Alison M Goate
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA; Hope Center for Neurological Disorders, Department of Neurology, Washington University School of Medicine, St. Louis, MO, USA
| | - Tatiana Foroud
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Richard P Mayeux
- The Taub Institute of Research on Alzheimer's Disease, College of Physicians and Surgeons, Columbia University, New York, NY, USA; The Gertrude H. Sergievsky Center, College of Physicians and Surgeons, Columbia University, New York, NY, USA; Department of Neurology, College of Physicians and Surgeons, Columbia University, New York, NY, USA; Department of Psychiatry, College of Physicians and Surgeons, Columbia University, New York, NY, USA; The Department of Epidemiology, School of Public Health, Columbia University, New York, NY, USA
| | - Gerard D Schellenberg
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jonathan L Haines
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA
| | - Margaret A Pericak-Vance
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA; Dr. John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, FL, USA.
| |
Collapse
|
29
|
Puig M, Castellano D, Pantano L, Giner-Delgado C, Izquierdo D, Gayà-Vidal M, Lucas-Lledó JI, Esko T, Terao C, Matsuda F, Cáceres M. Functional Impact and Evolution of a Novel Human Polymorphic Inversion That Disrupts a Gene and Creates a Fusion Transcript. PLoS Genet 2015; 11:e1005495. [PMID: 26427027 PMCID: PMC4591017 DOI: 10.1371/journal.pgen.1005495] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2014] [Accepted: 08/12/2015] [Indexed: 11/18/2022] Open
Abstract
Despite many years of study into inversions, very little is known about their functional consequences, especially in humans. A common hypothesis is that the selective value of inversions stems in part from their effects on nearby genes, although evidence of this in natural populations is almost nonexistent. Here we present a global analysis of a new 415-kb polymorphic inversion that is among the longest ones found in humans and is the first with clear position effects. This inversion is located in chromosome 19 and has been generated by non-homologous end joining between blocks of transposable elements with low identity. PCR genotyping in 541 individuals from eight different human populations allowed the detection of tag SNPs and inversion genotyping in multiple populations worldwide, showing that the inverted allele is mainly found in East Asia with an average frequency of 4.7%. Interestingly, one of the breakpoints disrupts the transcription factor gene ZNF257, causing a significant reduction in the total expression level of this gene in lymphoblastoid cell lines. RNA-Seq analysis of the effects of this expression change in standard homozygotes and inversion heterozygotes revealed distinct expression patterns that were validated by quantitative RT-PCR. Moreover, we have found a new fusion transcript that is generated exclusively from inverted chromosomes around one of the breakpoints. Finally, by the analysis of the associated nucleotide variation, we have estimated that the inversion was generated ~40,000–50,000 years ago and, while a neutral evolution cannot be ruled out, its current frequencies are more consistent with those expected for a deleterious variant, although no significant association with phenotypic traits has been found so far. Since the discovery of chromosomal inversions almost 100 years ago, how they are maintained in natural populations has been a highly debated issue. One of the hypotheses is that inversion breakpoints could affect genes and modify gene expression levels, although evidence of this came only from laboratory mutants. In humans, a few inversions have been shown to associate with expression differences, but in all cases the molecular causes have remained elusive. Here, we have carried out a complete characterization of a new human polymorphic inversion and determined that it is specific to East Asian populations. In addition, we demonstrate that it disrupts the ZNF257 gene and, through the translocation of the first exon and regulatory sequences, creates a previously nonexistent fusion transcript, which together are associated to expression changes in several other genes. Finally, we investigate the potential evolutionary and phenotypic consequences of the inversion, and suggest that it is probably deleterious. This is therefore the first example of a natural polymorphic inversion that has position effects and creates a new chimeric gene, contributing to answer an old question in evolutionary biology.
Collapse
Affiliation(s)
- Marta Puig
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - David Castellano
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - Lorena Pantano
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - Carla Giner-Delgado
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - David Izquierdo
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - Magdalena Gayà-Vidal
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - José Ignacio Lucas-Lledó
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
| | - Tõnu Esko
- Estonian Biobank, Estonian Genome Center, University of Tartu, Tartu, Estonia
- Boston Children's Hospital, Harvard Medical School, and Broad Institute of Harvard and MIT, Boston, Massachusetts, United States of America
| | - Chikashi Terao
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Fumihiko Matsuda
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Mario Cáceres
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- * E-mail:
| |
Collapse
|
30
|
Pal LR, Yu CH, Mount SM, Moult J. Insights from GWAS: emerging landscape of mechanisms underlying complex trait disease. BMC Genomics 2015; 16 Suppl 8:S4. [PMID: 26110739 PMCID: PMC4480957 DOI: 10.1186/1471-2164-16-s8-s4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND There are now over 2000 loci in the human genome where genome wide association studies (GWAS) have found one or more SNPs to be associated with altered risk of a complex trait disease. At each of these loci, there must be some molecular level mechanism relevant to the disease. What are these mechanisms and how do they contribute to disease? RESULTS Here we consider the roles of three primary mechanism classes: changes that directly alter protein function (missense SNPs), changes that alter transcript abundance as a consequence of variants close-by in sequence, and changes that affect splicing. Missense SNPs are divided into those predicted to have a high impact on in vivo protein function, and those with a low impact. Splicing is divided into SNPs with a direct impact on splice sites, and those with a predicted effect on auxiliary splicing signals. The analysis was based on associations found for seven complex trait diseases in the classic Wellcome Trust Case Control Consortium (WTCCC1) GWA study and subsequent studies and meta-analyses, collected from the GWAS catalog. Linkage disequilibrium information was used to identify possible candidate SNPs for involvement in disease mechanism in each of the 356 loci associated with these seven diseases. With the parameters used, we find that 76% of loci have at least of these mechanisms. Overall, except for the low incidence of direct impact on splice sites, the mechanisms are found at similar frequencies, with changes in transcript abundance the most common. But the distribution of mechanisms over diseases varies markedly, as does the fraction of loci with assigned mechanisms. Many of the implicated proteins have previously been suggested as relevant, but the specific mechanism assignments are new. In addition, a number of new disease relevant proteins are proposed. CONCLUSIONS The high fraction of GWAS loci with proposed mechanisms suggests that these classes of mechanism play a major role. Other mechanism types, such as variants affecting expression of genes remote in the DNA sequence, will contribute in other loci. Each of the identified putative mechanisms provides a hypothesis for further investigation.
Collapse
Affiliation(s)
- Lipika R Pal
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, USA
| | - Chen-Hsin Yu
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, USA
- Molecular and Cellular Biology Program, University of Maryland, College Park, MD, USA
| | - Stephen M Mount
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
- Center for Bioinformatics and Computational Biology, University of Maryland at College Park, College Park, MD, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
| |
Collapse
|
31
|
Abstract
Next-generation sequencing technology has facilitated the discovery of millions of genetic variants in human genomes. A sizeable fraction of these variants are predicted to be deleterious. Here, we review the pattern of deleterious alleles as ascertained in genome sequencing data sets and ask whether human populations differ in their predicted burden of deleterious alleles - a phenomenon known as mutation load. We discuss three demographic models that are predicted to affect mutation load and relate these models to the evidence (or the lack thereof) for variation in the efficacy of purifying selection in diverse human genomes. We also emphasize why accurate estimation of mutation load depends on assumptions regarding the distribution of dominance and selection coefficients - quantities that remain poorly characterized for current genomic data sets.
Collapse
|
32
|
Uricchio LH, Torres R, Witte JS, Hernandez RD. Population genetic simulations of complex phenotypes with implications for rare variant association tests. Genet Epidemiol 2014; 39:35-44. [PMID: 25417809 DOI: 10.1002/gepi.21866] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Revised: 09/09/2014] [Accepted: 09/26/2014] [Indexed: 12/12/2022]
Abstract
Demographic events and natural selection alter patterns of genetic variation within populations and may play a substantial role in shaping the genetic architecture of complex phenotypes and disease. However, the joint impact of these basic evolutionary forces is often ignored in the assessment of statistical tests of association. Here, we provide a simulation-based framework for generating DNA sequences that incorporates selection and demography with flexible models for simulating phenotypic variation (sfs_coder). This tool also allows the user to perform locus-specific simulations by automatically querying annotated genomic functional elements and genetic maps. We demonstrate the effects of evolutionary forces on patterns of genetic variation by simulating recently inferred models of human selection and demography. We use these simulations to show that the demographic model and locus-specific features, such as the proportion of sites under selection, may have practical implications for estimating the statistical power of sequencing-based rare variant association tests. In particular, for some phenotype models, there may be higher power to detect rare variant associations in African populations compared to non-Africans, but power is considerably reduced in regions of the genome with rampant negative selection. Furthermore, we show that existing methods for simulating large samples based on resampling from a small set of observed haplotypes fail to recapitulate the distribution of rare variants in the presence of rapid population growth (as has been observed in several human populations).
Collapse
Affiliation(s)
- Lawrence H Uricchio
- Graduate Program in Bioinformatics, University of California, San Francisco, California, United States of America
| | | | | | | |
Collapse
|
33
|
Chen HS, Hutter CM, Mechanic LE, Amos CI, Bafna V, Hauser ER, Hernandez RD, Li C, Liberles DA, McAllister K, Moore JH, Paltoo DN, Papanicolaou GJ, Peng B, Ritchie MD, Rosenfeld G, Witte JS, Gillanders EM, Feuer EJ. Genetic simulation tools for post-genome wide association studies of complex diseases. Genet Epidemiol 2014; 39:11-19. [PMID: 25371374 DOI: 10.1002/gepi.21870] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2014] [Revised: 09/02/2014] [Accepted: 09/26/2014] [Indexed: 01/12/2023]
Abstract
Genetic simulation programs are used to model data under specified assumptions to facilitate the understanding and study of complex genetic systems. Standardized data sets generated using genetic simulation are essential for the development and application of novel analytical tools in genetic epidemiology studies. With continuing advances in high-throughput genomic technologies and generation and analysis of larger, more complex data sets, there is a need for updating current approaches in genetic simulation modeling. To provide a forum to address current and emerging challenges in this area, the National Cancer Institute (NCI) sponsored a workshop, entitled "Genetic Simulation Tools for Post-Genome Wide Association Studies of Complex Diseases" at the National Institutes of Health (NIH) in Bethesda, Maryland on March 11-12, 2014. The goals of the workshop were to (1) identify opportunities, challenges, and resource needs for the development and application of genetic simulation models; (2) improve the integration of tools for modeling and analysis of simulated data; and (3) foster collaborations to facilitate development and applications of genetic simulation. During the course of the meeting, the group identified challenges and opportunities for the science of simulation, software and methods development, and collaboration. This paper summarizes key discussions at the meeting, and highlights important challenges and opportunities to advance the field of genetic simulation.
Collapse
Affiliation(s)
- Huann-Sheng Chen
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, NIH, Bethesda, MD 20892
| | - Carolyn M Hutter
- Division of Genomic Medicine, National Human Genome Research Institute, NIH, Bethesda, MD 20892
| | - Leah E Mechanic
- Epidemiology and Genomics Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, NIH, Bethesda, MD 20892
| | - Christopher I Amos
- Division of Community, Family Medicine, Dartmouth College, Lebanon, NH 03755
| | - Vineet Bafna
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093
| | | | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94143
| | - Chun Li
- Department of Biostatistics, Vanderbilt University, Nashville, TN 37235
| | - David A Liberles
- Department of Molecular Biology, University of Wyoming, Laramie, WY 82071
| | - Kimberly McAllister
- Susceptibility and Population Health Branch, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC 27709
| | - Jason H Moore
- Department of Genetics, Dartmouth College, Lebanon, NH 03755
| | - Dina N Paltoo
- Office of Director, National Institutes of Health, Bethesda, MD 20892
| | - George J Papanicolaou
- Division of Cardiovascular Sciences, Prevention and Population Sciences Program, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892
| | - Bo Peng
- Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Houston, TX 77030
| | - Marylyn D Ritchie
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802
| | - Gabriel Rosenfeld
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, NIH, Bethesda, MD 20892
| | - John S Witte
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94107
| | - Elizabeth M Gillanders
- Epidemiology and Genomics Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, NIH, Bethesda, MD 20892
| | - Eric J Feuer
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, NIH, Bethesda, MD 20892
| |
Collapse
|
34
|
Fu W, Gittelman R, Bamshad M, Akey J. Characteristics of neutral and deleterious protein-coding variation among individuals and populations. Am J Hum Genet 2014; 95:421-36. [PMID: 25279984 DOI: 10.1016/j.ajhg.2014.09.006] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Accepted: 09/11/2014] [Indexed: 01/27/2023] Open
Abstract
Whole-genome and exome data sets continue to be produced at a frenetic pace, resulting in massively large catalogs of human genomic variation. However, a clear picture of the characteristics and patterns of neutral and deleterious variation within and between populations has yet to emerge, given that recent large-scale sequencing studies have often emphasized different aspects of the data and sometimes appear to have conflicting conclusions. Here, we comprehensively studied characteristics of protein-coding variation in high-coverage exome sequence data from 6,515 European American (EA) and African American (AA) individuals. We developed an unbiased approach to identify putatively deleterious variants and investigated patterns of neutral and deleterious single-nucleotide variants and alleles between individuals and populations. We show that there are substantial differences in the composition of genotypes between EA and AA populations and that small but statistically significant differences exist in the average number of deleterious alleles carried by EA and AA individuals. Furthermore, we performed extensive simulations to delineate the temporal dynamics of deleterious alleles for a broad range of demographic models and use these data to inform the interpretation of empirical patterns of deleterious variation. Finally, we illustrate that the effects of demographic perturbations, such as bottlenecks and expansions, often manifest in opposing patterns of neutral and deleterious variation depending on whether the focus is on populations or individuals. Our results clarify seemingly disparate empirical characteristics of protein-coding variation and provide substantial insights into how natural selection and demographic history have patterned neutral and deleterious variation within and between populations.
Collapse
|
35
|
Impact of range expansions on current human genomic diversity. Curr Opin Genet Dev 2014; 29:22-30. [PMID: 25156518 DOI: 10.1016/j.gde.2014.07.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Revised: 07/09/2014] [Accepted: 07/25/2014] [Indexed: 12/19/2022]
Abstract
The patterns of population genetic diversity depend to a large extent on past demographic history. Most human populations are known to have gone recently through a series of range expansions within and out of Africa, but these spatial expansions are rarely taken into account when interpreting observed genomic diversity, possibly because they are difficult to model. Here we review available evidence in favour of range expansions out of Africa, and we discuss several of their consequences on neutral and selected diversity, including some recent observations on an excess of rare neutral and selected variants in large samples. We further show that in spatially subdivided populations, the sampling strategy can severely impact the resulting genetic diversity and be confounded by past demography. We conclude that ignoring the spatial structure of human population can lead to some misinterpretations of extant genetic diversity.
Collapse
|
36
|
Lohmueller KE. The impact of population demography and selection on the genetic architecture of complex traits. PLoS Genet 2014; 10:e1004379. [PMID: 24875776 PMCID: PMC4038606 DOI: 10.1371/journal.pgen.1004379] [Citation(s) in RCA: 98] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 03/28/2014] [Indexed: 02/06/2023] Open
Abstract
Population genetic studies have found evidence for dramatic population growth in recent human history. It is unclear how this recent population growth, combined with the effects of negative natural selection, has affected patterns of deleterious variation, as well as the number, frequency, and effect sizes of mutations that contribute risk to complex traits. Because researchers are performing exome sequencing studies aimed at uncovering the role of low-frequency variants in the risk of complex traits, this topic is of critical importance. Here I use simulations under population genetic models where a proportion of the heritability of the trait is accounted for by mutations in a subset of the exome. I show that recent population growth increases the proportion of nonsynonymous variants segregating in the population, but does not affect the genetic load relative to a population that did not expand. Under a model where a mutation's effect on a trait is correlated with its effect on fitness, rare variants explain a greater portion of the additive genetic variance of the trait in a population that has recently expanded than in a population that did not recently expand. Further, when using a single-marker test, for a given false-positive rate and sample size, recent population growth decreases the expected number of significant associations with the trait relative to the number detected in a population that did not expand. However, in a model where there is no correlation between a mutation's effect on fitness and the effect on the trait, common variants account for much of the additive genetic variance, regardless of demography. Moreover, here demography does not affect the number of significant associations detected. These findings suggest recent population history may be an important factor influencing the power of association tests and in accounting for the missing heritability of certain complex traits.
Collapse
Affiliation(s)
- Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
| |
Collapse
|
37
|
Abstract
Evolutionary forces shape patterns of genetic diversity within populations and contribute to phenotypic variation. In particular, recurrent positive selection has attracted significant interest in both theoretical and empirical studies. However, most existing theoretical models of recurrent positive selection cannot easily incorporate realistic confounding effects such as interference between selected sites, arbitrary selection schemes, and complicated demographic processes. It is possible to quantify the effects of arbitrarily complex evolutionary models by performing forward population genetic simulations, but forward simulations can be computationally prohibitive for large population sizes (>105). A common approach for overcoming these computational limitations is rescaling of the most computationally expensive parameters, especially population size. Here, we show that ad hoc approaches to parameter rescaling under the recurrent hitchhiking model do not always provide sufficiently accurate dynamics, potentially skewing patterns of diversity in simulated DNA sequences. We derive an extension of the recurrent hitchhiking model that is appropriate for strong selection in small population sizes and use it to develop a method for parameter rescaling that provides the best possible computational performance for a given error tolerance. We perform a detailed theoretical analysis of the robustness of rescaling across the parameter space. Finally, we apply our rescaling algorithms to parameters that were previously inferred for Drosophila and discuss practical considerations such as interference between selected sites.
Collapse
|
38
|
Abstract
Proteins are not monolithic entities; rather, they can contain multiple domains that mediate distinct interactions, and their functionality can be regulated through post-translational modifications at multiple distinct sites. Traditionally, network biology has ignored such properties of proteins and has instead examined either the physical interactions of whole proteins or the consequences of removing entire genes. In this Review, we discuss experimental and computational methods to increase the resolution of protein-protein, genetic and drug-gene interaction studies to the domain and residue levels. Such work will be crucial for using interaction networks to connect sequence and structural information, and to understand the biological consequences of disease-associated mutations, which will hopefully lead to more effective therapeutic strategies.
Collapse
|
39
|
Hochreiter S. HapFABIA: identification of very short segments of identity by descent characterized by rare variants in large sequencing data. Nucleic Acids Res 2013; 41:e202. [PMID: 24174545 PMCID: PMC3905877 DOI: 10.1093/nar/gkt1013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Identity by descent (IBD) can be reliably detected for long shared DNA segments, which are found in related individuals. However, many studies contain cohorts of unrelated individuals that share only short IBD segments. New sequencing technologies facilitate identification of short IBD segments through rare variants, which convey more information on IBD than common variants. Current IBD detection methods, however, are not designed to use rare variants for the detection of short IBD segments. Short IBD segments reveal genetic structures at high resolution. Therefore, they can help to improve imputation and phasing, to increase genotyping accuracy for low-coverage sequencing and to increase the power of association studies. Since short IBD segments are further assumed to be old, they can shed light on the evolutionary history of humans. We propose HapFABIA, a computational method that applies biclustering to identify very short IBD segments characterized by rare variants. HapFABIA is designed to detect short IBD segments in genotype data that were obtained from next-generation sequencing, but can also be applied to DNA microarray data. Especially in next-generation sequencing data, HapFABIA exploits rare variants for IBD detection. HapFABIA significantly outperformed competing algorithms at detecting short IBD segments on artificial and simulated data with rare variants. HapFABIA identified 160 588 different short IBD segments characterized by rare variants with a median length of 23 kb (mean 24 kb) in data for chromosome 1 of the 1000 Genomes Project. These short IBD segments contain 752 000 single nucleotide variants (SNVs), which account for 39% of the rare variants and 23.5% of all variants. The vast majority—152 000 IBD segments—are shared by Africans, while only 19 000 and 11 000 are shared by Europeans and Asians, respectively. IBD segments that match the Denisova or the Neandertal genome are found significantly more often in Asians and Europeans but also, in some cases exclusively, in Africans. The lengths of IBD segments and their sharing between continental populations indicate that many short IBD segments from chromosome 1 existed before humans migrated out of Africa. Thus, rare variants that tag these short IBD segments predate human migration from Africa. The software package HapFABIA is available from Bioconductor. All data sets, result files and programs for data simulation, preprocessing and evaluation are supplied at http://www.bioinf.jku.at/research/short-IBD.
Collapse
Affiliation(s)
- Sepp Hochreiter
- Institute of Bioinformatics, Johannes Kepler University, Linz, Austria
| |
Collapse
|
40
|
Abstract
This study addresses the question of how purifying selection operates during recent rapid population growth such as has been experienced by human populations. This is not a straightforward problem because the human population is not at equilibrium: population genetics predicts that, on the one hand, the efficacy of natural selection increases as population size increases, eliminating ever more weakly deleterious variants; on the other hand, a larger number of deleterious mutations will be introduced into the population and will be more likely to increase in their number of copies as the population grows. To understand how patterns of human genetic variation have been shaped by the interaction of natural selection and population growth, we examined the trajectories of mutations with varying selection coefficients, using computer simulations. We observed that while population growth dramatically increases the number of deleterious segregating sites in the population, it only mildly increases the number carried by each individual. Our simulations also show an increased efficacy of natural selection, reflected in a higher fraction of deleterious mutations eliminated at each generation and a more efficient elimination of the most deleterious ones. As a consequence, while each individual carries a larger number of deleterious alleles than expected in the absence of growth, the average selection coefficient of each segregating allele is less deleterious. Combined, our results suggest that the genetic risk of complex diseases in growing populations might be distributed across a larger number of more weakly deleterious rare variants.
Collapse
|
41
|
Thomas DC. Some surprising twists on the road to discovering the contribution of rare variants to complex diseases. Hum Hered 2013; 74:113-7. [PMID: 23594489 DOI: 10.1159/000347020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|