1
|
Chukwu W, Lee S, Crane A, Zhang S, Webster S, Mittra I, Imielinski M, Beroukhim R, Dubois F, Dalin S. Comparison of germline and somatic structural variants in cancers reveal systematic differences in variant generating and selection processes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.09.561462. [PMID: 38106141 PMCID: PMC10723258 DOI: 10.1101/2023.10.09.561462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Although several recent studies have characterized structural variants (SVs) in germline and cancer genomes, the features of SVs in these different contexts have not been directly compared. We examined similarities and differences between 2 million germline and 115 thousand tumor SVs from a cohort of 963 patients from The Cancer Genome Atlas (TCGA). We found significant differences in features related to their genomic sequences and localization that suggest differences between SV-generating processes and selective pressures. For example, we found that transposon-mediated processes shape germline much more than somatic SVs, while somatic SVs more frequently show features characteristic of chromoanagenesis. These differences were extensive enough to enable us to develop a classifier - "the great GaTSV" - that accurately distinguishes between germline and cancer SVs in tumor samples that lack a matched normal sample.
Collapse
Affiliation(s)
- Wolu Chukwu
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cancer Biology and Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Siyun Lee
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cancer Biology and Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Alexander Crane
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cancer Biology and Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Shu Zhang
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cancer Biology and Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Sophie Webster
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cancer Biology and Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Ipsa Mittra
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Marcin Imielinski
- Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA; New York Genome Center, New York, NY, USA; Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, NY, USA; Meyer Cancer Center, Weill Cornell Medicine, New York, NY, USA; Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA; Department of Pathology and Perlmutter Cancer Center, NYU Grossman School of Medicine, New York, NY, USA
| | - Rameen Beroukhim
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cancer Biology and Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Frank Dubois
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cancer Biology and Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin; Humboldt-Universität zu Berlin, Institute of Pathology
| | - Simona Dalin
- Cancer Program, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Departments of Cancer Biology and Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| |
Collapse
|
2
|
Atkinson EG, Artomov M, Loboda AA, Rehm HL, MacArthur DG, Karczewski KJ, Neale BM, Daly MJ. Discordant calls across genotype discovery approaches elucidate variants with systematic errors. Genome Res 2023; 33:999-1005. [PMID: 37253541 PMCID: PMC10519400 DOI: 10.1101/gr.277908.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 05/19/2023] [Indexed: 06/01/2023]
Abstract
Large-scale high-throughput sequencing data sets have been transformative for informing clinical variant interpretation and for use as reference panels for statistical and population genetic efforts. Although such resources are often treated as ground truth, we find that in widely used reference data sets such as the Genome Aggregation Database (gnomAD), some variants pass gold-standard filters, yet are systematically different in their genotype calls across genotype discovery approaches. The inclusion of such discordant sites in study designs involving multiple genotype discovery strategies could bias results and lead to false-positive hits in association studies owing to technological artifacts rather than a true relationship to the phenotype. Here, we describe this phenomenon of discordant genotype calls across genotype discovery approaches, characterize the error mode of wrong calls, provide a list of discordant sites identified in gnomAD that should be treated with caution in analyses, and present a metric and machine learning classifier trained on gnomAD data to identify likely discordant variants in other data sets. We find that different genotype discovery approaches have different sets of variants at which this problem occurs, but there are characteristic variant features that can be used to predict discordant behavior. Discordant sites are largely shared across ancestry groups, although different populations are powered for the discovery of different variants. We find that the most common error mode is that of a variant being heterozygous for one approach and homozygous for the other, with heterozygous in the genomes and homozygous reference in the exomes making up the majority of miscalls.
Collapse
Affiliation(s)
- Elizabeth G Atkinson
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA;
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Mykyta Artomov
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA;
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, Ohio 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, Ohio 43210, USA
| | - Alexander A Loboda
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- ITMO University, Saint-Petersburg, 197101, Russia
- Almazov National Medical Research Center, St. Petersburg, 197341, Russia
| | - Heidi L Rehm
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Daniel G MacArthur
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Darlinghurst, New South Wales 2010, Australia
| | - Konrad J Karczewski
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Mark J Daly
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Institute for Molecular Medicine Finland, University of Helsinki, FI-00290 Helsinki, Finland
| |
Collapse
|
3
|
Ramaraj T, Grover CE, Mendoza AC, Arick MA, Jareczek JJ, Leach AG, Peterson DG, Wendel JF, Udall JA. The Gossypium herbaceum L. Wagad genome as a resource for understanding cotton domestication. G3 (BETHESDA, MD.) 2022; 13:6858943. [PMID: 36454094 PMCID: PMC9911056 DOI: 10.1093/g3journal/jkac308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 10/14/2022] [Accepted: 10/23/2022] [Indexed: 12/05/2022]
Abstract
Gossypium herbaceum is a species of cotton native to Africa and Asia that is one of the 2 domesticated diploids. Together with its sister-species G. arboreum, these A-genome taxa represent models of the extinct A-genome donor of modern polyploid cotton, which provide about 95% of cotton grown worldwide. As part of a larger effort to characterize variation and improve resources among diverse diploid and polyploid cotton genomes, we sequenced and assembled the genome of G. herbaceum cultivar (cv.) Wagad, representing the first domesticated accession for this species. This chromosome-level genome was generated using a combination of PacBio long-read technology, HiC, and Bionano optical mapping and compared to existing genome sequences in cotton. We compare the genome of this cultivar to the existing genome of wild G. herbaceum subspecies africanum to elucidate changes in the G. herbaceum genome concomitant with domestication and extend these analyses to gene expression using available RNA-seq. Our results demonstrate the utility of the G. herbaceum cv. Wagad genome in understanding domestication in the diploid species, which could inform modern breeding programs.
Collapse
Affiliation(s)
- Thiruvarangan Ramaraj
- School of Computing, Jarvis College of Computing and Digital Media, DePaul University, Chicago, IL 60605, USA
| | - Corrinne E Grover
- Ecology, Evolution, and Organismal Biology Department, Iowa State University, Ames, IA 50011, USA
| | - Azalea C Mendoza
- School of Computing, Jarvis College of Computing and Digital Media, DePaul University, Chicago, IL 60605, USA
| | - Mark A Arick
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Josef J Jareczek
- Ecology, Evolution, and Organismal Biology Department, Iowa State University, Ames, IA 50011, USA
| | - Alexis G Leach
- Ecology, Evolution, and Organismal Biology Department, Iowa State University, Ames, IA 50011, USA
| | - Daniel G Peterson
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Jonathan F Wendel
- Ecology, Evolution, and Organismal Biology Department, Iowa State University, Ames, IA 50011, USA
| | - Joshua A Udall
- *Corresponding author: Crop Germplasm Research Unit, USDA/Agricultural Research Service, 2881 F&B Rd., College Station, TX 77845, USA.
| |
Collapse
|
4
|
Nucleotide-based genetic networks: Methods and applications. J Biosci 2022. [PMID: 36226367 PMCID: PMC9554864 DOI: 10.1007/s12038-022-00290-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Genomic variations have been acclaimed as among the key players in understanding the biological mechanisms behind migration, evolution, and adaptation to extreme conditions. Due to stochastic evolutionary forces, the frequency of polymorphisms is affected by changes in the frequency of nearby polymorphisms in the same DNA sample, making them connected in terms of evolution. This article presents all the ingredients to understand the cumulative effects and complex behaviors of genetic variations in the human mitochondrial genome by analyzing co-occurrence networks of nucleotides, and shows key results obtained from such analyses. The article emphasizes recent investigations of these co-occurrence networks, describing the role of interactions between nucleotides in fundamental processes of human migration and viral evolution. The corresponding co-mutation-based genetic networks revealed genetic signatures of human adaptation in extreme environments. This article provides the methods of constructing such networks in detail, along with their graph-theoretical properties, and applications of the genomic networks in understanding the role of nucleotide co-evolution in evolution of the whole genome.
Collapse
|
5
|
FVC as an adaptive and accurate method for filtering variants from popular NGS analysis pipelines. Commun Biol 2022; 5:975. [PMID: 36114280 PMCID: PMC9481582 DOI: 10.1038/s42003-022-03397-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 04/22/2022] [Indexed: 11/08/2022] Open
Abstract
The quality control of variants from whole-genome sequencing data is vital in clinical diagnosis and human genetics research. However, current filtering methods (Frequency, Hard-Filter, VQSR, GARFIELD, and VEF) were developed to be utilized on particular variant callers and have certain limitations. Especially, the number of eliminated true variants far exceeds the number of removed false variants using these methods. Here, we present an adaptive method for quality control on genetic variants from different analysis pipelines, and validate it on the variants generated from four popular variant callers (GATK HaplotypeCaller, Mutect2, Varscan2, and DeepVariant). FVC consistently exhibited the best performance. It removed far more false variants than the current state-of-the-art filtering methods and recalled ~51-99% true variants filtered out by the other methods. Once trained, FVC can be conveniently integrated into a user-specific variant calling pipeline. FVC is a method for calling specific gene variants from whole genome data, for potential use in clinical diagnosis and human genetics research.
Collapse
|
6
|
Marenne G, Ludwig TE, Bocher O, Herzig AF, Aloui C, Tournier-Lasserve E, Génin E. RAVAQ: An integrative pipeline from quality control to region-based rare variant association analysis. Genet Epidemiol 2022; 46:256-265. [PMID: 35419876 DOI: 10.1002/gepi.22450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 02/04/2022] [Accepted: 03/15/2022] [Indexed: 11/07/2022]
Abstract
Next-generation sequencing technologies have opened up the possibility to sequence large samples of cases and controls to test for association with rare variants. To limit cost and increase sample sizes, data from controls could be used in multiple studies and might thus be generated on different sequencing platforms. This could pose some problems of comparability between cases and controls due to batch effects that could be confounding factors, leading to false-positive association signals. To limit batch effects and ensure comparability of datasets, stringent quality controls are required. We propose an integrative five-steps pipeline, RAVAQ, that (a) performs a specific three-step quality control taking into account the case-control status to ensure data comparability, (b) selects qualifying variants as defined by the user, and (c) performs rare variant association tests per genomic region. The RAVAQ pipeline is wrapped in an R package. It is user-friendly and flexible in its arguments to adapt to the specificity of each research project. We provide examples showing how RAVAQ improves rare variant association tests. The default RAVAQ quality control outperformed the widely used Variant Quality Score Recalibration method, removing inflation due to spurious signals. RAVAQ is open source and freely available at https://gitlab.com/gmarenne/ravaq.
Collapse
Affiliation(s)
| | - Thomas E Ludwig
- Inserm, Univ Brest, EFS, UMR 1078, GGB, Brest, France
- CHU Brest, Brest, France
| | - Ozvan Bocher
- Inserm, Univ Brest, EFS, UMR 1078, GGB, Brest, France
| | | | - Chaker Aloui
- Université de Paris, NeuroDiderot, Inserm UMR 1141, Paris, France
| | - Elisabeth Tournier-Lasserve
- Université de Paris, NeuroDiderot, Inserm UMR 1141, Paris, France
- AP-HP, Service de Génétique Moléculaire Neurovasculaire, Hôpital Saint-Louis, Paris, France
| | - Emmanuelle Génin
- Inserm, Univ Brest, EFS, UMR 1078, GGB, Brest, France
- CHU Brest, Brest, France
| |
Collapse
|
7
|
Belloy ME, Eger SJ, Le Guen Y, Damotte V, Ahmad S, Ikram MA, Ramirez A, Tsolaki AC, Rossi G, Jansen IE, de Rojas I, Parveen K, Sleegers K, Ingelsson M, Hiltunen M, Amin N, Andreassen O, Sánchez-Juan P, Kehoe P, Amouyel P, Sims R, Frikke-Schmidt R, van der Flier WM, Lambert JC, He Z, Han SS, Napolioni V, Greicius MD. Challenges at the APOE locus: a robust quality control approach for accurate APOE genotyping. Alzheimers Res Ther 2022; 14:22. [PMID: 35120553 PMCID: PMC8815198 DOI: 10.1186/s13195-022-00962-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 01/12/2022] [Indexed: 04/22/2023]
Abstract
BACKGROUND Genetic variants within the APOE locus may modulate Alzheimer's disease (AD) risk independently or in conjunction with APOE*2/3/4 genotypes. Identifying such variants and mechanisms would importantly advance our understanding of APOE pathophysiology and provide critical guidance for AD therapies aimed at APOE. The APOE locus however remains relatively poorly understood in AD, owing to multiple challenges that include its complex linkage structure and uncertainty in APOE*2/3/4 genotype quality. Here, we present a novel APOE*2/3/4 filtering approach and showcase its relevance on AD risk association analyses for the rs439401 variant, which is located 1801 base pairs downstream of APOE and has been associated with a potential regulatory effect on APOE. METHODS We used thirty-two AD-related cohorts, with genetic data from various high-density single-nucleotide polymorphism microarrays, whole-genome sequencing, and whole-exome sequencing. Study participants were filtered to be ages 60 and older, non-Hispanic, of European ancestry, and diagnosed as cognitively normal or AD (n = 65,701). Primary analyses investigated AD risk in APOE*4/4 carriers. Additional supporting analyses were performed in APOE*3/4 and 3/3 strata. Outcomes were compared under two different APOE*2/3/4 filtering approaches. RESULTS Using more conventional APOE*2/3/4 filtering criteria (approach 1), we showed that, when in-phase with APOE*4, rs439401 was variably associated with protective effects on AD case-control status. However, when applying a novel filter that increases the certainty of the APOE*2/3/4 genotypes by applying more stringent criteria for concordance between the provided APOE genotype and imputed APOE genotype (approach 2), we observed that all significant effects were lost. CONCLUSIONS We showed that careful consideration of APOE genotype and appropriate sample filtering were crucial to robustly interrogate the role of the APOE locus on AD risk. Our study presents a novel APOE filtering approach and provides important guidelines for research into the APOE locus, as well as for elucidating genetic interaction effects with APOE*2/3/4.
Collapse
Affiliation(s)
- Michael E Belloy
- Department of Neurology and Neurological Sciences - Greicius lab, Stanford University, 290 Jane Stanford Way, Stanford, CA, 94304, USA.
| | - Sarah J Eger
- Department of Neurology and Neurological Sciences - Greicius lab, Stanford University, 290 Jane Stanford Way, Stanford, CA, 94304, USA
| | - Yann Le Guen
- Department of Neurology and Neurological Sciences - Greicius lab, Stanford University, 290 Jane Stanford Way, Stanford, CA, 94304, USA
| | - Vincent Damotte
- Univ. Lille, Inserm, CHU Lille, Institut Pasteur de Lille, U1167-RID-AGE Facteurs de risque et déterminants moléculaires des maladies liées au vieillissement, Lille, France
| | - Shahzad Ahmad
- Department of Epidemiology, ErasmusMC, Rotterdam, The Netherlands
- Division of Systems Biomedicine and Pharmacology, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands
| | - M Arfan Ikram
- Department of Epidemiology, ErasmusMC, Rotterdam, The Netherlands
| | - Alfredo Ramirez
- Division of Neurogenetics and Molecular Psychiatry, Department of Psychiatry and Psychotherapy, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
- Department of Neurodegenerative diseases and Geriatric Psychiatry, Medical Faculty, University Hospital Bonn, Bonn, Germany
- Department of Psychiatry & Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, San Antonio, TX, USA
- German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
- Cluster of Excellence Cellular Stress Responses in Aging-associated Diseases (CECAD), University of Cologne, Cologne, Germany
| | - Anthoula C Tsolaki
- 1st Department of Neurology, AHEPA Hospital, Aristotle University of Thessaloniki, Athens, Greece
| | - Giacomina Rossi
- Unit of Neurology V and Neuropathology, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Iris E Jansen
- Alzheimer Center Amsterdam, Department of Neurology, Amsterdam Neuroscience, Vrije Universiteit Amsterdam, Amsterdam UMC, Amsterdam, The Netherlands
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Amsterdam Neuroscience, Vrije University, Amsterdam, The Netherlands
| | - Itziar de Rojas
- Research Center and Memory Clinic, ACE Alzheimer Center Barcelona, Universitat Internacional de Catalunya, Barcelona, Spain
- Networking Research Center on Neurodegenerative Diseases (CIBERNED), Instituto de Salud Carlos III, Madrid, Spain
| | - Kayenat Parveen
- Division of Neurogenetics and Molecular Psychiatry, Department of Psychiatry and Psychotherapy, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
- Department of Neurodegenerative diseases and Geriatric Psychiatry, Medical Faculty, University Hospital Bonn, Bonn, Germany
| | - Kristel Sleegers
- Complex Genetics of Alzheimer's Disease Group, Center for Molecular Neurology, VIB, Antwerp, Belgium
- Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium
| | - Martin Ingelsson
- Department of Public Health and Carins Sciences/Geriatrics, Uppsala University, Uppsala, Sweden
| | - Mikko Hiltunen
- Institute of Biomedicine, University of Eastern Finland, Yliopistonranta 1E, 70211, Kuopio, Finland
| | - Najaf Amin
- Department of Epidemiology, ErasmusMC, Rotterdam, The Netherlands
- Nuffield Department of Population Health Oxford University, Oxford, UK
| | - Ole Andreassen
- NORMENT Centre, Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
- Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Pascual Sánchez-Juan
- CIBERNED, Network Center for Biomedical Research in Neurodegenerative Diseases, National Institute of Health Carlos III, Madrid, Spain
- Neurology Service, Marqués de Valdecilla University Hospital (University of Cantabria and IDIVAL), Santander, Spain
| | - Patrick Kehoe
- Translational Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Philippe Amouyel
- Univ. Lille, Inserm, CHU Lille, Institut Pasteur de Lille, U1167-RID-AGE Facteurs de risque et déterminants moléculaires des maladies liées au vieillissement, Lille, France
| | - Rebecca Sims
- Division of Psychological Medicine and Clinical Neuroscience, School of Medicine, Cardiff University, Wales, UK
| | - Ruth Frikke-Schmidt
- Department of Clinical Biochemistry, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - Wiesje M van der Flier
- Alzheimer Center Amsterdam, Department of Neurology, Amsterdam Neuroscience, Vrije Universiteit Amsterdam, Amsterdam UMC, Amsterdam, The Netherlands
| | - Jean-Charles Lambert
- Univ. Lille, Inserm, CHU Lille, Institut Pasteur de Lille, U1167-RID-AGE Facteurs de risque et déterminants moléculaires des maladies liées au vieillissement, Lille, France
| | - Zihuai He
- Department of Neurology and Neurological Sciences - Greicius lab, Stanford University, 290 Jane Stanford Way, Stanford, CA, 94304, USA
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA, 94304, USA
| | - Summer S Han
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA, 94304, USA
- Department of Neurosurgery, Stanford University, Stanford, CA, 94304, USA
| | - Valerio Napolioni
- School of Biosciences and Veterinary Medicine, University of Camerino, 62032, Camerino, Italy
| | - Michael D Greicius
- Department of Neurology and Neurological Sciences - Greicius lab, Stanford University, 290 Jane Stanford Way, Stanford, CA, 94304, USA
| |
Collapse
|
8
|
Lo YH, Cheng HC, Hsiung CN, Yang SL, Wang HY, Peng CW, Chen CY, Lin KP, Kang ML, Chen CH, Chu HW, Lin CF, Lee MH, Liu Q, Satta Y, Lin CJ, Lin M, Chaw SM, Loo JH, Shen CY, Ko WY. Detecting Genetic Ancestry and Adaptation in the Taiwanese Han People. Mol Biol Evol 2021; 38:4149-4165. [PMID: 33170928 PMCID: PMC8476137 DOI: 10.1093/molbev/msaa276] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The Taiwanese people are composed of diverse indigenous populations and the Taiwanese Han. About 95% of the Taiwanese identify themselves as Taiwanese Han, but this may not be a homogeneous population because they migrated to the island from various regions of continental East Asia over a period of 400 years. Little is known about the underlying patterns of genetic ancestry, population admixture, and evolutionary adaptation in the Taiwanese Han people. Here, we analyzed the whole-genome single-nucleotide polymorphism genotyping data from 14,401 individuals of Taiwanese Han collected by the Taiwan Biobank and the whole-genome sequencing data for a subset of 772 people. We detected four major genetic ancestries with distinct geographic distributions (i.e., Northern, Southeastern, Japonic, and Island Southeast Asian ancestries) and signatures of population mixture contributing to the genomes of Taiwanese Han. We further scanned for signatures of positive natural selection that caused unusually long-range haplotypes and elevations of hitchhiked variants. As a result, we identified 16 candidate loci in which selection signals can be unambiguously localized at five single genes: CTNNA2, LRP1B, CSNK1G3, ASTN2, and NEO1. Statistical associations were examined in 16 metabolic-related traits to further elucidate the functional effects of each candidate gene. All five genes appear to have pleiotropic connections to various types of disease susceptibility and significant associations with at least one metabolic-related trait. Together, our results provide critical insights for understanding the evolutionary history and adaption of the Taiwanese Han population.
Collapse
Affiliation(s)
- Yun-Hua Lo
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Hsueh-Chien Cheng
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Chia-Ni Hsiung
- Institute of Biomedical Sciences, Academia Sinica, Taipei City, Taiwan
| | - Show-Ling Yang
- Institute of Biomedical Sciences, Academia Sinica, Taipei City, Taiwan
| | - Han-Yu Wang
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Chia-Wei Peng
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Chun-Yu Chen
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Kung-Ping Lin
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Mei-Ling Kang
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
| | - Chien-Hsiun Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei City, Taiwan
| | - Hou-Wei Chu
- Institute of Biomedical Sciences, Academia Sinica, Taipei City, Taiwan
| | | | - Mei-Hsuan Lee
- Institute of Clinical Medicine, National Yang-Ming University, Taipei, Taiwan
| | - Quintin Liu
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Japan
| | - Yoko Satta
- Department of Evolutionary Studies of Biosystems, SOKENDAI (The Graduate University for Advanced Studies), Hayama, Japan
| | - Cheng-Jui Lin
- Molecular Anthropology and Transfusion Medicine Research Laboratory, Mackay Memorial Hospital, Taipei, Taiwan
| | - Marie Lin
- Molecular Anthropology and Transfusion Medicine Research Laboratory, Mackay Memorial Hospital, Taipei, Taiwan
| | - Shu-Miaw Chaw
- Biodiversity Research Center, Academia Sinica, Taipei City, Taiwan
| | - Jun-Hun Loo
- Molecular Anthropology and Transfusion Medicine Research Laboratory, Mackay Memorial Hospital, Taipei, Taiwan
| | - Chen-Yang Shen
- Institute of Biomedical Sciences, Academia Sinica, Taipei City, Taiwan
| | - Wen-Ya Ko
- Faculty of Life Sciences and Institute of Genome Sciences, National Yang-Ming University, Taipei, Taiwan
| |
Collapse
|
9
|
Greater effect of polygenic risk score for Alzheimer's disease among younger cases who are apolipoprotein E-ε4 carriers. Neurobiol Aging 2020; 99:101.e1-101.e9. [PMID: 33164815 DOI: 10.1016/j.neurobiolaging.2020.09.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 09/04/2020] [Accepted: 09/10/2020] [Indexed: 12/12/2022]
Abstract
To evaluate how age and apolipoprotein E-ε4 (APOE4) status interact with APOE-independent polygenic risk score (PRSnon-APOE), we estimated PRSnon-APOE in superagers (age ≥ 90 years, N = 346), 89- controls (age 60-89, N = 2930), and Alzheimer's disease (AD) cases (N = 1760). Using superagers, we see a nearly 5 times greater odds ratio (OR) for AD comparing the top PRSnon-APOE decile to the lowest decile (OR = 4.82, p = 2.5 × 10-6), which is twice the OR as using 89- controls (OR = 2.38, p = 4.6 × 10-9). Thus PRSnon-APOE is correlated with age, which in turn is associated with APOE. Further exploring these relationships, we find that PRSnon-APOE modifies age at onset among APOE4 carriers, but not among noncarriers. More specifically, PRSnon-APOE in the top decile predicts an age at onset 5 years earlier compared with the lowest decile (70.1 vs. 75.0 years; t-test p = 2.4 × 10-5) among APOE4 carriers. This disproportionally large PRSnon-APOE among younger APOE4-positive cases is reflected in a significant statistical interaction between APOE4 status and age at onset (β = -0.02, p = 4.8 × 10-3) as a predictor of PRSnon-APOE. Thus, the known AD risk variants are particularly detrimental in young APOE4 carriers.
Collapse
|