1
|
Poirier A, Picard C, Labonté A, Aubry I, Auld D, Zetterberg H, Blennow K, Tremblay ML, Poirier J. PTPRS is a novel marker for early Tau pathology and synaptic integrity in Alzheimer's disease. Sci Rep 2024; 14:14718. [PMID: 38926456 PMCID: PMC11208446 DOI: 10.1038/s41598-024-65104-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 06/17/2024] [Indexed: 06/28/2024] Open
Abstract
We examined the role of protein tyrosine phosphatase receptor sigma (PTPRS) in the context of Alzheimer's disease and synaptic integrity. Publicly available datasets (BRAINEAC, ROSMAP, ADC1) and a cohort of asymptomatic but "at risk" individuals (PREVENT-AD) were used to explore the relationship between PTPRS and various Alzheimer's disease biomarkers. We identified that PTPRS rs10415488 variant C shows features of neuroprotection against early Tau pathology and synaptic degeneration in Alzheimer's disease. This single nucleotide polymorphism correlated with higher PTPRS transcript abundance and lower p(181)Tau and GAP-43 levels in the CSF. In the brain, PTPRS protein abundance was significantly correlated with the quantity of two markers of synaptic integrity: SNAP25 and SYT-1. We also found the presence of sexual dimorphism for PTPRS, with higher CSF concentrations in males than females. Male carriers for variant C were found to have a 10-month delay in the onset of AD. We thus conclude that PTPRS acts as a neuroprotective receptor in Alzheimer's disease. Its protective effect is most important in males, in whom it postpones the age of onset of the disease.
Collapse
Affiliation(s)
- Alexandre Poirier
- Division of Experimental Medicine, Faculty of Medicine and Health Science, McGill University, Montréal, QC, Canada
- Goodman Cancer Institute, McGill University, Montréal, Canada
| | - Cynthia Picard
- Douglas Mental Health University Institute, Montréal, QC, Canada
- Centre for the Studies in the Prevention of Alzheimer's Disease, Montréal, QC, Canada
| | - Anne Labonté
- Douglas Mental Health University Institute, Montréal, QC, Canada
- Centre for the Studies in the Prevention of Alzheimer's Disease, Montréal, QC, Canada
| | - Isabelle Aubry
- Goodman Cancer Institute, McGill University, Montréal, Canada
- McGill University, Montréal, QC, Canada
| | - Daniel Auld
- McGill University, Montréal, QC, Canada
- Victor Phillip Dahdaleh Institute of Genomic Medicine, McGill University, Montréal, QC, Canada
| | - Henrik Zetterberg
- Department of Psychiatry and Neurochemistry, Institute of Neuroscience and Physiology, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Clinical Neurochemistry Laboratory, Sahlgrenska University Hospital, Mölndal, Sweden
- Department of Neurodegenerative Disease, UCL Institute of Neurology, London, UK
- UK Dementia Research Institute at UCL, London, UK
- Hong Kong Center for Neurodegenerative Diseases, Clear Water Bay, Hong Kong, SAR, People's Republic of China
- Wisconsin Alzheimer's Disease Research Center, University of Wisconsin School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, USA
- University of Science and Technology of China, Hefei, Anhui, People's Republic of China
| | - Kaj Blennow
- Department of Psychiatry and Neurochemistry, Institute of Neuroscience and Physiology, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Clinical Neurochemistry Laboratory, Sahlgrenska University Hospital, Mölndal, Sweden
- Department of Neurodegenerative Disease, UCL Institute of Neurology, London, UK
- University of Science and Technology of China, Hefei, Anhui, People's Republic of China
- Institut du Cerveau et de la Moelle épinière (ICM), Pitié-Salpêtrière Hospital, Sorbonne Université, Paris, France
| | - Michel L Tremblay
- Division of Experimental Medicine, Faculty of Medicine and Health Science, McGill University, Montréal, QC, Canada.
- Goodman Cancer Institute, McGill University, Montréal, Canada.
- McGill University, Montréal, QC, Canada.
- Department of Biochemistry, McGill University, Montréal, Canada.
| | - Judes Poirier
- Douglas Mental Health University Institute, Montréal, QC, Canada.
- Centre for the Studies in the Prevention of Alzheimer's Disease, Montréal, QC, Canada.
- McGill University, Montréal, QC, Canada.
| |
Collapse
|
2
|
Martinez KL, Klein A, Martin JR, Sampson CU, Giles JB, Beck ML, Bhakta K, Quatraro G, Farol J, Karnes JH. Disparities in ABO blood type determination across diverse ancestries: a systematic review and validation in the All of Us Research Program. J Am Med Inform Assoc 2024:ocae161. [PMID: 38917427 DOI: 10.1093/jamia/ocae161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 05/02/2024] [Accepted: 06/19/2024] [Indexed: 06/27/2024] Open
Abstract
OBJECTIVES ABO blood types have widespread clinical use and robust associations with disease. The purpose of this study is to evaluate the portability and suitability of tag single-nucleotide polymorphisms (tSNPs) used to determine ABO alleles and blood types across diverse populations in published literature. MATERIALS AND METHODS Bibliographic databases were searched for studies using tSNPs to determine ABO alleles. We calculated linkage between tSNPs and functional variants across inferred continental ancestry groups from 1000 Genomes. We compared r2 across ancestry and assessed real-world consequences by comparing tSNP-derived blood types to serology in a diverse population from the All of Us Research Program. RESULTS Linkage between functional variants and O allele tSNPs was significantly lower in African (median r2 = 0.443) compared to East Asian (r2 = 0.946, P = 1.1 × 10-5) and European (r2 = 0.869, P = .023) populations. In All of Us, discordance between tSNP-derived blood types and serology was high across all SNPs in African ancestry individuals and linkage was strongly correlated with discordance across all ancestries (ρ = -0.90, P = 3.08 × 10-23). DISCUSSION Many studies determine ABO blood types using tSNPs. However, tSNPs with low linkage disequilibrium promote misinference of ABO blood types, particularly in diverse populations. We observe common use of inappropriate tSNPs to determine ABO blood type, particularly for O alleles and with some tSNPs mistyping up to 58% of individuals. CONCLUSION Our results highlight the lack of transferability of tSNPs across ancestries and potential exacerbation of disparities in genomic research for underrepresented populations. This is especially relevant as more diverse cohorts are made publicly available.
Collapse
Affiliation(s)
- Kiana L Martinez
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
| | - Andrew Klein
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
| | - Jennifer R Martin
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
- Department of the University of Arizona Health Sciences Library, The University of Arizona, Tucson, AZ 85721, United States
| | - Chinwuwanuju U Sampson
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
| | - Jason B Giles
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
| | - Madison L Beck
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
| | - Krupa Bhakta
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
| | - Gino Quatraro
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
| | - Juvie Farol
- Department of Clinical and Translational Science, The University of Arizona College of Medicine, Tucson, AZ 85721, United States
| | - Jason H Karnes
- Department of Pharmacy Practice and Science, The University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ 85721, United States
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, United States
| |
Collapse
|
3
|
Poirier A, Picard C, Labonté A, Aubry I, Auld D, Zetterberg H, Blennow K, Tremblay ML, Poirier J. PTPRS is a novel marker for early tau pathology and synaptic integrity in Alzheimer's disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.12.593733. [PMID: 38766183 PMCID: PMC11100782 DOI: 10.1101/2024.05.12.593733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
We examined the role of protein tyrosine phosphatase receptor sigma (PTPRS) in the context of Alzheimer's disease and synaptic integrity. Publicly available datasets (BRAINEAC, ROSMAP, ADC1) and a cohort of asymptomatic but "at risk" individuals (PREVENT-AD) were used to explore the relationship between PTPRS and various Alzheimer's disease biomarkers. We identified that PTPRS rs10415488 variant C shows features of neuroprotection against early tau pathology and synaptic degeneration in Alzheimer's disease. This single nucleotide polymorphism correlated with higher PTPRS transcript abundance and lower P-tau181 and GAP-43 levels in the CSF. In the brain, PTPRS protein abundance was significantly correlated with the quantity of two markers of synaptic integrity: SNAP25 and SYT-1. We also found the presence of sexual dimorphism for PTPRS, with higher CSF concentrations in males than females. Male carriers for variant C were found to have a 10-month delay in the onset of AD. We thus conclude that PTPRS acts as a neuroprotective receptor in Alzheimer's disease. Its protective effect is most important in males, in whom it postpones the age of onset of the disease.
Collapse
|
4
|
Garcia IS, Silva-Vignato B, Cesar ASM, Petrini J, da Silva VH, Morosini NS, Goes CP, Afonso J, da Silva TR, Lima BD, Clemente LG, Regitano LCDA, Mourão GB, Coutinho LL. Novel putative causal mutations associated with fat traits in Nellore cattle uncovered by eQTLs located in open chromatin regions. Sci Rep 2024; 14:10094. [PMID: 38698200 PMCID: PMC11066111 DOI: 10.1038/s41598-024-60703-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 04/26/2024] [Indexed: 05/05/2024] Open
Abstract
Intramuscular fat (IMF) and backfat thickness (BFT) are critical economic traits impacting meat quality. However, the genetic variants controlling these traits need to be better understood. To advance knowledge in this area, we integrated RNA-seq and single nucleotide polymorphisms (SNPs) identified in genomic and transcriptomic data to generate a linkage disequilibrium filtered panel of 553,581 variants. Expression quantitative trait loci (eQTL) analysis revealed 36,916 cis-eQTLs and 14,408 trans-eQTLs. Association analysis resulted in three eQTLs associated with BFT and 24 with IMF. Functional enrichment analysis of genes regulated by these 27 eQTLs revealed noteworthy pathways that can play a fundamental role in lipid metabolism and fat deposition, such as immune response, cytoskeleton remodeling, iron transport, and phospholipid metabolism. We next used ATAC-Seq assay to identify and overlap eQTL and open chromatin regions. Six eQTLs were in regulatory regions, four in predicted insulators and possible CCCTC-binding factor DNA binding sites, one in an active enhancer region, and the last in a low signal region. Our results provided novel insights into the transcriptional regulation of IMF and BFT, unraveling putative regulatory variants.
Collapse
Affiliation(s)
- Ingrid Soares Garcia
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Bárbara Silva-Vignato
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Aline Silva Mello Cesar
- Department of Agroindustry, Food and Nutrition, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Juliana Petrini
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Vinicius Henrique da Silva
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Natália Silva Morosini
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Carolina Purcell Goes
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | | | - Thaís Ribeiro da Silva
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Beatriz Delcarme Lima
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Luan Gaspar Clemente
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | | | - Gerson Barreto Mourão
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil
| | - Luiz Lehmann Coutinho
- Department of Animal Science, College of Agriculture "Luiz de Queiroz", University of São Paulo, Piracicaba, SP, Brazil.
| |
Collapse
|
5
|
Sun Q, Yang Y, Rosen JD, Chen J, Li X, Guan W, Jiang MZ, Wen J, Pace RG, Blackman SM, Bamshad MJ, Gibson RL, Cutting GR, O'Neal WK, Knowles MR, Kooperberg C, Reiner AP, Raffield LM, Carson AP, Rich SS, Rotter JI, Loos RJF, Kenny E, Jaeger BC, Min YI, Fuchsberger C, Li Y. MagicalRsq-X: A cross-cohort transferable genotype imputation quality metric. Am J Hum Genet 2024; 111:990-995. [PMID: 38636510 PMCID: PMC11080605 DOI: 10.1016/j.ajhg.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 03/29/2024] [Accepted: 04/01/2024] [Indexed: 04/20/2024] Open
Abstract
Since genotype imputation was introduced, researchers have been relying on the estimated imputation quality from imputation software to perform post-imputation quality control (QC). However, this quality estimate (denoted as Rsq) performs less well for lower-frequency variants. We recently published MagicalRsq, a machine-learning-based imputation quality calibration, which leverages additional typed markers from the same cohort and outperforms Rsq as a QC metric. In this work, we extended the original MagicalRsq to allow cross-cohort model training and named the new model MagicalRsq-X. We removed the cohort-specific estimated minor allele frequency and included linkage disequilibrium scores and recombination rates as additional features. Leveraging whole-genome sequencing data from TOPMed, specifically participants in the BioMe, JHS, WHI, and MESA studies, we performed comprehensive cross-cohort evaluations for predominantly European and African ancestral individuals based on their inferred global ancestry with the 1000 Genomes and Human Genome Diversity Project data as reference. Our results suggest MagicalRsq-X outperforms Rsq in almost every setting, with 7.3%-14.4% improvement in squared Pearson correlation with true R2, corresponding to 85-218 K variant gains. We further developed a metric to quantify the genetic distances of a target cohort relative to a reference cohort and showed that such metric largely explained the performance of MagicalRsq-X models. Finally, we found MagicalRsq-X saved up to 53 known genome-wide significant variants in one of the largest blood cell trait GWASs that would be missed using the original Rsq for QC. In conclusion, MagicalRsq-X shows superiority for post-imputation QC and benefits genetic studies by distinguishing well and poorly imputed lower-frequency variants.
Collapse
Affiliation(s)
- Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yingxi Yang
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - Jonathan D Rosen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jiawen Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Xihao Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Wyliena Guan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Min-Zhi Jiang
- Department of Applied Physical Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jia Wen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Rhonda G Pace
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Scott M Blackman
- Division of Pediatric Endocrinology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Michael J Bamshad
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Ronald L Gibson
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA
| | - Garry R Cutting
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Wanda K O'Neal
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Michael R Knowles
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA 98109, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA 98195, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - April P Carson
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL 35249, USA
| | - Stephen S Rich
- Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia School of Medicine, Charlottesville, VA 22908, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA; Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Eimear Kenny
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
| | - Byron C Jaeger
- Wake Forest School of Medicine, Department of Biostatistics and Data Science, Wake Forest University, Winston-Salem, NC 27109, USA
| | - Yuan-I Min
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS 39216, USA
| | - Christian Fuchsberger
- Institute for Biomedicine, Eurac Research (affiliated with the University of Lübeck), Bolzano, Italy.
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
| |
Collapse
|
6
|
Bougiouri K, Charlton S, Harris A, Carmagnini A, Piličiauskienė G, Feuerborn TR, Scarsbrook L, Tabadda K, Blaževičius P, Parker HG, Gopalakrishnan S, Larson G, Ostrander EA, Irving-Pease EK, Frantz LA, Racimo F. Imputation of ancient canid genomes reveals inbreeding history over the past 10,000 years. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585179. [PMID: 38903121 PMCID: PMC11188068 DOI: 10.1101/2024.03.15.585179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/22/2024]
Abstract
The multi-millenia long history between dogs and humans has justly placed them at the forefront of archeological and genomic research. Despite ongoing efforts including the analysis of ancient dog and wolf genomes, many questions remain regarding their geographic and temporal origins, and the microevolutionary processes that led to the huge diversity of breeds today. Although ancient genomes provide valuable information, their use is significantly hindered by low depth of coverage and post-mortem damage, which often inhibits confident genotype calling. In the present study, we assess how genotype imputation of ancient dog and wolf genomes, utilising a large reference panel, can improve the resolution afforded by ancient genomic datasets. Imputation accuracy was evaluated by down-sampling 10 high coverage ancient and modern dog and wolf genomes to 0.05-2x coverage and comparing concordance between imputed and high coverage genotypes. We also measured the impact of imputation on principal component analyses (PCA) and runs of homozygosity (ROH). Our findings show high (R2 > 0.9) imputation accuracy for dogs with coverage as low as 0.5x and for wolves as low as 1.0x. We then imputed a worldwide dataset of 81 published ancient dog and wolf genomes, in addition to nine newly sequenced medieval and early modern period European dogs, to assess changes in inbreeding during the last 10,000 years of dog evolution. Ancient dog and wolf populations generally exhibited lower inbreeding levels than present-day individuals, though with some exceptions occurring in ancient Arctic and European dogs. Interestingly, regions with low ROH density maintained across ancient and present-day samples were significantly associated with genes related to olfaction and immune response. Our study indicates that imputing ancient canine genomes is a viable strategy that allows for the use of analytical methods previously limited to high-quality genetic data.
Collapse
Affiliation(s)
- Katia Bougiouri
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Sophy Charlton
- BioArCh, Department of Archaeology, University of York, York, UK
| | - Alex Harris
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Alberto Carmagnini
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
- Palaeogenomics Group, Department of Veterinary Sciences, Ludwig Maximilian University, Munich, Germany
| | - Giedrė Piličiauskienė
- Department of Archeology, Faculty of History, Vilnius University, Vilnius, Lithuania
| | - Tatiana R. Feuerborn
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Lachie Scarsbrook
- The Palaeogenomics and Bio-archaeology Research Network, Research Laboratory for Archaeology and History of Art, University of Oxford, Oxford, UK
| | - Kristina Tabadda
- The Palaeogenomics and Bio-archaeology Research Network, Research Laboratory for Archaeology and History of Art, University of Oxford, Oxford, UK
| | - Povilas Blaževičius
- Department of Archeology, Faculty of History, Vilnius University, Vilnius, Lithuania
- National Museum of Lithuania, Vilnius, Lithuania
| | - Heidi G. Parker
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Shyam Gopalakrishnan
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Greger Larson
- The Palaeogenomics and Bio-archaeology Research Network, Research Laboratory for Archaeology and History of Art, University of Oxford, Oxford, UK
| | - Elaine A. Ostrander
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Evan K. Irving-Pease
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Laurent A.F. Frantz
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
- Palaeogenomics Group, Department of Veterinary Sciences, Ludwig Maximilian University, Munich, Germany
| | - Fernando Racimo
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
7
|
Garrido Marques A, Rubinacci S, Malaspinas AS, Delaneau O, Sousa da Mota B. Assessing the impact of post-mortem damage and contamination on imputation performance in ancient DNA. Sci Rep 2024; 14:6227. [PMID: 38486065 PMCID: PMC10940295 DOI: 10.1038/s41598-024-56584-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 03/08/2024] [Indexed: 03/18/2024] Open
Abstract
Low-coverage imputation is becoming ever more present in ancient DNA (aDNA) studies. Imputation pipelines commonly used for present-day genomes have been shown to yield accurate results when applied to ancient genomes. However, post-mortem damage (PMD), in the form of C-to-T substitutions at the reads termini, and contamination with DNA from closely related species can potentially affect imputation performance in aDNA. In this study, we evaluated imputation performance (i) when using a genotype caller designed for aDNA, ATLAS, compared to bcftools, and (ii) when contamination is present. We evaluated imputation performance with principal component analyses and by calculating imputation error rates. With a particular focus on differently imputed sites, we found that using ATLAS prior to imputation substantially improved imputed genotypes for a very damaged ancient genome (42% PMD). Trimming the ends of the sequencing reads led to similar improvements in imputation accuracy. For the remaining genomes, ATLAS brought limited gains. Finally, to examine the effect of contamination on imputation, we added various amounts of reads from two present-day genomes to a previously downsampled high-coverage ancient genome. We observed that imputation accuracy drastically decreased for contamination rates above 5%. In conclusion, we recommend (i) accounting for PMD by either trimming sequencing reads or using a genotype caller such as ATLAS before imputing highly damaged genomes and (ii) only imputing genomes containing up to 5% of contamination.
Collapse
Affiliation(s)
| | - Simone Rubinacci
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Anna-Sapfo Malaspinas
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | | | - Bárbara Sousa da Mota
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute of Bioinformatics, University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
8
|
Busonero F, Lenarduzzi S, Crobu F, Gentile RM, Carta A, Cracco F, Maschio A, Camarda S, Marongiu M, Zanetti D, Conversano C, Di Lorenzo G, Mazzà D, De Seta F, Girotto G, Sanna S. The Women4Health cohort: a unique cohort to study women-specific mechanisms of cardio-metabolic regulation. EUROPEAN HEART JOURNAL OPEN 2024; 4:oeae012. [PMID: 38532851 PMCID: PMC10964981 DOI: 10.1093/ehjopen/oeae012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/22/2024] [Accepted: 02/26/2024] [Indexed: 03/28/2024]
Abstract
Aims Epidemiological research has shown relevant differences between sexes in clinical manifestations, severity, and progression of cardiovascular and metabolic disorders. To date, the mechanisms underlying these differences remain unknown. Given the rising incidence of such diseases, gender-specific research on established and emerging risk factors, such as dysfunction of glycaemic and/or lipid metabolism, of sex hormones and of gut microbiome, is of paramount importance. The relationships between sex hormones, gut microbiome, and host glycaemic and/or lipid metabolism are largely unknown even in the homoeostasis status. Yet this knowledge gap would be pivotal to pinpoint to key mechanisms that are likely to be disrupted in disease context. Methods and results Here we present the Women4Health (W4H) cohort, a unique cohort comprising up to 300 healthy women followed up during a natural menstrual cycle, set up with the primary goal to investigate the combined role of sex hormones and gut microbiota variations in regulating host lipid and glucose metabolism during homoeostasis, using a multi-omics strategy. Additionally, the W4H cohort will take into consideration another ecosystem that is unique to women, the vaginal microbiome, investigating its interaction with gut microbiome and exploring-for the first time-its role in cardiometabolic disorders. Conclusion The W4H cohort study lays a foundation for improving current knowledge of women-specific mechanisms in cardiometabolic regulation. It aspires to transform insights on host-microbiota interactions into prevention and therapeutic approaches for personalized health care.
Collapse
Affiliation(s)
- Fabio Busonero
- Institute of Genetic and Biomedical Research (IRGB), National Research Council (CNR), c/o Cittadella Universitaria di Monserrato, SS554 Km 4500, Monserrato, 09042, CA, Italy
| | - Stefania Lenarduzzi
- Institute for Maternal and Child Health—IRCCS ‘Burlo Garofolo’, Via dell'Istria 65/1, Trieste, 34137, TS, Italy
| | - Francesca Crobu
- Institute of Genetic and Biomedical Research (IRGB), National Research Council (CNR), c/o Cittadella Universitaria di Monserrato, SS554 Km 4500, Monserrato, 09042, CA, Italy
| | - Roberta Marie Gentile
- Department of Medicine, Surgery and Health Sciences, University of Trieste, Piazzale Europa 1, Trieste, 34137, TS, Italy
| | - Andrea Carta
- Department of Business and Economics, University of Cagliari, via Università 40, 09124, Cagliari, CA, Italy
| | - Francesco Cracco
- Department of Medicine, Surgery and Health Sciences, University of Trieste, Piazzale Europa 1, Trieste, 34137, TS, Italy
| | - Andrea Maschio
- Institute of Genetic and Biomedical Research (IRGB), National Research Council (CNR), c/o Cittadella Universitaria di Monserrato, SS554 Km 4500, Monserrato, 09042, CA, Italy
| | - Silvia Camarda
- Department of Medicine, Surgery and Health Sciences, University of Trieste, Piazzale Europa 1, Trieste, 34137, TS, Italy
| | - Michele Marongiu
- Institute of Genetic and Biomedical Research (IRGB), National Research Council (CNR), c/o Cittadella Universitaria di Monserrato, SS554 Km 4500, Monserrato, 09042, CA, Italy
| | - Daniela Zanetti
- Institute of Genetic and Biomedical Research (IRGB), National Research Council (CNR), c/o Cittadella Universitaria di Monserrato, SS554 Km 4500, Monserrato, 09042, CA, Italy
| | - Claudio Conversano
- Institute of Genetic and Biomedical Research (IRGB), National Research Council (CNR), c/o Cittadella Universitaria di Monserrato, SS554 Km 4500, Monserrato, 09042, CA, Italy
- Department of Business and Economics, University of Cagliari, via Università 40, 09124, Cagliari, CA, Italy
| | - Giovanni Di Lorenzo
- Institute for Maternal and Child Health—IRCCS ‘Burlo Garofolo’, Via dell'Istria 65/1, Trieste, 34137, TS, Italy
| | - Daniela Mazzà
- Institute for Maternal and Child Health—IRCCS ‘Burlo Garofolo’, Via dell'Istria 65/1, Trieste, 34137, TS, Italy
| | - Francesco De Seta
- Institute for Maternal and Child Health—IRCCS ‘Burlo Garofolo’, Via dell'Istria 65/1, Trieste, 34137, TS, Italy
| | - Giorgia Girotto
- Institute for Maternal and Child Health—IRCCS ‘Burlo Garofolo’, Via dell'Istria 65/1, Trieste, 34137, TS, Italy
- Department of Medicine, Surgery and Health Sciences, University of Trieste, Piazzale Europa 1, Trieste, 34137, TS, Italy
| | - Serena Sanna
- Institute of Genetic and Biomedical Research (IRGB), National Research Council (CNR), c/o Cittadella Universitaria di Monserrato, SS554 Km 4500, Monserrato, 09042, CA, Italy
- Department of Genetics, University Medical Center Groningen, Hanzeplein 1, 97123 GZ, Groningen, The Netherlands
| |
Collapse
|
9
|
Levi H, Elkon R, Shamir R. The predictive capacity of polygenic risk scores for disease risk is only moderately influenced by imputation panels tailored to the target population. Bioinformatics 2024; 40:btae036. [PMID: 38265251 PMCID: PMC10868313 DOI: 10.1093/bioinformatics/btae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 12/20/2023] [Accepted: 01/20/2024] [Indexed: 01/25/2024] Open
Abstract
MOTIVATION Polygenic risk scores (PRSs) predict individuals' genetic risk of developing complex diseases. They summarize the effect of many variants discovered in genome-wide association studies (GWASs). However, to date, large GWASs exist primarily for the European population and the quality of PRS prediction declines when applied to other ethnicities. Genetic profiling of individuals in the discovery set (on which the GWAS was performed) and target set (on which the PRS is applied) is typically done by SNP arrays that genotype a fraction of common SNPs. Therefore, a key step in GWAS analysis and PRS calculation is imputing untyped SNPs using a panel of fully sequenced individuals. The imputation results depend on the ethnic composition of the imputation panel. Imputing genotypes with a panel of individuals of the same ethnicity as the genotyped individuals typically improves imputation accuracy. However, there has been no systematic investigation into the influence of the ethnic composition of imputation panels on the accuracy of PRS predictions when applied to ethnic groups that differ from the population used in the GWAS. RESULTS We estimated the effect of imputation of the target set on prediction accuracy of PRS when the discovery and the target sets come from different ethnic groups. We analyzed binary phenotypes on ethnically distinct sets from the UK Biobank and other resources. We generated ethnically homogenous panels, imputed the target sets, and generated PRSs. Then, we assessed the prediction accuracy obtained from each imputation panel. Our analysis indicates that using an imputation panel matched to the ethnicity of the target population yields only a marginal improvement and only under specific conditions. AVAILABILITY AND IMPLEMENTATION The source code used for executing the analyses is this paper is available at https://github.com/Shamir-Lab/PRS-imputation-panels.
Collapse
Affiliation(s)
- Hagai Levi
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ran Elkon
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ron Shamir
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
10
|
Yu H, Khanshour AM, Ushiki A, Otomo N, Koike Y, Einarsdottir E, Fan Y, Antunes L, Kidane YH, Cornelia R, Sheng RR, Zhang Y, Pei J, Grishin NV, Evers BM, Cheung JPY, Herring JA, Terao C, Song YQ, Gurnett CA, Gerdhem P, Ikegawa S, Rios JJ, Ahituv N, Wise CA. Association of genetic variation in COL11A1 with adolescent idiopathic scoliosis. eLife 2024; 12:RP89762. [PMID: 38277211 PMCID: PMC10945706 DOI: 10.7554/elife.89762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024] Open
Abstract
Adolescent idiopathic scoliosis (AIS) is a common and progressive spinal deformity in children that exhibits striking sexual dimorphism, with girls at more than fivefold greater risk of severe disease compared to boys. Despite its medical impact, the molecular mechanisms that drive AIS are largely unknown. We previously defined a female-specific AIS genetic risk locus in an enhancer near the PAX1 gene. Here, we sought to define the roles of PAX1 and newly identified AIS-associated genes in the developmental mechanism of AIS. In a genetic study of 10,519 individuals with AIS and 93,238 unaffected controls, significant association was identified with a variant in COL11A1 encoding collagen (α1) XI (rs3753841; NM_080629.2_c.4004C>T; p.(Pro1335Leu); p=7.07E-11, OR = 1.118). Using CRISPR mutagenesis we generated Pax1 knockout mice (Pax1-/-). In postnatal spines we found that PAX1 and collagen (α1) XI protein both localize within the intervertebral disc-vertebral junction region encompassing the growth plate, with less collagen (α1) XI detected in Pax1-/- spines compared to wild-type. By genetic targeting we found that wild-type Col11a1 expression in costal chondrocytes suppresses expression of Pax1 and of Mmp3, encoding the matrix metalloproteinase 3 enzyme implicated in matrix remodeling. However, the latter suppression was abrogated in the presence of the AIS-associated COL11A1P1335L mutant. Further, we found that either knockdown of the estrogen receptor gene Esr2 or tamoxifen treatment significantly altered Col11a1 and Mmp3 expression in chondrocytes. We propose a new molecular model of AIS pathogenesis wherein genetic variation and estrogen signaling increase disease susceptibility by altering a PAX1-COL11a1-MMP3 signaling axis in spinal chondrocytes.
Collapse
Affiliation(s)
- Hao Yu
- Center for Translational Research, Scottish Rite for ChildrenDallasUnited States
| | - Anas M Khanshour
- Center for Translational Research, Scottish Rite for ChildrenDallasUnited States
| | - Aki Ushiki
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- Institute for Human Genetics, University of California, San FranciscoSan FranciscoUnited States
| | - Nao Otomo
- Laboratory of Bone and Joint Diseases, RIKEN Center for Integrative Medical SciencesTokyoJapan
| | - Yoshinao Koike
- Laboratory of Bone and Joint Diseases, RIKEN Center for Integrative Medical SciencesTokyoJapan
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical SciencesYokohamaJapan
| | - Elisabet Einarsdottir
- Science for Life Laboratory, Department of Gene Technology, KTH-Royal Institute of TechnologySolnaSweden
| | - Yanhui Fan
- School of Biomedical Sciences, The University of Hong KongHong Kong SARChina
| | - Lilian Antunes
- Department of Neurology, Washington University in St. LouisSt. LouisUnited States
| | - Yared H Kidane
- Center for Translational Research, Scottish Rite for ChildrenDallasUnited States
| | - Reuel Cornelia
- Center for Translational Research, Scottish Rite for ChildrenDallasUnited States
| | - Rory R Sheng
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- Institute for Human Genetics, University of California, San FranciscoSan FranciscoUnited States
| | - Yichi Zhang
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- Institute for Human Genetics, University of California, San FranciscoSan FranciscoUnited States
- School of Pharmaceutical Sciences, Tsinghua UniversityBeijingChina
| | - Jimin Pei
- Department of Biophysics, University of Texas Southwestern Medical CenterDallasUnited States
| | - Nick V Grishin
- Department of Biophysics, University of Texas Southwestern Medical CenterDallasUnited States
| | - Bret M Evers
- Department of Pathology, University of Texas Southwestern Medical CenterDallasUnited States
- Department of Ophthalmology, University of Texas Southwestern Medical CenterDallasUnited States
| | - Jason Pui Yin Cheung
- Department of Orthopaedics and Traumatology LKS Faculty of Medicine, The University of Hong KongHong Kong SARChina
| | - John A Herring
- Department of Orthopedic Surgery, Scottish Rite for ChildrenDallasUnited States
- Department of Orthopaedic Surgery, University of Texas Southwestern Medical CenterDallasUnited States
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical SciencesYokohamaJapan
| | - You-qiang Song
- School of Biomedical Sciences, The University of Hong KongHong Kong SARChina
| | - Christina A Gurnett
- Department of Neurology, Washington University in St. LouisSt. LouisUnited States
| | - Paul Gerdhem
- Department of Surgical Sciences, Uppsala UniversityUppsalaSweden
- Department of Orthopaedics and Hand Surgery, Uppsala University HospitalUppsalaSweden
- Department of Clinical Science, Intervention & Technology (CLINTEC), Karolinska Institutet, Stockholm, Uppsala UniversityUppsalaSweden
| | - Shiro Ikegawa
- Laboratory of Bone and Joint Diseases, RIKEN Center for Integrative Medical SciencesTokyoJapan
| | - Jonathan J Rios
- Center for Translational Research, Scottish Rite for ChildrenDallasUnited States
- Department of Orthopaedic Surgery, University of Texas Southwestern Medical CenterDallasUnited States
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical CenterDallasUnited States
- Department of Pediatrics, University of Texas Southwestern Medical CenterDallasUnited States
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- Institute for Human Genetics, University of California, San FranciscoSan FranciscoUnited States
| | - Carol A Wise
- Center for Translational Research, Scottish Rite for ChildrenDallasUnited States
- Department of Orthopaedic Surgery, University of Texas Southwestern Medical CenterDallasUnited States
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical CenterDallasUnited States
- Department of Pediatrics, University of Texas Southwestern Medical CenterDallasUnited States
| |
Collapse
|
11
|
Koorevaar T, Willemsen JH, Visser RGF, Arens P, Maliepaard C. Construction of a strawberry breeding core collection to capture and exploit genetic variation. BMC Genomics 2023; 24:740. [PMID: 38053072 DOI: 10.1186/s12864-023-09824-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 11/21/2023] [Indexed: 12/07/2023] Open
Abstract
BACKGROUND Genetic diversity is crucial for the success of plant breeding programs and core collections are important resources to capture this diversity. Many core collections have already been constructed by gene banks, whose main goal is to obtain a panel of a limited number of genotypes to simplify management practices and to improve shareability while retaining as much diversity as possible. However, as gene banks have a different composition and goal than plant breeding programs, constructing a core collection for a plant breeding program should consider different aspects. RESULTS In this study, we present a novel approach for constructing a core collection by integrating both genomic and pedigree information to maximize the representation of the breeding germplasm in a minimum subset of genotypes while accounting for future genetic variation within a strawberry breeding program. Our stepwise approach starts with selecting the most important crossing parents of advanced selections and genotypes included for specific traits, to represent also future genetic variation. We then use pedigree-genomic-based relationship coefficients combined with the 'accession to nearest entry' criterion to complement the core collection and maximize its representativeness of the current breeding program. Combined pedigree-genomic-based relationship coefficients allow for accurate relationship estimation without the need to genotype every individual in the breeding program. CONCLUSIONS This stepwise construction of a core collection in a strawberry breeding program can be applied in other plant breeding programs to construct core collections for various purposes.
Collapse
Affiliation(s)
- T Koorevaar
- Wageningen University and Research Plant Breeding, Wageningen, The Netherlands.
- Fresh Forward Breeding B.V., Huissen, The Netherlands.
| | - J H Willemsen
- Fresh Forward Breeding B.V., Huissen, The Netherlands
| | - R G F Visser
- Wageningen University and Research Plant Breeding, Wageningen, The Netherlands
| | - P Arens
- Wageningen University and Research Plant Breeding, Wageningen, The Netherlands
| | - C Maliepaard
- Wageningen University and Research Plant Breeding, Wageningen, The Netherlands
| |
Collapse
|
12
|
Yu H, Khanshour AM, Ushiki A, Otomo N, Koike Y, Einarsdottir E, Fan Y, Antunes L, Kidane YH, Cornelia R, Sheng R, Zhang Y, Pei J, Grishin NV, Evers BM, Cheung JPY, Herring JA, Terao C, Song YQ, Gurnett CA, Gerdhem P, Ikegawa S, Rios JJ, Ahituv N, Wise CA. Association of genetic variation in COL11A1 with adolescent idiopathic scoliosis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.26.542293. [PMID: 37292598 PMCID: PMC10245954 DOI: 10.1101/2023.05.26.542293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Adolescent idiopathic scoliosis (AIS) is a common and progressive spinal deformity in children that exhibits striking sexual dimorphism, with girls at more than five-fold greater risk of severe disease compared to boys. Despite its medical impact, the molecular mechanisms that drive AIS are largely unknown. We previously defined a female-specific AIS genetic risk locus in an enhancer near the PAX1 gene. Here we sought to define the roles of PAX1 and newly-identified AIS-associated genes in the developmental mechanism of AIS. In a genetic study of 10,519 individuals with AIS and 93,238 unaffected controls, significant association was identified with a variant in COL11A1 encoding collagen (α1) XI (rs3753841; NM_080629.2_c.4004C>T; p.(Pro1335Leu); P=7.07e-11, OR=1.118). Using CRISPR mutagenesis we generated Pax1 knockout mice (Pax1-/-). In postnatal spines we found that PAX1 and collagen (α1) XI protein both localize within the intervertebral disc (IVD)-vertebral junction region encompassing the growth plate, with less collagen (α1) XI detected in Pax1-/- spines compared to wildtype. By genetic targeting we found that wildtype Col11a1 expression in costal chondrocytes suppresses expression of Pax1 and of Mmp3, encoding the matrix metalloproteinase 3 enzyme implicated in matrix remodeling. However, this suppression was abrogated in the presence of the AIS-associated COL11A1P1335L mutant. Further, we found that either knockdown of the estrogen receptor gene Esr2, or tamoxifen treatment, significantly altered Col11a1 and Mmp3 expression in chondrocytes. We propose a new molecular model of AIS pathogenesis wherein genetic variation and estrogen signaling increase disease susceptibility by altering a Pax1-Col11a1-Mmp3 signaling axis in spinal chondrocytes.
Collapse
Affiliation(s)
- Hao Yu
- Center for Pediatric Bone Biology and Translational Research, Scottish Rite for Children, Dallas, TX, USA
| | - Anas M Khanshour
- Center for Pediatric Bone Biology and Translational Research, Scottish Rite for Children, Dallas, TX, USA
| | - Aki Ushiki
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Nao Otomo
- Laboratory of Bone and Joint Diseases, RIKEN Center for Integrative Medical Sciences, Tokyo, JP
| | - Yoshinao Koike
- Laboratory of Bone and Joint Diseases, RIKEN Center for Integrative Medical Sciences, Tokyo, JP
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, JP
| | - Elisabet Einarsdottir
- Science for Life Laboratory, Department of Gene Technology, KTH-Royal Institute of Technology, Solna, SE
| | - Yanhui Fan
- School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, CN
| | - Lilian Antunes
- Department of Neurology, Washington University in St. Louis, St. Louis, MO, USA
| | - Yared H Kidane
- Center for Pediatric Bone Biology and Translational Research, Scottish Rite for Children, Dallas, TX, USA
| | - Reuel Cornelia
- Center for Pediatric Bone Biology and Translational Research, Scottish Rite for Children, Dallas, TX, USA
| | - Rory Sheng
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Yichi Zhang
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- School of Pharmaceutical Sciences, Tsinghua University, Beijing, CN
| | - Jimin Pei
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Nick V Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Bret M Evers
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Department of Ophthalmology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jason Pui Yin Cheung
- Department of Orthopaedics and Traumatology LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, CN
| | - John A Herring
- Department of Orthopedic Surgery, Scottish Rite for Children, Dallas, TX, USA
- Department of Orthopaedic Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, JP
| | - You-Qiang Song
- School of Biomedical Sciences, The University of Hong Kong, Hong Kong SAR, CN
| | - Christina A Gurnett
- Department of Neurology, Washington University in St. Louis, St. Louis, MO, USA
| | - Paul Gerdhem
- Department of Clinical Science, Intervention & Technology (CLINTEC), Karolinska Institutet, Stockholm, Uppsala University, Uppsala, SE
- Department of Surgical Sciences, Uppsala University and
- Department of Orthopaedics and Hand Surgery, Uppsala University Hospital, Uppsala, SE
| | - Shiro Ikegawa
- Laboratory of Bone and Joint Diseases, RIKEN Center for Integrative Medical Sciences, Tokyo, JP
| | - Jonathan J Rios
- Center for Pediatric Bone Biology and Translational Research, Scottish Rite for Children, Dallas, TX, USA
- Department of Orthopaedic Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Carol A Wise
- Center for Pediatric Bone Biology and Translational Research, Scottish Rite for Children, Dallas, TX, USA
- Department of Orthopaedic Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
13
|
Childebayeva A, Zavala EI. Review: Computational analysis of human skeletal remains in ancient DNA and forensic genetics. iScience 2023; 26:108066. [PMID: 37927550 PMCID: PMC10622734 DOI: 10.1016/j.isci.2023.108066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2023] Open
Abstract
Degraded DNA is used to answer questions in the fields of ancient DNA (aDNA) and forensic genetics. While aDNA studies typically center around human evolution and past history, and forensic genetics is often more concerned with identifying a specific individual, scientists in both fields face similar challenges. The overlap in source material has prompted periodic discussions and studies on the advantages of collaboration between fields toward mutually beneficial methodological advancements. However, most have been centered around wet laboratory methods (sampling, DNA extraction, library preparation, etc.). In this review, we focus on the computational side of the analytical workflow. We discuss limitations and considerations to consider when working with degraded DNA. We hope this review provides a framework to researchers new to computational workflows for how to think about analyzing highly degraded DNA and prompts an increase of collaboration between the forensic genetics and aDNA fields.
Collapse
Affiliation(s)
- Ainash Childebayeva
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Department of Anthropology, University of Kansas, Lawrence, KS, USA
| | - Elena I. Zavala
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Biology, University of Oregon, Eugene, OR, USA
| |
Collapse
|
14
|
Arora A, Zareba W, Woosley RL, Klimentidis YC, Patel IY, Quan SF, Wendel C, Shamoun F, Guerra S, Parthasarathy S, Patel SI. Genetic QT Score and Sleep Apnea as Predictors of Sudden Cardiac Death in the UK Biobank. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.11.07.23298237. [PMID: 37986981 PMCID: PMC10659512 DOI: 10.1101/2023.11.07.23298237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Introduction The goal of this study was to evaluate the association between a polygenic risk score (PRS) for QT prolongation (QTc-PRS), QTc intervals and mortality in patients enrolled in the UK Biobank with and without sleep apnea. Methods The QTc-PRS was calculated using allele copy number and previously reported effect estimates for each single nuclear polymorphism SNP. Competing-risk regression models adjusting for age, sex, BMI, QT prolonging medication, race, and comorbid cardiovascular conditions were used for sudden cardiac death (SCD) analyses. Results 500,584 participants were evaluated (56.5 ±8 years, 54% women, 1.4% diagnosed with sleep apnea). A higher QTc-PRS was independently associated with the increased QTc interval duration (p<0.0001). The mean QTc for the top QTc-PRS quintile was 15 msec longer than the bottom quintile (p<0.001). Sleep apnea was found to be an effect modifier in the relationship between QTc-PRS and SCD. The adjusted HR per 5-unit change in QTc-PRS for SCD was 1.64 (95% CI 1.16 - 2.31, p=0.005) among those with sleep apnea and 1.04 (95% CI 0.95 - 1.14, p=0.44) among those without sleep apnea (p for interaction =0.01). Black participants with sleep apnea had significantly elevated adjusted risk of SCD compared to White participants (HR=9.6, 95% CI 1.24 - 74, p=0.03). Conclusion In the UK Biobank population, the QTc-PRS was associated with SCD among participants with sleep apnea but not among those without sleep apnea, indicating that sleep apnea is a significant modifier of the genetic risk. Black participants with sleep apnea had a particularly high risk of SCD.
Collapse
|
15
|
Ilska J, Tolhurst D, Tumas H, Maclean JP, Cottrell J, Lee S, Mackay J, Woolliams J. Additive and non-additive genetic variance in juvenile Sitka spruce ( Picea sitchensis Bong. Carr). TREE GENETICS & GENOMES 2023; 19:53. [PMID: 37970220 PMCID: PMC10632294 DOI: 10.1007/s11295-023-01627-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 10/07/2023] [Accepted: 10/16/2023] [Indexed: 11/17/2023]
Abstract
Many quantitative genetic models assume that all genetic variation is additive because of a lack of data with sufficient structure and quality to determine the relative contribution of additive and non-additive variation. Here the fractions of additive (fa) and non-additive (fd) genetic variation were estimated in Sitka spruce for height, bud burst and pilodyn penetration depth. Approximately 1500 offspring were produced in each of three sib families and clonally replicated across three geographically diverse sites. Genotypes from 1525 offspring from all three families were obtained by RADseq, followed by imputation using 1630 loci segregating in all families and mapped using the newly developed linkage map of Sitka spruce. The analyses employed a new approach for estimating fa and fd, which combined all available genotypic and phenotypic data with spatial modelling for each trait and site. The consensus estimate for fa increased with age for height from 0.58 at 2 years to 0.75 at 11 years, with only small overlap in 95% support intervals (I95). The estimated fa for bud burst was 0.83 (I95=[0.78, 0.90]) and 0.84 (I95=[0.77, 0.92]) for pilodyn depth. Overall, there was no evidence of family heterogeneity for height or bud burst, or site heterogeneity for pilodyn depth, and no evidence of inbreeding depression associated with genomic homozygosity, expected if dominance variance was the major component of non-additive variance. The results offer no support for the development of sublines for crossing within the species. The models give new opportunities to assess more accurately the scale of non-additive variation. Supplementary Information The online version contains supplementary material available at 10.1007/s11295-023-01627-5.
Collapse
Affiliation(s)
- J.J. Ilska
- The Roslin Institute, Royal (Dick) School of Veterinary Science, University of Edinburgh, Easter Bush, Midlothian, Scotland EH25 9RG UK
- Present Address: The Kennel Club, 10 Clarges St, London, W1J 8AB UK
| | - D.J. Tolhurst
- The Roslin Institute, Royal (Dick) School of Veterinary Science, University of Edinburgh, Easter Bush, Midlothian, Scotland EH25 9RG UK
| | - H. Tumas
- Department of Biology, University of Oxford, South Parks Road, Oxford, OX1 3RB UK
- Present Address: Department of Forest and Conservation Sciences, University of British Columbia, 2424 Main Mall, Vancouver, BC V6T 1Z4 Canada
| | - J. P. Maclean
- Forest Research, Northern Research Station, Roslin, Midlothian, EH25 9SY UK
- Present Address: Norwegian University of Life Sciences, Postboks 5003, 1432 Ås, Norway
| | - J. Cottrell
- Forest Research, Northern Research Station, Roslin, Midlothian, EH25 9SY UK
| | - S.J. Lee
- Forest Research, Northern Research Station, Roslin, Midlothian, EH25 9SY UK
| | - J. Mackay
- Department of Biology, University of Oxford, South Parks Road, Oxford, OX1 3RB UK
| | - J.A. Woolliams
- The Roslin Institute, Royal (Dick) School of Veterinary Science, University of Edinburgh, Easter Bush, Midlothian, Scotland EH25 9RG UK
| |
Collapse
|
16
|
Kim J, Rosenberg NA. Record-matching of STR profiles with fragmentary genomic SNP data. Eur J Hum Genet 2023; 31:1283-1290. [PMID: 37567955 PMCID: PMC10620386 DOI: 10.1038/s41431-023-01430-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 05/30/2023] [Accepted: 07/03/2023] [Indexed: 08/13/2023] Open
Abstract
In many forensic settings, identity of a DNA sample is sought from poor-quality DNA, for which the typical STR loci tabulated in forensic databases are not possible to reliably genotype. Genome-wide SNPs, however, can potentially be genotyped from such samples via next-generation sequencing, so that queries can in principle compare SNP genotypes from DNA samples of interest to STR genotype profiles that represent proposed matches. We use genetic record-matching to evaluate the possibility of testing SNP profiles obtained from poor-quality DNA samples to identify exact and relatedness matches to STR profiles. Using simulations based on whole-genome sequences, we show that in some settings, similar match accuracies to those seen with full coverage of the genome are obtained by genetic record-matching for SNP data that represent 5-10% genomic coverage. Thus, if even a fraction of random genomic SNPs can be genotyped by next-generation sequencing, then the potential may exist to test the resulting genotype profiles for matches to profiles consisting exclusively of nonoverlapping STR loci. The result has implications in relation to criminal justice, mass disasters, missing-person cases, studies of ancient DNA, and genomic privacy.
Collapse
Affiliation(s)
- Jaehee Kim
- Department of Computational Biology, Cornell University, Ithaca, NY, 14853, USA
| | - Noah A Rosenberg
- Department of Biology, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
17
|
Berry DP, Spangler ML. Animal board invited review: Practical applications of genomic information in livestock. Animal 2023; 17:100996. [PMID: 37820404 DOI: 10.1016/j.animal.2023.100996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 09/08/2023] [Accepted: 09/11/2023] [Indexed: 10/13/2023] Open
Abstract
Access to high-dimensional genomic information in many livestock species is accelerating. This has been greatly aided not only by continual reductions in genotyping costs but also an expansion in the services available that leverage genomic information to create a greater return-on-investment. Genomic information on individual animals has many uses including (1) parentage verification and discovery, (2) traceability, (3) karyotyping, (4) sex determination, (5) reporting and monitoring of mutations conferring major effects or congenital defects, (6) better estimating inbreeding of individuals and coancestry among individuals, (7) mating advice, (8) determining breed composition, (9) enabling precision management, and (10) genomic evaluations; genomic evaluations exploit genome-wide genotype information to improve the accuracy of predicting an animal's (and by extension its progeny's) genetic merit. Genomic data also provide a huge resource for research, albeit the outcome from this research, if successful, should eventually be realised through one of the ten applications already mentioned. The process for generating a genotype all the way from sample procurement to identifying erroneous genotypes is described, as are the steps that should be considered when developing a bespoke genotyping panel for practical application.
Collapse
Affiliation(s)
- D P Berry
- Animal & Grassland Research and Innovation Centre, Teagasc, Moorepark, Fermoy, Cork, Ireland.
| | - M L Spangler
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE, United States
| |
Collapse
|
18
|
Fukasaku H, Meguro A, Takeuchi M, Mizuki N, Ota M, Funakoshi K. Association of PDGFRA polymorphisms with the risk of corneal astigmatism in a Japanese population. Sci Rep 2023; 13:16075. [PMID: 37752244 PMCID: PMC10522672 DOI: 10.1038/s41598-023-43333-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 09/22/2023] [Indexed: 09/28/2023] Open
Abstract
Corneal astigmatism is reportedly associated with polymorphisms of the platelet-derived growth factor receptor alpha (PDGFRA) gene region in Asian populations of Chinese, Malay, and Indian ancestry and populations of European ancestry. In this study, we investigated whether these PDGFRA polymorphisms are associated with corneal astigmatism in a Japanese population. We recruited 1,535 cases with corneal astigmatism (mean corneal cylinder power across both eyes: ≤ - 0.75 diopters [D]) and 842 controls (> - 0.75 D) to genotype 13 single-nucleotide polymorphisms (SNPs) in the PDGFRA gene region. We also performed imputation analysis in the region, with 179 imputed SNPs included in the statistical analyses. The PDGFRA SNPs were not significantly associated with the cases with corneal astigmatism ≤ - 0.75 D. However, the odds ratios (ORs) of the minor alleles of SNPs in the upstream region of PDGFRA, including rs7673984, rs4864857, and rs11133315, tended to increase according to the degree of corneal astigmatism, and these SNPs were significantly associated with the cases with corneal astigmatism ≤ - 1.25 D or ≤ - 1.50 D (Pc < 0.05, OR = 1.34-1.39). These results suggest that PDGFRA SNPs play a potential role in the development of greater corneal astigmatism.
Collapse
Affiliation(s)
- Hideharu Fukasaku
- Department of Neuroanatomy, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan
- Fukasaku Eye Institute, Yokohama, Kanagawa, 220-0003, Japan
| | - Akira Meguro
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan.
- Department of Advanced Medicine for Ocular Diseases, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan.
| | - Masaki Takeuchi
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan
- Department of Advanced Medicine for Ocular Diseases, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan
| | - Nobuhisa Mizuki
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan
- Department of Advanced Medicine for Ocular Diseases, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan
| | - Masao Ota
- Department of Advanced Medicine for Ocular Diseases, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan
- Department of Medicine, Division of Hepatology and Gastroenterology, Shinshu University School of Medicine, Matsumoto, Nagano, 390-8621, Japan
| | - Kengo Funakoshi
- Department of Neuroanatomy, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa, 236-0004, Japan
| |
Collapse
|
19
|
Moorjani P, Hellenthal G. Methods for Assessing Population Relationships and History Using Genomic Data. Annu Rev Genomics Hum Genet 2023; 24:305-332. [PMID: 37220313 PMCID: PMC11040641 DOI: 10.1146/annurev-genom-111422-025117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Genetic data contain a record of our evolutionary history. The availability of large-scale datasets of human populations from various geographic areas and timescales, coupled with advances in the computational methods to analyze these data, has transformed our ability to use genetic data to learn about our evolutionary past. Here, we review some of the widely used statistical methods to explore and characterize population relationships and history using genomic data. We describe the intuition behind commonly used approaches, their interpretation, and important limitations. For illustration, we apply some of these techniques to genome-wide autosomal data from 929 individuals representing 53 worldwide populations that are part of the Human Genome Diversity Project. Finally, we discuss the new frontiers in genomic methods to learn about population history. In sum, this review highlights the power (and limitations) of DNA to infer features of human evolutionary history, complementing the knowledge gleaned from other disciplines, such as archaeology, anthropology, and linguistics.
Collapse
Affiliation(s)
- Priya Moorjani
- Department of Molecular and Cell Biology and Center for Computational Biology, University of California, Berkeley, California, USA;
| | - Garrett Hellenthal
- UCL Genetics Institute and Research Department of Genetics, Evolution, and Environment, University College London, London, United Kingdom;
| |
Collapse
|
20
|
Choi J, Kim S, Kim J, Son HY, Yoo SK, Kim CU, Park YJ, Moon S, Cha B, Jeon MC, Park K, Yun JM, Cho B, Kim N, Kim C, Kwon NJ, Park YJ, Matsuda F, Momozawa Y, Kubo M, Kim HJ, Park JH, Seo JS, Kim JI, Im SW. A whole-genome reference panel of 14,393 individuals for East Asian populations accelerates discovery of rare functional variants. SCIENCE ADVANCES 2023; 9:eadg6319. [PMID: 37556544 PMCID: PMC10411914 DOI: 10.1126/sciadv.adg6319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 07/06/2023] [Indexed: 08/11/2023]
Abstract
Underrepresentation of non-European (EUR) populations hinders growth of global precision medicine. Resources such as imputation reference panels that match the study population are necessary to find low-frequency variants with substantial effects. We created a reference panel consisting of 14,393 whole-genome sequences including more than 11,000 Asian individuals. Genome-wide association studies were conducted using the reference panel and a population-specific genotype array of 72,298 subjects for eight phenotypes. This panel yields improved imputation accuracy of rare and low-frequency variants within East Asian populations compared with the largest reference panel. Thirty-nine previously unidentified associations were found, and more than half of the variants were East Asian specific. We discovered genes with rare protein-altering variants, including LTBP1 for height and GPR75 for body mass index, as well as putative regulatory mechanisms for rare noncoding variants with cell type-specific effects. We suggest that this dataset will add to the potential value of Asian precision medicine.
Collapse
Affiliation(s)
- Jaeyong Choi
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Republic of Korea
| | | | - Juhyun Kim
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Ho-Young Son
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Seong-Keun Yoo
- The Marc and Jennifer Lipschultz Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Young Jun Park
- Department of Translational Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Sungji Moon
- Interdisciplinary Program in Cancer Biology, Seoul National University College of Medicine, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University, Seoul, Republic of Korea
| | - Bukyoung Cha
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Min Chul Jeon
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Kyunghyuk Park
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
| | - Jae Moon Yun
- Department of Family Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Belong Cho
- Department of Family Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Family Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | | | | | | | - Young Joo Park
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
- Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Molecular Medicine and Biopharmaceutical Sciences, Graduate School of Convergence Science and Technology, Seoul National University, Seoul, Republic of Korea
| | - Fumihiko Matsuda
- Center for Genomic Medicine, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | | | - Michiaki Kubo
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | | | - Hyun-Jin Kim
- National Cancer Control Institute, National Cancer Center, Goyang, Republic of Korea
| | - Jin-Ho Park
- Department of Family Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Family Medicine, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jeong-Sun Seo
- Macrogen Inc., Seoul, Republic of Korea
- Asian Genome Center, Seoul National University Bundang Hospital, Gyeonggi, Republic of Korea
| | - Jong-Il Kim
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Republic of Korea
- Genomic Medicine Institute, Medical Research Center, Seoul National University, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University, Seoul, Republic of Korea
- Department of Biochemistry and Molecular Biology, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Sun-Wha Im
- Department of Biochemistry and Molecular Biology, Kangwon National University School of Medicine, Gangwon, Republic of Korea
| |
Collapse
|
21
|
Hassanin E, Maj C, Klinkhammer H, Krawitz P, May P, Bobbili DR. Assessing the performance of European-derived cardiometabolic polygenic risk scores in South-Asians and their interplay with family history. BMC Med Genomics 2023; 16:164. [PMID: 37438803 PMCID: PMC10339617 DOI: 10.1186/s12920-023-01598-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/01/2023] [Indexed: 07/14/2023] Open
Abstract
BACKGROUND & AIMS We aimed to assess the performance of European-derived polygenic risk scores (PRSs) for common metabolic diseases such as coronary artery disease (CAD), obesity, and type 2 diabetes (T2D) in the South Asian (SAS) individuals in the UK Biobank. Additionally, we studied the interaction between PRS and family history (FH) in the same population. METHODS To calculate the PRS, we used a previously published model derived from the EUR population and applied it to the individuals of SAS ancestry from the UKB study. Each PRS was adjusted according to an individual's genotype location in the principal components (PC) space to derive an ancestry adjusted PRS (aPRS). We calculated the percentiles based on aPRS and stratified individuals into three aPRS categories: low, intermediate, and high. Considering the intermediate-aPRS percentile as a reference, we compared the low and high aPRS categories and generated the odds ratio (OR) estimates. Further, we measured the combined role of aPRS and first-degree family history (FH) in the SAS population. RESULTS The risk of developing severe obesity for SAS individuals was almost twofold higher for individuals with high aPRS than for those with intermediate aPRS, with an OR of 1.95 (95% CI = 1.71-2.23, P < 0.01). At the same time, the risk of severe obesity was lower in the low-aPRS group (OR = 0.60, CI = 0.53-0.67, P < 0.01). Results in the same direction were found in the EUR data, where the low-PRS group had an OR of 0.53 (95% CI = 0.51-0.56, P < 0.01) and the high-PRS group had an OR of 2.06 (95% CI = 2.00-2.12, P < 0.01). We observed similar results for CAD and T2D. Further, we show that SAS individuals with a familial history of CAD and T2D with high-aPRS are associated with a higher risk of these diseases, implying a greater genetic predisposition. CONCLUSION Our findings suggest that CAD, obesity, and T2D GWAS summary statistics generated predominantly from the EUR population can be potentially used to derive aPRS in SAS individuals for risk stratification. With future GWAS recruiting more SAS participants and tailoring the PRSs towards SAS ancestry, the predictive power of PRS is likely to improve further.
Collapse
Affiliation(s)
- Emadeldin Hassanin
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, Belvaux, L-4367, Luxembourg
- Institute for Genomic Statistics and Bioinformatics, University of Bonn, Bonn, Germany
| | - Carlo Maj
- Centre for Human Genetics, University of Marburg, Marburg, Germany
| | - Hannah Klinkhammer
- Institute for Genomic Statistics and Bioinformatics, University of Bonn, Bonn, Germany
- Medical Faculty, Institute for Medical Biometry, Informatics and Epidemiology, University Bonn, Bonn, Germany
| | - Peter Krawitz
- Institute for Genomic Statistics and Bioinformatics, University of Bonn, Bonn, Germany
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, Belvaux, L-4367, Luxembourg
| | - Dheeraj Reddy Bobbili
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, Belvaux, L-4367, Luxembourg.
- Wellytics Technologies Pvt Ltd, Hyderabad, India.
| |
Collapse
|
22
|
Otomo N, Khanshour AM, Koido M, Takeda K, Momozawa Y, Kubo M, Kamatani Y, Herring JA, Ogura Y, Takahashi Y, Minami S, Uno K, Kawakami N, Ito M, Sato T, Watanabe K, Kaito T, Yanagida H, Taneichi H, Harimaya K, Taniguchi Y, Shigematsu H, Iida T, Demura S, Sugawara R, Fujita N, Yagi M, Okada E, Hosogane N, Kono K, Nakamura M, Chiba K, Kotani T, Sakuma T, Akazawa T, Suzuki T, Nishida K, Kakutani K, Tsuji T, Sudo H, Iwata A, Inami S, Wise CA, Kochi Y, Matsumoto M, Ikegawa S, Watanabe K, Terao C. Evidence of causality of low body mass index on risk of adolescent idiopathic scoliosis: a Mendelian randomization study. Front Endocrinol (Lausanne) 2023; 14:1089414. [PMID: 37415668 PMCID: PMC10319580 DOI: 10.3389/fendo.2023.1089414] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 05/17/2023] [Indexed: 07/08/2023] Open
Abstract
Introduction Adolescent idiopathic scoliosis (AIS) is a disorder with a three-dimensional spinal deformity and is a common disease affecting 1-5% of adolescents. AIS is also known as a complex disease involved in environmental and genetic factors. A relation between AIS and body mass index (BMI) has been epidemiologically and genetically suggested. However, the causal relationship between AIS and BMI remains to be elucidated. Material and methods Mendelian randomization (MR) analysis was performed using summary statistics from genome-wide association studies (GWASs) of AIS (Japanese cohort, 5,327 cases, 73,884 controls; US cohort: 1,468 cases, 20,158 controls) and BMI (Biobank Japan: 173430 individual; meta-analysis of genetic investigation of anthropometric traits and UK Biobank: 806334 individuals; European Children cohort: 39620 individuals; Population Architecture using Genomics and Epidemiology: 49335 individuals). In MR analyses evaluating the effect of BMI on AIS, the association between BMI and AIS summary statistics was evaluated using the inverse-variance weighted (IVW) method, weighted median method, and Egger regression (MR-Egger) methods in Japanese. Results Significant causality of genetically decreased BMI on risk of AIS was estimated: IVW method (Estimate (beta) [SE] = -0.56 [0.16], p = 1.8 × 10-3), weighted median method (beta = -0.56 [0.18], p = 8.5 × 10-3) and MR-Egger method (beta = -1.50 [0.43], p = 4.7 × 10-3), respectively. Consistent results were also observed when using the US AIS summary statistic in three MR methods; however, no significant causality was observed when evaluating the effect of AIS on BMI. Conclusions Our Mendelian randomization analysis using large studies of AIS and GWAS for BMI summary statistics revealed that genetic variants contributing to low BMI have a causal effect on the onset of AIS. This result was consistent with those of epidemiological studies and would contribute to the early detection of AIS.
Collapse
Affiliation(s)
- Nao Otomo
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
- Laboratory for Bone and Joint Diseases, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| | - Anas M. Khanshour
- Center for Translational Research, Scottish Rite for Children, Dallas, TX, United States
| | - Masaru Koido
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
- Division of Molecular Pathology, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Kazuki Takeda
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
| | - Yukihide Momozawa
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Michiaki Kubo
- Laboratory for Genotyping Development, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Kamatani
- Division of Molecular Pathology, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
- Laboratory of Complex Trait Genomics, Graduate School of Frontier Science, The University of Tokyo, Tokyo, Japan
| | - John A. Herring
- Department of Orthopaedic Surgery , Scottish Rite for Children, Dallas, TX, United States
- Department of Orthopaedic Surgery and Pediatric, University of Texas Southwestern Medical Center, Dallas, TX, United States
| | - Yoji Ogura
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
| | - Yohei Takahashi
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
| | - Shohei Minami
- Department of Orthopaedic Surgery, Seirei Sakura Citizen Hospital, Sakura, Japan
| | - Koki Uno
- Department of Orthopaedic Surgery, National Hospital Organization, Kobe Medical Center, Kobe, Japan
| | - Noriaki Kawakami
- Department of Orthopaedic Surgery, Meijo Hospital, Nagoya, Japan
| | - Manabu Ito
- Department of Orthopaedic Surgery, National Hospital Organization, Hokkaido Medical Center, Sapporo, Japan
| | - Tatsuya Sato
- Department of Orthopaedic Surgery, Juntendo University School of Medicine, Tokyo, Japan
| | - Kei Watanabe
- Department of Orthopaedic Surgery, Niigata University Medical and Dental General Hospital, Niigata, Japan
| | - Takashi Kaito
- Department of Orthopaedic Surgery, Osaka University Graduate School of Medicine, Suita, Japan
| | - Haruhisa Yanagida
- Department of Orthopaedic and Spine Surgery, Fukuoka Children’s Hospital, Fukuoka, Japan
| | - Hiroshi Taneichi
- Department of Orthopaedic Surgery, Dokkyo Medical University School of Medicine, Mibu, Japan
| | - Katsumi Harimaya
- Department of Orthopaedic Surgery, Kyushu University Beppu Hospital, Beppu, Japan
| | - Yuki Taniguchi
- Department of Orthopaedic Surgery, Faculty of Medicine, The University of Tokyo, Tokyo, Japan
| | - Hideki Shigematsu
- Department of Orthopaedic Surgery, Nara Medical University, Kashihara, Japan
| | - Takahiro Iida
- Department of Orthopaedic Surgery, Dokkyo Medical University Saitama Medical Center, Koshigaya, Japan
- Department of Orthopaedic Surgery, Teine Keijinkai Hospital, Sapporo, Japan
| | - Satoru Demura
- Department of Orthopaedic Surgery Graduated School of Medical Science, Kanazawa University, Kanazawa, Japan
| | - Ryo Sugawara
- Department of Orthopaedic Surgery, Jichi Medical University, Shimotsuke, Japan
| | - Nobuyuki Fujita
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
- Department of Orthopaedic Surgery, Fujita Health University, Toyoake, Japan
| | - Mitsuru Yagi
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
- Department of Orthopaedic Surgery, International University of Health and Welfare School of Medicine, Narita, Japan
| | - Eijiro Okada
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
| | - Naobumi Hosogane
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
- Department of Orthopaedic Surgery, National Defense Medical College, Tokorozawa, Japan
| | - Katsuki Kono
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
- Department of Orthopaedic Surgery, Kono Orthopaedic Clinic, Tokyo, Japan
| | - Masaya Nakamura
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
| | - Kazuhiro Chiba
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
- Department of Orthopaedic Surgery, Fujita Health University, Toyoake, Japan
| | - Toshiaki Kotani
- Department of Orthopaedic Surgery, Seirei Sakura Citizen Hospital, Sakura, Japan
| | - Tsuyoshi Sakuma
- Department of Orthopaedic Surgery, Seirei Sakura Citizen Hospital, Sakura, Japan
| | - Tsutomu Akazawa
- Department of Orthopaedic Surgery, Seirei Sakura Citizen Hospital, Sakura, Japan
| | - Teppei Suzuki
- Department of Orthopaedic Surgery, National Hospital Organization, Kobe Medical Center, Kobe, Japan
| | - Kotaro Nishida
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Kenichiro Kakutani
- Department of Orthopaedic Surgery, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Taichi Tsuji
- Department of Orthopaedic Surgery, Meijo Hospital, Nagoya, Japan
| | - Hideki Sudo
- Department of Advanced Medicine for Spine and Spinal Cord Disorders, Hokkaido University Graduate School of Medicine, Sapporo, Japan
| | - Akira Iwata
- Department of Preventive and Therapeutic Research for Metastatic Bone Tumor, Faculty of Medicine and Graduate School of Medicine, Hokkaido University, Sapporo, Japan
| | - Satoshi Inami
- Department of Orthopaedic Surgery, Dokkyo Medical University School of Medicine, Mibu, Japan
| | - Carol A. Wise
- Center for Translational Research, Scottish Rite for Children, Dallas, TX, United States
- Department of Orthopaedic Surgery and Pediatric, University of Texas Southwestern Medical Center, Dallas, TX, United States
- McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, TX, United States
| | - Yuta Kochi
- Department of Genomic Function and Diversity, Medical Research Institute, Tokyo Medical and Dental and University, Tokyo, Japan
| | - Morio Matsumoto
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
| | - Shiro Ikegawa
- Laboratory for Bone and Joint Diseases, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan
| | - Kota Watanabe
- Department of Orthopaedic Surgery, Keio University School of Medicine, Tokyo, Japan
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, RIKEN, Yokohama, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| |
Collapse
|
23
|
Nannini DR, Zheng Y, Joyce BT, Kim K, Gao T, Wang J, Jacobs DR, Schreiner PJ, Yaffe K, Greenland P, Lloyd-Jones DM, Hou L. Genome-wide DNA methylation association study of recent and cumulative marijuana use in middle aged adults. Mol Psychiatry 2023; 28:2572-2582. [PMID: 37258616 PMCID: PMC10611566 DOI: 10.1038/s41380-023-02106-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 04/24/2023] [Accepted: 05/03/2023] [Indexed: 06/02/2023]
Abstract
Marijuana is a widely used psychoactive substance in the US and medical and recreational legalization has risen over the past decade. Despite the growing number of individuals using marijuana, studies investigating the association between epigenetic factors and recent and cumulative marijuana use remain limited. We therefore investigated the association between recent and cumulative marijuana use and DNA methylation levels. Participants from the Coronary Artery Risk Development in Young Adults Study with whole blood collected at examination years (Y) 15 and Y20 were randomly selected to undergo DNA methylation profiling at both timepoints using the Illumina MethylationEPIC BeadChip. Recent use of marijuana was queried at each examination and used to estimate cumulative marijuana use from Y0 to Y15 and Y20. At Y15 (n = 1023), we observed 22 and 31 methylation markers associated (FDR P ≤ 0.05) with recent and cumulative marijuana use and 132 and 16 methylation markers at Y20 (n = 883), respectively. We replicated 8 previously reported methylation markers associated with marijuana use. We further identified 640 cis-meQTLs and 198 DMRs associated with recent and cumulative use at Y15 and Y20. Differentially methylated genes were statistically overrepresented in pathways relating to cellular proliferation, hormone signaling, and infections as well as schizophrenia, bipolar disorder, and substance-related disorders. We identified numerous methylation markers, pathways, and diseases associated with recent and cumulative marijuana use in middle-aged adults, providing additional insight into the association between marijuana use and the epigenome. These results provide novel insights into the role marijuana has on the epigenome and related health conditions.
Collapse
Affiliation(s)
- Drew R Nannini
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
| | - Yinan Zheng
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Brian T Joyce
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Kyeezu Kim
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Tao Gao
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Jun Wang
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - David R Jacobs
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Pamela J Schreiner
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Kristine Yaffe
- University of California at San Francisco School of Medicine, San Francisco, CA, USA
| | - Philip Greenland
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
- Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Donald M Lloyd-Jones
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
- Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| |
Collapse
|
24
|
Stephens M. The Bayesian lens and Bayesian blinkers. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2023; 381:20220144. [PMID: 36970830 PMCID: PMC10041352 DOI: 10.1098/rsta.2022.0144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 12/15/2022] [Indexed: 06/18/2023]
Abstract
I discuss the benefits of looking through the 'Bayesian lens' (seeking a Bayesian interpretation of ostensibly non-Bayesian methods), and the dangers of wearing 'Bayesian blinkers' (eschewing non-Bayesian methods as a matter of philosophical principle). I hope that the ideas may be useful to scientists trying to understand widely used statistical methods (including confidence intervals and [Formula: see text]-values), as well as teachers of statistics and practitioners who wish to avoid the mistake of overemphasizing philosophy at the expense of practical matters. This article is part of the theme issue 'Bayesian inference: challenges, perspectives, and prospects'.
Collapse
Affiliation(s)
- Matthew Stephens
- Department of Statistics and Department of Human Genetics, University of Chicago, Chicago, IL, USA
| |
Collapse
|
25
|
Zhang Z, Li H, Weng H, Zhou G, Chen H, Yang G, Zhang P, Zhang X, Ji Y, Ying K, Liu B, Xu Q, Tang Y, Zhu G, Liu Z, Xia S, Yang X, Dong L, Zhu L, Zeng M, Yuan Y, Yang Y, Zhang N, Xu X, Pang W, Zhang M, Zhang Y, Zhen K, Wang D, Lei J, Wu S, Shu S, Zhang Y, Zhang S, Gao Q, Huang Q, Deng C, Fu X, Chen G, Duan W, Wan J, Xie W, Zhang P, Wang S, Yang P, Zuo X, Zhai Z, Wang C. Genome-wide association analyses identified novel susceptibility loci for pulmonary embolism among Han Chinese population. BMC Med 2023; 21:153. [PMID: 37076872 PMCID: PMC10116678 DOI: 10.1186/s12916-023-02844-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 03/22/2023] [Indexed: 04/21/2023] Open
Abstract
BACKGROUND A large proportion of pulmonary embolism (PE) heritability remains unexplained, particularly among the East Asian (EAS) population. Our study aims to expand the genetic architecture of PE and reveal more genetic determinants in Han Chinese. METHODS We conducted the first genome-wide association study (GWAS) of PE in Han Chinese, then performed the GWAS meta-analysis based on the discovery and replication stages. To validate the effect of the risk allele, qPCR and Western blotting experiments were used to investigate possible changes in gene expression. Mendelian randomization (MR) analysis was employed to implicate pathogenic mechanisms, and a polygenic risk score (PRS) for PE risk prediction was generated. RESULTS After meta-analysis of the discovery dataset (622 cases, 8853 controls) and replication dataset (646 cases, 8810 controls), GWAS identified 3 independent loci associated with PE, including the reported loci FGG rs2066865 (p-value = 3.81 × 10-14), ABO rs582094 (p-value = 1.16 × 10-10) and newly reported locus FABP2 rs1799883 (p-value = 7.59 × 10-17). Previously reported 10 variants were successfully replicated in our cohort. Functional experiments confirmed that FABP2-A163G(rs1799883) promoted the transcription and protein expression of FABP2. Meanwhile, MR analysis revealed that high LDL-C and TC levels were associated with an increased risk of PE. Individuals with the top 10% of PRS had over a fivefold increased risk for PE compared to the general population. CONCLUSIONS We identified FABP2, related to the transport of long-chain fatty acids, contributing to the risk of PE and provided more evidence for the essential role of metabolic pathways in PE development.
Collapse
Affiliation(s)
- Zhu Zhang
- Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Haobo Li
- China-Japan Friendship Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College; Department of Pulmonary and Critical Care Medicine, Center of Respiratory Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Haoyi Weng
- Shenzhen WeGene Clinical Laboratory; WeGene, Shenzhen Zaozhidao Technology Co. Ltd; Hunan Provincial Key Lab On Bioinformatics, School of Computer Science and Engineering, Central South University, Shenzhen, 518042, China
| | - Geyu Zhou
- Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Hong Chen
- Department of Pulmonary and Critical Care Medicine, the First Affiliated Hospital of Chongqing Medical University, Chongqing, 400016, China
| | - Guoru Yang
- Department of Pulmonary and Critical Care Medicine, Weifang No.2 People's Hospital, Weifang, 261021, China
| | - Ping Zhang
- Department of Pulmonary and Critical Care Medicine, Dongguan People's Hospital, Dongguan, 523059, China
| | - Xiangyan Zhang
- Department of Pulmonary and Critical Care Medicine, Guizhou Provincial People's Hospital, Guiyang, 550002, China
| | - Yingqun Ji
- Department of Pulmonary and Critical Care Medicine, Shanghai East Hospital Affiliated by Tongji University, Shanghai, 200120, China
| | - Kejing Ying
- Department of Respiratory Medicine, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, 310020, China
| | - Bo Liu
- Department of Pulmonary and Critical Care Medicine, Department of Clinical Microbiology, Zibo City Key Laboratory of Respiratory Infection and Clinical Microbiology, Linzi District People's Hospital, Zibo, 255400, China
| | - Qixia Xu
- Department of Pulmonary and Critical Care Medicine, the First Affiliated Hospital of University of Science and Technology of China, Hefei, 230001, China
| | - Yongjun Tang
- Department of Pulmonary and Critical Care Medicine, Xiangya Hospital Central South University, Changsha, 410008, China
| | - Guangfa Zhu
- Department of Pulmonary and Critical Care, Beijing Anzhen Hospital, Capital Medical University, Beijing, 100029, China
| | - Zhihong Liu
- Fuwai Hospital, Chinese Academy of Medical Science; National Center for Cardiovascular Diseases, Beijing, 100037, China
| | - Shuyue Xia
- Department of Pulmonary and Critical Care Medicine, Central Hospital Affiliated to Shenyang Medical College, Shenyang, 110001, China
| | - Xiaohong Yang
- Department of Pulmonary and Critical Care Medicine, People's Hospital of Xinjiang Uygur Autonomous Region, Xinjiang, 830001, China
| | - Lixia Dong
- Department of Pulmonary and Critical Care Medicine, Tianjin Medical University General Hospital, Tianjin, 300050, China
| | - Ling Zhu
- Department of Pulmonary and Critical Care Medicine, Shandong Provincial Hospital, Jinan, 250021, China
| | - Mian Zeng
- Department of Medical Intensive Care Unit, The First Affiliated Hospital, Sun Yat-Sen University, Guangzhou, 510080, China
| | - Yadong Yuan
- Department of Pulmonary and Critical Care Medicine, The Second Hospital of Hebei Medical University, Shijiazhuang, 050004, China
| | - Yuanhua Yang
- Department of Pulmonary and Critical Care Medicine, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, 100026, China
| | - Nuofu Zhang
- State Key Laboratory of Respiratory Disease and National Clinical Research Center for Respiratory Disease, the First Affiliated Hospital of Guangzhou Medical University, Guangzhou Medical University, Guangzhou, 510230, China
| | - Xiaomao Xu
- Department of Pulmonary and Critical Care Medicine, Beijing Hospital, Beijing, 100080, China
| | - Wenyi Pang
- Department of Pulmonary and Critical Care Medicine, Beijing Jishuitan Hospital, Beijing, 100035, China
| | - Meng Zhang
- Department of Pulmonary and Critical Care, Beijing Anzhen Hospital, Capital Medical University, Beijing, 100029, China
| | - Yu Zhang
- China-Japan Friendship Hospital, Capital Medical University; Department of Pulmonary and Critical Care Medicine, Center of Respiratory Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Kaiyuan Zhen
- Institute of Clinical Medical Sciences, China-Japan Friendship Hospital; Department of Pulmonary and Critical Care Medicine, Center of Respiratory Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Dingyi Wang
- Institute of Clinical Medical Sciences, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, China, 100029
| | - Jieping Lei
- Institute of Clinical Medical Sciences, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, China, 100029
| | - Sinan Wu
- Institute of Clinical Medical Sciences, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, China, 100029
| | - Shi Shu
- Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Yunxia Zhang
- Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Shuai Zhang
- Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Qian Gao
- Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Qiang Huang
- Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Chao Deng
- Department of Bioinformatics and Biostatistics, SJTU-Yale Joint Center for Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Xi Fu
- Shenzhen WeGene Clinical Laboratory; WeGene, Shenzhen Zaozhidao Technology Co. Ltd; Hunan Provincial Key Lab On Bioinformatics, School of Computer Science and Engineering, Central South University, Shenzhen, 518042, China
| | - Gang Chen
- Shenzhen WeGene Clinical Laboratory; WeGene, Shenzhen Zaozhidao Technology Co. Ltd; Hunan Provincial Key Lab On Bioinformatics, School of Computer Science and Engineering, Central South University, Shenzhen, 518042, China
| | - Wenxin Duan
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, 100005, China
| | - Jun Wan
- Department of Pulmonary and Critical Care, Beijing Anzhen Hospital, Capital Medical University, Beijing, 100029, China
| | - Wanmu Xie
- Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China
| | - Peng Zhang
- Beijing Pediatric Research Institute, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, 100045, China
| | - Shengfeng Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, 100191, China
| | - Peiran Yang
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, 100005, China
| | - Xianbo Zuo
- Department of Dermatology, China-Japan Friendship Hospital, Beijing, China; Department of Pharmacy, China-Japan Friendship Hospital, No. 2, East Yinghua Road, Chaoyang District, Beijing, 100029, China.
| | - Zhenguo Zhai
- Department of Pulmonary and Critical Care Medicine, China-Japan Friendship Hospital; National Center for Respiratory Medicine; Institute of Respiratory Medicine, Chinese Academy of Medical Sciences; National Clinical Research Center for Respiratory Diseases, Beijing, 100029, China.
| | - Chen Wang
- Department of Pulmonary and Critical Care Medicine, Center of Respiratory Medicine, China-Japan Friendship Hospital, Beijing, China.
- National Center for Respiratory Medicine, Beijing, China.
- Institute of Respiratory Medicine, Chinese Academy of Medical Sciences, Beijing, China.
- National Clinical Research Center for Respiratory Diseases, Beijing, China.
- Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing, China.
- Department of Respiratory Medicine, Capital Medical University, Beijing, China.
| |
Collapse
|
26
|
Appadurai V, Bybjerg-Grauholm J, Krebs MD, Rosengren A, Buil A, Ingason A, Mors O, Børglum AD, Hougaard DM, Nordentoft M, Mortensen PB, Delaneau O, Werge T, Schork AJ. Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks. Commun Biol 2023; 6:101. [PMID: 36697501 PMCID: PMC9876938 DOI: 10.1038/s42003-023-04477-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 01/12/2023] [Indexed: 01/27/2023] Open
Abstract
Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks.
Collapse
Affiliation(s)
- Vivek Appadurai
- Institute of Biological Psychiatry, Mental Health Center Sankt Hans, Roskilde, 4000 Denmark ,grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - Jonas Bybjerg-Grauholm
- grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark ,grid.6203.70000 0004 0417 4147Danish Center for Neonatal Screening, Statens Serum Institut, Copenhagen, Denmark
| | - Morten Dybdahl Krebs
- Institute of Biological Psychiatry, Mental Health Center Sankt Hans, Roskilde, 4000 Denmark ,grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - Anders Rosengren
- Institute of Biological Psychiatry, Mental Health Center Sankt Hans, Roskilde, 4000 Denmark ,grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - Alfonso Buil
- Institute of Biological Psychiatry, Mental Health Center Sankt Hans, Roskilde, 4000 Denmark ,grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - Andrés Ingason
- Institute of Biological Psychiatry, Mental Health Center Sankt Hans, Roskilde, 4000 Denmark ,grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - Ole Mors
- grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark ,grid.154185.c0000 0004 0512 597XPsychosis Research Unit, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
| | - Anders D. Børglum
- grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark ,grid.7048.b0000 0001 1956 2722Department of Biomedicine and Center for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark ,grid.7048.b0000 0001 1956 2722Center for Genomics and Personalized Medicine, CGPM, Aarhus University, Aarhus, Denmark
| | - David M. Hougaard
- grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark ,grid.6203.70000 0004 0417 4147Danish Center for Neonatal Screening, Statens Serum Institut, Copenhagen, Denmark
| | - Merete Nordentoft
- grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark ,grid.466916.a0000 0004 0631 4836Mental Health Services in the Capital Region of Denmark, Copenhagen, Denmark ,grid.5254.60000 0001 0674 042XDepartment of Clinical Medicine, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Preben B. Mortensen
- grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark ,grid.7048.b0000 0001 1956 2722NCRR - National Center for Register-Based Research, Business and Social Sciences, Aarhus University, Aarhus, Denmark ,grid.7048.b0000 0001 1956 2722CIRRAU - Centre for Integrated Register-Based Research, Aarhus University, Aarhus, Denmark
| | - Olivier Delaneau
- grid.9851.50000 0001 2165 4204Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Thomas Werge
- Institute of Biological Psychiatry, Mental Health Center Sankt Hans, Roskilde, 4000 Denmark ,grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - Andrew J. Schork
- Institute of Biological Psychiatry, Mental Health Center Sankt Hans, Roskilde, 4000 Denmark ,grid.452548.a0000 0000 9817 5300The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark ,grid.250942.80000 0004 0507 3225The Translational Genomics Research Institute, Phoenix, AZ USA
| |
Collapse
|
27
|
The association between a genetic variant in the SULF2 gene, metabolic parameters and vascular disease in patients at high cardiovascular risk. Cardiovasc Endocrinol Metab 2023; 12:e0278. [PMID: 36699192 PMCID: PMC9870215 DOI: 10.1097/xce.0000000000000278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Accepted: 11/18/2022] [Indexed: 01/27/2023]
Abstract
Clearance of triglyceride-rich lipoproteins (TRLs) is mediated by several receptors, including heparan sulfate proteoglycans (HSPGs). Sulfate glucosamine-6-O-endosulfatase-2 is a gene related to the regulation of HSPG. A variant in this gene, rs2281279, has been shown to be associated with triglycerides and insulin resistance. Objective To determine the relationship between rs2281279, metabolic parameters and vascular events, and type 2 diabetes mellitus (T2DM) in patients at high cardiovascular risk and whether APOE genotype modifies this relationship. Methods Patients (n = 4386) at high cardiovascular risk from the Utrecht Cardiovascular Cohort-Second Manifestations of Arterial Disease study were stratified according to their imputed rs2281279 genotype: AA (n = 2438), AG (n = 1642) and GG (n = 306). Effects of rs2281279 on metabolic parameters, vascular events and T2DM were analyzed with linear regression and Cox models. Results There was no relationship between imputed rs2281279 genotype and triglycerides, non-high-density lipoprotein (HDL)-cholesterol, insulin and quantitative insulin sensitivity check index. During a median follow-up of 11.8 (IQR, 9.3-15.5) years, 1026 cardiovascular events and 320 limb events occurred. The presence of the G allele in rs2281279 did not affect the risk of vascular events [hazard ratio (HR), 1.03; 95% confidence interval (CI), 0.94-1.14] or limb events (HR, 0.92; 95% CI, 0.77-1.10). The presence of the G allele in rs2281279 did not affect the risk of T2DM (HR, 1.09; 95% CI, 0.94-1.27). The presence of the minor G allele of rs2281279 was associated with a beneficial risk profile in ε2ε2 patients, but not in ε3ε3 patients. Conclusions Imputed rs2281279 genotype is not associated with metabolic parameters and does not increase the risk of vascular events or T2DM in patients at high risk for cardiovascular disease.
Collapse
|
28
|
Baldrighi GN, Nova A, Bernardinelli L, Fazia T. A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software. LIFE (BASEL, SWITZERLAND) 2022; 12:life12122030. [PMID: 36556394 PMCID: PMC9781110 DOI: 10.3390/life12122030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 12/01/2022] [Accepted: 12/02/2022] [Indexed: 12/09/2022]
Abstract
Genotype imputation has become an essential prerequisite when performing association analysis. It is a computational technique that allows us to infer genetic markers that have not been directly genotyped, thereby increasing statistical power in subsequent association studies, which consequently has a crucial impact on the identification of causal variants. Many features need to be considered when choosing the proper algorithm for imputation, including the target sample on which it is performed, i.e., related individuals, unrelated individuals, or both. Problems could arise when dealing with a target sample made up of mixed data, composed of both related and unrelated individuals, especially since the scientific literature on this topic is not sufficiently clear. To shed light on this issue, we examined existing algorithms and software for performing phasing and imputation on mixed human data from SNP arrays, specifically when related subjects belong to trios. By discussing the advantages and limitations of the current algorithms, we identified LD-based methods as being the most suitable for reconstruction of haplotypes in this specific context, and we proposed a feasible pipeline that can be used for imputing genotypes in both phased and unphased human data.
Collapse
|
29
|
Muneeb M, Feng S, Henschel A. Transfer learning for genotype-phenotype prediction using deep learning models. BMC Bioinformatics 2022; 23:511. [PMID: 36447153 PMCID: PMC9710151 DOI: 10.1186/s12859-022-05036-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 11/05/2022] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND For some understudied populations, genotype data is minimal for genotype-phenotype prediction. However, we can use the data of some other large populations to learn about the disease-causing SNPs and use that knowledge for the genotype-phenotype prediction of small populations. This manuscript illustrated that transfer learning is applicable for genotype data and genotype-phenotype prediction. RESULTS Using HAPGEN2 and PhenotypeSimulator, we generated eight phenotypes for 500 cases/500 controls (CEU, large population) and 100 cases/100 controls (YRI, small populations). We considered 5 (4 phenotypes) and 10 (4 phenotypes) different risk SNPs for each phenotype to evaluate the proposed method. The improved accuracy with transfer learning for eight different phenotypes was between 2 and 14.2 percent. The two-tailed p-value between the classification accuracies for all phenotypes without transfer learning and with transfer learning was 0.0306 for five risk SNPs phenotypes and 0.0478 for ten risk SNPs phenotypes. CONCLUSION The proposed pipeline is used to transfer knowledge for the case/control classification of the small population. In addition, we argue that this method can also be used in the realm of endangered species and personalized medicine. If the large population data is extensive compared to small population data, expect transfer learning results to improve significantly. We show that Transfer learning is capable to create powerful models for genotype-phenotype predictions in large, well-studied populations and fine-tune these models to populations were data is sparse.
Collapse
Affiliation(s)
- Muhammad Muneeb
- grid.440568.b0000 0004 1762 9729Department of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Al Saada St - Zone 1, Abu Dhabi, United Arab Emirates
| | - Samuel Feng
- grid.449223.a0000 0004 1754 9534Department of Science and Engineering, Sorbonne University Abu Dhabi, PO Box 38044, Abu Dhabi, United Arab Emirates
| | - Andreas Henschel
- grid.440568.b0000 0004 1762 9729Department of Electrical Engineering and Computer Science, Khalifa University of Science and Technology, Al Saada St - Zone 1, Abu Dhabi, United Arab Emirates
| |
Collapse
|
30
|
Ausmees K, Nettelblad C. Achieving improved accuracy for imputation of ancient DNA. Bioinformatics 2022; 39:6827812. [PMID: 36377787 PMCID: PMC9805568 DOI: 10.1093/bioinformatics/btac738] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 11/07/2022] [Accepted: 11/14/2022] [Indexed: 11/16/2022] Open
Abstract
MOTIVATION Genotype imputation has the potential to increase the amount of information that can be gained from the often limited biological material available in ancient samples. As many widely used tools have been developed with modern data in mind, their design is not necessarily reflective of the requirements in studies of ancient DNA. Here, we investigate if an imputation method based on the full probabilistic Li and Stephens model of haplotype frequencies might be beneficial for the particular challenges posed by ancient data. RESULTS We present an implementation called prophaser and compare imputation performance to two alternative pipelines that have been used in the ancient DNA community based on the Beagle software. Considering empirical ancient data downsampled to lower coverages as well as present-day samples with artificially thinned genotypes, we show that the proposed method is advantageous at lower coverages, where it yields improved accuracy and ability to capture rare variation. The software prophaser is optimized for running in a massively parallel manner and achieved reasonable runtimes on the experiments performed when executed on a GPU. AVAILABILITY AND IMPLEMENTATION The C++ code for prophaser is available in the GitHub repository https://github.com/scicompuu/prophaser. SUPPLEMENTARY INFORMATION Supplementary information is available at Bioinformatics online.
Collapse
|
31
|
Sun Q, Yang Y, Rosen JD, Jiang MZ, Chen J, Liu W, Wen J, Raffield LM, Pace RG, Zhou YH, Wright FA, Blackman SM, Bamshad MJ, Gibson RL, Cutting GR, Knowles MR, Schrider DR, Fuchsberger C, Li Y. MagicalRsq: Machine-learning-based genotype imputation quality calibration. Am J Hum Genet 2022; 109:1986-1997. [PMID: 36198314 PMCID: PMC9674945 DOI: 10.1016/j.ajhg.2022.09.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2022] [Accepted: 09/16/2022] [Indexed: 01/26/2023] Open
Abstract
Whole-genome sequencing (WGS) is the gold standard for fully characterizing genetic variation but is still prohibitively expensive for large samples. To reduce costs, many studies sequence only a subset of individuals or genomic regions, and genotype imputation is used to infer genotypes for the remaining individuals or regions without sequencing data. However, not all variants can be well imputed, and the current state-of-the-art imputation quality metric, denoted as standard Rsq, is poorly calibrated for lower-frequency variants. Here, we propose MagicalRsq, a machine-learning-based method that integrates variant-level imputation and population genetics statistics, to provide a better calibrated imputation quality metric. Leveraging WGS data from the Cystic Fibrosis Genome Project (CFGP), and whole-exome sequence data from UK BioBank (UKB), we performed comprehensive experiments to evaluate the performance of MagicalRsq compared to standard Rsq for partially sequenced studies. We found that MagicalRsq aligns better with true R2 than standard Rsq in almost every situation evaluated, for both European and African ancestry samples. For example, when applying models trained from 1,992 CFGP sequenced samples to an independent 3,103 samples with no sequencing but TOPMed imputation from array genotypes, MagicalRsq, compared to standard Rsq, achieved net gains of 1.4 million rare, 117k low-frequency, and 18k common variants, where net gains were gained numbers of correctly distinguished variants by MagicalRsq over standard Rsq. MagicalRsq can serve as an improved post-imputation quality metric and will benefit downstream analysis by better distinguishing well-imputed variants from those poorly imputed. MagicalRsq is freely available on GitHub.
Collapse
Affiliation(s)
- Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yingxi Yang
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - Jonathan D Rosen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Min-Zhi Jiang
- Department of Applied Physical Sciences, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jiawen Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Weifang Liu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jia Wen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Rhonda G Pace
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yi-Hui Zhou
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Fred A Wright
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA; Bioinformatics Research Center and Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Scott M Blackman
- Division of Pediatric Endocrinology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Michael J Bamshad
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA; Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Ronald L Gibson
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA
| | - Garry R Cutting
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Michael R Knowles
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Christian Fuchsberger
- Institute for Biomedicine, Eurac Research (affiliated with the University of Lübeck), Bolzano, Italy.
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
| |
Collapse
|
32
|
Truong VQ, Woerner JA, Cherlin TA, Bradford Y, Lucas AM, Okeh CC, Shivakumar MK, Hui DH, Kumar R, Pividori M, Jones SC, Bossa AC, Turner SD, Ritchie MD, Verma SS. Quality Control Procedures for Genome-Wide Association Studies. Curr Protoc 2022; 2:e603. [PMID: 36441943 DOI: 10.1002/cpz1.603] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of many complex diseases. Regardless of the context, the practical utility of this information ultimately depends upon the quality of the data used for statistical analyses. Quality control (QC) procedures for GWAS are constantly evolving. Here, we enumerate some of the challenges in QC of genotyped GWAS data and describe the approaches involving genotype imputation of a sample dataset along with post-imputation quality assurance, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of the GWAS data (genotyped and imputed), including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We provide detailed guidelines along with a sample dataset to suggest current best practices and discuss areas of ongoing and future research. © 2022 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Van Q Truong
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Jakob A Woerner
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Tess A Cherlin
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Yuki Bradford
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Anastasia M Lucas
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Chelsea C Okeh
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Manu K Shivakumar
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Daniel H Hui
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Rachit Kumar
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Milton Pividori
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - S Chris Jones
- Genomics and Computational Biology Graduate Group, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Abigail C Bossa
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | | | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Shefali S Verma
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| |
Collapse
|
33
|
Chen Y, Chen W. Genome-Wide Integration of Genetic and Genomic Studies of Atopic Dermatitis: Insights into Genetic Architecture and Pathogenesis. J Invest Dermatol 2022; 142:2958-2967.e8. [PMID: 35577104 DOI: 10.1016/j.jid.2022.04.021] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 04/26/2022] [Accepted: 04/26/2022] [Indexed: 12/23/2022]
Abstract
Atopic dermatitis (AD) is a common heterogeneous, chronic, itching, and inflammatory skin disease. Genetic studies have identified multiple AD susceptibility genes. However, the genetic architecture of AD has not been elucidated. In this study, we conducted a large-scale meta-analysis of AD (35,647 cases and 1,013,885 controls) to characterize the genetic basis of AD. The heritability of AD in different datasets varied from 0.6 to 7.1%. We identified 31 previously unreported genes by integrating multiomics data. Among the 31 genes, MCL1 was identified as a potential treatment target for AD by mediating gene‒drug interactions. Tissue enrichment analyses and phenome-wide association study provided strong support for the role of the hemic and immune systems in AD. Across 1,207 complex traits and diseases, genetic correlations indicated that AD shared links with multiple respiratory phenotypes. The phenome-wide Mendelian randomization analysis (Mendelian randomization‒phenome-wide association study) revealed that the age of onset of diabetes exhibited a positive causal effect on AD (inverse-variance weighted β = 0.39, SEM = 0.09, P = 2.77 × 10-5). Overall, these results provide important insights into the genetic architecture of AD and will lead to a more thorough and complete understanding of the molecular mechanisms underlying AD.
Collapse
Affiliation(s)
- Yanxuan Chen
- Department of General Medicine, Shenzhen Longhua District Central Hospital, Shenzhen, China
| | - Wenyan Chen
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, China.
| |
Collapse
|
34
|
Wang X, Wang L, Shi L, Zhang P, Li Y, Li M, Tian J, Wang L, Zhao F. GWAS of Reproductive Traits in Large White Pigs on Chip and Imputed Whole-Genome Sequencing Data. Int J Mol Sci 2022; 23:13338. [PMID: 36362120 PMCID: PMC9656588 DOI: 10.3390/ijms232113338] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 12/09/2023] Open
Abstract
Total number born (TNB), number of stillborn (NSB), and gestation length (GL) are economically important traits in pig production, and disentangling the molecular mechanisms associated with traits can provide valuable insights into their genetic structure. Genotype imputation can be used as a practical tool to improve the marker density of single-nucleotide polymorphism (SNP) chips based on sequence data, thereby dramatically improving the power of genome-wide association studies (GWAS). In this study, we applied Beagle software to impute the 50 K chip data to the whole-genome sequencing (WGS) data with average imputation accuracy (R2) of 0.876. The target pigs, 2655 Large White pigs introduced from Canadian and French lines, were genotyped by a GeneSeek Porcine 50K chip. The 30 Large White reference pigs were the key ancestral individuals sequenced by whole-genome resequencing. To avoid population stratification, we identified genetic variants associated with reproductive traits by performing within-population GWAS and cross-population meta-analyses with data before and after imputation. Finally, several genes were detected and regarded as potential candidate genes for each of the traits: for the TNB trait: NOTCH2, KLF3, PLXDC2, NDUFV1, TLR10, CDC14A, EPC2, ORC4, ACVR2A, and GSC; for the NSB trait: NUB1, TGFBR3, ZDHHC14, FGF14, BAIAP2L1, EVI5, TAF1B, and BCAR3; for the GL trait: PPP2R2B, AMBP, MALRD1, HOXA11, and BICC1. In conclusion, expanding the size of the reference population and finding an optimal imputation strategy to ensure that more loci are obtained for GWAS under high imputation accuracy will contribute to the identification of causal mutations in pig breeding.
Collapse
Affiliation(s)
- Xiaoqing Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Ligang Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Liangyu Shi
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
- Laboratory of Genetic Breeding, Reproduction and Precision Livestock Farming, School of Animal Science and Nutritional Engineering, Wuhan Polytechnic University, Wuhan 430023, China
| | - Pengfei Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Yang Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Mianyan Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Jingjing Tian
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Lixian Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | - Fuping Zhao
- Key Laboratory of Animal Genetics, Breeding and Reproduction (Poultry) of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| |
Collapse
|
35
|
Durda P, Raffield LM, Lange EM, Olson NC, Jenny NS, Cushman M, Deichgraeber P, Grarup N, Jonsson A, Hansen T, Mychaleckyj JC, Psaty BM, Reiner AP, Tracy RP, Lange LA. Circulating Soluble CD163, Associations With Cardiovascular Outcomes and Mortality, and Identification of Genetic Variants in Older Individuals: The Cardiovascular Health Study. J Am Heart Assoc 2022; 11:e024374. [PMID: 36314488 PMCID: PMC9673628 DOI: 10.1161/jaha.121.024374] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Background Monocytes/macrophages participate in cardiovascular disease. CD163 (cluster of differentiation 163) is a monocyte/macrophage receptor, and the shed sCD163 (soluble CD163) reflects monocyte/macrophage activation. We examined the association of sCD163 with incident cardiovascular disease events and performed a genome-wide association study to identify sCD163-associated variants. Methods and Results We measured plasma sCD163 in 5214 adults (aged ≥65 years, 58.7% women, 16.2% Black) of the CHS (Cardiovascular Health Study). We used Cox regression models (associations of sCD163 with incident events and mortality); median follow-up was 26 years. Genome-wide association study analyses were stratified on race. Adjusted for age, sex, and race and ethnicity, sCD163 levels were associated with all-cause mortality (hazard ratio [HR], 1.08 [95% CI, 1.04-1.12] per SD increase), cardiovascular disease mortality (HR, 1.15 [95% CI, 1.09-1.21]), incident coronary heart disease (HR, 1.10 [95% CI, 1.04-1.16]), and incident heart failure (HR, 1.18 [95% CI, 1.12-1.25]). When further adjusted (eg, cardiovascular disease risk factors), only incident coronary heart disease lost significance. In European American individuals, genome-wide association studies identified 38 variants on chromosome 2 near MGAT5 (top result rs62165726, P=3.3×10-18),19 variants near chromosome 17 gene ASGR1 (rs55714927, P=1.5×10-14), and 18 variants near chromosome 11 gene ST3GAL4. These regions replicated in the European ancestry ADDITION-PRO cohort, a longitudinal cohort study nested in the Danish arm of the Anglo-Danish-Dutch study of Intensive Treatment Intensive Treatment In peOple with screeNdetcted Diabetes in Primary Care. In Black individuals, we identified 9 variants on chromosome 6 (rs3129781 P=7.1×10-9) in the HLA region, and 3 variants (rs115391969 P=4.3×10-8) near the chromosome 16 gene MYLK3. Conclusions Monocyte function, as measured by sCD163, may be predictive of overall and cardiovascular-specific mortality and incident heart failure.
Collapse
Affiliation(s)
- Peter Durda
- Department of Pathology and Laboratory MedicineLarner College of Medicine, University of VermontBurlingtonVT
| | | | - Ethan M. Lange
- Division of Biomedical Informatics and Personalized Medicine, Department of MedicineUniversity of Colorado Anschutz Medical CampusAuroraCO
| | - Nels C. Olson
- Department of Pathology and Laboratory MedicineLarner College of Medicine, University of VermontBurlingtonVT
| | - Nancy Swords Jenny
- Department of Pathology and Laboratory MedicineLarner College of Medicine, University of VermontBurlingtonVT
| | - Mary Cushman
- Department of Pathology and Laboratory MedicineLarner College of Medicine, University of VermontBurlingtonVT,Department of MedicineLarner College of Medicine, University of VermontBurlingtonVT
| | - Pia Deichgraeber
- Steno Diabetes CenterAarhus University HospitalAarhusDenmark,Department of Endocrinology and Internal MedicineAarhus University HospitalAarhusDenmark
| | - Niels Grarup
- Novo Nordisk Foundation Center for Basic Metabolic ResearchCopenhagenDenmark
| | - Anna Jonsson
- Novo Nordisk Foundation Center for Basic Metabolic ResearchCopenhagenDenmark
| | - Torben Hansen
- Novo Nordisk Foundation Center for Basic Metabolic ResearchCopenhagenDenmark
| | | | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Departments of Medicine, Epidemiology and Health ServicesUniversity of WashingtonSeattleWA
| | - Alex P. Reiner
- Department of EpidemiologyUniversity of WashingtonSeattleWA
| | - Russell P. Tracy
- Department of Pathology and Laboratory MedicineLarner College of Medicine, University of VermontBurlingtonVT,Department of BiochemistryLarner College of Medicine, University of VermontBurlingtonVT
| | - Leslie A. Lange
- Division of Biomedical Informatics and Personalized Medicine, Department of MedicineUniversity of Colorado Anschutz Medical CampusAuroraCO
| |
Collapse
|
36
|
Elghzaly AA, Sun C, Looger LL, Hirose M, Salama M, Khalil NM, Behiry ME, Hegazy MT, Hussein MA, Salem MN, Eltoraby E, Tawhid Z, Alwasefy M, Allam W, El-Shiekh I, Elserafy M, Abdelnaser A, Hashish S, Shebl N, Shahba AA, Elgirby A, Hassab A, Refay K, El-Touchy HM, Youssef A, Shabacy F, Hashim AA, Abdelzaher A, Alshebini E, Fayez D, El-Bakry SA, Elzohri MH, Abdelsalam EN, El-Khamisy SF, Ibrahim S, Ragab G, Nath SK. Genome-wide association study for systemic lupus erythematosus in an egyptian population. Front Genet 2022; 13:948505. [PMID: 36324510 PMCID: PMC9619055 DOI: 10.3389/fgene.2022.948505] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 09/30/2022] [Indexed: 04/11/2024] Open
Abstract
Systemic lupus erythematosus (SLE) susceptibility has a strong genetic component. Genome-wide association studies (GWAS) across trans-ancestral populations show both common and distinct genetic variants of susceptibility across European and Asian ancestries, while many other ethnic populations remain underexplored. We conducted the first SLE GWAS on Egyptians-an admixed North African/Middle Eastern population-using 537 patients and 883 controls. To identify novel susceptibility loci and replicate previously known loci, we performed imputation-based association analysis with 6,382,276 SNPs while accounting for individual admixture. We validated the association analysis using adaptive permutation tests (n = 109). We identified a novel genome-wide significant locus near IRS1/miR-5702 (Pcorrected = 1.98 × 10-8) and eight novel suggestive loci (Pcorrected < 1.0 × 10-5). We also replicated (Pperm < 0.01) 97 previously known loci with at least one associated nearby SNP, with ITGAM, DEF6-PPARD and IRF5 the top three replicated loci. SNPs correlated (r 2 > 0.8) with lead SNPs from four suggestive loci (ARMC9, DIAPH3, IFLDT1, and ENTPD3) were associated with differential gene expression (3.5 × 10-95 < p < 1.0 × 10-2) across diverse tissues. These loci are involved in cellular proliferation and invasion-pathways prominent in lupus and nephritis. Our study highlights the utility of GWAS in an admixed Egyptian population for delineating new genetic associations and for understanding SLE pathogenesis.
Collapse
Affiliation(s)
- Ashraf A. Elghzaly
- Department of Clinical Pathology, Faculty of Medicine, Mansoura University, El-Mansoura, Egypt
| | - Celi Sun
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, United States
| | - Loren L. Looger
- Department of Neurosciences, Howard Hughes Medical Institute, University of California, San Diego, San Diego, CA, United States
| | - Misa Hirose
- Division of Genetics, Lübeck Institute of Experimental Dermatology, University of Lübeck, Lübeck, Germany
| | - Mohamed Salama
- Institute of Global Health and Human Ecology, The American University in Cairo, New Cairo, Egypt
| | - Noha M. Khalil
- Rheumatology and Clinical Immunology Unit, Department of Internal Medicine, Faculty of Medicine, Cairo University, Cairo, Egypt
| | - Mervat Essam Behiry
- Rheumatology and Clinical Immunology Unit, Department of Internal Medicine, Faculty of Medicine, Cairo University, Cairo, Egypt
| | - Mohamed Tharwat Hegazy
- Rheumatology and Clinical Immunology Unit, Department of Internal Medicine, Faculty of Medicine, Cairo University, Cairo, Egypt
| | - Mohamed Ahmed Hussein
- Rheumatology and Clinical Immunology Unit, Department of Internal Medicine, Faculty of Medicine, Cairo University, Cairo, Egypt
| | - Mohamad Nabil Salem
- Department of Internal Medicine, Faculty of Medicine, Beni-Suef University, Beni Suef, Egypt
| | - Ehab Eltoraby
- Department of Internal Medicine, Faculty of Medicine, Mansoura University, El-Mansoura, Egypt
| | - Ziyad Tawhid
- Department of Clinical Pathology, Faculty of Medicine, Mansoura University, El-Mansoura, Egypt
| | - Mona Alwasefy
- Department of Clinical Pathology, Faculty of Medicine, Mansoura University, El-Mansoura, Egypt
| | - Walaa Allam
- Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt
| | - Iman El-Shiekh
- Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt
| | - Menattallah Elserafy
- Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt
| | - Anwar Abdelnaser
- Institute of Global Health and Human Ecology, The American University in Cairo, New Cairo, Egypt
| | - Sara Hashish
- Institute of Global Health and Human Ecology, The American University in Cairo, New Cairo, Egypt
| | - Nourhan Shebl
- Institute of Global Health and Human Ecology, The American University in Cairo, New Cairo, Egypt
| | | | - Amira Elgirby
- Department of Internal Medicine, Faculty of Medicine, Alexandria University, Bab Sharqi, Egypt
| | - Amina Hassab
- Department of Clinical Pathology, Faculty of Medicine, Alexandria University, Bab Sharqi, Egypt
| | - Khalida Refay
- Department of Internal Medicine, Faculty of Medicine, Al-Azhar University, Cairo, Egypt
| | | | - Ali Youssef
- Department of Rheumatology and Immunology, Faculty of Medicine, Benha University Hospital, Benha, Egypt
| | - Fatma Shabacy
- Department of Rheumatology and Immunology, Faculty of Medicine, Benha University Hospital, Benha, Egypt
| | | | - Asmaa Abdelzaher
- Department of Clinical Pathology, Faculty of Medicine, South Valley University, Qena, Egypt
| | - Emad Alshebini
- Department of Internal Medicine, Faculty of Medicine, Menoufia University, Al Minufiyah, Egypt
| | - Dalia Fayez
- Rheumatology and Clinical Immunology Unit, Department of Internal Medicine, Faculty of Medicine, Ain Shams University, Cairo, Egypt
| | - Samah A. El-Bakry
- Rheumatology and Clinical Immunology Unit, Department of Internal Medicine, Faculty of Medicine, Ain Shams University, Cairo, Egypt
| | - Mona H. Elzohri
- Department of Internal Medicine, Faculty of Medicine, Assiut University, Asyut, Egypt
| | | | - Sherif F. El-Khamisy
- Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt
- The Healthy Lifespan Institute, University of Sheffield, Sheffield, United Kingdom
- The Institute of Cancer Therapeutics, University of Bradford, Bradford, United Kingdom
| | - Saleh Ibrahim
- Division of Genetics, Lübeck Institute of Experimental Dermatology, University of Lübeck, Lübeck, Germany
| | - Gaafar Ragab
- Rheumatology and Clinical Immunology Unit, Department of Internal Medicine, Faculty of Medicine, Cairo University, Cairo, Egypt
| | - Swapan K. Nath
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, United States
| |
Collapse
|
37
|
Bañuelos MM, Zavaleta YJA, Roldan A, Reyes RJ, Guardado M, Chavez Rojas B, Nyein T, Rodriguez Vega A, Santos M, Huerta-Sanchez E, Rohlfs RV. Associations between forensic loci and expression levels of neighboring genes may compromise medical privacy. Proc Natl Acad Sci U S A 2022; 119:e2121024119. [PMID: 36166477 PMCID: PMC9546536 DOI: 10.1073/pnas.2121024119] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 08/29/2022] [Indexed: 11/18/2022] Open
Abstract
A set of 20 short tandem repeats (STRs) is used by the US criminal justice system to identify suspects and to maintain a database of genetic profiles for individuals who have been previously convicted or arrested. Some of these STRs were identified in the 1990s, with a preference for markers in putative gene deserts to avoid forensic profiles revealing protected medical information. We revisit that assumption, investigating whether forensic genetic profiles reveal information about gene-expression variation or potential medical information. We find six significant correlations (false discovery rate = 0.23) between the forensic STRs and the expression levels of neighboring genes in lymphoblastoid cell lines. We explore possible mechanisms for these associations, showing evidence compatible with forensic STRs causing expression variation or being in linkage disequilibrium with a causal locus in three cases and weaker or potentially spurious associations in the other three cases. Together, these results suggest that forensic genetic loci may reveal expression levels and, perhaps, medical information.
Collapse
Affiliation(s)
- Mayra M. Bañuelos
- Department of Mathematics, San Francisco State University, San Francisco, CA 94132
- Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912
- Center for Computational and Molecular Biology, Brown University, Providence, RI 02912
| | | | - Alennie Roldan
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| | - Rochelle-Jan Reyes
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| | - Miguel Guardado
- Department of Mathematics, San Francisco State University, San Francisco, CA 94132
| | | | - Thet Nyein
- Department of Mathematics, San Francisco State University, San Francisco, CA 94132
| | - Ana Rodriguez Vega
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| | - Maribel Santos
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| | - Emilia Huerta-Sanchez
- Ecology, Evolution and Organismal Biology, Brown University, Providence, RI 02912
- Center for Computational and Molecular Biology, Brown University, Providence, RI 02912
| | - Rori V. Rohlfs
- Department of Biology, San Francisco State University, San Francisco, CA 94132
| |
Collapse
|
38
|
Ma S, Shungin D, Mallick H, Schirmer M, Nguyen LH, Kolde R, Franzosa E, Vlamakis H, Xavier R, Huttenhower C. Population structure discovery in meta-analyzed microbial communities and inflammatory bowel disease using MMUPHin. Genome Biol 2022; 23:208. [PMID: 36192803 PMCID: PMC9531436 DOI: 10.1186/s13059-022-02753-4] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 08/19/2022] [Indexed: 01/19/2023] Open
Abstract
Microbiome studies of inflammatory bowel diseases (IBD) have achieved a scale for meta-analysis of dysbioses among populations. To enable microbial community meta-analyses generally, we develop MMUPHin for normalization, statistical meta-analysis, and population structure discovery using microbial taxonomic and functional profiles. Applying it to ten IBD cohorts, we identify consistent associations, including novel taxa such as Acinetobacter and Turicibacter, and additional exposure and interaction effects. A single gradient of dysbiosis severity is favored over discrete types to summarize IBD microbiome population structure. These results provide a benchmark for characterization of IBD and a framework for meta-analysis of any microbial communities.
Collapse
Affiliation(s)
- Siyuan Ma
- grid.38142.3c000000041936754XHarvard Chan Microbiome in Public Health Center, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Dmitry Shungin
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Himel Mallick
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Melanie Schirmer
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Long H. Nguyen
- grid.32224.350000 0004 0386 9924Massachusetts General Hospital, Boston, MA USA
| | - Raivo Kolde
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Eric Franzosa
- grid.38142.3c000000041936754XHarvard Chan Microbiome in Public Health Center, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Hera Vlamakis
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Ramnik Xavier
- grid.66859.340000 0004 0546 1623Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Curtis Huttenhower
- grid.38142.3c000000041936754XHarvard Chan Microbiome in Public Health Center, Harvard T.H. Chan School of Public Health, Boston, MA USA
| |
Collapse
|
39
|
Dias R, Evans D, Chen SF, Chen KY, Loguercio S, Chan L, Torkamani A. Rapid, Reference-Free human genotype imputation with denoising autoencoders. eLife 2022; 11:e75600. [PMID: 36148981 PMCID: PMC9555874 DOI: 10.7554/elife.75600] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 09/19/2022] [Indexed: 11/13/2022] Open
Abstract
Genotype imputation is a foundational tool for population genetics. Standard statistical imputation approaches rely on the co-location of large whole-genome sequencing-based reference panels, powerful computing environments, and potentially sensitive genetic study data. This results in computational resource and privacy-risk barriers to access to cutting-edge imputation techniques. Moreover, the accuracy of current statistical approaches is known to degrade in regions of low and complex linkage disequilibrium. Artificial neural network-based imputation approaches may overcome these limitations by encoding complex genotype relationships in easily portable inference models. Here, we demonstrate an autoencoder-based approach for genotype imputation, using a large, commonly used reference panel, and spanning the entirety of human chromosome 22. Our autoencoder-based genotype imputation strategy achieved superior imputation accuracy across the allele-frequency spectrum and across genomes of diverse ancestry, while delivering at least fourfold faster inference run time relative to standard imputation tools.
Collapse
Affiliation(s)
- Raquel Dias
- Scripps Research Translational Institute, Scripps Research InstituteLa JollaUnited States
- Department of Integrative Structural and Computational Biology, Scripps ResearchLa JollaUnited States
- Department of Microbiology and Cell Science, University of FloridaGainesvilleUnited States
| | - Doug Evans
- Scripps Research Translational Institute, Scripps Research InstituteLa JollaUnited States
- Department of Integrative Structural and Computational Biology, Scripps ResearchLa JollaUnited States
| | - Shang-Fu Chen
- Scripps Research Translational Institute, Scripps Research InstituteLa JollaUnited States
- Department of Integrative Structural and Computational Biology, Scripps ResearchLa JollaUnited States
| | - Kai-Yu Chen
- Scripps Research Translational Institute, Scripps Research InstituteLa JollaUnited States
- Department of Integrative Structural and Computational Biology, Scripps ResearchLa JollaUnited States
| | - Salvatore Loguercio
- Scripps Research Translational Institute, Scripps Research InstituteLa JollaUnited States
- Department of Integrative Structural and Computational Biology, Scripps ResearchLa JollaUnited States
| | - Leslie Chan
- Scripps Research Translational Institute, Scripps Research InstituteLa JollaUnited States
- Department of Integrative Structural and Computational Biology, Scripps ResearchLa JollaUnited States
| | - Ali Torkamani
- Scripps Research Translational Institute, Scripps Research InstituteLa JollaUnited States
- Department of Integrative Structural and Computational Biology, Scripps ResearchLa JollaUnited States
| |
Collapse
|
40
|
Riggio V, Tijjani A, Callaby R, Talenti A, Wragg D, Obishakin ET, Ezeasor C, Jongejan F, Ogo NI, Aboagye-Antwi F, Toure A, Nzalawahej J, Diallo B, Missohou A, Belem AMG, Djikeng A, Juleff N, Fourie J, Labuschagne M, Madder M, Marshall K, Prendergast JGD, Morrison LJ. Assessment of genotyping array performance for genome-wide association studies and imputation in African cattle. Genet Sel Evol 2022; 54:58. [PMID: 36057548 PMCID: PMC9441065 DOI: 10.1186/s12711-022-00751-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 08/17/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In cattle, genome-wide association studies (GWAS) have largely focused on European or Asian breeds, using genotyping arrays that were primarily designed for European cattle. Because there is growing interest in performing GWAS in African breeds, we have assessed the performance of 23 commercial bovine genotyping arrays for capturing the diversity across African breeds and performing imputation. We used 409 whole-genome sequences (WGS) spanning global cattle breeds, and a real cohort of 2481 individuals (including African breeds) that were genotyped with the Illumina high-density (HD) array and the GeneSeek bovine 50 k array. RESULTS We found that commercially available arrays were not effective in capturing variants that segregate among African indicine animals. Only 6% of these variants in high linkage disequilibrium (LD) (r2 > 0.8) were on the best performing arrays, which contrasts with the 17% and 25% in African and European taurine cattle, respectively. However, imputation from available HD arrays can successfully capture most variants (accuracies up to 0.93), mainly when using a global, not continent-specific, reference panel, which partially reflects the unusually high levels of admixture on the continent. When considering functional variants, the GGPF250 array performed best for tagging WGS variants and imputation. Finally, we show that imputation from low-density arrays can perform almost as well as HD arrays, if a two-stage imputation approach is adopted, i.e. first imputing to HD and then to WGS, which can potentially reduce the costs of GWAS. CONCLUSIONS Our results show that the choice of an array should be based on a balance between the objective of the study and the breed/population considered, with the HD and BOS1 arrays being the best choice for both taurine and indicine breeds when performing GWAS, and the GGPF250 being preferable for fine-mapping studies. Moreover, our results suggest that there is no advantage to using the indicus-specific arrays for indicus breeds, regardless of the objective. Finally, we show that using a reference panel that better represents global bovine diversity improves imputation accuracy, particularly for non-European taurine populations.
Collapse
Affiliation(s)
- Valentina Riggio
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK. .,Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK.
| | - Abdulfatai Tijjani
- Centre for Tropical Livestock Genetics and Health (CTLGH), ILRI Ethiopia, P.O Box 5689, Addis Ababa, Ethiopia
| | - Rebecca Callaby
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK.,Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | - Andrea Talenti
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK
| | - David Wragg
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK
| | - Emmanuel T Obishakin
- Biotechnology Division, National Veterinary Research Institute, Vom, Plateau State, Nigeria.,Biomedical Research Centre, Ghent University Global Campus, Songdo, Incheon, South Korea
| | - Chukwunonso Ezeasor
- Department of Veterinary Pathology and Microbiology, University of Nigeria, Nsukka, Enugu State, Nigeria
| | - Frans Jongejan
- Department of Veterinary Tropical Diseases, Faculty of Veterinary Science, University of Pretoria, Onderstepoort, South Africa
| | - Ndudim I Ogo
- National Veterinary Research Institute, Vom, Nigeria
| | - Fred Aboagye-Antwi
- Department of Animal Biology and Conservation Sciences, University of Ghana, Accra, Ghana
| | - Alassane Toure
- Laboratoire National d'Appui Au Dévéloppement Agricole(LANADA)/Laboratoire Central Vétérinaire de Bingerville, Bp: 206, Bingerville, Côte d'Ivoire
| | - Jahashi Nzalawahej
- Department of Microbiology, Parasitology and Biotechnology, Sokoine University of Agriculture, Morogoro, Tanzania
| | | | - Ayao Missohou
- Ecole Inter-Etats des Sciences et Médecine Vétérinaires (EISMV) de Dakar, Dakar, Senegal
| | - Adrien M G Belem
- Université Polytechnique de Bobo-Dioulasso (UPB), Bobo -Dioulasso, Burkina Faso
| | - Appolinaire Djikeng
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK.,Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | - Nick Juleff
- Bill & Melinda Gates Foundation, Seattle, WA, USA
| | | | - Michel Labuschagne
- Clinomics, Uitzich Road, Bainsvlei, Bloemfontein, 9338, South Africa.,Clinvet, Uitzich Road, Bainsvlei, Bloemfontein, 9338, South Africa
| | - Maxime Madder
- Clinglobal, B03/04, The Tamarin Commercial Hub, Jacaranda Avenue, Tamarin, 90903, Mauritius
| | - Karen Marshall
- Centre for Tropical Livestock Genetics and Health (CTLGH), ILRI Kenya, P.O. Box 30709, Nairobi, 00100, Kenya.,International Livestock Research Institute, P.O. Box 30709, Nairobi, 00100, Kenya
| | - James G D Prendergast
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK.,Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | - Liam J Morrison
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK.,Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| |
Collapse
|
41
|
Hanks SC, Forer L, Schönherr S, LeFaive J, Martins T, Welch R, Gagliano Taliun SA, Braff D, Johnsen JM, Kenny EE, Konkle BA, Laakso M, Loos RF, McCarroll S, Pato C, Pato MT, Smith AV, Boehnke M, Scott LJ, Fuchsberger C. Extent to which array genotyping and imputation with large reference panels approximate deep whole-genome sequencing. Am J Hum Genet 2022; 109:1653-1666. [PMID: 35981533 PMCID: PMC9502057 DOI: 10.1016/j.ajhg.2022.07.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 07/20/2022] [Indexed: 01/02/2023] Open
Abstract
Understanding the genetic basis of human diseases and traits is dependent on the identification and accurate genotyping of genetic variants. Deep whole-genome sequencing (WGS), the gold standard technology for SNP and indel identification and genotyping, remains very expensive for most large studies. Here, we quantify the extent to which array genotyping followed by genotype imputation can approximate WGS in studies of individuals of African, Hispanic/Latino, and European ancestry in the US and of Finnish ancestry in Finland (a population isolate). For each study, we performed genotype imputation by using the genetic variants present on the Illumina Core, OmniExpress, MEGA, and Omni 2.5M arrays with the 1000G, HRC, and TOPMed imputation reference panels. Using the Omni 2.5M array and the TOPMed panel, ≥90% of bi-allelic single-nucleotide variants (SNVs) are well imputed (r2 > 0.8) down to minor-allele frequencies (MAFs) of 0.14% in African, 0.11% in Hispanic/Latino, 0.35% in European, and 0.85% in Finnish ancestries. There was little difference in TOPMed-based imputation quality among the arrays with >700k variants. Individual-level imputation quality varied widely between and within the three US studies. Imputation quality also varied across genomic regions, producing regions where even common (MAF > 5%) variants were consistently not well imputed across ancestries. The extent to which array genotyping and imputation can approximate WGS therefore depends on reference panel, genotype array, sample ancestry, and genomic location. Imputation quality by variant or genomic region can be queried with our new tool, RsqBrowser, now deployed on the Michigan Imputation Server.
Collapse
Affiliation(s)
- Sarah C. Hanks
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Lukas Forer
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Sebastian Schönherr
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Jonathon LeFaive
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Taylor Martins
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Ryan Welch
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Sarah A. Gagliano Taliun
- Department of Medicine and Department of Neurosciences, Université de Montréal, Montreal, QC, Canada,Research Centre, Montreal Heart Institute, Montreal, QC, Canada
| | - David Braff
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Jill M. Johnsen
- Research Institute, Bloodworks, Seattle, WA, USA,Department of Medicine, University of Washington, Seattle, WA, USA
| | - Eimear E. Kenny
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA,Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Markku Laakso
- Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland
| | - Ruth F.J. Loos
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Steven McCarroll
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA,Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Carlos Pato
- Departments of Psychiatry, Rutgers University, Robert Wood Johnson Medical School and New Jersey Medical School, New Brunswick, NJ, USA
| | - Michele T. Pato
- Departments of Psychiatry, Rutgers University, Robert Wood Johnson Medical School and New Jersey Medical School, New Brunswick, NJ, USA
| | - Albert V. Smith
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | | | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Laura J. Scott
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Christian Fuchsberger
- Institute for Biomedicine (Affiliated with the University of Lübeck), Eurac Research, Bolzano, Italy.
| |
Collapse
|
42
|
Wang S, Kim M, Jiang X, Harmanci AO. Evaluation of vicinity-based hidden Markov models for genotype imputation. BMC Bioinformatics 2022; 23:356. [PMID: 36038834 PMCID: PMC9422108 DOI: 10.1186/s12859-022-04896-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The decreasing cost of DNA sequencing has led to a great increase in our knowledge about genetic variation. While population-scale projects bring important insight into genotype-phenotype relationships, the cost of performing whole-genome sequencing on large samples is still prohibitive. In-silico genotype imputation coupled with genotyping-by-arrays is a cost-effective and accurate alternative for genotyping of common and uncommon variants. Imputation methods compare the genotypes of the typed variants with the large population-specific reference panels and estimate the genotypes of untyped variants by making use of the linkage disequilibrium patterns. Most accurate imputation methods are based on the Li-Stephens hidden Markov model, HMM, that treats the sequence of each chromosome as a mosaic of the haplotypes from the reference panel. RESULTS Here we assess the accuracy of vicinity-based HMMs, where each untyped variant is imputed using the typed variants in a small window around itself (as small as 1 centimorgan). Locality-based imputation is used recently by machine learning-based genotype imputation approaches. We assess how the parameters of the vicinity-based HMMs impact the imputation accuracy in a comprehensive set of benchmarks and show that vicinity-based HMMs can accurately impute common and uncommon variants. CONCLUSIONS Our results indicate that locality-based imputation models can be effectively used for genotype imputation. The parameter settings that we identified can be used in future methods and vicinity-based HMMs can be used for re-structuring and parallelizing new imputation methods. The source code for the vicinity-based HMM implementations is publicly available at https://github.com/harmancilab/LoHaMMer .
Collapse
Affiliation(s)
- Su Wang
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Miran Kim
- Department of Mathematics, Hanyang University, Seoul, 04763, Republic of Korea
| | - Xiaoqian Jiang
- Center for Secure Artificial Intelligence For hEalthcare (SAFE), School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA
| | - Arif Ozgun Harmanci
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center, Houston, TX, 77030, USA.
| |
Collapse
|
43
|
Jiang Y, Song H, Gao H, Zhang Q, Ding X. Exploring the optimal strategy of imputation from SNP array to whole-genome sequencing data in farm animals. Front Genet 2022; 13:963654. [PMID: 36092888 PMCID: PMC9459117 DOI: 10.3389/fgene.2022.963654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
Genotype imputation from BeadChip to whole-genome sequencing (WGS) data is a cost-effective method of obtaining genotypes of WGS variants. Beagle, one of the most popular imputation software programs, has been widely used for genotype inference in humans and non-human species. A few studies have systematically and comprehensively compared the performance of beagle versions and parameter settings of farm animals. Here, we investigated the imputation performance of three representative versions of Beagle (Beagle 4.1, Beagle 5.0, and Beagle 5.4), and the effective population size (Ne) parameter setting for three species (cattle, pig, and chicken). Six scenarios were investigated to explore the impact of certain key factors on imputation performance. The results showed that the default Ne (1,000,000) is not suitable for livestock and poultry in small reference or low-density arrays of target panels, with 2.47%–10.45% drops in accuracy. Beagle 5 significantly reduced the computation time (4.66-fold–13.24-fold) without an accuracy loss. In addition, using a large combined-reference panel or high-density chip provides greater imputation accuracy, especially for low minor allele frequency (MAF) variants. Finally, a highly significant correlation in the measures of imputation accuracy can be obtained with an MAF equal to or greater than 0.05.
Collapse
Affiliation(s)
- Yifan Jiang
- National Engineering Laboratory for Animal Breeding, Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Hailiang Song
- Beijing Key Laboratory of Fisheries Biotechnology, Fisheries Science Institute, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Hongding Gao
- Natural Resources Institute Finland (Luke), Helsinki, Finland
| | - Qin Zhang
- Shandong Provincial Key Laboratory of Animal Biotechnology and Disease Control and Prevention, Shandong Agricultural University, Taian, China
| | - Xiangdong Ding
- National Engineering Laboratory for Animal Breeding, Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture and Rural Affairs, College of Animal Science and Technology, China Agricultural University, Beijing, China
- *Correspondence: Xiangdong Ding,
| |
Collapse
|
44
|
Willcox MC, Burgueño JA, Jeffers D, Rodriguez-Chanona E, Guadarrama-Espinoza A, Kehel Z, Chepetla D, Shrestha R, Swarts K, Buckler ES, Hearne S, Chen C. Mining alleles for tar spot complex resistance from CIMMYT's maize Germplasm Bank. FRONTIERS IN SUSTAINABLE FOOD SYSTEMS 2022. [DOI: 10.3389/fsufs.2022.937200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The tar spot complex (TSC) is a devastating disease of maize (Zea mays L.), occurring in 17 countries throughout Central, South, and North America and the Caribbean, and can cause grain yield losses of up to 80%. As yield losses from the disease continue to intensify in Central America, Phyllachora maydis, one of the causal pathogens of TSC, was first detected in the United States in 2015, and in 2020 in Ontario, Canada. Both the distribution and yield losses due to TSC are increasing, and there is a critical need to identify the genetic resources for TSC resistance. The Seeds of Discovery Initiative at CIMMYT has sought to combine next-generation sequencing technologies and phenotypic characterization to identify valuable alleles held in the CIMMYT Germplasm Bank for use in germplasm improvement programs. Individual landrace accessions of the “Breeders' Core Collection” were crossed to CIMMYT hybrids to form 918 unique accessions topcrosses (F1 families) which were evaluated during 2011 and 2012 for TSC disease reaction. A total of 16 associated SNP variants were identified for TSC foliar leaf damage resistance and increased grain yield. These variants were confirmed by evaluating the TSC reaction of previously untested selections of the larger F1 testcross population (4,471 accessions) based on the presence of identified favorable SNPs. We demonstrated the usefulness of mining for donor alleles in Germplasm Bank accessions for newly emerging diseases using genomic variation in landraces.
Collapse
|
45
|
Chen AS, Liu H, Wu Y, Luo S, Patz EF, Glass C, Su L, Du M, Christiani DC, Wei Q. Genetic variants in DDO and PEX5L in peroxisome-related pathways predict non-small cell lung cancer survival. Mol Carcinog 2022; 61:619-628. [PMID: 35502931 DOI: 10.1002/mc.23400] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 03/05/2022] [Accepted: 03/10/2022] [Indexed: 01/14/2023]
Abstract
Peroxisomes play a role in lipid metabolism and regulation of reactive oxygen species, but its role in development and progression of non-small cell lung cancer (NSCLC) is not well understood. Here, we investigated the associations between 9708 single-nucleotide polymorphisms (SNPs) in 113 genes in the peroxisome-related pathways and survival of NSCLC patients from the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) and the Harvard Lung Cancer Susceptibility (HLCS) study. In 1185 NSCLC patients from the PLCO trial, we found that 213 SNPs were significantly associated with NSCLC overall survival (OS) (p ≤ 0.05, Bayesian false discovery probability [BFDP] ≤ 0.80), of which eight SNPs were validated in the HLCS data set. In a multivariate Cox proportional hazards regression model, two independent SNPs (rs9384742 DDO and rs9825224 PEX5L) were significantly associated with NSCLC survival (hazards ratios [HR] of 1.17 with 95% CI [confidence interval] of 1.06-1.28 and 0.86 with 95% CI of 0.77-0.96, respectively). Patients with one or two protective genotypes had a significantly higher OS (HR: 0.787 [95% CI: 0.620-0.998] and 0.691 [95% CI: 0.543-0.879], respectively). Further expression quantitative trait loci analysis using whole blood and lung tissue showed that the minor allele of rs9384742 DDO was significantly associated with decreased messenger RNA (mRNA) expression levels and that DDO expression was also decreased in NSCLC tumor tissue. Additionally, high PEX5L expression levels were significantly associated with lower survival of NSCLC. Our data suggest that variants in these peroxisome-related genes may influence gene regulation and are potential predictors of NSCLC OS, once validated by additional studies.
Collapse
Affiliation(s)
- Allan S Chen
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina, USA.,Department of Population Health Sciences, Duke University School of Medicine, Durham, North Carolina, USA
| | - Hongliang Liu
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina, USA.,Department of Population Health Sciences, Duke University School of Medicine, Durham, North Carolina, USA
| | - Yufeng Wu
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina, USA.,Department of Population Health Sciences, Duke University School of Medicine, Durham, North Carolina, USA
| | - Sheng Luo
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina, USA
| | - Edward F Patz
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina, USA.,Departments of Radiology, Pharmacology and Cancer Biology, Duke University Medical Center, Durham, North Carolina, USA
| | - Carolyn Glass
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina, USA.,Department of Pathology, Duke University School of Medicine, Durham, North Carolina, USA
| | - Li Su
- Departments of Environmental Health and Epidemiology, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Mulong Du
- Departments of Environmental Health and Epidemiology, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, USA
| | - David C Christiani
- Departments of Environmental Health and Epidemiology, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, USA.,Department of Medicine, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Qingyi Wei
- Duke Cancer Institute, Duke University Medical Center, Durham, North Carolina, USA.,Department of Population Health Sciences, Duke University School of Medicine, Durham, North Carolina, USA.,Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA.,Duke Global Health Institute, Duke University, Durham, North Carolina, USA
| |
Collapse
|
46
|
Multivariate genome-wide association study models to improve prediction of Crohn’s disease risk and identification of potential novel variants. Comput Biol Med 2022; 145:105398. [DOI: 10.1016/j.compbiomed.2022.105398] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 03/09/2022] [Accepted: 03/09/2022] [Indexed: 12/21/2022]
|
47
|
Gong H, Han B. Genotype calling and haplotype inference from low coverage sequence data in heterozygous plant genome using HetMap. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:2157-2166. [PMID: 35504967 DOI: 10.1007/s00122-022-04105-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 04/12/2022] [Indexed: 06/14/2023]
Abstract
This study developed a new genotyping method that can accurately infer heterozygous genotype information from the complex plant genome sequence data, which helped discover new alleles in the association studies. Many software packages and pipelines had been developed to handle the sequence data of the model species. However, Genotyping from complex heterozygous plant genome needs further improvement on the previous methods. Here we present a new pipeline available at https://github.com/Ncgrhg/HetMapv1 ) for variant calling and missing genotype imputation from low coverage sequence data for heterozygous plant genomes. To check the performance of the HetMap on the real sequence data, HetMap was applied to both the F1 hybrid rice population, which consists of 1495 samples and the wild rice population with 446 samples. The high coverage sequence data of four hybrid rice accessions and two wild rice accessions, which were also included in low coverage sequence data, were used to validate the accuracy of genotype inference. The validation results showed that HetMap archieved significant improvement in heterozygous genotype inference accuracy (13.65% for hybrid rice, 26.05% for wild rice) and total accuracy compared with similar software packages. The application of the new genotype with the genome-wide association study also showed improvement of association power in wild rice awn length phenotype. It could archive high genotype inference accuracy in low sequence coverage in a small population with both the natural and constructed recombination population. HetMap provided a powerful tool for the heterozygous plant genome sequence data analysis, which may help to discover new phenotype regions for the plant species with the complex heterozygous genome.
Collapse
Affiliation(s)
- Hao Gong
- School of Life Science, Huizhou University, Huizhou, 516007, China.
| | - Bin Han
- State Key Laboratory of Plant Molecular Genetics, National Center for Gene Research, Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, 200233, China.
| |
Collapse
|
48
|
Zeng Q, Zhao B, Wang H, Wang M, Teng M, Hu J, Bao Z, Wang Y. Aquaculture Molecular Breeding Platform (AMBP): a comprehensive web server for genotype imputation and genetic analysis in aquaculture. Nucleic Acids Res 2022; 50:W66-W74. [PMID: 35639514 PMCID: PMC9252723 DOI: 10.1093/nar/gkac424] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 04/19/2022] [Accepted: 05/09/2022] [Indexed: 12/26/2022] Open
Abstract
It is of vital importance to understand the population structure, dissect the genetic bases of performance traits, and make proper strategies for selection in breeding programs. However, there is no single webserver covering the specific needs in aquaculture. We present Aquaculture Molecular Breeding Platform (AMBP), the first web server for genetic data analysis in aquatic species of farming interest. AMBP integrates the haplotype reference panels of 18 aquaculture species, which greatly improves the accuracy of genotype imputation. It also supports multiple tools to infer genetic structures, dissect the genetic architecture of performance traits, estimate breeding values, and predict optimum contribution. All the tools are coherently linked in a web-interface for users to generate interpretable results and evaluate statistical appropriateness. The webserver supports standard VCF and PLINK (PED, MAP) files, and implements automated pipelines for format transformation and visualization to simplify the process of analysis. As a demonstration, we applied the webserver to Pacific white shrimp and Atlantic salmon datasets. In summary, AMBP constitutes comprehensive resources and analytical tools for exploring genetic data and guiding practical breeding programs. AMBP is available at http://mgb.qnlm.ac.
Collapse
Affiliation(s)
- Qifan Zeng
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Oceanog Inst, Ocean Univ China, Sanya 572000, Peoples R China.,Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| | - Baojun Zhao
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Hao Wang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Mengqiu Wang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Mingxuan Teng
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China
| | - Jingjie Hu
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Oceanog Inst, Ocean Univ China, Sanya 572000, Peoples R China
| | - Zhenmin Bao
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Key Laboratory of Tropical Aquatic Germplasm of Hainan Province, Sanya Oceanog Inst, Ocean Univ China, Sanya 572000, Peoples R China.,Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| | - Yangfan Wang
- MOE Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China.,Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
| |
Collapse
|
49
|
See GM, Fix JS, Schwab CR, Spangler ML. Imputation of non-genotyped F1 dams to improve genetic gain in swine crossbreeding programs. J Anim Sci 2022; 100:6572187. [PMID: 35451025 PMCID: PMC9126202 DOI: 10.1093/jas/skac148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 04/20/2022] [Indexed: 11/12/2022] Open
Abstract
This study investigated using imputed genotypes from non-genotyped animals which were not in the pedigree for the purpose of genetic selection and improving genetic gain for economically relevant traits. Simulations were used to mimic a 3-breed crossbreeding system that resembled a modern swine breeding scheme. The simulation consisted of three purebred (PB) breeds A, B, and C each with 25 and 425 mating males and females, respectively. Males from A and females from B were crossed to produce AB females (n = 1,000), which were crossed with males from C to produce crossbreds (CB; n = 10,000). The genome consisted of three chromosomes with 300 quantitative trait loci and ~9,000 markers. Lowly heritable reproductive traits were simulated for A, B, and AB (h2 = 0.2, 0.2, and 0.15, respectively), whereas a moderately heritable carcass trait was simulated for C (h2 = 0.4). Genetic correlations between reproductive traits in A, B, and AB were moderate (rg = 0.65). The goal trait of the breeding program was AB performance. Selection was practiced for four generations where AB and CB animals were first produced in generations 1 and 2, respectively. Non-genotyped AB dams were imputed using FImpute beginning in generation 2. Genotypes of PB and CB were used for imputation. Imputation strategies differed by three factors: 1) AB progeny genotyped per generation (2, 3, 4, or 6), 2) known or unknown mates of AB dams, and 3) genotyping rate of females from breeds A and B (0% or 100%). PB selection candidates from A and B were selected using estimated breeding values for AB performance, whereas candidates from C were selected by phenotype. Response to selection using imputed genotypes of non-genotyped animals was then compared to the scenarios where true AB genotypes (trueGeno) or no AB genotypes/phenotypes (noGeno) were used in genetic evaluations. The simulation was replicated 20 times. The average increase in genotype concordance between unknown and known sire imputation strategies was 0.22. Genotype concordance increased as the number of genotyped CB increased with little additional gain beyond 9 progeny. When mates of AB were known and more than 4 progeny were genotyped per generation, the phenotypic response in AB did not differ (P > 0.05) from trueGeno yet was greater (P < 0.05) than noGeno. Imputed genotypes of non-genotyped animals can be used to increase performance when 4 or more progeny are genotyped and sire pedigrees of CB animals are known.
Collapse
Affiliation(s)
- Garrett M See
- Department of Animal Science, University of Nebraska - Lincoln, Lincoln, NE 68588, USA
| | | | | | - Matthew L Spangler
- Department of Animal Science, University of Nebraska - Lincoln, Lincoln, NE 68588, USA
| |
Collapse
|
50
|
Chen L, Yang S, Araya S, Quigley C, Taliercio E, Mian R, Specht JE, Diers BW, Song Q. Genotype imputation for soybean nested association mapping population to improve precision of QTL detection. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2022; 135:1797-1810. [PMID: 35275252 PMCID: PMC9110473 DOI: 10.1007/s00122-022-04070-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 02/25/2022] [Indexed: 06/14/2023]
Abstract
KEY MESSAGE Software for high imputation accuracy in soybean was identified. Imputed dataset could significantly reduce the interval of genomic regions controlling traits, thus greatly improve the efficiency of candidate gene identification. Genotype imputation is a strategy to increase marker density of existing datasets without additional genotyping. We compared imputation performance of software BEAGLE 5.0, IMPUTE 5 and AlphaPlantImpute and tested software parameters that may help to improve imputation accuracy in soybean populations. Several factors including marker density, extent of linkage disequilibrium (LD), minor allele frequency (MAF), etc., were examined for their effects on imputation accuracy across different software. Our results showed that AlphaPlantImpute had a higher imputation accuracy than BEAGLE 5.0 or IMPUTE 5 tested in each soybean family, especially if the study progeny were genotyped with an extremely low number of markers. LD extent, MAF and reference panel size were positively correlated with imputation accuracy, a minimum number of 50 markers per chromosome and MAF of SNPs > 0.2 in soybean line were required to avoid a significant loss of imputation accuracy. Using the software, we imputed 5176 soybean lines in the soybean nested mapping population (NAM) with high-density markers of the 40 parents. The dataset containing 423,419 markers for 5176 lines and 40 parents was deposited at the Soybase. The imputed NAM dataset was further examined for the improvement of mapping quantitative trait loci (QTL) controlling soybean seed protein content. Most of the QTL identified were at identical or at similar position based on initial and imputed datasets; however, QTL intervals were greatly narrowed. The resulting genotypic dataset of NAM population will facilitate QTL mapping of traits and downstream applications. The information will also help to improve genotyping imputation accuracy in self-pollinated crops.
Collapse
Affiliation(s)
- Linfeng Chen
- Soybean Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, MD, 20705, USA
- National Center for Soybean Improvement, Key Laboratory of Biology and Genetic Improvement of Soybean, Ministry of Agriculture, State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, College of Agriculture, Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, China
| | - Shouping Yang
- National Center for Soybean Improvement, Key Laboratory of Biology and Genetic Improvement of Soybean, Ministry of Agriculture, State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, College of Agriculture, Soybean Research Institute, Nanjing Agricultural University, Nanjing, 210095, China.
| | - Susan Araya
- Soybean Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, MD, 20705, USA
| | - Charles Quigley
- Soybean Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, MD, 20705, USA
| | - Earl Taliercio
- Soybean and Nitrogen Fixation Research, USDA-ARS, Raleigh, NC, 27607, USA
| | - Rouf Mian
- Soybean and Nitrogen Fixation Research, USDA-ARS, Raleigh, NC, 27607, USA
| | - James E Specht
- Department of Agronomy and Horticulture, University of Nebraska, Lincoln, NE, 68583, USA
| | - Brian W Diers
- Department of Crop Sciences, National Soybean Research Center, University of Illinois, 1101 West Peabody Drive, Urbana, IL, 61801, USA
| | - Qijian Song
- Soybean Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, MD, 20705, USA.
| |
Collapse
|