1
|
Chattopadhyay A, Lee CY, Shen YC, Lu KC, Hsiao TH, Lin CH, La LC, Tsai MH, Lu TP, Chuang EY. Multi-ethnic imputation system (MI-System): a genotype imputation server for high-dimensional data. J Biomed Inform 2023:104423. [PMID: 37308034 DOI: 10.1016/j.jbi.2023.104423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 05/11/2023] [Accepted: 06/09/2023] [Indexed: 06/14/2023]
Abstract
OBJECTIVE Genotype imputation is a commonly used technique that infers un-typed variants into a study's genotype data, allowing better identification of causal variants in disease studies. However, due to overrepresentation of Caucasian studies, there's a lack of understanding of genetic basis of health-outcomes in other ethnic populations. Therefore, facilitating imputation of missing key-predictor-variants that can potentially improve a risk health-outcome prediction model, specifically for Asian ancestry, is of utmost relevance. METHODS We aimed to construct an imputation and analysis web-platform, that primarily facilitates, but is not limited to, genotype imputation on East-Asians. The goal is to provide a collaborative imputation platform for researchers in the public domain, towards rapidly and efficiently conducting accurate genotype imputation. RESULTS We present an online genotype imputation platform, Multi-ethnic Imputation System (MI-System) (https://misystem.cgm.ntu.edu.tw/), that offers users 3 established pipelines, SHAPEIT2-IMPUTE2, SHAPEIT4-IMPUTE5, and Beagle5.1 for conducting imputation analyses. In addition to 1000 Genomes and Hapmap3, a new customized Taiwan Biobank (TWB) reference panel, specifically created for Taiwanese-Chinese ancestry is provided. MI-System further offers functions to create customized reference panels to be used for imputation, conduct quality control, split whole genome data into chromosomes, and convert genome builds. CONCLUSION Users can upload their genotype data and perform imputation with minimum effort and resources. The utility functions further can be utilized to preprocess user uploaded data with easy clicks. MI-System potentially contributes to Asian-population genetics research, while eliminating the requirement for high performing computational resources and bioinformatics expertise. It will enable an increased pace of research and provide a knowledge-base for genetic carriers of complex diseases, therefore greatly enhancing patient-driven research. STATEMENT OF SIGNIFICANCE Multi-ethnic Imputation System (MI-System), primarily facilitates, but is not limited to, imputation on East-Asians, through 3 established prephasing-imputation pipelines, SHAPEIT2-IMPUTE2, SHAPEIT4-IMPUTE5, and Beagle5.1, where users can upload their genotype data and perform imputation and other utility functions with minimum effort and resources. A new customized Taiwan Biobank (TWB) reference panel, specifically created for Taiwanese-Chinese ancestry is provided. Utility functions include (a) create customized reference panels, (b) conduct quality control, (c) split whole genome data into chromosomes, and (d) convert genome builds. Users can also combine 2 reference panels using the system and use combined panels as reference to conduct imputation using MI-System.
Collapse
Affiliation(s)
- Amrita Chattopadhyay
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan
| | - Chien-Yueh Lee
- Master Program for Biomedical Engineering, College of Biomedical Engineering, China Medical University, Taichung 40402, Taiwan
| | - Ying-Cheng Shen
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan
| | - Kuan-Chen Lu
- Department of Public Health, Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei 10055, Taiwan
| | - Tzu-Hung Hsiao
- Department of Medical Research, Taichung Veterans General Hospital, Taiwan
| | - Ching-Heng Lin
- Department of Medical Research, Taichung Veterans General Hospital, Taiwan
| | - Liang-Chuan La
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan; Graduate Institute of Physiology, College of Medicine, National Taiwan University, Taipei 10051, Taiwan
| | - Mong-Hsun Tsai
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan; Institute of Biotechnology, National Taiwan University, Taipei 10672, Taiwan; Center of Biotechnology, National Taiwan University, Taipei 10672, Taiwan
| | - Tzu-Pin Lu
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan; Department of Public Health, Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei 10055, Taiwan
| | - Eric Y Chuang
- Bioinformatics and Biostatistics Core, Centre of Genomic and Precision Medicine, National Taiwan University, Taipei 10055, Taiwan; Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, Taiwan; Biomedical Technology and Device Research Laboratories, Industrial Technology Research Institute, Hsinchu, Taiwan.
| |
Collapse
|
2
|
Sun Q, Liu W, Rosen JD, Huang L, Pace RG, Dang H, Gallins PJ, Blue EE, Ling H, Corvol H, Strug LJ, Bamshad MJ, Gibson RL, Pugh EW, Blackman SM, Cutting GR, O'Neal WK, Zhou YH, Wright FA, Knowles MR, Wen J, Li Y. Leveraging TOPMed imputation server and constructing a cohort-specific imputation reference panel to enhance genotype imputation among cystic fibrosis patients. HGG ADVANCES 2022; 3:100090. [PMID: 35128485 PMCID: PMC8804187 DOI: 10.1016/j.xhgg.2022.100090] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 01/06/2022] [Indexed: 11/25/2022] Open
Abstract
Cystic fibrosis (CF) is a severe genetic disorder that can cause multiple comorbidities affecting the lungs, the pancreas, the luminal digestive system and beyond. In our previous genome-wide association studies (GWAS), we genotyped approximately 8,000 CF samples using a mixture of different genotyping platforms. More recently, the Cystic Fibrosis Genome Project (CFGP) performed deep (approximately 30×) whole genome sequencing (WGS) of 5,095 samples to better understand the genetic mechanisms underlying clinical heterogeneity among patients with CF. For mixtures of GWAS array and WGS data, genotype imputation has proven effective in increasing effective sample size. Therefore, we first performed imputation for the approximately 8,000 CF samples with GWAS array genotype using the Trans-Omics for Precision Medicine (TOPMed) freeze 8 reference panel. Our results demonstrate that TOPMed can provide high-quality imputation for patients with CF, boosting genomic coverage from approximately 0.3-4.2 million genotyped markers to approximately 11-43 million well-imputed markers, and significantly improving polygenic risk score (PRS) prediction accuracy. Furthermore, we built a CF-specific CFGP reference panel based on WGS data of patients with CF. We demonstrate that despite having approximately 3% the sample size of TOPMed, our CFGP reference panel can still outperform TOPMed when imputing some CF disease-causing variants, likely owing to allele and haplotype differences between patients with CF and general populations. We anticipate our imputed data for 4,656 samples without WGS data will benefit our subsequent genetic association studies, and the CFGP reference panel built from CF WGS samples will benefit other investigators studying CF.
Collapse
Affiliation(s)
- Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Weifang Liu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jonathan D. Rosen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Le Huang
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Rhonda G. Pace
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hong Dang
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Paul J. Gallins
- Bioinformatics Research Center and Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Elizabeth E. Blue
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA 98195, USA
- Brotman Baty Institute, Seattle, WA 98195, USA
| | - Hua Ling
- Center for Inherited Disease Research (CIDR), Johns Hopkins University, Baltimore, MD 21205, USA
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Harriet Corvol
- Sorbonne Université, Inserm, Centre de Recherche Saint-Antoine, Assistance Publique-Hôpitaux de Paris (APHP), Hôpital Trousseau, Service de Pneumologie Pédiatrique, Paris, France
| | - Lisa J. Strug
- Departments of Statistical Sciences and Computer Science and Division of Biostatistics, University of Toronto, Toronto, ON, Canada
- Program in Genetics and Genome Biology and The Centre for Applied Genomics, The Hospital for Sick Children, University of Toronto, Toronto, ON, Canada
| | - Michael J. Bamshad
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Division of Genetic Medicine, Seattle Children's Hospital, Seattle, WA 98105, USA
- Brotman Baty Institute, Seattle, WA 98195, USA
| | - Ronald L. Gibson
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA
| | - Elizabeth W. Pugh
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Scott M. Blackman
- Division of Pediatric Endocrinology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Garry R. Cutting
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Wanda K. O'Neal
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yi-Hui Zhou
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Fred A. Wright
- Bioinformatics Research Center and Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
| | - Michael R. Knowles
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Jia Wen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Cystic Fibrosis Genome Project
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Marsico Lung Institute/UNC CF Research Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Bioinformatics Research Center and Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA 98195, USA
- Center for Inherited Disease Research (CIDR), Johns Hopkins University, Baltimore, MD 21205, USA
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Sorbonne Université, Inserm, Centre de Recherche Saint-Antoine, Assistance Publique-Hôpitaux de Paris (APHP), Hôpital Trousseau, Service de Pneumologie Pédiatrique, Paris, France
- Departments of Statistical Sciences and Computer Science and Division of Biostatistics, University of Toronto, Toronto, ON, Canada
- Program in Genetics and Genome Biology and The Centre for Applied Genomics, The Hospital for Sick Children, University of Toronto, Toronto, ON, Canada
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Division of Genetic Medicine, Seattle Children's Hospital, Seattle, WA 98105, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Division of Pediatric Endocrinology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Brotman Baty Institute, Seattle, WA 98195, USA
| |
Collapse
|
3
|
Gong J, He G, Wang C, Bartlett C, Panjwani N, Mastromatteo S, Lin F, Keenan K, Avolio J, Halevy A, Shaw M, Esmaeili M, Côté-Maurais G, Adam D, Bégin S, Bjornson C, Chilvers M, Reisman J, Price A, Parkins M, van Wylick R, Berthiaume Y, Bilodeau L, Mateos-Corral D, Hughes D, Smith MJ, Morrison N, Brusky J, Tullis E, Stephenson AL, Quon BS, Wilcox P, Leung WM, Solomon M, Sun L, Brochiero E, Moraes TJ, Gonska T, Ratjen F, Rommens JM, Strug LJ. Genetic evidence supports the development of SLC26A9 targeting therapies for the treatment of lung disease. NPJ Genom Med 2022; 7:28. [PMID: 35396391 PMCID: PMC8993824 DOI: 10.1038/s41525-022-00299-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 03/04/2022] [Indexed: 12/19/2022] Open
Abstract
Over 400 variants in the cystic fibrosis (CF) transmembrane conductance regulator (CFTR) are CF-causing. CFTR modulators target variants to improve lung function, but marked variability in response exists and current therapies do not address all CF-causing variants highlighting unmet needs. Alternative epithelial ion channel/transporters such as SLC26A9 could compensate for CFTR dysfunction, providing therapeutic targets that may benefit all individuals with CF. We investigate the relationship between rs7512462, a marker of SLC26A9 activity, and lung function pre- and post-treatment with CFTR modulators in Canadian and US CF cohorts, in the general population, and in those with chronic obstructive pulmonary disease (COPD). Rs7512462 CC genotype is associated with greater lung function in CF individuals with minimal function variants (for which there are currently no approved therapies; p = 0.008); and for gating (p = 0.033) and p.Phe508del/ p.Phe508del (p = 0.006) genotypes upon treatment with CFTR modulators. In parallel, human nasal epithelia with CC and p.Phe508del/p.Phe508del after Ussing chamber analysis of a combination of approved and experimental modulator treatments show greater CFTR function (p = 0.0022). Beyond CF, rs7512462 is associated with peak expiratory flow in a meta-analysis of the UK Biobank and Spirometa Consortium (p = 2.74 × 10-44) and provides p = 0.0891 in an analysis of COPD case-control status in the UK Biobank defined by spirometry. These findings support SLC26A9 as a therapeutic target to improve lung function for all people with CF and in individuals with other obstructive lung diseases.
Collapse
Affiliation(s)
- Jiafen Gong
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Gengming He
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Cheng Wang
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Claire Bartlett
- Program in Translational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Naim Panjwani
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Scott Mastromatteo
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Fan Lin
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Katherine Keenan
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Program in Translational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Julie Avolio
- Program in Translational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Anat Halevy
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Michelle Shaw
- Program in Translational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Mohsen Esmaeili
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Guillaume Côté-Maurais
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada
| | - Damien Adam
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada
| | - Stéphanie Bégin
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada
| | | | - Mark Chilvers
- British Columbia Children's Hospital, Vancouver, BC, Canada
| | - Joe Reisman
- The Children's Hospital of Eastern Ontario, Ottawa, ON, Canada
| | - April Price
- The Children's Hospital, London Health Science Centre, London, ON, Canada
| | | | | | - Yves Berthiaume
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada
| | - Lara Bilodeau
- Centre de recherche de l'Institut universitaire de cardiologie et de pneumologie de Québec-Université Laval, Québec City, QC, Canada
| | | | | | - Mary J Smith
- Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL, Canada
| | - Nancy Morrison
- Queen Elizabeth II Health Sciences Centre, Halifax, NS, Canada
| | - Janna Brusky
- Department of Pediatrics, University of Saskatchewan, Saskatoon, SK, Canada
| | | | | | | | | | | | - Melinda Solomon
- Respiratory Medicine, Hospital for Sick Children, Toronto, ON, Canada
- Department of Paediatrics, University of Toronto, Toronto, ON, Canada
| | - Lei Sun
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
| | - Emmanuelle Brochiero
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada
| | - Theo J Moraes
- Program in Translational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
- Respiratory Medicine, Hospital for Sick Children, Toronto, ON, Canada
| | - Tanja Gonska
- Program in Translational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
- Division of Gastroenterology, Hepatology and Nutrition, The Hospital for Sick Children, Toronto, ON, Canada
| | - Felix Ratjen
- Program in Translational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Paediatrics, University of Toronto, Toronto, ON, Canada
| | - Johanna M Rommens
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Lisa J Strug
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada.
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada.
- The Centre for Applied Genomics, Hospital for Sick Children, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
4
|
Dang H, Polineni D, Pace RG, Stonebraker JR, Corvol H, Cutting GR, Drumm ML, Strug LJ, O’Neal WK, Knowles MR. Mining GWAS and eQTL data for CF lung disease modifiers by gene expression imputation. PLoS One 2020; 15:e0239189. [PMID: 33253230 PMCID: PMC7703903 DOI: 10.1371/journal.pone.0239189] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 09/02/2020] [Indexed: 12/18/2022] Open
Abstract
Genome wide association studies (GWAS) have identified several genomic loci with candidate modifiers of cystic fibrosis (CF) lung disease, but only a small proportion of the expected genetic contribution is accounted for at these loci. We leveraged expression data from CF cohorts, and Genotype-Tissue Expression (GTEx) reference data sets from multiple human tissues to generate predictive models, which were used to impute transcriptional regulation from genetic variance in our GWAS population. The imputed gene expression was tested for association with CF lung disease severity. By comparing and combining results from alternative approaches, we identified 379 candidate modifier genes. We delved into 52 modifier candidates that showed consensus between approaches, and 28 of them were near known GWAS loci. A number of these genes are implicated in the pathophysiology of CF lung disease (e.g., immunity, infection, inflammation, HLA pathways, glycosylation, and mucociliary clearance) and the CFTR protein biology (e.g., cytoskeleton, microtubule, mitochondrial function, lipid metabolism, endoplasmic reticulum/Golgi, and ubiquitination). Gene set enrichment results are consistent with current knowledge of CF lung disease pathogenesis. HLA Class II genes on chr6, and CEP72, EXOC3, and TPPP near the GWAS peak on chr5 are most consistently associated with CF lung disease severity across the tissues tested. The results help to prioritize genes in the GWAS regions, predict direction of gene expression regulation, and identify new candidate modifiers throughout the genome for potential therapeutic development.
Collapse
Affiliation(s)
- Hong Dang
- Marsico Lung Institute, University of North Carolina at Chapel Hill School of Medicine Cystic Fibrosis/Pulmonary Research & Treatment Center, Chapel Hill, North Carolina, United States of America
| | - Deepika Polineni
- University of Kansas Medical Center, Kansas City, Kansas, United States of America
| | - Rhonda G. Pace
- Marsico Lung Institute, University of North Carolina at Chapel Hill School of Medicine Cystic Fibrosis/Pulmonary Research & Treatment Center, Chapel Hill, North Carolina, United States of America
| | - Jaclyn R. Stonebraker
- Marsico Lung Institute, University of North Carolina at Chapel Hill School of Medicine Cystic Fibrosis/Pulmonary Research & Treatment Center, Chapel Hill, North Carolina, United States of America
| | - Harriet Corvol
- Pediatric Pulmonary Department, Assistance Publique-Hôpitaux sde Paris (AP-HP), Hôpital Trousseau, Institut National de la Santé et la Recherche Médicale (INSERM) U938, Paris, France
- Sorbonne Universités, Université Pierre et Marie Curie (UPMC), Paris 6, Paris, France
| | - Garry R. Cutting
- McKusick-Nathans Institute of Genetic Medicine, Baltimore, Maryland, United States of America
- Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Mitchell L. Drumm
- Department of Pediatrics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Lisa J. Strug
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Wanda K. O’Neal
- Marsico Lung Institute, University of North Carolina at Chapel Hill School of Medicine Cystic Fibrosis/Pulmonary Research & Treatment Center, Chapel Hill, North Carolina, United States of America
| | - Michael R. Knowles
- Marsico Lung Institute, University of North Carolina at Chapel Hill School of Medicine Cystic Fibrosis/Pulmonary Research & Treatment Center, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
5
|
Panjwani N, Wang F, Mastromatteo S, Bao A, Wang C, He G, Gong J, Rommens JM, Sun L, Strug LJ. LocusFocus: Web-based colocalization for the annotation and functional follow-up of GWAS. PLoS Comput Biol 2020; 16:e1008336. [PMID: 33090994 PMCID: PMC7608978 DOI: 10.1371/journal.pcbi.1008336] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 11/03/2020] [Accepted: 09/13/2020] [Indexed: 01/10/2023] Open
Abstract
Genome-wide association studies (GWAS) have primarily identified trait-associated loci in the non-coding genome. Colocalization analyses of SNP associations from GWAS with expression quantitative trait loci (eQTL) evidence enable the generation of hypotheses about responsible mechanism, genes and tissues of origin to guide functional characterization. Here, we present a web-based colocalization browsing and testing tool named LocusFocus (https://locusfocus.research.sickkids.ca). LocusFocus formally tests colocalization using our established Simple Sum method to identify the most relevant genes and tissues for a particular GWAS locus in the presence of high linkage disequilibrium and/or allelic heterogeneity. We demonstrate the utility of LocusFocus, following up on a genome-wide significant locus from a GWAS of meconium ileus (an intestinal obstruction in cystic fibrosis). Using LocusFocus for colocalization analysis with eQTL data suggests variation in ATP12A gene expression in the pancreas rather than intestine is responsible for the GWAS locus. LocusFocus has no operating system dependencies and may be installed in a local web server. LocusFocus is available under the MIT license, with full documentation and source code accessible on GitHub at https://github.com/naim-panjwani/LocusFocus.
Collapse
Affiliation(s)
- Naim Panjwani
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Fan Wang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Scott Mastromatteo
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Allen Bao
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Cheng Wang
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Gengming He
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Jiafen Gong
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Johanna M. Rommens
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - Lei Sun
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Lisa J. Strug
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
- * E-mail:
| |
Collapse
|
6
|
Baskurt Z, Mastromatteo S, Gong J, Wintle RF, Scherer SW, Strug LJ. VikNGS: a C++ variant integration kit for next generation sequencing association analysis. Bioinformatics 2020; 36:1283-1285. [PMID: 31580400 PMCID: PMC7703770 DOI: 10.1093/bioinformatics/btz716] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Revised: 08/13/2019] [Accepted: 09/25/2019] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Integration of next generation sequencing data (NGS) across different research studies can improve the power of genetic association testing by increasing sample size and can obviate the need for sequencing controls. If differential genotype uncertainty across studies is not accounted for, combining datasets can produce spurious association results. We developed the Variant Integration Kit for NGS (VikNGS), a fast cross-platform software package, to enable aggregation of several datasets for rare and common variant genetic association analysis of quantitative and binary traits with covariate adjustment. VikNGS also includes a graphical user interface, power simulation functionality and data visualization tools. AVAILABILITY AND IMPLEMENTATION The VikNGS package can be downloaded at http://www.tcag.ca/tools/index.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zeynep Baskurt
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada.,The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada
| | - Scott Mastromatteo
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada.,The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada
| | - Jiafen Gong
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada.,The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada
| | - Richard F Wintle
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada.,The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada
| | - Stephen W Scherer
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada.,The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada.,McLaughlin Centre and Department of Molecular Genetics, University of Toronto, Toronto, ON M5G 0A4, Canada
| | - Lisa J Strug
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada.,The Centre for Applied Genomics, The Hospital for Sick Children, Toronto, ON M5G0A4, Canada.,Division of Biostatistics and Department of Statistical Sciences, University of Toronto, Toronto, ON, M5T3M7, Canada
| |
Collapse
|
7
|
Genetic association and transcriptome integration identify contributing genes and tissues at cystic fibrosis modifier loci. PLoS Genet 2019; 15:e1008007. [PMID: 30807572 PMCID: PMC6407791 DOI: 10.1371/journal.pgen.1008007] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 03/08/2019] [Accepted: 02/06/2019] [Indexed: 01/09/2023] Open
Abstract
Cystic Fibrosis (CF) exhibits morbidity in several organs, including progressive lung disease in all patients and intestinal obstruction at birth (meconium ileus) in ~15%. Individuals with the same causal CFTR mutations show variable disease presentation which is partly attributed to modifier genes. With >6,500 participants from the International CF Gene Modifier Consortium, genome-wide association investigation identified a new modifier locus for meconium ileus encompassing ATP12A on chromosome 13 (min p = 3.83x10(-10)); replicated loci encompassing SLC6A14 on chromosome X and SLC26A9 on chromosome 1, (min p<2.2x10(-16), 2.81x10(-11), respectively); and replicated a suggestive locus on chromosome 7 near PRSS1 (min p = 2.55x10(-7)). PRSS1 is exclusively expressed in the exocrine pancreas and was previously associated with non-CF pancreatitis with functional characterization demonstrating impact on PRSS1 gene expression. We thus asked whether the other meconium ileus modifier loci impact gene expression and in which organ. We developed and applied a colocalization framework called the Simple Sum (SS) that integrates regulatory and genetic association information, and also contrasts colocalization evidence across tissues or genes. The associated modifier loci colocalized with expression quantitative trait loci (eQTLs) for ATP12A (p = 3.35x10(-8)), SLC6A14 (p = 1.12x10(-10)) and SLC26A9 (p = 4.48x10(-5)) in the pancreas, even though meconium ileus manifests in the intestine. The meconium ileus susceptibility locus on chromosome X appeared shifted in location from a previously identified locus for CF lung disease severity. Using the SS we integrated the lung disease association locus with eQTLs from nasal epithelia of 63 CF participants and demonstrated evidence of colocalization with airway-specific regulation of SLC6A14 (p = 2.3x10(-4)). Cystic Fibrosis is realizing the promise of personalized medicine, and identification of the contributing organ and understanding of tissue specificity for a gene modifier is essential for the next phase of personalizing therapeutic strategies.
Collapse
|