1
|
Hamilton MC, Fife JD, Akinci E, Yu T, Khowpinitchai B, Cha M, Barkal S, Thi TT, Yeo GH, Ramos Barroso JP, Francoeur MJ, Velimirovic M, Gifford DK, Lettre G, Yu H, Cassa CA, Sherwood RI. Systematic elucidation of genetic mechanisms underlying cholesterol uptake. Cell Genom 2023; 3:100304. [PMID: 37228746 PMCID: PMC10203276 DOI: 10.1016/j.xgen.2023.100304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 12/02/2022] [Accepted: 03/24/2023] [Indexed: 05/27/2023]
Abstract
Genetic variation contributes greatly to LDL cholesterol (LDL-C) levels and coronary artery disease risk. By combining analysis of rare coding variants from the UK Biobank and genome-scale CRISPR-Cas9 knockout and activation screening, we substantially improve the identification of genes whose disruption alters serum LDL-C levels. We identify 21 genes in which rare coding variants significantly alter LDL-C levels at least partially through altered LDL-C uptake. We use co-essentiality-based gene module analysis to show that dysfunction of the RAB10 vesicle transport pathway leads to hypercholesterolemia in humans and mice by impairing surface LDL receptor levels. Further, we demonstrate that loss of function of OTX2 leads to robust reduction in serum LDL-C levels in mice and humans by increasing cellular LDL-C uptake. Altogether, we present an integrated approach that improves our understanding of the genetic regulators of LDL-C levels and provides a roadmap for further efforts to dissect complex human disease genetics.
Collapse
Affiliation(s)
- Marisa C. Hamilton
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - James D. Fife
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Ersin Akinci
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Tian Yu
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Benyapa Khowpinitchai
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Minsun Cha
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Sammy Barkal
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Thi Tun Thi
- Precision Medicine Research Programme, Cardiovascular Disease Research Programme, and Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Grace H.T. Yeo
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Biological Engineering, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Juan Pablo Ramos Barroso
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Matthew Jake Francoeur
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Minja Velimirovic
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - David K. Gifford
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Biological Engineering, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Guillaume Lettre
- Montreal Heart Institute, Montréal, QC H1T 1C8, Canada
- Faculté de Médecine, Université de Montréal, Montréal, QC H3T 1J4, Canada
| | - Haojie Yu
- Precision Medicine Research Programme, Cardiovascular Disease Research Programme, and Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Christopher A. Cassa
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Richard I. Sherwood
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
2
|
Abstract
BACKGROUND Genome sequencing efforts for individuals with rare Mendelian disease have increased the research focus on the noncoding genome and the clinical need for methods that prioritize potentially disease causal noncoding variants. Some tools for assessment of variant pathogenicity as well as annotations are not available for the current human genome build (GRCh38), for which the adoption in databases, software, and pipelines was slow. RESULTS Here, we present an updated version of the Regulatory Mendelian Mutation (ReMM) score, retrained on features and variants derived from the GRCh38 genome build. Like its GRCh37 version, it achieves good performance on its highly imbalanced data. To improve accessibility and provide users with a toolbox to score their variant files and look up scores in the genome, we developed a website and API for easy score lookup. CONCLUSIONS Scores of the GRCh38 genome build are highly correlated to the prior release with a performance increase due to the better coverage of features. For prioritization of noncoding mutations in imbalanced datasets, the ReMM score performed much better than other variation scores. Prescored whole-genome files of GRCh37 and GRCh38 genome builds are cited in the article and the website; UCSC genome browser tracks, and an API are available at https://remm.bihealth.org.
Collapse
Affiliation(s)
- Max Schubach
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité–Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Lusiné Nazaretyan
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité–Universitätsmedizin Berlin, 10117 Berlin, Germany
| | - Martin Kircher
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité–Universitätsmedizin Berlin, 10117 Berlin, Germany
- Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck, 23562 Lübeck, Germany
| |
Collapse
|
3
|
Zhang S, Cooper-Knock J, Weimer AK, Shi M, Kozhaya L, Unutmaz D, Harvey C, Julian TH, Furini S, Frullanti E, Fava F, Renieri A, Gao P, Shen X, Timpanaro IS, Kenna KP, Baillie JK, Davis MM, Tsao PS, Snyder MP. Multiomic analysis reveals cell-type-specific molecular determinants of COVID-19 severity. Cell Syst 2022; 13:598-614.e6. [PMID: 35690068 PMCID: PMC9163145 DOI: 10.1016/j.cels.2022.05.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 04/02/2022] [Accepted: 05/18/2022] [Indexed: 01/26/2023]
Abstract
The determinants of severe COVID-19 in healthy adults are poorly understood, which limits the opportunity for early intervention. We present a multiomic analysis using machine learning to characterize the genomic basis of COVID-19 severity. We use single-cell multiome profiling of human lungs to link genetic signals to cell-type-specific functions. We discover >1,000 risk genes across 19 cell types, which account for 77% of the SNP-based heritability for severe disease. Genetic risk is particularly focused within natural killer (NK) cells and T cells, placing the dysfunction of these cells upstream of severe disease. Mendelian randomization and single-cell profiling of human NK cells support the role of NK cells and further localize genetic risk to CD56bright NK cells, which are key cytokine producers during the innate immune response. Rare variant analysis confirms the enrichment of severe-disease-associated genetic variation within NK-cell risk genes. Our study provides insights into the pathogenesis of severe COVID-19 with potential therapeutic targets.
Collapse
Affiliation(s)
- Sai Zhang
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; VA Palo Alto Epidemiology Research and Information Center for Genomics, VA Palo Alto Health Care System, Palo Alto, CA 94304, USA; Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA; Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Johnathan Cooper-Knock
- Sheffield Institute for Translational Neuroscience, University of Sheffield, Sheffield S10 2HQ, UK
| | - Annika K Weimer
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Minyi Shi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Lina Kozhaya
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Derya Unutmaz
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Calum Harvey
- Sheffield Institute for Translational Neuroscience, University of Sheffield, Sheffield S10 2HQ, UK
| | - Thomas H Julian
- Sheffield Institute for Translational Neuroscience, University of Sheffield, Sheffield S10 2HQ, UK
| | - Simone Furini
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy
| | - Elisa Frullanti
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy; Medical Genetics, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy
| | - Francesca Fava
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy; Medical Genetics, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy; Genetica Medica, Azienda Ospedaliero-Universitaria Senese, 53100 Siena, Italy
| | - Alessandra Renieri
- Med Biotech Hub and Competence Center, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy; Medical Genetics, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy; Genetica Medica, Azienda Ospedaliero-Universitaria Senese, 53100 Siena, Italy
| | - Peng Gao
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Xiaotao Shen
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Ilia Sarah Timpanaro
- Department of Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht, 3584 CX Utrecht, the Netherlands
| | - Kevin P Kenna
- Department of Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht, 3584 CX Utrecht, the Netherlands
| | - J Kenneth Baillie
- Roslin Institute, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK; MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK; Intensive Care Unit, Royal Infirmary of Edinburgh, Edinburgh EH16 4SA, UK
| | - Mark M Davis
- Institute for Immunity, Transplantation and Infection, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA; Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Philip S Tsao
- VA Palo Alto Epidemiology Research and Information Center for Genomics, VA Palo Alto Health Care System, Palo Alto, CA 94304, USA; Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA.
| | - Michael P Snyder
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Center for Genomics and Personalized Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA; Stanford Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA 94305, USA.
| |
Collapse
|
4
|
Anfinson M, Fitts RH, Lough JW, James JM, Simpson PM, Handler SS, Mitchell ME, Tomita-Mitchell A. Significance of α-Myosin Heavy Chain (MYH6) Variants in Hypoplastic Left Heart Syndrome and Related Cardiovascular Diseases. J Cardiovasc Dev Dis 2022; 9. [PMID: 35621855 DOI: 10.3390/jcdd9050144] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 04/27/2022] [Accepted: 04/29/2022] [Indexed: 02/04/2023] Open
Abstract
Hypoplastic left heart syndrome (HLHS) is a severe congenital heart disease (CHD) with complex genetic inheritance. HLHS segregates with other left ventricular outflow tract (LVOT) malformations in families, and can present as either an isolated phenotype or as a feature of a larger genetic disorder. The multifactorial etiology of HLHS makes it difficult to interpret the clinical significance of genetic variants. Specific genes have been implicated in HLHS, including rare, predicted damaging MYH6 variants that are present in >10% of HLHS patients, and which have been shown to be associated with decreased transplant-free survival in our previous studies. MYH6 (α-myosin heavy chain, α-MHC) variants have been reported in HLHS and numerous other CHDs, including LVOT malformations, and may provide a genetic link to these disorders. In this paper, we outline the MYH6 variants that have been identified, discuss how bioinformatic and functional studies can inform clinical decision making, and highlight the importance of genetic testing in HLHS.
Collapse
|
5
|
Teerlink CC, Miller JB, Vance EL, Staley LA, Stevens J, Tavana JP, Cloward ME, Page ML, Dayton L, Cannon-Albright LA, Kauwe JSK. Analysis of high-risk pedigrees identifies 11 candidate variants for Alzheimer's disease. Alzheimers Dement 2021; 18:307-317. [PMID: 34151536 PMCID: PMC9291865 DOI: 10.1002/alz.12397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 04/15/2021] [Accepted: 05/11/2021] [Indexed: 11/08/2022]
Abstract
Introduction Analysis of sequence data in high‐risk pedigrees is a powerful approach to detect rare predisposition variants. Methods Rare, shared candidate predisposition variants were identified from exome sequencing 19 Alzheimer's disease (AD)‐affected cousin pairs selected from high‐risk pedigrees. Variants were further prioritized by risk association in various external datasets. Candidate variants emerging from these analyses were tested for co‐segregation to additional affected relatives of the original sequenced pedigree members. Results AD‐affected high‐risk cousin pairs contained 564 shared rare variants. Eleven variants spanning 10 genes were prioritized in external datasets: rs201665195 (ABCA7), and rs28933981 (TTR) were previously implicated in AD pathology; rs141402160 (NOTCH3) and rs140914494 (NOTCH3) were previously reported; rs200290640 (PIDD1) and rs199752248 (PIDD1) were present in more than one cousin pair; rs61729902 (SNAP91), rs140129800 (COX6A2, AC026471), and rs191804178 (MUC16) were not present in a longevity cohort; and rs148294193 (PELI3) and rs147599881 (FCHO1) approached significance from analysis of AD‐related phenotypes. Three variants were validated via evidence of co‐segregation to additional relatives (PELI3, ABCA7, and SNAP91). Discussion These analyses support ABCA7 and TTR as AD risk genes, expand on previously reported NOTCH3 variant identification, and prioritize seven additional candidate variants.
Collapse
Affiliation(s)
- Craig C Teerlink
- Department of Internal Medicine, University of Utah, Salt Lake City, Utah, USA
| | - Justin B Miller
- Department of Biomedical Informatics, University of Kentucky Sanders-Brown Center on Aging, Lexington, Kentucky, USA
| | | | - Lyndsay A Staley
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | - Jeffrey Stevens
- Department of Internal Medicine, University of Utah, Salt Lake City, Utah, USA
| | - Justina P Tavana
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | | | - Madeline L Page
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | - Louisa Dayton
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | | | - Lisa A Cannon-Albright
- Department of Internal Medicine, University of Utah, Salt Lake City, Utah, USA.,George E. Wahlen Department of Veterans Affairs Medical Center, Salt Lake City, Utah, USA.,Huntsman Cancer Institute, Salt Lake City, Utah, USA
| | - John S K Kauwe
- Department of Biology, Brigham Young University, Provo, Utah, USA
| |
Collapse
|
6
|
Bang L, Shivakumar M, Garg T, Kim D. Genetic Analysis Reveals Rare Variants in T-Cell Response Gene MR1 Associated with Poor Overall Survival after Urothelial Cancer Diagnosis. Cancers (Basel) 2021; 13:1864. [PMID: 33919687 DOI: 10.3390/cancers13081864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 04/06/2021] [Accepted: 04/08/2021] [Indexed: 11/16/2022] Open
Abstract
Urothelial carcinoma of the bladder (UC) is the fifth most common cancer in the United States. Germline variants, especially rare germline variants, may account for a portion of the disparity seen among patients in terms of UC incidence, presentation, and outcomes. The objectives of this study were to identify rare germline variant associations in UC incidence and to determine its association with clinical outcomes. Using exome sequencing data from the DiscovEHR UC cohort (n = 446), a European-ancestry, North American population, the complex influence of germline variants on known clinical phenotypes were analyzed using dispersion and burden metrics with regression tests. Outcomes measured were derived from the electronic health record (EHR) and included UC incidence, age at diagnosis, and overall survival (OS). Consequently, key rare variant association genes were implicated in MR1 and ADGRL2. The Kaplan-Meier survival analysis reveals that individuals with MR1 germline variants had significantly worse OS than those without any (log-rank p-value = 3.46 × 10-7). Those with ADGRL2 variants were found to be slightly more likely to have UC compared to a matched control cohort (FDR q-value = 0.116). These associations highlight several candidate genes that have the potential to explain clinical disparities in UC and predict UC outcomes.
Collapse
|
7
|
Read RW, Schlauch KA, Lombardi VC, Cirulli ET, Washington NL, Lu JT, Grzymski JJ. Genome-Wide Identification of Rare and Common Variants Driving Triglyceride Levels in a Nevada Population. Front Genet 2021; 12:639418. [PMID: 33763119 PMCID: PMC7982958 DOI: 10.3389/fgene.2021.639418] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Accepted: 02/12/2021] [Indexed: 01/08/2023] Open
Abstract
Clinical conditions correlated with elevated triglyceride levels are well-known: coronary heart disease, hypertension, and diabetes. Underlying genetic and phenotypic mechanisms are not fully understood, partially due to lack of coordinated genotypic-phenotypic data. Here we use a subset of the Healthy Nevada Project, a population of 9,183 sequenced participants with longitudinal electronic health records to examine consequences of altered triglyceride levels. Specifically, Healthy Nevada Project participants sequenced by the Helix Exome+ platform were cross-referenced to their electronic medical records to identify: (1) rare and common single-variant genome-wide associations; (2) gene-based associations using a Sequence Kernel Association Test; (3) phenome-wide associations with triglyceride levels; and (4) pleiotropic variants linked to triglyceride levels. The study identified 549 significant single-variant associations (p < 8.75 × 10-9), many in chromosome 11's triglyceride hotspot: ZPR1, BUD13, APOC3, APOA5. A well-known protective loss-of-function variant in APOC3 (R19X) was associated with a 51% decrease in triglyceride levels in the cohort. Sixteen gene-based triglyceride associations were identified; six of these genes surprisingly did not include a single variant with significant associations. Results at the variant and gene level were validated with the UK Biobank. The combination of a single-variant genome-wide association, a gene-based association method, and phenome wide-association studies identified rare and common variants, genes, and phenotypes associated with elevated triglyceride levels, some of which may have been overlooked with standard approaches.
Collapse
Affiliation(s)
- Robert W. Read
- Center for Genomic Medicine, Desert Research Institute, Reno, NV, United States
| | - Karen A. Schlauch
- Center for Genomic Medicine, Desert Research Institute, Reno, NV, United States
| | - Vincent C. Lombardi
- Department of Microbiology and Immunology, School of Medicine, University of Nevada, Reno, Reno, NV, United States
| | | | | | - James T. Lu
- Helix Opco, LLC., San Mateo, CA, United States
| | - Joseph J. Grzymski
- Center for Genomic Medicine, Desert Research Institute, Reno, NV, United States
- Renown Health, Reno, NV, United States
| |
Collapse
|
8
|
Sayaman RW, Saad M, Thorsson V, Hu D, Hendrickx W, Roelands J, Porta-Pardo E, Mokrab Y, Farshidfar F, Kirchhoff T, Sweis RF, Bathe OF, Heimann C, Campbell MJ, Stretch C, Huntsman S, Graff RE, Syed N, Radvanyi L, Shelley S, Wolf D, Marincola FM, Ceccarelli M, Galon J, Ziv E, Bedognetti D. Germline genetic contribution to the immune landscape of cancer. Immunity 2021; 54:367-386.e8. [PMID: 33567262 DOI: 10.1016/j.immuni.2021.01.011] [Citation(s) in RCA: 76] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 10/14/2020] [Accepted: 01/13/2021] [Indexed: 02/07/2023]
Abstract
Understanding the contribution of the host's genetic background to cancer immunity may lead to improved stratification for immunotherapy and to the identification of novel therapeutic targets. We investigated the effect of common and rare germline variants on 139 well-defined immune traits in ∼9000 cancer patients enrolled in TCGA. High heritability was observed for estimates of NK cell and T cell subset infiltration and for interferon signaling. Common variants of IFIH1, TMEM173 (STING1), and TMEM108 were associated with differential interferon signaling and variants mapping to RBL1 correlated with T cell subset abundance. Pathogenic or likely pathogenic variants in BRCA1 and in genes involved in telomere stabilization and Wnt-β-catenin also acted as immune modulators. Our findings provide evidence for the impact of germline genetics on the composition and functional orientation of the tumor immune microenvironment. The curated datasets, variants, and genes identified provide a resource toward further understanding of tumor-immune interactions.
Collapse
Affiliation(s)
- Rosalyn W Sayaman
- Department of Population Sciences, Beckman Research Institute, City of Hope Comprehensive Cancer Center, Duarte, CA 91010, USA; Department of Laboratory Medicine, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94143, USA; Biological Sciences and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
| | - Mohamad Saad
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha, Qatar; Neuroscience Research Center, Faculty of Medical Sciences, Lebanese University, Beirut, Lebanon
| | | | - Donglei Hu
- Department of Medicine, Institute for Human Genetics, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Wouter Hendrickx
- Research Branch, Sidra Medicine, PO Box 26999 Doha, Qatar; College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
| | - Jessica Roelands
- Research Branch, Sidra Medicine, PO Box 26999 Doha, Qatar; Department of Surgery, Leiden University Medical Center, 2333 ZA Leiden, the Netherlands
| | - Eduard Porta-Pardo
- Barcelona Supercomputing Center (BSC); Josep Carreras Leukaemia Research Institute (IJC), Badalona, 08034 Barcelona, Catalonia, Spain
| | - Younes Mokrab
- Research Branch, Sidra Medicine, PO Box 26999 Doha, Qatar; College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar; Weill Cornell Medicine, Doha, Qatar
| | - Farshad Farshidfar
- Department of Oncology, University of Calgary, Alberta AB T2N 4N1, Canada; Arnie Charbonneau Cancer Institute, Calgary, Alberta AB T2N 4N1, Canada; Department of Biomedical Data Science and Institute for Stem Cell Biology and Regenerative Medicine, School of Medicine, Stanford University, Stanford, CA 94305, USA; Tenaya Therapeutics, South San Francisco, CA 94080, USA
| | - Tomas Kirchhoff
- Perlmutter Cancer Center, New York University School of Medicine, New York University Langone Health, New York, NY 10016, USA
| | - Randy F Sweis
- Department of Medicine, Section of Hematology/Oncology, Committee on Clinical Pharmacology and Pharmacogenomics, Committee on Immunology, University of Chicago, Chicago, IL 60637, USA
| | - Oliver F Bathe
- Department of Oncology, University of Calgary, Alberta AB T2N 4N1, Canada; Arnie Charbonneau Cancer Institute, Calgary, Alberta AB T2N 4N1, Canada; Department of Surgery, University of Calgary, Calgary, Alberta AB T2N 4N1, Canada
| | | | - Michael J Campbell
- Department of Surgery, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Cynthia Stretch
- Department of Oncology, University of Calgary, Alberta AB T2N 4N1, Canada; Arnie Charbonneau Cancer Institute, Calgary, Alberta AB T2N 4N1, Canada
| | - Scott Huntsman
- Department of Medicine, Institute for Human Genetics, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Rebecca E Graff
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Najeeb Syed
- Research Branch, Sidra Medicine, PO Box 26999 Doha, Qatar; Department of Science and Technology, University of Sannio, 82100 Benevento, Italy
| | - Laszlo Radvanyi
- Ontario Institute for Cancer Research, Toronto, Ontario M5G 0A3, Canada
| | - Simon Shelley
- Department of Research and Development, Leukemia Therapeutics, LLC, Hull, MA 02045, USA
| | - Denise Wolf
- Department of Laboratory Medicine, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94143, USA
| | | | - Michele Ceccarelli
- Department of Electrical Engineering and Information Technology, University of Naples "Federico II," 80128 Naples, Italy; Istituto di Ricerche Genetiche "G. Salvatore," Biogem s.c.ar.l., 83031 Ariano Irpino, Italy
| | - Jérôme Galon
- INSERM, Laboratory of Integrative Cancer Immunology, Equipe Labellisée Ligue Contre Le Cancer, Centre de Recherche de Cordeliers, Université de Paris, Sorbonne Université, Paris, France
| | - Elad Ziv
- Department of Medicine, Institute for Human Genetics, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA 94143, USA.
| | - Davide Bedognetti
- Research Branch, Sidra Medicine, PO Box 26999 Doha, Qatar; College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar; Department of Internal Medicine and Medical Specialties (Di.M.I.), University of Genoa, 16132 Genoa, Italy.
| |
Collapse
|
9
|
Fore R, Boehme J, Li K, Westra J, Tintle N. Multi-Set Testing Strategies Show Good Behavior When Applied to Very Large Sets of Rare Variants. Front Genet 2020; 11:591606. [PMID: 33240333 PMCID: PMC7680887 DOI: 10.3389/fgene.2020.591606] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 10/05/2020] [Indexed: 12/22/2022] Open
Abstract
Gene-based tests of association (e.g., variance components and burden tests) are now common practice for analyses attempting to elucidate the contribution of rare genetic variants on common disease. As sequencing datasets continue to grow in size, the number of variants within each set (e.g., gene) being tested is also continuing to grow. Pathway-based methods have been used to allow for the initial aggregation of gene-based statistical evidence and then the subsequent aggregation of evidence across the pathway. This “multi-set” approach (first gene-based test, followed by pathway-based) lacks thorough exploration in regard to evaluating genotype–phenotype associations in the age of large, sequenced datasets. In particular, we wonder whether there are statistical and biological characteristics that make the multi-set approach optimal vs. simply doing all gene-based tests? In this paper, we provide an intuitive framework for evaluating these questions and use simulated data to affirm us this intuition. A real data application is provided demonstrating how our insights manifest themselves in practice. Ultimately, we find that when initial subsets are biologically informative (e.g., tending to aggregate causal genetic variants within one or more subsets, often genes), multi-set strategies can improve statistical power, with particular gains in cases where causal variants are aggregated in subsets with less variants overall (high proportion of causal variants in the subset). However, we find that there is little advantage when the sets are non-informative (similar proportion of causal variants in the subsets). Our application to real data further demonstrates this intuition. In practice, we recommend wider use of pathway-based methods and further exploration of optimal ways of aggregating variants into subsets based on emerging biological evidence of the genetic architecture of complex disease.
Collapse
Affiliation(s)
- Ruby Fore
- Department of Biostatistics, Brown University, Providence, RI, United States
| | - Jaden Boehme
- Department of Mathematics, Oregon State University, Corvallis, OR, United States
| | - Kevin Li
- Department of Mathematics, School of Arts and Sciences, Columbia University, New York, NY, United States
| | - Jason Westra
- Department of Mathematics and Statistics, Dordt University, Sioux Center, IA, United States
| | - Nathan Tintle
- Department of Mathematics and Statistics, Dordt University, Sioux Center, IA, United States
| |
Collapse
|
10
|
Lim E, Chen H, Dupuis J, Liu CT. A unified method for rare variant analysis of gene-environment interactions. Stat Med 2020; 39:801-813. [PMID: 31799744 DOI: 10.1002/sim.8446] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 11/19/2019] [Accepted: 11/21/2019] [Indexed: 01/17/2023]
Abstract
Advanced technology in whole-genome sequencing has offered the opportunity to comprehensively investigate the genetic contribution, particularly rare variants, to complex traits. Several region-based tests have been developed to jointly model the marginal effect of rare variants, but methods to detect gene-environment (GE) interactions are underdeveloped. Identifying the modification effects of environmental factors on genetic risk poses a considerable challenge. To tackle this challenge, we develop a method to detect GE interactions for rare variants using generalized linear mixed effect model. The proposed method can accommodate either binary or continuous traits in related or unrelated samples. Under this model, genetic main effects, GE interactions, and sample relatedness are modeled as random effects. We adopt a kernel-based method to leverage the joint information across rare variants and implement variance component score tests to reduce the computational burden. Our simulation studies of continuous and binary traits show that the proposed method maintains correct type I error rates and appropriate power under various scenarios, such as genotype main effects and GE interaction effects in opposite directions and varying the proportion of causal variants in the model. We apply our method in the Framingham Heart Study to test GE interaction of smoking on body mass index or overweight status and replicate the Cholinergic Receptor Nicotinic Beta 4 gene association reported in previous large consortium meta-analysis of single nucleotide polymorphism-smoking interaction. Our proposed set-based GE test is computationally efficient and is applicable to both binary and continuous phenotypes, while appropriately accounting for familial or cryptic relatedness.
Collapse
Affiliation(s)
- Elise Lim
- Department of Biostatistics, Boston University, Boston, Massachusetts
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas.,Center for Precision Health, School of Public Health and School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas
| | - Josée Dupuis
- Department of Biostatistics, Boston University, Boston, Massachusetts
| | - Ching-Ti Liu
- Department of Biostatistics, Boston University, Boston, Massachusetts
| |
Collapse
|
11
|
Konigorski S, Yilmaz YE, Janke J, Bergmann MM, Boeing H, Pischon T. Powerful rare variant association testing in a copula-based joint analysis of multiple phenotypes. Genet Epidemiol 2019; 44:26-40. [PMID: 31732979 DOI: 10.1002/gepi.22265] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Revised: 08/13/2019] [Accepted: 09/16/2019] [Indexed: 12/16/2022]
Abstract
In genetic association studies of rare variants, the low power of association tests is one of the main challenges. In this study, we propose a new single-marker association test called C-JAMP (Copula-based Joint Analysis of Multiple Phenotypes), which is based on a joint model of multiple phenotypes given genetic markers and other covariates. We evaluated its performance and compared its empirical type I error and power with existing univariate and multivariate single-marker and multi-marker rare-variant tests in extensive simulation studies. C-JAMP yielded unbiased genetic effect estimates and valid type I errors with an adjusted test statistic. When strongly dependent traits were jointly analyzed, C-JAMP had the highest power in all scenarios except when a high percentage of variants were causal with moderate/small effect sizes. When traits with weak or moderate dependence were analyzed, whether C-JAMP or competing approaches had higher power depended on the effect size. When C-JAMP was applied with a misspecified copula function, it still achieved high power in some of the scenarios considered. In a real-data application, we analyzed sequencing data using C-JAMP and performed the first genome-wide association studies of high-molecular-weight and medium-molecular-weight adiponectin plasma concentrations. C-JAMP identified 20 rare variants with p-values smaller than 10-5 , while all other tests resulted in the identification of fewer variants with higher p-values. In summary, the results indicate that C-JAMP is a powerful, flexible, and robust method for association studies, and we identified novel candidate markers for adiponectin. C-JAMP is implemented as an R package and freely available from https://cran.r-project.org/package=CJAMP.
Collapse
Affiliation(s)
- Stefan Konigorski
- Molecular Epidemiology Research Group, Max Delbrück Center (MDC) for Molecular Medicine in the Helmholtz Association, Berlin, Germany.,Digital Health and Machine Learning Research Group, Hasso Plattner Institute for Digital Engineering, Potsdam, Germany
| | - Yildiz E Yilmaz
- Department of Mathematics and Statistics, Memorial University of Newfoundland, St. John's, NL, Canada.,Discipline of Genetics, Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL, Canada.,Discipline of Medicine, Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL, Canada
| | - Jürgen Janke
- Molecular Epidemiology Research Group, Max Delbrück Center (MDC) for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Manuela M Bergmann
- Department of Epidemiology, German Institute of Human Nutrition Potsdam-Rehbrücke (DIfE), Nuthetal, Germany
| | - Heiner Boeing
- Department of Epidemiology, German Institute of Human Nutrition Potsdam-Rehbrücke (DIfE), Nuthetal, Germany
| | - Tobias Pischon
- Molecular Epidemiology Research Group, Max Delbrück Center (MDC) for Molecular Medicine in the Helmholtz Association, Berlin, Germany.,Charité-Universitätsmedizin Berlin, Berlin, Germany.,DZHK (German Center for Cardiovascular Research), partner site Berlin, Berlin, Germany
| |
Collapse
|
12
|
Li Z, Li X, Liu Y, Shen J, Chen H, Zhou H, Morrison AC, Boerwinkle E, Lin X. Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies. Am J Hum Genet 2019; 104:802-814. [PMID: 30982610 PMCID: PMC6507043 DOI: 10.1016/j.ajhg.2019.03.002] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2018] [Accepted: 03/01/2019] [Indexed: 11/19/2022] Open
Abstract
Whole-genome sequencing (WGS) studies are being widely conducted in order to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set-based analyses are commonly used by researchers for analyzing rare variants. However, existing variant-set-based approaches need to pre-specify genetic regions for analysis; hence, they are not directly applicable to WGS data because of the large number of intergenic and intron regions that consist of a massive number of non-coding variants. The commonly used sliding-window method requires the pre-specification of fixed window sizes, which are often unknown as a priori, are difficult to specify in practice, and are subject to limitations given that the sizes of genetic-association regions are likely to vary across the genome and phenotypes. We propose a computationally efficient and dynamic scan-statistic method (Scan the Genome [SCANG]) for analyzing WGS data; this method flexibly detects the sizes and the locations of rare-variant association regions without the need to specify a prior, fixed window size. The proposed method controls for the genome-wise type I error rate and accounts for the linkage disequilibrium among genetic variants. It allows the detected sizes of rare-variant association regions to vary across the genome. Through extensive simulated studies that consider a wide variety of scenarios, we show that SCANG substantially outperforms several alternative methods for detecting rare-variant-associations while controlling for the genome-wise type I error rates. We illustrate SCANG by analyzing the WGS lipids data from the Atherosclerosis Risk in Communities (ARIC) study.
Collapse
Affiliation(s)
- Zilin Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Xihao Li
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Yaowu Liu
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Jincheng Shen
- Department of Population Health Sciences, University of Utah, Salt Lake City, UT 84108, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, the University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Center for Precision Health, School of Public Health and School of Biomedical Informatics, the University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Hufeng Zhou
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Alanna C Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, the University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, the University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Statistics, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
13
|
Zhou Y, Fujikura K, Mkrtchian S, Lauschke VM. Computational Methods for the Pharmacogenetic Interpretation of Next Generation Sequencing Data. Front Pharmacol 2018; 9:1437. [PMID: 30564131 PMCID: PMC6288784 DOI: 10.3389/fphar.2018.01437] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Accepted: 11/20/2018] [Indexed: 12/21/2022] Open
Abstract
Up to half of all patients do not respond to pharmacological treatment as intended. A substantial fraction of these inter-individual differences is due to heritable factors and a growing number of associations between genetic variations and drug response phenotypes have been identified. Importantly, the rapid progress in Next Generation Sequencing technologies in recent years unveiled the true complexity of the genetic landscape in pharmacogenes with tens of thousands of rare genetic variants. As each individual was found to harbor numerous such rare variants they are anticipated to be important contributors to the genetically encoded inter-individual variability in drug effects. The fundamental challenge however is their functional interpretation due to the sheer scale of the problem that renders systematic experimental characterization of these variants currently unfeasible. Here, we review concepts and important progress in the development of computational prediction methods that allow to evaluate the effect of amino acid sequence alterations in drug metabolizing enzymes and transporters. In addition, we discuss recent advances in the interpretation of functional effects of non-coding variants, such as variations in splice sites, regulatory regions and miRNA binding sites. We anticipate that these methodologies will provide a useful toolkit to facilitate the integration of the vast extent of rare genetic variability into drug response predictions in a precision medicine framework.
Collapse
Affiliation(s)
- Yitian Zhou
- Section of Pharmacogenetics, Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| | - Kohei Fujikura
- Department of Diagnostic Pathology, Kobe University Graduate School of Medicine, Kobe, Japan
| | - Souren Mkrtchian
- Section of Pharmacogenetics, Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| | - Volker M. Lauschke
- Section of Pharmacogenetics, Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
14
|
Miller JE, Shivakumar MK, Risacher SL, Saykin AJ, Lee S, Nho K, Kim D. Codon bias among synonymous rare variants is associated with Alzheimer's disease imaging biomarker. Pac Symp Biocomput 2018; 23:365-376. [PMID: 29218897 PMCID: PMC5756629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Alzheimer's disease (AD) is a neurodegenerative disorder with few biomarkers even though it impacts a relatively large portion of the population and is predicted to affect significantly more individuals in the future. Neuroimaging has been used in concert with genetic information to improve our understanding in relation to how AD arises and how it can be potentially diagnosed. Additionally, evidence suggests synonymous variants can have a functional impact on gene regulatory mechanisms, including those related to AD. Some synonymous codons are preferred over others leading to a codon bias. The bias can arise with respect to codons that are more or less frequently used in the genome. A bias can also result from optimal and non-optimal codons, which have stronger and weaker codon anti-codon interactions, respectively. Although association tests have been utilized before to identify genes associated with AD, it remains unclear how codon bias plays a role and if it can improve rare variant analysis. In this work, rare variants from whole-genome sequencing from the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort were binned into genes using BioBin. An association analysis of the genes with AD-related neuroimaging biomarker was performed using SKAT-O. While using all synonymous variants we did not identify any genomewide significant associations, using only synonymous variants that affected codon frequency we identified several genes as significantly associated with the imaging phenotype. Additionally, significant associations were found using only rare variants that contains an optimal codon in among minor alleles and a non-optimal codon in the major allele. These results suggest that codon bias may play a role in AD and that it can be used to improve detection power in rare variant association analysis.
Collapse
Affiliation(s)
- Jason E Miller
- Biomedical and Translational Informatics Institute, Geisinger Health System, Danville, PA, USA
| | | | | | | | | | | | | |
Collapse
|
15
|
Lutz SM, Fingerlin TE, Hokanson JE, Lange C. A general approach to testing for pleiotropy with rare and common variants. Genet Epidemiol 2016; 41:163-170. [PMID: 27900789 DOI: 10.1002/gepi.22011] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Revised: 08/01/2016] [Accepted: 09/19/2016] [Indexed: 12/22/2022]
Abstract
Through genome-wide association studies, numerous genes have been shown to be associated with multiple phenotypes. To determine the overlap of genetic susceptibility of correlated phenotypes, one can apply multivariate regression or dimension reduction techniques, such as principal components analysis, and test for the association with the principal components of the phenotypes rather than the individual phenotypes. However, as these approaches test whether there is a genetic effect for at least one of the phenotypes, a significant test result does not necessarily imply pleiotropy. Recently, a method called Pleiotropy Estimation and Test Bootstrap (PET-B) has been proposed to specifically test for pleiotropy (i.e., that two normally distributed phenotypes are both associated with the single nucleotide polymorphism of interest). Although the method examines the genetic overlap between the two quantitative phenotypes, the extension to binary phenotypes, three or more phenotypes, and rare variants is not straightforward. We provide two approaches to formally test this pleiotropic relationship in multiple scenarios. These approaches depend on permuting the phenotypes of interest and comparing the set of observed P-values to the set of permuted P-values in relation to the origin (e.g., a vector of zeros) either using the Hausdorff metric or a cutoff-based approach. These approaches are appropriate for categorical and quantitative phenotypes, more than two phenotypes, common variants and rare variants. We evaluate these approaches under various simulation scenarios and apply them to the COPDGene study, a case-control study of chronic obstructive pulmonary disease in current and former smokers.
Collapse
Affiliation(s)
- Sharon M Lutz
- Department of Biostatistics, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Tasha E Fingerlin
- Department of Biostatistics, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA.,Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, USA
| | - John E Hokanson
- Department of Epidemiology, University of Colorado, Anschutz Medical Campus, Aurora, CO, USA
| | - Christoph Lange
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
16
|
Abstract
With the advance of next-generation sequencing technologies in recent years, rare genetic variant data have now become available for genetic epidemiology studies. For family samples, however, only a few statistical methods for association analysis of rare genetic variants have been developed. Rare variant approaches are of great interest, particularly for family data, because samples enriched for trait-relevant variants can be ascertained and rare variants are putatively enriched through segregation. To facilitate the evaluation of existing and new rare variant testing approaches for analyzing family data, Genetic Analysis Workshop 18 (GAW18) provided genotype and next-generation sequencing data and longitudinal blood pressure traits from extended pedigrees of Mexican American families from the San Antonio Family Study. Our GAW18 group members analyzed real and simulated phenotype data from GAW18 by using generalized linear mixed-effects models or principal components to adjust for familial correlation or by testing binary traits using a correction factor for familial effects. With one exception, approaches dealt with the extended pedigrees in their original state using information based on the kinship matrix or alternative genetic similarity measures. For simulated data our group demonstrated that the family-based kernel machine score test is superior in power to family-based single-marker or burden tests, except in a few specific scenarios. For real data three contributions identified significant associations. They substantially reduced the number of tests before performing the association analysis. We conclude from our real data analyses that further development of strategies for targeted testing or more focused screening of genetic variants is strongly desirable.
Collapse
Affiliation(s)
- Han Chen
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America
| | | | | | | | | |
Collapse
|
17
|
Chen H, Lumley T, Brody J, Heard-Costa NL, Fox CS, Cupples LA, Dupuis J. Sequence kernel association test for survival traits. Genet Epidemiol 2014; 38:191-7. [PMID: 24464521 DOI: 10.1002/gepi.21791] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2013] [Revised: 12/20/2013] [Accepted: 12/21/2013] [Indexed: 11/11/2022]
Abstract
Rare variant tests have been of great interest in testing genetic associations with diseases and disease-related quantitative traits in recent years. Among these tests, the sequence kernel association test (SKAT) is an omnibus test for effects of rare genetic variants, in a linear or logistic regression framework. It is often described as a variance component test treating the genotypic effects as random. When the linear kernel is used, its test statistic can be expressed as a weighted sum of single-marker score test statistics. In this paper, we extend the test to survival phenotypes in a Cox regression framework. Because of the anticonservative small-sample performance of the score test in a Cox model, we substitute signed square-root likelihood ratio statistics for the score statistics, and confirm that the small-sample control of type I error is greatly improved. This test can also be applied in meta-analysis. We show in our simulation studies that this test has superior statistical power except in a few specific scenarios, as compared to burden tests in a Cox model. We also present results in an application to time-to-obesity using genotypes from Framingham Heart Study SNP Health Association Resource.
Collapse
Affiliation(s)
- Han Chen
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America; Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | | | | | | | | | | | | |
Collapse
|
18
|
Yoo YJ, Sun L, Bull SB. Gene-based multiple regression association testing for combined examination of common and low frequency variants in quantitative trait analysis. Front Genet 2013; 4:233. [PMID: 24273553 PMCID: PMC3824159 DOI: 10.3389/fgene.2013.00233] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Accepted: 10/21/2013] [Indexed: 11/18/2022] Open
Abstract
Multi-marker methods for genetic association analysis can be performed for common and low frequency SNPs to improve power. Regression models are an intuitive way to formulate multi-marker tests. In previous studies we evaluated regression-based multi-marker tests for common SNPs, and through identification of bins consisting of correlated SNPs, developed a multi-bin linear combination (MLC) test that is a compromise between a 1 df linear combination test and a multi-df global test. Bins of SNPs in high linkage disequilibrium (LD) are identified, and a linear combination of individual SNP statistics is constructed within each bin. Then association with the phenotype is represented by an overall statistic with df as many or few as the number of bins. In this report we evaluate multi-marker tests for SNPs that occur at low frequencies. There are many linear and quadratic multi-marker tests that are suitable for common or low frequency variant analysis. We compared the performance of the MLC tests with various linear and quadratic statistics in joint or marginal regressions. For these comparisons, we performed a simulation study of genotypes and quantitative traits for 85 genes with many low frequency SNPs based on HapMap Phase III. We compared the tests using (1) set of all SNPs in a gene, (2) set of common SNPs in a gene (MAF ≥ 5%), (3) set of low frequency SNPs (1% ≤ MAF < 5%). For different trait models based on low frequency causal SNPs, we found that combined analysis using all SNPs including common and low frequency SNPs is a good and robust choice whereas using common SNPs alone or low frequency SNP alone can lose power. MLC tests performed well in combined analysis except where two low frequency causal SNPs with opposing effects are positively correlated. Overall, across different sets of analysis, the joint regression Wald test showed consistently good performance whereas other statistics including the ones based on marginal regression had lower power for some situations.
Collapse
Affiliation(s)
- Yun Joo Yoo
- Department of Mathematics Education, Seoul National University Seoul, South Korea ; Interdisciplinary Program in Bioinformatics, Seoul National University Seoul, South Korea
| | | | | |
Collapse
|