1
|
Patiño-García A, Guruceaga E, Andueza MP, Ocón M, Fodop Sokoudjou JJ, de Villalonga Zornoza N, Alkorta-Aranburu G, Uria IT, Gurpide A, Camps C, Jantus-Lewintre E, Navamuel-Andueza M, Sanmamed MF, Melero I, Elgendy M, Fusco JP, Zulueta JJ, de-Torres JP, Bastarrika G, Seijo L, Pio R, Montuenga LM, Hernáez M, Ochoa I, Perez-Gracia JL. Whole exome sequencing and machine learning germline analysis of individuals presenting with extreme phenotypes of high and low risk of developing tobacco-associated lung adenocarcinoma. EBioMedicine 2024; 102:105048. [PMID: 38484556 PMCID: PMC10955643 DOI: 10.1016/j.ebiom.2024.105048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 02/15/2024] [Accepted: 02/22/2024] [Indexed: 03/24/2024] Open
Abstract
BACKGROUND Tobacco is the main risk factor for developing lung cancer. Yet, while some heavy smokers develop lung cancer at a young age, other heavy smokers never develop it, even at an advanced age, suggesting a remarkable variability in the individual susceptibility to the carcinogenic effects of tobacco. We characterized the germline profile of subjects presenting these extreme phenotypes with Whole Exome Sequencing (WES) and Machine Learning (ML). METHODS We sequenced germline DNA from heavy smokers who either developed lung adenocarcinoma at an early age (extreme cases) or who did not develop lung cancer at an advanced age (extreme controls), selected from databases including over 6600 subjects. We selected individual coding genetic variants and variant-rich genes showing a significantly different distribution between extreme cases and controls. We validated the results from our discovery cohort, in which we analysed by WES extreme cases and controls presenting similar phenotypes. We developed ML models using both cohorts. FINDINGS Mean age for extreme cases and controls was 50.7 and 79.1 years respectively, and mean tobacco consumption was 34.6 and 62.3 pack-years. We validated 16 individual variants and 33 variant-rich genes. The gene harbouring the most validated variants was HLA-A in extreme controls (4 variants in the discovery cohort, p = 3.46E-07; and 4 in the validation cohort, p = 1.67E-06). We trained ML models using as input the 16 individual variants in the discovery cohort and tested them on the validation cohort, obtaining an accuracy of 76.5% and an AUC-ROC of 83.6%. Functions of validated genes included candidate oncogenes, tumour-suppressors, DNA repair, HLA-mediated antigen presentation and regulation of proliferation, apoptosis, inflammation and immune response. INTERPRETATION Individuals presenting extreme phenotypes of high and low risk of developing tobacco-associated lung adenocarcinoma show different germline profiles. Our strategy may allow the identification of high-risk subjects and the development of new therapeutic approaches. FUNDING See a detailed list of funding bodies in the Acknowledgements section at the end of the manuscript.
Collapse
Affiliation(s)
- Ana Patiño-García
- Department of Pediatrics and Clinical Genetics, Clínica Universidad de Navarra (CUN), Cancer Center Clínica Universidad de Navarra (CCUN), Program in Solid Tumors, Center for Applied Medical Research (Cima) and Navarra Institute for Health Research (IdisNA), University of Navarra, Pamplona, Spain
| | - Elizabeth Guruceaga
- Bioinformatics Platform, Cima and IdisNA, University of Navarra, Pamplona, Spain
| | - Maria Pilar Andueza
- Department of Oncology, CUN, CCUN and IdisNA, University of Navarra, Pamplona, Spain
| | - Marimar Ocón
- Pulmonary Department, CUN, CCUN and IdisNA, University of Navarra, Pamplona, Spain
| | | | | | | | - Ibon Tamayo Uria
- Bioinformatics Platform, Cima and IdisNA, University of Navarra, Pamplona, Spain
| | - Alfonso Gurpide
- Department of Oncology, CUN, CCUN and IdisNA, University of Navarra, Pamplona, Spain
| | - Carlos Camps
- Department of Medical Oncology, Hospital General Universitario de Valencia, Unidad Mixta TRIAL (Fundación para la Investigación del Hospital General Universitario de Valencia y Centro de Investigación Príncipe Felipe) and Centro de Investigación Biomédica en Red Cáncer (CIBERONC), Valencia, Spain
| | - Eloísa Jantus-Lewintre
- Department of Biotechnology, Universitat Politècnica de València, Unidad Mixta TRIAL (Fundación para la Investigación del Hospital General Universitario de Valencia y Centro de Investigación Príncipe Felipe) and CIBERONC, Valencia, Spain
| | | | - Miguel F Sanmamed
- Department of Oncology, CUN, Division of Immunology, Cima, CCUN, IdisNA and CIBERONC, University of Navarra, Pamplona, Spain
| | - Ignacio Melero
- Division of Immunology, Cima and Immunotherapy, CUN, CCUN, IdisNA and CIBERONC, University of Navarra, Pamplona, Spain
| | - Mohamed Elgendy
- Institute for Clinical Chemistry and Laboratory Medicine, Mildred-Scheel Early Career Center, National Center for Tumor Diseases Dresden (NCT/UCC), University Hospital and Faculty of Medicine, Medical Clinic I, University Hospital Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany. Laboratory of Cancer Cell Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Juan Pablo Fusco
- Department of Medical Oncology Hospital La Luz, Quirón, Madrid, Spain
| | - Javier J Zulueta
- Pulmonary, Critical Care, and Sleep Division, Mount Sinai Morningside Hospital, New York, USA
| | - Juan P de-Torres
- Pulmonary Department, CUN, CCUN and IdisNA, University of Navarra, Pamplona, Spain
| | | | - Luis Seijo
- Pulmonary Department, CUN, CCUN and Centro de Investigación Biomédica en Red de Enfermedades Respiratorias (CIBERES), University of Navarra, Madrid, Spain
| | - Ruben Pio
- Program in Solid Tumors, Cima -CCUN, Department of Biochemistry and Genetics, School of Science, IdisNA and CIBERONC, University of Navarra, Pamplona, Spain
| | - Luis M Montuenga
- Program in Solid Tumors, Cima, Department of Pathology, Anatomy and Physiology, Schools of Medicine and Sciences, CCUN, IdisNA and CIBERONC, University of Navarra, Pamplona, Spain
| | - Mikel Hernáez
- Computational Biology Program, Cima, Data Science and Artificial Intelligence Institute (DATAI), CCUN, IdisNA and CIBERONC, University of Navarra, Pamplona, Spain
| | - Idoia Ochoa
- Electrical and Electronic Engineering Department, Tecnun, DATAI, University of Navarra, San Sebastian, Spain
| | - Jose Luis Perez-Gracia
- Department of Oncology, CUN, CCUN, IdisNA and CIBERONC, University of Navarra, Pamplona, Spain.
| |
Collapse
|
2
|
Boutry S, Helaers R, Lenaerts T, Vikkula M. Rare variant association on unrelated individuals in case-control studies using aggregation tests: existing methods and current limitations. Brief Bioinform 2023; 24:bbad412. [PMID: 37974506 DOI: 10.1093/bib/bbad412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 10/14/2023] [Accepted: 10/28/2023] [Indexed: 11/19/2023] Open
Abstract
Over the past years, progress made in next-generation sequencing technologies and bioinformatics have sparked a surge in association studies. Especially, genome-wide association studies (GWASs) have demonstrated their effectiveness in identifying disease associations with common genetic variants. Yet, rare variants can contribute to additional disease risk or trait heterogeneity. Because GWASs are underpowered for detecting association with such variants, numerous statistical methods have been recently proposed. Aggregation tests collapse multiple rare variants within a genetic region (e.g. gene, gene set, genomic loci) to test for association. An increasing number of studies using such methods successfully identified trait-associated rare variants and led to a better understanding of the underlying disease mechanism. In this review, we compare existing aggregation tests, their statistical features and scope of application, splitting them into the five classical classes: burden, adaptive burden, variance-component, omnibus and other. Finally, we describe some limitations of current aggregation tests, highlighting potential direction for further investigations.
Collapse
Affiliation(s)
- Simon Boutry
- Human Molecular Genetics, de Duve Institute, University of Louvain, Avenue Hippocrate 74 (+5) bte B1.74.06, 1200 Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, 1050 Brussels, Belgium
| | - Raphaël Helaers
- Human Molecular Genetics, de Duve Institute, University of Louvain, Avenue Hippocrate 74 (+5) bte B1.74.06, 1200 Brussels, Belgium
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, 1050 Brussels, Belgium
- Machine Learning Group, Université Libre de Bruxelles, 1050 Brussels, Belgium
- Artificial Intelligence laboratory, Vrije Universiteit Brussel, 1050 Brussels, Belgium
| | - Miikka Vikkula
- Human Molecular Genetics, de Duve Institute, University of Louvain, Avenue Hippocrate 74 (+5) bte B1.74.06, 1200 Brussels, Belgium
- WELBIO department, WEL Research Institute, avenue Pasteur, 6, 1300 Wavre, Belgium
| |
Collapse
|
3
|
Boutry S, Helaers R, Lenaerts T, Vikkula M. Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data. PLoS Comput Biol 2023; 19:e1011488. [PMID: 37708232 PMCID: PMC10522036 DOI: 10.1371/journal.pcbi.1011488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 09/26/2023] [Accepted: 09/04/2023] [Indexed: 09/16/2023] Open
Abstract
The development of high-throughput next-generation sequencing technologies and large-scale genetic association studies produced numerous advances in the biostatistics field. Various aggregation tests, i.e. statistical methods that analyze associations of a trait with multiple markers within a genomic region, have produced a variety of novel discoveries. Notwithstanding their usefulness, there is no single test that fits all needs, each suffering from specific drawbacks. Selecting the right aggregation test, while considering an unknown underlying genetic model of the disease, remains an important challenge. Here we propose a new ensemble method, called Excalibur, based on an optimal combination of 36 aggregation tests created after an in-depth study of the limitations of each test and their impact on the quality of result. Our findings demonstrate the ability of our method to control type I error and illustrate that it offers the best average power across all scenarios. The proposed method allows for novel advances in Whole Exome/Genome sequencing association studies, able to handle a wide range of association models, providing researchers with an optimal aggregation analysis for the genetic regions of interest.
Collapse
Affiliation(s)
- Simon Boutry
- Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, Brussels, Belgium
| | - Raphaël Helaers
- Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium
| | - Tom Lenaerts
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, Brussels, Belgium
- Machine Learning Group, Université Libre de Bruxelles, Brussels, Belgium
- Artificial Intelligence laboratory, Vrije Universiteit Brussel, Brussels, Belgium
| | - Miikka Vikkula
- Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium
- WELBIO department, WEL Research Institute, Wavre, Belgium
| |
Collapse
|
4
|
Patiño-García A, Guruceaga E, Segura V, Sánchez Bayona R, Andueza MP, Tamayo Uria I, Serrano G, Fusco JP, Pajares MJ, Gurpide A, Ocón M, Sanmamed MF, Rodriguez Ruiz M, Melero I, Lozano MD, de Andrea C, Pita G, Gonzalez-Neira A, Gonzalez A, Zulueta JJ, Montuenga LM, Pio R, Perez-Gracia JL. Whole exome sequencing characterization of individuals presenting extreme phenotypes of high and low risk of developing tobacco-induced lung adenocarcinoma. Transl Lung Cancer Res 2021; 10:1327-1337. [PMID: 33889513 PMCID: PMC8044482 DOI: 10.21037/tlcr-20-1197] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Background Tobacco is the main risk factor for developing lung cancer. Yet, some heavy smokers do not develop lung cancer at advanced ages while others develop it at young ages. Here, we assess for the first time the genetic background of these clinically relevant extreme phenotypes using whole exome sequencing (WES). Methods We performed WES of germline DNA from heavy smokers who either developed lung adenocarcinoma at an early age (extreme cases, n=50) or did not present lung adenocarcinoma or other tumors at an advanced age (extreme controls, n=50). We selected non-synonymous variants located in exonic regions and consensus splice sites of the genes that showed significantly different allelic frequencies between both cohorts. We validated our results in all the additional extreme cases (i.e., heavy smokers who developed lung adenocarcinoma at an early age) available from The Cancer Genome Atlas (TCGA). Results The mean age for the extreme cases and controls was respectively 49.7 and 77.5 years. Mean tobacco consumption was 43.6 and 56.8 pack-years. We identified 619 significantly different variants between both cohorts, and we validated 108 of these in extreme cases selected from TCGA. Nine validated variants, located in relevant cancer related genes, such as PARP4, HLA-A or NQO1, among others, achieved statistical significance in the False Discovery Rate test. The most significant validated variant (P=4.48×10−5) was located in the tumor-suppressor gene ALPK2. Conclusions We describe genetic variants associated with extreme phenotypes of high and low risk for the development of tobacco-induced lung adenocarcinoma. Our results and our strategy may help to identify high-risk subjects and to develop new therapeutic approaches.
Collapse
Affiliation(s)
- Ana Patiño-García
- Department of Pediatrics and Clinical Genetics, Clinica Universidad de Navarra, Pamplona, Spain.,Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Program in Solid Tumors, Center for Applied Medical Research (CIMA), Pamplona, Spain
| | - Elizabeth Guruceaga
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Bioinformatics Platform, CIMA, Universidad de Navarra, Pamplona, Spain
| | - Victor Segura
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Bioinformatics Platform, CIMA, Universidad de Navarra, Pamplona, Spain
| | - Rodrigo Sánchez Bayona
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Oncology, Clinica Universidad de Navarra, Pamplona, Spain
| | - Maria Pilar Andueza
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Oncology, Clinica Universidad de Navarra, Pamplona, Spain
| | - Ibon Tamayo Uria
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Bioinformatics Platform, CIMA, Universidad de Navarra, Pamplona, Spain
| | - Guillermo Serrano
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Program in Solid Tumors, Center for Applied Medical Research (CIMA), Pamplona, Spain
| | | | - María José Pajares
- Biochemistry Area, Department of Health Science, Public University of Navarre, Pamplona, Spain
| | - Alfonso Gurpide
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Oncology, Clinica Universidad de Navarra, Pamplona, Spain
| | - Marimar Ocón
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Pulmonary, Clinica Universidad de Navarra, Pamplona, Spain
| | - Miguel F Sanmamed
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Oncology, Clinica Universidad de Navarra, Pamplona, Spain
| | - Maria Rodriguez Ruiz
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Oncology, Clinica Universidad de Navarra, Pamplona, Spain
| | - Ignacio Melero
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Division of Immunology and Immunotherapy, CIMA, Universidad de Navarra and Instituto de Investigación Sanitaria de Navarra (IdisNA), Pamplona, Spain.,Department of Immunology, Clinica Universidad de Navarra and CIMA, Pamplona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain
| | - Maria Dolores Lozano
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain.,Department of Pathology, Clinica Universidad de Navarra, Pamplona, Spain
| | - Carlos de Andrea
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Pathology, Clinica Universidad de Navarra, Pamplona, Spain
| | - Guillermo Pita
- Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Anna Gonzalez-Neira
- Human Genotyping Unit-CeGen, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Alvaro Gonzalez
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Biochemistry, Clinica Universidad de Navarra, Pamplona, Spain
| | - Javier J Zulueta
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Division of Immunology and Immunotherapy, CIMA, Universidad de Navarra and Instituto de Investigación Sanitaria de Navarra (IdisNA), Pamplona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain
| | - Luis M Montuenga
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Program in Solid Tumors, Center for Applied Medical Research (CIMA), Pamplona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain.,Department of Pathology, Anatomy and Physiology, Schools of Medicine and Sciences, University of Navarra, Pamplona, Spain
| | - Ruben Pio
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Program in Solid Tumors, Center for Applied Medical Research (CIMA), Pamplona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain
| | - Jose Luis Perez-Gracia
- Health Research Institute of Navarra (IdisNA), Pamplona, Spain.,Department of Oncology, Clinica Universidad de Navarra, Pamplona, Spain.,Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Spain
| |
Collapse
|
5
|
Mirabello L, Zhu B, Koster R, Karlins E, Dean M, Yeager M, Gianferante M, Spector LG, Morton LM, Karyadi D, Robison LL, Armstrong GT, Bhatia S, Song L, Pankratz N, Pinheiro M, Gastier-Foster JM, Gorlick R, de Toledo SRC, Petrilli AS, Patino-Garcia A, Lecanda F, Gutierrez-Jimeno M, Serra M, Hattinger C, Picci P, Scotlandi K, Flanagan AM, Tirabosco R, Amary MF, Kurucu N, Ilhan IE, Ballinger ML, Thomas DM, Barkauskas DA, Mejia-Baltodano G, Valverde P, Hicks BD, Zhu B, Wang M, Hutchinson AA, Tucker M, Sampson J, Landi MT, Freedman ND, Gapstur S, Carter B, Hoover RN, Chanock SJ, Savage SA. Frequency of Pathogenic Germline Variants in Cancer-Susceptibility Genes in Patients With Osteosarcoma. JAMA Oncol 2021; 6:724-734. [PMID: 32191290 DOI: 10.1001/jamaoncol.2020.0197] [Citation(s) in RCA: 154] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Importance Osteosarcoma, the most common malignant bone tumor in children and adolescents, occurs in a high number of cancer predisposition syndromes that are defined by highly penetrant germline mutations. The germline genetic susceptibility to osteosarcoma outside of familial cancer syndromes remains unclear. Objective To investigate the germline genetic architecture of 1244 patients with osteosarcoma. Design, Setting, and Participants Whole-exome sequencing (n = 1104) or targeted sequencing (n = 140) of the DNA of 1244 patients with osteosarcoma from 10 participating international centers or studies was conducted from April 21, 2014, to September 1, 2017. The results were compared with the DNA of 1062 individuals without cancer assembled internally from 4 participating studies who underwent comparable whole-exome sequencing and 27 173 individuals of non-Finnish European ancestry who were identified through the Exome Aggregation Consortium (ExAC) database. In the analysis, 238 high-interest cancer-susceptibility genes were assessed followed by testing of the mutational burden across 736 additional candidate genes. Principal component analyses were used to identify 732 European patients with osteosarcoma and 994 European individuals without cancer, with outliers removed for patient-control group comparisons. Patients were subsequently compared with individuals in the ExAC group. All data were analyzed from June 1, 2017, to July 1, 2019. Main Outcomes and Measures The frequency of rare pathogenic or likely pathogenic genetic variants. Results Among 1244 patients with osteosarcoma (mean [SD] age at diagnosis, 16 [8.9] years [range, 2-80 years]; 684 patients [55.0%] were male), an analysis restricted to individuals with European ancestry indicated a significantly higher pathogenic or likely pathogenic variant burden in 238 high-interest cancer-susceptibility genes among patients with osteosarcoma compared with the control group (732 vs 994, respectively; P = 1.3 × 10-18). A pathogenic or likely pathogenic cancer-susceptibility gene variant was identified in 281 of 1004 patients with osteosarcoma (28.0%), of which nearly three-quarters had a variant that mapped to an autosomal-dominant gene or a known osteosarcoma-associated cancer predisposition syndrome gene. The frequency of a pathogenic or likely pathogenic cancer-susceptibility gene variant was 128 of 1062 individuals (12.1%) in the control group and 2527 of 27 173 individuals (9.3%) in the ExAC group. A higher than expected frequency of pathogenic or likely pathogenic variants was observed in genes not previously linked to osteosarcoma (eg, CDKN2A, MEN1, VHL, POT1, APC, MSH2, and ATRX) and in the Li-Fraumeni syndrome-associated gene, TP53. Conclusions and Relevance In this study, approximately one-fourth of patients with osteosarcoma unselected for family history had a highly penetrant germline mutation requiring additional follow-up analysis and possible genetic counseling with cascade testing.
Collapse
Affiliation(s)
- Lisa Mirabello
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Bin Zhu
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Roelof Koster
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Eric Karlins
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Michael Dean
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland.,Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Meredith Yeager
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Matthew Gianferante
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Logan G Spector
- Department of Pediatrics, University of Minnesota, Minneapolis
| | - Lindsay M Morton
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Danielle Karyadi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Leslie L Robison
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, Tennessee
| | - Gregory T Armstrong
- Department of Epidemiology and Cancer Control, St Jude Children's Research Hospital, Memphis, Tennessee
| | - Smita Bhatia
- Institute for Cancer Outcomes and Survivorship, University of Alabama at Birmingham, Birmingham
| | - Lei Song
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Nathan Pankratz
- Department of Pediatrics, University of Minnesota, Minneapolis
| | - Maisa Pinheiro
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Julie M Gastier-Foster
- Department of Pathology and Pediatrics, Nationwide Children's Hospital, The Ohio State University, Columbus
| | - Richard Gorlick
- Department of Pediatrics, University of Texas MD Anderson Cancer Center, Houston
| | - Silvia Regina Caminada de Toledo
- Laboratorio de Genetica, Instituto de Oncologia Pediatrica, Grupo de Apoio ao Adolescente e a Crianca com Cancer/Universidade Federal de Sao Paulo, Sao Paulo, Brazil
| | - Antonio S Petrilli
- Laboratorio de Genetica, Instituto de Oncologia Pediatrica, Grupo de Apoio ao Adolescente e a Crianca com Cancer/Universidade Federal de Sao Paulo, Sao Paulo, Brazil
| | - Ana Patino-Garcia
- Solid Tumor Division, Department of Pediatrics, University Clinic of Navarra and Center for Applied Medical Research, Navarra Institute for Health Research, Pamplona, Spain.,Center for Applied Medical Research, University of Navarra, Instituto de Investigacion Sanitaria de Navarra, and Centro de Investigacion Biomedica en Red Cancer, Pamplona, Spain
| | - Fernando Lecanda
- Solid Tumor Division, Department of Pediatrics, University Clinic of Navarra and Center for Applied Medical Research, Navarra Institute for Health Research, Pamplona, Spain.,Center for Applied Medical Research, University of Navarra, Instituto de Investigacion Sanitaria de Navarra, and Centro de Investigacion Biomedica en Red Cancer, Pamplona, Spain
| | - Miriam Gutierrez-Jimeno
- Solid Tumor Division, Department of Pediatrics, University Clinic of Navarra and Center for Applied Medical Research, Navarra Institute for Health Research, Pamplona, Spain
| | - Massimo Serra
- Laboratory of Experimental Oncology, Istituto di Ricovero e Cura a Carattere Scientifico, Istituto Ortopedico Rizzoli, Bologna, Italy
| | - Claudia Hattinger
- Laboratory of Experimental Oncology, Istituto di Ricovero e Cura a Carattere Scientifico, Istituto Ortopedico Rizzoli, Bologna, Italy
| | - Piero Picci
- Laboratory of Experimental Oncology, Istituto di Ricovero e Cura a Carattere Scientifico, Istituto Ortopedico Rizzoli, Bologna, Italy
| | - Katia Scotlandi
- Laboratory of Experimental Oncology, Istituto di Ricovero e Cura a Carattere Scientifico, Istituto Ortopedico Rizzoli, Bologna, Italy
| | - Adrienne M Flanagan
- Research Department of Pathology, UCL Cancer Institute, London, United Kingdom.,Royal National Orthopaedic Hospital NHS Trust, Stanmore, Middlesex, United Kingdom
| | - Roberto Tirabosco
- Royal National Orthopaedic Hospital NHS Trust, Stanmore, Middlesex, United Kingdom
| | - Maria Fernanda Amary
- Royal National Orthopaedic Hospital NHS Trust, Stanmore, Middlesex, United Kingdom
| | - Nilgün Kurucu
- Department of Pediatric Oncology, A.Y. Ankara Oncology Training and Research Hospital, Yenimahalle, Ankara, Turkey
| | - Inci Ergurhan Ilhan
- Department of Pediatric Oncology, A.Y. Ankara Oncology Training and Research Hospital, Yenimahalle, Ankara, Turkey
| | - Mandy L Ballinger
- Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia.,St. Vincent's Clinical School, University of New South Wales, Sydney, New South Wales, Australia
| | - David M Thomas
- Garvan Institute of Medical Research, Darlinghurst, New South Wales, Australia.,St. Vincent's Clinical School, University of New South Wales, Sydney, New South Wales, Australia
| | - Donald A Barkauskas
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles
| | | | | | - Belynda D Hicks
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Bin Zhu
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Mingyi Wang
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Amy A Hutchinson
- Cancer Genomics Research Laboratory, Frederick National Laboratory for Cancer Research, Frederick, Maryland
| | - Margaret Tucker
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Joshua Sampson
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Maria T Landi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Neal D Freedman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Susan Gapstur
- Epidemiology Research Program, American Cancer Society, Atlanta, Georgia
| | - Brian Carter
- Epidemiology Research Program, American Cancer Society, Atlanta, Georgia
| | - Robert N Hoover
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | - Sharon A Savage
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
6
|
Marceau West R, Lu W, Rotroff DM, Kuenemann MA, Chang SM, Wu MC, Wagner MJ, Buse JB, Motsinger-Reif AA, Fourches D, Tzeng JY. Identifying individual risk rare variants using protein structure guided local tests (POINT). PLoS Comput Biol 2019; 15:e1006722. [PMID: 30779729 PMCID: PMC6396946 DOI: 10.1371/journal.pcbi.1006722] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Revised: 03/01/2019] [Accepted: 12/17/2018] [Indexed: 01/08/2023] Open
Abstract
Rare variants are of increasing interest to genetic association studies because of their etiological contributions to human complex diseases. Due to the rarity of the mutant events, rare variants are routinely analyzed on an aggregate level. While aggregation analyses improve the detection of global-level signal, they are not able to pinpoint causal variants within a variant set. To perform inference on a localized level, additional information, e.g., biological annotation, is often needed to boost the information content of a rare variant. Following the observation that important variants are likely to cluster together on functional domains, we propose a protein structure guided local test (POINT) to provide variant-specific association information using structure-guided aggregation of signal. Constructed under a kernel machine framework, POINT performs local association testing by borrowing information from neighboring variants in the 3-dimensional protein space in a data-adaptive fashion. Besides merely providing a list of promising variants, POINT assigns each variant a p-value to permit variant ranking and prioritization. We assess the selection performance of POINT using simulations and illustrate how it can be used to prioritize individual rare variants in PCSK9, ANGPTL4 and CETP in the Action to Control Cardiovascular Risk in Diabetes (ACCORD) clinical trial data.
Collapse
Affiliation(s)
- Rachel Marceau West
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Daniel M. Rotroff
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, United States of America
| | - Melaine A. Kuenemann
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Sheng-Mao Chang
- Department of Statistics, National Cheng-Kung University, Tainan, Taiwan
| | - Michael C. Wu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Michael J. Wagner
- Center for Pharmacogenomics and Individualized Therapy, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - John B. Buse
- Department of Medicine, University of North Carolina School of Medicine, Chapel Hill, North Carolina, United States of America
| | - Alison A. Motsinger-Reif
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Denis Fourches
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Jung-Ying Tzeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
- Department of Statistics, National Cheng-Kung University, Tainan, Taiwan
- Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| |
Collapse
|