1
|
Bonetti E, Tini G, Mazzarella L. Accuracy of renovo predictions on variants reclassified over time. J Transl Med 2024; 22:713. [PMID: 39085881 PMCID: PMC11293099 DOI: 10.1186/s12967-024-05508-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 07/14/2024] [Indexed: 08/02/2024] Open
Abstract
BACKGROUND Interpreting the clinical consequences of genetic variants is the central problem in modern clinical genomics, for both hereditary diseases and oncology. However, clinical validation lags behind the pace of discovery, leading to distressing uncertainty for patients, physicians and researchers. This "interpretation gap" changes over time as evidence accumulates, and variants initially deemed of uncertain (VUS) significance may be subsequently reclassified in pathogenic/benign. We previously developed RENOVO, a random forest-based tool able to predict variant pathogenicity based on publicly available information from GnomAD and dbNFSP, and tested on variants that have changed their classification status over time. Here, we comprehensively evaluated the accuracy of RENOVO predictions on variants that have been reclassified over the last four years. METHODS we retrieved 16 retrospective instances of the ClinVar database, every 3 months since March 2020 to March 2024, and analyzed time trends of variant classifications. We identified variants that changed their status over time and compared RENOVO predictions generated in 2020 with the actual reclassifications. RESULTS VUS have become the most represented class in ClinVar (44.97% vs. 9.75% (likely) pathogenic and 40,33% (likely) benign). The rate of VUS reclassification is linear and slow compared to the rate of VUS reporting, exponential and currently ~ 30x faster, creating a growing divide between what can be sequenced vs. what can be interpreted. Out of 10,196 VUS variants in January 2020 that have undergone a clinically meaningful reclassification to march 2024, RENOVO correctly classified 82.6% in 2020. In addition, RENOVO correctly identified the majority of the few variants that switched clinically meaningful classes (e.g., from benign to pathogenic and vice versa). We highlight variant classes and clinically relevant genes for which RENOVO provides particularly accurate estimates. In particularly, genes characterized by large prevalence of high- or low-impact variants (e.g., POLE, NOTCH1, FANCM etc.). Suboptimal RENOVO predictions mostly concern genes validated through dedicated consortia (e.g., BRCA1/2), in which RENOVO would anyway have a limited impact. CONCLUSIONS Time trend analysis demonstrates that the current model of variant interpretation cannot keep up with variant discovery. Machine learning-based tools like RENOVO confirm high accuracy that can aid in clinical practice and research.
Collapse
Affiliation(s)
- Emanuele Bonetti
- Department of Experimental Oncology, European Institute of Oncology, IEO-IRCCS, Milan, 20139, Italy
| | - Giulia Tini
- Department of Experimental Oncology, European Institute of Oncology, IEO-IRCCS, Milan, 20139, Italy
| | - Luca Mazzarella
- Department of Experimental Oncology, European Institute of Oncology, IEO-IRCCS, Milan, 20139, Italy.
| |
Collapse
|
2
|
Qahwaji R, Ashankyty I, Sannan NS, Hazzazi MS, Basabrain AA, Mobashir M. Pharmacogenomics: A Genetic Approach to Drug Development and Therapy. Pharmaceuticals (Basel) 2024; 17:940. [PMID: 39065790 PMCID: PMC11279827 DOI: 10.3390/ph17070940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 07/03/2024] [Accepted: 07/10/2024] [Indexed: 07/28/2024] Open
Abstract
The majority of the well-known pharmacogenomics research used in the medical sciences contributes to our understanding of medication interactions. It has a significant impact on treatment and drug development. The broad use of pharmacogenomics is required for the progress of therapy. The main focus is on how genes and an intricate gene system affect the body's reaction to medications. Novel biomarkers that help identify a patient group that is more or less likely to respond to a certain medication have been discovered as a result of recent developments in the field of clinical therapeutics. It aims to improve customized therapy by giving the appropriate drug at the right dose at the right time and making sure that the right prescriptions are issued. A combination of genetic, environmental, and patient variables that impact the pharmacokinetics and/or pharmacodynamics of medications results in interindividual variance in drug response. Drug development, illness susceptibility, and treatment efficacy are all impacted by pharmacogenomics. The purpose of this work is to give a review that might serve as a foundation for the creation of new pharmacogenomics applications, techniques, or strategies.
Collapse
Affiliation(s)
- Rowaid Qahwaji
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 22254, Saudi Arabia; (R.Q.); (I.A.); (M.S.H.); (A.A.B.)
- Hematology Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Ibraheem Ashankyty
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 22254, Saudi Arabia; (R.Q.); (I.A.); (M.S.H.); (A.A.B.)
| | - Naif S. Sannan
- College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Ar Rimayah, Riyadh 14611, Saudi Arabia;
- King Abdullah International Medical Research Center, Jeddah 22384, Saudi Arabia
| | - Mohannad S. Hazzazi
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 22254, Saudi Arabia; (R.Q.); (I.A.); (M.S.H.); (A.A.B.)
- Hematology Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Ammar A. Basabrain
- Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 22254, Saudi Arabia; (R.Q.); (I.A.); (M.S.H.); (A.A.B.)
- Hematology Research Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Mohammad Mobashir
- Department of Biomedical Laboratory Science, Faculty of Natural Sciences, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway
| |
Collapse
|
3
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600283. [PMID: 38979289 PMCID: PMC11230257 DOI: 10.1101/2024.06.25.600283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). Results The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past 25 years, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 186 VIPs, resulting in a total of 403 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. Conclusions VIPdb version 2 summarizes 403 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. Availability VIPdb version 2 is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Arul S. Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Currently at: Illumina, Foster City, California 94404, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
4
|
Wang D, Li J, Wang E, Wang Y. DVA: predicting the functional impact of single nucleotide missense variants. BMC Bioinformatics 2024; 25:100. [PMID: 38448823 PMCID: PMC10916336 DOI: 10.1186/s12859-024-05709-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 02/16/2024] [Indexed: 03/08/2024] Open
Abstract
BACKGROUND In the past decade, single nucleotide variants (SNVs) have been identified as having a significant relationship with the development and treatment of diseases. Among them, prioritizing missense variants for further functional impact investigation is an essential challenge in the study of common disease and cancer. Although several computational methods have been developed to predict the functional impacts of variants, the predictive ability of these methods is still insufficient in the Mendelian and cancer missense variants. RESULTS We present a novel prediction method called the disease-related variant annotation (DVA) method that predicts the effect of missense variants based on a comprehensive feature set of variants, notably, the allele frequency and protein-protein interaction network feature based on graph embedding. Benchmarked against datasets of single nucleotide missense variants, the DVA method outperforms the state-of-the-art methods by up to 0.473 in the area under receiver operating characteristic curve. The results demonstrate that the proposed method can accurately predict the functional impact of single nucleotide missense variants and substantially outperforms existing methods. CONCLUSIONS DVA is an effective framework for identifying the functional impact of disease missense variants based on a comprehensive feature set. Based on different datasets, DVA shows its generalization ability and robustness, and it also provides innovative ideas for the study of the functional mechanism and impact of SNVs.
Collapse
Affiliation(s)
- Dong Wang
- School of Computer Science and Technology, Harbin Institute of Technology Harbin, Harbin, Heilongjiang, China
| | - Jie Li
- School of Computer Science and Technology, Harbin Institute of Technology Harbin, Harbin, Heilongjiang, China.
| | - Edwin Wang
- Cumming School of Medicine, University of Calgary, Calgary, Canada
| | - Yadong Wang
- School of Computer Science and Technology, Harbin Institute of Technology Harbin, Harbin, Heilongjiang, China
| |
Collapse
|
5
|
Pathan N, Deng WQ, Di Scipio M, Khan M, Mao S, Morton RW, Lali R, Pigeyre M, Chong MR, Paré G. A method to estimate the contribution of rare coding variants to complex trait heritability. Nat Commun 2024; 15:1245. [PMID: 38336875 PMCID: PMC10858280 DOI: 10.1038/s41467-024-45407-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 01/22/2024] [Indexed: 02/12/2024] Open
Abstract
It has been postulated that rare coding variants (RVs; MAF < 0.01) contribute to the "missing" heritability of complex traits. We developed a framework, the Rare variant heritability (RARity) estimator, to assess RV heritability (h2RV) without assuming a particular genetic architecture. We applied RARity to 31 complex traits in the UK Biobank (n = 167,348) and showed that gene-level RV aggregation suffers from 79% (95% CI: 68-93%) loss of h2RV. Using unaggregated variants, 27 traits had h2RV > 5%, with height having the highest h2RV at 21.9% (95% CI: 19.0-24.8%). The total heritability, including common and rare variants, recovered pedigree-based estimates for 11 traits. RARity can estimate gene-level h2RV, enabling the assessment of gene-level characteristics and revealing 11, previously unreported, gene-phenotype relationships. Finally, we demonstrated that in silico pathogenicity prediction (variant-level) and gene-level annotations do not generally enrich for RVs that over-contribute to complex trait variance, and thus, innovative methods are needed to predict RV functionality.
Collapse
Affiliation(s)
- Nazia Pathan
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, Canada
| | - Wei Q Deng
- Peter Boris Centre for Addictions Research, St. Joseph's Healthcare Hamilton, Hamilton, Canada
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, Canada
| | - Matteo Di Scipio
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, Canada
| | - Mohammad Khan
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, Canada
| | - Shihong Mao
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
| | - Robert W Morton
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, Canada
| | - Ricky Lali
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Marie Pigeyre
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, Canada
| | - Michael R Chong
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, Canada
- Thrombosis and Atherosclerosis Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton, Canada
| | - Guillaume Paré
- Population Health Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Canada.
- Department of Pathology and Molecular Medicine, McMaster University, Michael G. DeGroote School of Medicine, Hamilton, Canada.
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada.
- Thrombosis and Atherosclerosis Research Institute, David Braley Cardiac, Vascular and Stroke Research Institute, Hamilton, Canada.
| |
Collapse
|
6
|
Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res 2024; 52:D1143-D1154. [PMID: 38183205 PMCID: PMC10767851 DOI: 10.1093/nar/gkad989] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/14/2023] [Accepted: 10/17/2023] [Indexed: 01/07/2024] Open
Abstract
Machine Learning-based scoring and classification of genetic variants aids the assessment of clinical findings and is employed to prioritize variants in diverse genetic studies and analyses. Combined Annotation-Dependent Depletion (CADD) is one of the first methods for the genome-wide prioritization of variants across different molecular functions and has been continuously developed and improved since its original publication. Here, we present our most recent release, CADD v1.7. We explored and integrated new annotation features, among them state-of-the-art protein language model scores (Meta ESM-1v), regulatory variant effect predictions (from sequence-based convolutional neural networks) and sequence conservation scores (Zoonomia). We evaluated the new version on data sets derived from ClinVar, ExAC/gnomAD and 1000 Genomes variants. For coding effects, we tested CADD on 31 Deep Mutational Scanning (DMS) data sets from ProteinGym and, for regulatory effect prediction, we used saturation mutagenesis reporter assay data of promoter and enhancer sequences. The inclusion of new features further improved the overall performance of CADD. As with previous releases, all data sets, genome-wide CADD v1.7 scores, scripts for on-site scoring and an easy-to-use webserver are readily provided via https://cadd.bihealth.org/ or https://cadd.gs.washington.edu/ to the community.
Collapse
Affiliation(s)
- Max Schubach
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Thorben Maass
- Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| | - Lusiné Nazaretyan
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Sebastian Röner
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Martin Kircher
- Exploratory Diagnostic Sciences, Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
- Institute of Human Genetics, University Hospital Schleswig-Holstein, University of Lübeck, Lübeck, Germany
| |
Collapse
|
7
|
Chi YI, Jorge SD, Jensen DR, Smith BC, Volkman BF, Mathison AJ, Lomberk G, Zimmermann MT, Urrutia R. A multi-layered computational structural genomics approach enhances domain-specific interpretation of Kleefstra syndrome variants in EHMT1. Comput Struct Biotechnol J 2023; 21:5249-5258. [PMID: 37954151 PMCID: PMC10632586 DOI: 10.1016/j.csbj.2023.10.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 10/06/2023] [Accepted: 10/12/2023] [Indexed: 11/14/2023] Open
Abstract
This study investigates the functional significance of assorted variants of uncertain significance (VUS) in euchromatic histone lysine methyltransferase 1 (EHMT1), which is critical for early development and normal physiology. EHMT1 mutations cause Kleefstra syndrome and are linked to various human cancers. However, accurate functional interpretations of these variants are yet to be made, limiting diagnoses and future research. To overcome this, we integrate conventional tools for variant calling with computational biophysics and biochemistry to conduct multi-layered mechanistic analyses of the SET catalytic domain of EHMT1, which is critical for this protein function. We use molecular mechanics and molecular dynamics (MD)-based metrics to analyze the SET domain structure and functional motions resulting from 97 Kleefstra syndrome missense variants within the domain. Our approach allows us to classify the variants in a mechanistic manner into SV (Structural Variant), DV (Dynamic Variant), SDV (Structural and Dynamic Variant), and VUS (Variant of Uncertain Significance). Our findings reveal that the damaging variants are mostly mapped around the active site, substrate binding site, and pre-SET regions. Overall, we report an improvement for this method over conventional tools for variant interpretation and simultaneously provide a molecular mechanism for variant dysfunction.
Collapse
Affiliation(s)
- Young-In Chi
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
- Division of Research, Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Salomão D. Jorge
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Davin R. Jensen
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Brian C. Smith
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Brian F. Volkman
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Angela J. Mathison
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
- Division of Research, Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Gwen Lomberk
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
- Division of Research, Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Pharmacology and Toxicology, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Michael T. Zimmermann
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI, USA
- Clinical and Translational Sciences Institute, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Raul Urrutia
- Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, WI, USA
- Division of Research, Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
8
|
Anwar MY, Graff M, Highland HM, Smit R, Wang Z, Buchanan VL, Young KL, Kenny EE, Fernandez-Rhodes L, Liu S, Assimes T, Garcia DO, Daeeun K, Gignoux CR, Justice AE, Haiman CA, Buyske S, Peters U, Loos RJF, Kooperberg C, North KE. Assessing efficiency of fine-mapping obesity-associated variants through leveraging ancestry architecture and functional annotation using PAGE and UKBB cohorts. Hum Genet 2023; 142:1477-1489. [PMID: 37658231 DOI: 10.1007/s00439-023-02593-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 08/10/2023] [Indexed: 09/03/2023]
Abstract
Inadequate representation of non-European ancestry populations in genome-wide association studies (GWAS) has limited opportunities to isolate functional variants. Fine-mapping in multi-ancestry populations should improve the efficiency of prioritizing variants for functional interrogation. To evaluate this hypothesis, we leveraged ancestry architecture to perform comparative GWAS and fine-mapping of obesity-related phenotypes in European ancestry populations from the UK Biobank (UKBB) and multi-ancestry samples from the Population Architecture for Genetic Epidemiology (PAGE) consortium with comparable sample sizes. In the investigated regions with genome-wide significant associations for obesity-related traits, fine-mapping in our ancestrally diverse sample led to 95% and 99% credible sets (CS) with fewer variants than in the European ancestry sample. Lead fine-mapped variants in PAGE regions had higher average coding scores, and higher average posterior probabilities for causality compared to UKBB. Importantly, 99% CS in PAGE loci contained strong expression quantitative trait loci (eQTLs) in adipose tissues or harbored more variants in tighter linkage disequilibrium (LD) with eQTLs. Leveraging ancestrally diverse populations with heterogeneous ancestry architectures, coupled with functional annotation, increased fine-mapping efficiency and performance, and reduced the set of candidate variants for consideration for future functional studies. Significant overlap in genetic causal variants across populations suggests generalizability of genetic mechanisms underpinning obesity-related traits across populations.
Collapse
Affiliation(s)
- Mohammad Yaser Anwar
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| | - Mariaelisa Graff
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Heather M Highland
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Roelof Smit
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Zhe Wang
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Victoria L Buchanan
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Kristin L Young
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Eimear E Kenny
- Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lindsay Fernandez-Rhodes
- Department of Biobehavioral Health, College of Health and Human Development, Pennsylvania State University, University Park, PA, 16802, USA
| | - Simin Liu
- Department of Epidemiology and Center for Global Cardiometabolic Health, School of Public Health, Brown University, Providence, RI, 02903, USA
| | - Themistocles Assimes
- Division of Cardiovascular Medicine, Department of Medicine, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - David O Garcia
- Department of Health Promotion Sciences, Mel & Enid Zuckerman College of Public Health, University of Arizona, Tucson, AZ, 85724, USA
| | - Kim Daeeun
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Christopher R Gignoux
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Anne E Justice
- Department of Population Health Sciences, Geisinger Health, Danville, PA, 17822, USA
| | - Christopher A Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Steve Buyske
- Department of Statistics, Rutgers University, Piscataway, NJ, 08854, USA
| | - Ulrike Peters
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Kari E North
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| |
Collapse
|
9
|
Tremmel R, Pirmann S, Zhou Y, Lauschke VM. Translating pharmacogenomic sequencing data into drug response predictions-How to interpret variants of unknown significance. Br J Clin Pharmacol 2023. [PMID: 37759374 DOI: 10.1111/bcp.15915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 09/20/2023] [Accepted: 09/22/2023] [Indexed: 09/29/2023] Open
Abstract
The rapid development of sequencing technologies during the past 20 years has provided a variety of methods and tools to interrogate human genomic variations at the population level. Pharmacogenes are well known to be highly polymorphic and a plethora of pharmacogenomic variants has been identified in population sequencing data. However, so far only a small number of these variants have been functionally characterized regarding their impact on drug efficacy and toxicity and the significance of the vast majority remains unknown. It is therefore of high importance to develop tools and frameworks to accurately infer the effects of pharmacogenomic variants and, eventually, aggregate the effect of individual variations into personalized drug response predictions. To address this challenge, we here first describe the technological advances, including sequencing methods and accompanying bioinformatic processing pipelines that have enabled reliable variant identification. Subsequently, we highlight advances in computational algorithms for pharmacogenomic variant interpretation and discuss the added value of emerging strategies, such as machine learning and the integrative use of omics techniques that have the potential to further contribute to the refinement of personalized pharmacological response predictions. Lastly, we provide an overview of experimental and clinical approaches to validate in silico predictions. We conclude that the iterative feedback between computational predictions and experimental validations is likely to rapidly improve the accuracy of pharmacogenomic prediction models, which might soon allow for an incorporation of the entire pharmacogenetic profile into personalized response predictions.
Collapse
Affiliation(s)
- Roman Tremmel
- Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany
- University of Tübingen, Tübingen, Germany
| | - Sebastian Pirmann
- Computational Oncology Group, Molecular Precision Oncology Program, National Center for Tumor Diseases (NCT) Heidelberg and German Cancer Research Center (DKFZ), Heidelberg, Germany
- Helmholtz Information and Data Science School for Health, Karlsruhe/Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Yitian Zhou
- Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| | - Volker M Lauschke
- Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany
- University of Tübingen, Tübingen, Germany
- Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
10
|
Chi YI, Jorge SD, Jensen DR, Smith BC, Volkman BF, Mathison AJ, Lomberk G, Zimmermann MT, Urrutia R. A Multi-Layered Computational Structural Genomics Approach Enhances Domain-Specific Interpretation of Kleefstra Syndrome Variants in EHMT1. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.06.556558. [PMID: 37786696 PMCID: PMC10541560 DOI: 10.1101/2023.09.06.556558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
This study investigates the functional significance of assorted variants of uncertain significance (VUS) in euchromatic histone lysine methyltransferase 1 (EHMT1), which is critical for early development and normal physiology. EHMT1 mutations cause Kleefstra syndrome and are linked to various human cancers. However, accurate functional interpretation of these variants are yet to be made, limiting diagnoses and future research. To overcome this, we integrate conventional tools for variant calling with computational biophysics and biochemistry to conduct multi-layered mechanistic analyses of the SET catalytic domain of EHMT1, which is critical for this protein function. We use molecular mechanics and molecular dynamics (MD)-based metrics to analyze the SET domain structure and functional motions resulting from 97 Kleefstra syndrome missense variants within this domain. Our approach allows us to classify the variants in a mechanistic manner into SV (Structural Variant), DV (Dynamic Variant), SDV (Structural and Dynamic Variant), and VUS (Variant of Uncertain Significance). Our findings reveal that the damaging variants are mostly mapped around the active site, substrate binding site, and pre-SET regions. Overall, we report an improvement for this method over conventional tools for variant interpretation and simultaneously provide a molecular mechanism of variant dysfunction.
Collapse
|
11
|
Al-Kafaji G, Jassim G, AlHajeri A, Alawadhi AMT, Fida M, Sahin I, Alali F, Fadel E. Investigation of germline variants in Bahraini women with breast cancer using next-generation sequencing based-multigene panel. PLoS One 2023; 18:e0291015. [PMID: 37656691 PMCID: PMC10473515 DOI: 10.1371/journal.pone.0291015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/20/2023] [Indexed: 09/03/2023] Open
Abstract
Germline variants in BRCA1 and BRCA2 (BRCA1/2) genes are the most common cause of hereditary breast cancer. However, a significant number of cases are not linked to these two genes and additional high-, moderate- and low-penetrance genes have been identified in breast cancer. The advent of next-generation sequencing (NGS) allowed simultaneous sequencing of multiple cancer-susceptibility genes and prompted research in this field. So far, cancer-predisposition genes other than BRCA1/2 have not been studied in the population of Bahrain. We performed a targeted NGS using a multi-panel covering 180 genes associated with cancer predisposition to investigate the spectrum and frequency of germline variants in 54 women with a positive personal and/or family history of breast cancer. Sequencing analysis revealed germline variants in 29 (53.7%) patients. Five pathogenic/likely pathogenic variants in four DNA repair pathway-related genes were identified in five unrelated patients (9.3%). Two BRCA1 variants, namely the missense variant c.287A>G (p.Asp96Gly) and the truncating variant c.1066C>T (p.Gln356Ter), were detected in two patients (3.7%). Three variants in non-BRCA1/2 genes were detected in three patients (1.85% each) with a strong family history of breast cancer. These included a monoallelic missense variant c.1187G>A (p.Gly396Asp) in MUTYH gene, and two truncating variants namely c.3343C>T (p.Arg1115Ter) in MLH3 gene and c.1826G>A (p.Trp609Ter) in PMS1 gene. Other variants of uncertain significance (VUS) were also detected, and some of them were found together with the deleterious variants. In this first application of NGS-based multigene testing in Bahraini women with breast cancer, we show that multigene testing can yield additional genomic information on low-penetrance genes, although the clinical significance of these genes has not been fully appreciated yet. Our findings also provide valuable epidemiological information for future studies and highlight the importance of genetic testing, and an NGS-based multigene analysis may be applied supplementary to traditional genetic counseling.
Collapse
Affiliation(s)
- Ghada Al-Kafaji
- Department of Molecular Medicine and Al-Jawhara Centre for Molecular Medicine, Genetics, and Inherited Disorders, College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Kingdom of Bahrain
| | - Ghufran Jassim
- Department of Family Medicine, Royal College of Surgeons in Ireland-Bahrain, Manama, Kingdom of Bahrain
| | - Amani AlHajeri
- Department of Genetics, Salmaniya Medical Complex, Manama, Kingdom of Bahrain
| | | | - Mariam Fida
- Bahrain Oncology Center, King Hamad University Hospital, Manama, Kingdom of Bahrain
| | - Ibrahim Sahin
- Department of Molecular Medicine and Al-Jawhara Centre for Molecular Medicine, Genetics, and Inherited Disorders, College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Kingdom of Bahrain
| | - Faisal Alali
- North western Hospital, Chicago Medical School, North Chicago, Illinois, United States of America
| | - Elias Fadel
- Bahrain Oncology Center, King Hamad University Hospital, Manama, Kingdom of Bahrain
| |
Collapse
|
12
|
Pandey M, Anoosha P, Yesudhas D, Gromiha MM. Identification of potential driver mutations in glioblastoma using machine learning. Brief Bioinform 2022; 23:6764546. [PMID: 36266243 DOI: 10.1093/bib/bbac451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/13/2022] [Accepted: 09/22/2022] [Indexed: 12/14/2022] Open
Abstract
Glioblastoma is a fast and aggressively growing tumor in the brain and spinal cord. Mutation of amino acid residues in targets proteins, which are involved in glioblastoma, alters the structure and function and may lead to disease. In this study, we collected a set of 9386 disease-causing (drivers) mutations based on the recurrence in patient samples and experimentally annotated as pathogenic and 8728 as neutral (passenger) mutations. We observed that Arg is highly preferred at the mutant sites of drivers, whereas Met and Ile showed preferences in passengers. Inspecting neighboring residues at the mutant sites revealed that the motifs YP, CP and GRH, are preferred in drivers, whereas SI, IQ and TVI are dominant in neutral. In addition, we have computed other sequence-based features such as conservation scores, Position Specific Scoring Matrices (PSSM) and physicochemical properties, and developed a machine learning-based method, GBMDriver (GlioBlastoma Multiforme Drivers), for distinguishing between driver and passenger mutations. Our method showed an accuracy and AUC of 73.59% and 0.82, respectively, on 10-fold cross-validation and 81.99% and 0.87 in a blind set of 1809 mutants. The tool is available at https://web.iitm.ac.in/bioinfo2/GBMDriver/index.html. We envisage that the present method is helpful to prioritize driver mutations in glioblastoma and assist in identifying therapeutic targets.
Collapse
Affiliation(s)
- Medha Pandey
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - P Anoosha
- Division of Medical Oncology, Department of Internal Medicine, Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio, USA
| | - Dhanusha Yesudhas
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| |
Collapse
|
13
|
Zhou Y, Tremmel R, Schaeffeler E, Schwab M, Lauschke VM. Challenges and opportunities associated with rare-variant pharmacogenomics. Trends Pharmacol Sci 2022; 43:852-865. [PMID: 36008164 DOI: 10.1016/j.tips.2022.07.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 06/15/2022] [Accepted: 07/29/2022] [Indexed: 12/26/2022]
Abstract
Recent advances in next-generation sequencing (NGS) have resulted in the identification of tens of thousands of rare pharmacogenetic variations with unknown functional effects. However, although such pharmacogenetic variations have been estimated to account for a considerable amount of the heritable variability in drug response and toxicity, accurate interpretation at the level of the individual patient remains challenging. We discuss emerging strategies and concepts to close this translational gap. We illustrate how massively parallel experimental assays, artificial intelligence (AI), and machine learning can synergize with population-scale biobank projects to facilitate the interpretation of NGS data to individualize clinical decision-making and personalized medicine.
Collapse
Affiliation(s)
- Yitian Zhou
- Department of Physiology and Pharmacology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Roman Tremmel
- Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany; University of Tübingen, Tübingen, Germany
| | - Elke Schaeffeler
- Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany; University of Tübingen, Tübingen, Germany; Cluster of Excellence iFIT (EXC2180) Image-Guided and Functionally Instructed Tumor Therapies, University of Tübingen, Tübingen, Germany
| | - Matthias Schwab
- Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany; Cluster of Excellence iFIT (EXC2180) Image-Guided and Functionally Instructed Tumor Therapies, University of Tübingen, Tübingen, Germany; Department of Clinical Pharmacology, and Department of Biochemistry and Pharmacy, University of Tübingen, Tübingen, Germany
| | - Volker M Lauschke
- Department of Physiology and Pharmacology, Karolinska Institutet, 171 77 Stockholm, Sweden; Dr Margarete Fischer-Bosch Institute of Clinical Pharmacology, Stuttgart, Germany; University of Tübingen, Tübingen, Germany.
| |
Collapse
|
14
|
Omidpanah N, Saadat M. Introducing a new index for selecting genetic polymorphisms for association studies. EXCLI JOURNAL 2022; 21:814-817. [PMID: 35949492 PMCID: PMC9360471 DOI: 10.17179/excli2022-5004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 06/09/2022] [Indexed: 11/07/2022]
Affiliation(s)
- Nafiseh Omidpanah
- Department of Biology, College of Sciences, Shiraz University, Shiraz, Iran
| | - Mostafa Saadat
- Department of Biology, College of Sciences, Shiraz University, Shiraz, Iran,*To whom correspondence should be addressed: Mostafa Saadat, Department of Biology, College of Science, Shiraz University, Shiraz 71467-13565, Iran, E-mail:
| |
Collapse
|