1
|
Leclercq J, Torres-Paz J, Policarpo M, Agnès F, Rétaux S. Evolution of the regulation of developmental gene expression in blind Mexican cavefish. Development 2024; 151:dev202610. [PMID: 39007346 DOI: 10.1242/dev.202610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 07/08/2024] [Indexed: 07/16/2024]
Abstract
Developmental evolution and diversification of morphology can arise through changes in the regulation of gene expression or protein-coding sequence. To unravel mechanisms underlying early developmental evolution in cavefish of the species Astyanax mexicanus, we compared transcriptomes of surface-dwelling and blind cave-adapted morphs at the end of gastrulation. Twenty percent of the transcriptome was differentially expressed. Allelic expression ratios in cave X surface hybrids showed that cis-regulatory changes are the quasi-exclusive contributors to inter-morph variations in gene expression. Among a list of 108 genes with change at the cis-regulatory level, we explored the control of expression of rx3, which is a master eye gene. We discovered that cellular rx3 levels are cis-regulated in a cell-autonomous manner, whereas rx3 domain size depends on non-autonomous Wnt and Bmp signalling. These results highlight how uncoupled mechanisms and regulatory modules control developmental gene expression and shape morphological changes. Finally, a transcriptome-wide search for fixed coding mutations and differential exon use suggested that variations in coding sequence have a minor contribution. Thus, during early embryogenesis, changes in gene expression regulation are the main drivers of cavefish developmental evolution.
Collapse
Affiliation(s)
- Julien Leclercq
- Paris-Saclay Institute of Neuroscience, CNRS and University Paris-Saclay, 91400 Saclay, France
| | - Jorge Torres-Paz
- Paris-Saclay Institute of Neuroscience, CNRS and University Paris-Saclay, 91400 Saclay, France
| | - Maxime Policarpo
- Paris-Saclay Institute of Neuroscience, CNRS and University Paris-Saclay, 91400 Saclay, France
| | - François Agnès
- Paris-Saclay Institute of Neuroscience, CNRS and University Paris-Saclay, 91400 Saclay, France
| | - Sylvie Rétaux
- Paris-Saclay Institute of Neuroscience, CNRS and University Paris-Saclay, 91400 Saclay, France
| |
Collapse
|
2
|
Ahmad RM, Ali BR, Al-Jasmi F, Al Dhaheri N, Al Turki S, Kizhakkedath P, Mohamad MS. AI-derived comparative assessment of the performance of pathogenicity prediction tools on missense variants of breast cancer genes. Hum Genomics 2024; 18:99. [PMID: 39256852 PMCID: PMC11389290 DOI: 10.1186/s40246-024-00667-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 08/22/2024] [Indexed: 09/12/2024] Open
Abstract
Single nucleotide variants (SNVs) can exert substantial and extremely variable impacts on various cellular functions, making accurate predictions of their consequences challenging, albeit crucial especially in clinical settings such as in oncology. Laboratory-based experimental methods for assessing these effects are time-consuming and often impractical, highlighting the importance of in-silico tools for variant impact prediction. However, the performance metrics of currently available tools on breast cancer missense variants from benchmarking databases have not been thoroughly investigated, creating a knowledge gap in the accurate prediction of pathogenicity. In this study, the benchmarking datasets ClinVar and HGMD were used to evaluate 21 Artificial Intelligence (AI)-derived in-silico tools. Missense variants in breast cancer genes were extracted from ClinVar and HGMD professional v2023.1. The HGMD dataset focused on pathogenic variants only, to ensure balance, benign variants for the same genes were included from the ClinVar database. Interestingly, our analysis of both datasets revealed variants across genes with varying penetrance levels like low and moderate in addition to high, reinforcing the value of disease-specific tools. The top-performing tools on ClinVar dataset identified were MutPred (Accuracy = 0.73), Meta-RNN (Accuracy = 0.72), ClinPred (Accuracy = 0.71), Meta-SVM, REVEL, and Fathmm-XF (Accuracy = 0.70). While on HGMD dataset they were ClinPred (Accuracy = 0.72), MetaRNN (Accuracy = 0.71), CADD (Accuracy = 0.69), Fathmm-MKL (Accuracy = 0.68), and Fathmm-XF (Accuracy = 0.67). These findings offer clinicians and researchers valuable insights for selecting, improving, and developing effective in-silico tools for breast cancer pathogenicity prediction. Bridging this knowledge gap contributes to advancing precision medicine and enhancing diagnostic and therapeutic approaches for breast cancer patients with potential implications for other conditions.
Collapse
Affiliation(s)
- Rahaf M Ahmad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Bassam R Ali
- Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Fatma Al-Jasmi
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Noura Al Dhaheri
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Saeed Al Turki
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Praseetha Kizhakkedath
- Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Mohd Saberi Mohamad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates.
- Center for Engineering Computational Intelligence, Faculty of Engineering and Technology, Multimedia University, Melaka, Malaysia.
| |
Collapse
|
3
|
Platt CJ, Bierzynska A, Ding W, Saleem SA, Koziell A, Saleem MA. Rare heterozygous variants in paediatric steroid resistant nephrotic syndrome - a population-based analysis of their significance. Sci Rep 2024; 14:18568. [PMID: 39127776 DOI: 10.1038/s41598-024-68837-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 07/29/2024] [Indexed: 08/12/2024] Open
Abstract
Genetic testing in nephrotic syndrome may identify heterozygous predicted-pathogenic variants (HPPVs) in autosomal recessive (AR) genes that are known to cause disease in the homozygous or compound heterozygous state. In such cases, it can be difficult to define the variant's true significance and questions remain about whether a second pathogenic variant has been missed during analysis or whether the variant is an incidental finding. There are now known to be over 70 genes associated with nephrotic syndrome, the majority inherited as an AR trait. Knowledge of whether such HPPVs occur with equal frequency in patients compared to the general population would assist interpretation of their significance. Exome sequencing was performed on 187 Steroid-Resistant Nephrotic Syndrome (SRNS) paediatric patients recruited to a UK rare disease registry plus originating from clinics at Evelina, London. 59 AR podocytopathy linked genes were analysed in each patient and a list of HPPVs created. We compared the frequency of detected HPPVs with a 'control' population from the gnomAD database containing exome data from approximately 50,000 individuals. A bespoke filtering process was used for both patients and controls to predict 'likely pathogenicity' of variants. In total 130 Caucasian SRNS patients were screened across 59 AR genes and 201 rare heterozygous variants were identified. 17/201 (8.5%) were assigned as 'likely pathogenic' (HPPV) using our bespoke filtering method. Comparing each gene in turn, for SRNS patients with a confirmed genetic diagnosis, in 57 of the 59 genes we found no statistically significant difference in the frequency of these HPPVs between patients and controls (In genes ARHGDIA and TP53RK, we identified a significantly higher number of HPPVs in the control population compared with the patients when filtering was performed with 'high stringency' settings only). In the SRNS patients without a genetics diagnosis confirmed, there was no statistically significant difference identified in any gene between patient and control. In children with SRNS, we propose that identification of HPPV in AR podocytopathy linked genes is not necessarily representative of pathogenicity, given that the frequency is similar to that seen in controls for the majority. Whilst this may not exclude the presence of genetic kidney disease, this type of heterozygous variant is unlikely to be causal and each result must be interpreted in its clinical context.
Collapse
Affiliation(s)
- C J Platt
- Bristol Royal Hospital for Children, Bristol, BS2 8NJ, UK.
| | - A Bierzynska
- Bristol Renal, University of Bristol, Bristol, UK
| | - W Ding
- Bristol Renal, University of Bristol, Bristol, UK
| | | | - A Koziell
- King's College and Evelina, London, UK
| | - M A Saleem
- Bristol Renal, University of Bristol, Bristol, UK
| |
Collapse
|
4
|
Jain S, Trinidad M, Nguyen TB, Jones K, Neto SD, Ge F, Glagovsky A, Jones C, Moran G, Wang B, Rahimi K, Çalıcı SZ, Cedillo LR, Berardelli S, Özden B, Chen K, Katsonis P, Williams A, Lichtarge O, Rana S, Pradhan S, Srinivasan R, Sajeed R, Joshi D, Faraggi E, Jernigan R, Kloczkowski A, Xu J, Song Z, Özkan S, Padilla N, de la Cruz X, Acuna-Hidalgo R, Grafmüller A, Jiménez Barrón LT, Manfredi M, Savojardo C, Babbi G, Martelli PL, Casadio R, Sun Y, Zhu S, Shen Y, Pucci F, Rooman M, Cia G, Raimondi D, Hermans P, Kwee S, Chen E, Astore C, Kamandula A, Pejaver V, Ramola R, Velyunskiy M, Zeiberg D, Mishra R, Sterling T, Goldstein JL, Lugo-Martinez J, Kazi S, Li S, Long K, Brenner SE, Bakolitsa C, Radivojac P, Suhr D, Suhr T, Clark WT. Evaluation of enzyme activity predictions for variants of unknown significance in Arylsulfatase A. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.16.594558. [PMID: 38798479 PMCID: PMC11118473 DOI: 10.1101/2024.05.16.594558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfatase A (ARSA) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among submissions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research.
Collapse
Affiliation(s)
- Shantanu Jain
- The Institute for Experiential AI, Northeastern University, Boston, MA, USA
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Marena Trinidad
- Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California, Berkeley, Berkeley, CA, USA
| | - Thanh Binh Nguyen
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, Australia
| | | | | | - Fang Ge
- State Key Laboratory of Organic Electronics and Information Displays & Institute of Advanced Materials (IAM), Nanjing University of Posts & Telecommunications, Nanjing, China
| | | | | | | | - Boqi Wang
- Department of Bioinformatics and System Biology, University of California, San Diego, La Jolla, CA, USA
| | - Kobra Rahimi
- Department of Computational Biology, School of Life Sciences, Ochanomizu University, Tokyo, Japan
| | - Sümeyra Zeynep Çalıcı
- Department of Genomics, Faculty of Aquatic Science, Istanbul University, Istanbul, Türkiye
| | | | - Silvia Berardelli
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
- enGenome srl, Pavia, Italy
| | - Buse Özden
- Program of Molecular Biotechnology and Genetics, Institute of Science, Istanbul University, Istanbul, Türkiye
| | - Ken Chen
- University of California, Berkeley, Berkeley, CA, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | | | | | | | | | | | - Eshel Faraggi
- Research and Information Systems LLC, Indianapolis, IN, USA
- Physics Department, Indiana University-Purdue University, Indianapolis, IN, USA
| | - Robert Jernigan
- Roy J. Carver Department of Biochemistry, Iowa State University, Ames, IA, USA
| | - Andrzej Kloczkowski
- Institute for Genomic Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Jierui Xu
- University of California, Berkeley, Berkeley, CA, USA
| | | | - Selen Özkan
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Natàlia Padilla
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Xavier de la Cruz
- Vall d'Hebron Institute of Research (VHIR), Barcelona, Spain
- Universitat Autònoma de Barcelona, Barcelona, Spain
- Institucío Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | | | | | | | | | | | - Giulia Babbi
- Biocomputing Group, University of Bologna, Bologna, Italy
| | | | - Rita Casadio
- Biocomputing Group, University of Bologna, Bologna, Italy
| | - Yuanfei Sun
- Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, USA
| | - Shaowen Zhu
- Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, USA
| | - Yang Shen
- Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, USA
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Gabriel Cia
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | | | - Pauline Hermans
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Sofia Kwee
- University of California, Berkeley, Berkeley, CA, USA
| | - Ella Chen
- University of California, Berkeley, Berkeley, CA, USA
| | | | - Akash Kamandula
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Rashika Ramola
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Michelle Velyunskiy
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Daniel Zeiberg
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Reet Mishra
- Department of Bioengineering, University of California, Berkeley, CA, USA
- Department of Bioengineering, University of California, San Francisco, CA, USA
| | | | - Jennifer L Goldstein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jose Lugo-Martinez
- Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | - Sindy Li
- University of California, Berkeley, Berkeley, CA, USA
| | - Kinsey Long
- University of California, Berkeley, Berkeley, CA, USA
| | | | | | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | | | | | | |
Collapse
|
5
|
Mohammed EEA, Fayez AG, Abdelfattah NM, Fateen E. Novel gene-specific Bayesian Gaussian mixture model to predict the missense variants pathogenicity of Sanfilippo syndrome. Sci Rep 2024; 14:12148. [PMID: 38802532 PMCID: PMC11130188 DOI: 10.1038/s41598-024-62352-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 05/16/2024] [Indexed: 05/29/2024] Open
Abstract
MPS III is an autosomal recessive lysosomal storage disease caused mainly by missense variants in the NAGLU, GNS, HGSNAT, and SGSH genes. The pathogenicity interpretation of missense variants is still challenging. We aimed to develop unsupervised clustering-based pathogenicity predictor scores using extracted features from eight in silico predictors to predict the impact of novel missense variants of Sanfilippo syndrome. The model was trained on a dataset consisting of 415 uncertain significant (VUS) missense NAGLU variants. Performance The SanfilippoPred tool was evaluated by validation and test datasets consisting of 197-labelled NAGLU missense variants, and its performance was compared versus individual pathogenicity predictors using receiver operating characteristic (ROC) analysis. Moreover, we tested the SanfilippoPred tool using extra-labelled 427 missense variants to assess its specificity and sensitivity threshold. Application of the trained machine learning (ML) model on the test dataset of labelled NAGLU missense variants showed that SanfilippoPred has an accuracy of 0.93 (0.86-0.97 at CI 95%), sensitivity of 0.93, and specificity of 0.92. The comparative performance of the SanfilippoPred showed better performance (AUC = 0.908) than the individual predictors SIFT (AUC = 0.756), Polyphen-2 (AUC = 0.788), CADD (AUC = 0.568), REVEL (AUC = 0.548), MetaLR (AUC = 0.751), and AlphMissense (AUC = 0.885). Using high-confidence labelled NAGLU variants, showed that SanfilippoPred has an 85.7% sensitivity threshold. The poor correlation between the Sanfilippo syndrome phenotype and genotype represents a demand for a new tool to classify its missense variants. This study provides a significant tool for preventing the misinterpretation of missense variants of the Sanfilippo syndrome-relevant genes. Finally, it seems that ML-based pathogenicity predictors and Sanfilippo syndrome-specific prediction tools could be feasible and efficient pathogenicity predictors in the future.
Collapse
Affiliation(s)
- Eman E A Mohammed
- Medical Molecular Genetics Department, Human Genetics and Genome Research Institute, National Research Centre, Giza, Egypt.
| | - Alaaeldin G Fayez
- Molecular Genetics and Enzymology Department, Human Genetics and Genome Research Institute, National Research Centre, Giza, Egypt
| | | | - Ekram Fateen
- Biochemical Genetics Department, Human Genetics and Genome Research Institute, National Research Centre, Giza, Egypt
| |
Collapse
|
6
|
Jain S, Bakolitsa C, Brenner SE, Radivojac P, Moult J, Repo S, Hoskins RA, Andreoletti G, Barsky D, Chellapan A, Chu H, Dabbiru N, Kollipara NK, Ly M, Neumann AJ, Pal LR, Odell E, Pandey G, Peters-Petrulewicz RC, Srinivasan R, Yee SF, Yeleswarapu SJ, Zuhl M, Adebali O, Patra A, Beer MA, Hosur R, Peng J, Bernard BM, Berry M, Dong S, Boyle AP, Adhikari A, Chen J, Hu Z, Wang R, Wang Y, Miller M, Wang Y, Bromberg Y, Turina P, Capriotti E, Han JJ, Ozturk K, Carter H, Babbi G, Bovo S, Di Lena P, Martelli PL, Savojardo C, Casadio R, Cline MS, De Baets G, Bonache S, Díez O, Gutiérrez-Enríquez S, Fernández A, Montalban G, Ootes L, Özkan S, Padilla N, Riera C, De la Cruz X, Diekhans M, Huwe PJ, Wei Q, Xu Q, Dunbrack RL, Gotea V, Elnitski L, Margolin G, Fariselli P, Kulakovskiy IV, Makeev VJ, Penzar DD, Vorontsov IE, Favorov AV, Forman JR, Hasenahuer M, Fornasari MS, Parisi G, Avsec Z, Çelik MH, Nguyen TYD, Gagneur J, Shi FY, Edwards MD, Guo Y, Tian K, Zeng H, Gifford DK, Göke J, Zaucha J, Gough J, Ritchie GRS, Frankish A, Mudge JM, Harrow J, Young EL, Yu Y, Huff CD, Murakami K, Nagai Y, Imanishi T, Mungall CJ, Jacobsen JOB, Kim D, Jeong CS, Jones DT, Li MJ, Guthrie VB, Bhattacharya R, Chen YC, Douville C, Fan J, Kim D, Masica D, Niknafs N, Sengupta S, Tokheim C, Turner TN, Yeo HTG, Karchin R, Shin S, Welch R, Keles S, Li Y, Kellis M, Corbi-Verge C, Strokach AV, Kim PM, Klein TE, Mohan R, Sinnott-Armstrong NA, Wainberg M, Kundaje A, Gonzaludo N, Mak ACY, Chhibber A, Lam HYK, Dahary D, Fishilevich S, Lancet D, Lee I, Bachman B, Katsonis P, Lua RC, Wilson SJ, Lichtarge O, Bhat RR, Sundaram L, Viswanath V, Bellazzi R, Nicora G, Rizzo E, Limongelli I, Mezlini AM, Chang R, Kim S, Lai C, O’Connor R, Topper S, van den Akker J, Zhou AY, Zimmer AD, Mishne G, Bergquist TR, Breese MR, Guerrero RF, Jiang Y, Kiga N, Li B, Mort M, Pagel KA, Pejaver V, Stamboulian MH, Thusberg J, Mooney SD, Teerakulkittipong N, Cao C, Kundu K, Yin Y, Yu CH, Kleyman M, Lin CF, Stackpole M, Mount SM, Eraslan G, Mueller NS, Naito T, Rao AR, Azaria JR, Brodie A, Ofran Y, Garg A, Pal D, Hawkins-Hooker A, Kenlay H, Reid J, Mucaki EJ, Rogan PK, Schwarz JM, Searls DB, Lee GR, Seok C, Krämer A, Shah S, Huang CV, Kirsch JF, Shatsky M, Cao Y, Chen H, Karimi M, Moronfoye O, Sun Y, Shen Y, Shigeta R, Ford CT, Nodzak C, Uppal A, Shi X, Joseph T, Kotte S, Rana S, Rao A, Saipradeep VG, Sivadasan N, Sunderam U, Stanke M, Su A, Adzhubey I, Jordan DM, Sunyaev S, Rousseau F, Schymkowitz J, Van Durme J, Tavtigian SV, Carraro M, Giollo M, Tosatto SCE, Adato O, Carmel L, Cohen NE, Fenesh T, Holtzer T, Juven-Gershon T, Unger R, Niroula A, Olatubosun A, Väliaho J, Yang Y, Vihinen M, Wahl ME, Chang B, Chong KC, Hu I, Sun R, Wu WKK, Xia X, Zee BC, Wang MH, Wang M, Wu C, Lu Y, Chen K, Yang Y, Yates CM, Kreimer A, Yan Z, Yosef N, Zhao H, Wei Z, Yao Z, Zhou F, Folkman L, Zhou Y, Daneshjou R, Altman RB, Inoue F, Ahituv N, Arkin AP, Lovisa F, Bonvini P, Bowdin S, Gianni S, Mantuano E, Minicozzi V, Novak L, Pasquo A, Pastore A, Petrosino M, Puglisi R, Toto A, Veneziano L, Chiaraluce R, Ball MP, Bobe JR, Church GM, Consalvi V, Cooper DN, Buckley BA, Sheridan MB, Cutting GR, Scaini MC, Cygan KJ, Fredericks AM, Glidden DT, Neil C, Rhine CL, Fairbrother WG, Alontaga AY, Fenton AW, Matreyek KA, Starita LM, Fowler DM, Löscher BS, Franke A, Adamson SI, Graveley BR, Gray JW, Malloy MJ, Kane JP, Kousi M, Katsanis N, Schubach M, Kircher M, Mak ACY, Tang PLF, Kwok PY, Lathrop RH, Clark WT, Yu GK, LeBowitz JH, Benedicenti F, Bettella E, Bigoni S, Cesca F, Mammi I, Marino-Buslje C, Milani D, Peron A, Polli R, Sartori S, Stanzial F, Toldo I, Turolla L, Aspromonte MC, Bellini M, Leonardi E, Liu X, Marshall C, McCombie WR, Elefanti L, Menin C, Meyn MS, Murgia A, Nadeau KCY, Neuhausen SL, Nussbaum RL, Pirooznia M, Potash JB, Dimster-Denk DF, Rine JD, Sanford JR, Snyder M, Cote AG, Sun S, Verby MW, Weile J, Roth FP, Tewhey R, Sabeti PC, Campagna J, Refaat MM, Wojciak J, Grubb S, Schmitt N, Shendure J, Spurdle AB, Stavropoulos DJ, Walton NA, Zandi PP, Ziv E, Burke W, Chen F, Carr LR, Martinez S, Paik J, Harris-Wai J, Yarborough M, Fullerton SM, Koenig BA, McInnes G, Shigaki D, Chandonia JM, Furutsuki M, Kasak L, Yu C, Chen R, Friedberg I, Getz GA, Cong Q, Kinch LN, Zhang J, Grishin NV, Voskanian A, Kann MG, Tran E, Ioannidis NM, Hunter JM, Udani R, Cai B, Morgan AA, Sokolov A, Stuart JM, Minervini G, Monzon AM, Batzoglou S, Butte AJ, Greenblatt MS, Hart RK, Hernandez R, Hubbard TJP, Kahn S, O’Donnell-Luria A, Ng PC, Shon J, Veltman J, Zook JM. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods. Genome Biol 2024; 25:53. [PMID: 38389099 PMCID: PMC10882881 DOI: 10.1186/s13059-023-03113-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Accepted: 11/17/2023] [Indexed: 02/24/2024] Open
Abstract
BACKGROUND The Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction of genetic variant impact, particularly where relevant to disease. The five complete editions of the CAGI community experiment comprised 50 challenges, in which participants made blind predictions of phenotypes from genetic data, and these were evaluated by independent assessors. RESULTS Performance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic. CONCLUSIONS Results show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead.
Collapse
|
7
|
Ahmad RM, Ali BR, Al-Jasmi F, Sinnott RO, Al Dhaheri N, Mohamad MS. A review of genetic variant databases and machine learning tools for predicting the pathogenicity of breast cancer. Brief Bioinform 2023; 25:bbad479. [PMID: 38149678 PMCID: PMC10782903 DOI: 10.1093/bib/bbad479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 09/22/2023] [Accepted: 12/04/2023] [Indexed: 12/28/2023] Open
Abstract
Studies continue to uncover contributing risk factors for breast cancer (BC) development including genetic variants. Advances in machine learning and big data generated from genetic sequencing can now be used for predicting BC pathogenicity. However, it is unclear which tool developed for pathogenicity prediction is most suited for predicting the impact and pathogenicity of variant effects. A significant challenge is to determine the most suitable data source for each tool since different tools can yield different prediction results with different data inputs. To this end, this work reviews genetic variant databases and tools used specifically for the prediction of BC pathogenicity. We provide a description of existing genetic variants databases and, where appropriate, the diseases for which they have been established. Through example, we illustrate how they can be used for prediction of BC pathogenicity and discuss their associated advantages and disadvantages. We conclude that the tools that are specialized by training on multiple diverse datasets from different databases for the same disease have enhanced accuracy and specificity and are thereby more helpful to the clinicians in predicting and diagnosing BC as early as possible.
Collapse
Affiliation(s)
- Rahaf M Ahmad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Bassam R Ali
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| | - Fatma Al-Jasmi
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Richard O Sinnott
- School of Computing and Information System, Faculty of Engineering and Information Technology, The University of Melbourne, Melbourne, Victoria, Australia
| | - Noura Al Dhaheri
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
- Division of Metabolic Genetics, Department of Pediatrics, Tawam Hospital, Al Ain, United Arab Emirates
| | - Mohd Saberi Mohamad
- Health Data Science Lab, Department of Genetics and Genomics, College of Medical and Health Sciences, United Arab Emirates University, Tawam road, Al Maqam district, Al Ain, Abu Dhabi, United Arab Emirates
| |
Collapse
|
8
|
Katsonis P, Wilhelm K, Williams A, Lichtarge O. Genome interpretation using in silico predictors of variant impact. Hum Genet 2022; 141:1549-1577. [PMID: 35488922 PMCID: PMC9055222 DOI: 10.1007/s00439-022-02457-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/17/2022] [Indexed: 02/06/2023]
Abstract
Estimating the effects of variants found in disease driver genes opens the door to personalized therapeutic opportunities. Clinical associations and laboratory experiments can only characterize a tiny fraction of all the available variants, leaving the majority as variants of unknown significance (VUS). In silico methods bridge this gap by providing instant estimates on a large scale, most often based on the numerous genetic differences between species. Despite concerns that these methods may lack reliability in individual subjects, their numerous practical applications over cohorts suggest they are already helpful and have a role to play in genome interpretation when used at the proper scale and context. In this review, we aim to gain insights into the training and validation of these variant effect predicting methods and illustrate representative types of experimental and clinical applications. Objective performance assessments using various datasets that are not yet published indicate the strengths and limitations of each method. These show that cautious use of in silico variant impact predictors is essential for addressing genome interpretation challenges.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Kevin Wilhelm
- Graduate School of Biomedical Sciences, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Biochemistry, Human Genetics and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Pharmacology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
9
|
ADGRL3 genomic variation implicated in neurogenesis and ADHD links functional effects to the incretin polypeptide GIP. Sci Rep 2022; 12:15922. [PMID: 36151371 PMCID: PMC9508192 DOI: 10.1038/s41598-022-20343-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 09/12/2022] [Indexed: 11/09/2022] Open
Abstract
Attention deficit/hyperactivity disorder (ADHD) is the most common childhood neurodevelopmental disorder. Single nucleotide polymorphisms (SNPs) in the Adhesion G Protein-Coupled Receptor L3 (ADGRL3) gene are associated with increased susceptibility to developing ADHD worldwide. However, the effect of ADGRL3 non-synonymous SNPs (nsSNPs) on the ADGRL3 protein function is vastly unknown. Using several bioinformatics tools to evaluate the impact of mutations, we found that nsSNPs rs35106420, rs61747658, and rs734644, previously reported to be associated and in linkage with ADHD in disparate populations from the world over, are predicted as pathogenic variants. Docking analysis of rs35106420, harbored in the ADGLR3-hormone receptor domain (HRM, a common extracellular domain of the secretin-like GPCRs family), showed that HRM interacts with the Glucose-dependent insulinotropic polypeptide (GIP), part of the incretin hormones family. GIP has been linked to the pathogenesis of diabetes mellitus, and our analyses suggest a potential link to ADHD. Overall, the comprehensive application of bioinformatics tools showed that functional mutations in the ADGLR3 gene disrupt the standard and wild ADGRL3 structure, most likely affecting its metabolic regulation. Further in vitro experiments are granted to evaluate these in silico predictions of the ADGRL3-GIP interaction and dissect the complexity underlying the development of ADHD.
Collapse
|
10
|
Veatch OJ, Mazzotti DR, Schultz RT, Abel T, Michaelson JJ, Brodkin ES, Tunc B, Assouline SG, Nickl-Jockschat T, Malow BA, Sutcliffe JS, Pack AI. Calculating genetic risk for dysfunction in pleiotropic biological processes using whole exome sequencing data. J Neurodev Disord 2022; 14:39. [PMID: 35751013 PMCID: PMC9233372 DOI: 10.1186/s11689-022-09448-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 06/08/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Numerous genes are implicated in autism spectrum disorder (ASD). ASD encompasses a wide-range and severity of symptoms and co-occurring conditions; however, the details of how genetic variation contributes to phenotypic differences are unclear. This creates a challenge for translating genetic evidence into clinically useful knowledge. Sleep disturbances are particularly prevalent co-occurring conditions in ASD, and genetics may inform treatment. Identifying convergent mechanisms with evidence for dysfunction that connect ASD and sleep biology could help identify better treatments for sleep disturbances in these individuals. METHODS To identify mechanisms that influence risk for ASD and co-occurring sleep disturbances, we analyzed whole exome sequence data from individuals in the Simons Simplex Collection (n = 2380). We predicted protein damaging variants (PDVs) in genes currently implicated in either ASD or sleep duration in typically developing children. We predicted a network of ASD-related proteins with direct evidence for interaction with sleep duration-related proteins encoded by genes with PDVs. Overrepresentation analyses of Gene Ontology-defined biological processes were conducted on the resulting gene set. We calculated the likelihood of dysfunction in the top overrepresented biological process. We then tested if scores reflecting genetic dysfunction in the process were associated with parent-reported sleep duration. RESULTS There were 29 genes with PDVs in the ASD dataset where variation was reported in the literature to be associated with both ASD and sleep duration. A network of 108 proteins encoded by ASD and sleep duration candidate genes with PDVs was identified. The mechanism overrepresented in PDV-containing genes that encode proteins in the interaction network with the most evidence for dysfunction was cerebral cortex development (GO:0,021,987). Scores reflecting dysfunction in this process were associated with sleep durations; the largest effects were observed in adolescents (p = 4.65 × 10-3). CONCLUSIONS Our bioinformatic-driven approach detected a biological process enriched for genes encoding a protein-protein interaction network linking ASD gene products with sleep duration gene products where accumulation of potentially damaging variants in individuals with ASD was associated with sleep duration as reported by the parents. Specifically, genetic dysfunction impacting development of the cerebral cortex may affect sleep by disrupting sleep homeostasis which is evidenced to be regulated by this brain region. Future functional assessments and objective measurements of sleep in adolescents with ASD could provide the basis for more informed treatment of sleep problems in these individuals.
Collapse
Affiliation(s)
- Olivia J Veatch
- Department of Psychiatry and Behavioral Sciences, Medical Center, University of Kansas, Kansas City, KS, USA.
| | - Diego R Mazzotti
- Division of Medical Informatics, Department of Internal Medicine, Medical Center, University of Kansas, Kansas City, KS, USA
| | - Robert T Schultz
- Center for Autism Research, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Ted Abel
- Department of Neuroscience and Pharmacology, Iowa Neuroscience Institute, University of Iowa, Iowa City, Iowa, USA
| | | | - Edward S Brodkin
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Birkan Tunc
- Center for Autism Research, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Susan G Assouline
- Belin-Blank Center for Gifted Education and Talent Development, University of Iowa, Iowa City, Iowa, USA
| | | | - Beth A Malow
- Division of Sleep Medicine, Department of Neurology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - James S Sutcliffe
- Department of Molecular Physiology and Biophysics, Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA
| | - Allan I Pack
- Division of Sleep Medicine, Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
11
|
Analysis of missense variants in the human genome reveals widespread gene-specific clustering and improves prediction of pathogenicity. Am J Hum Genet 2022; 109:457-470. [PMID: 35120630 PMCID: PMC8948164 DOI: 10.1016/j.ajhg.2022.01.006] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 01/11/2022] [Indexed: 12/11/2022] Open
Abstract
We used a machine learning approach to analyze the within-gene distribution of missense variants observed in hereditary conditions and cancer. When applied to 840 genes from the ClinVar database, this approach detected a significant non-random distribution of pathogenic and benign variants in 387 (46%) and 172 (20%) genes, respectively, revealing that variant clustering is widespread across the human exome. This clustering likely occurs as a consequence of mechanisms shaping pathogenicity at the protein level, as illustrated by the overlap of some clusters with known functional domains. We then took advantage of these findings to develop a pathogenicity predictor, MutScore, that integrates qualitative features of DNA substitutions with the new additional information derived from this positional clustering. Using a random forest approach, MutScore was able to identify pathogenic missense mutations with very high accuracy, outperforming existing predictive tools, especially for variants associated with autosomal-dominant disease and cancer. Thus, the within-gene clustering of pathogenic and benign DNA changes is an important and previously underappreciated feature of the human exome, which can be harnessed to improve the prediction of pathogenicity and disambiguation of DNA variants of uncertain significance.
Collapse
|
12
|
Jiang Y, Urresti J, Pagel KA, Pramod AB, Iakoucheva LM, Radivojac P. Prioritizing de novo autism risk variants with calibrated gene- and variant-scoring models. Hum Genet 2021; 141:1595-1613. [PMID: 34549350 DOI: 10.1007/s00439-021-02356-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 08/26/2021] [Indexed: 12/17/2022]
Abstract
Whole-exome and whole-genome sequencing studies in autism spectrum disorder (ASD) have identified hundreds of thousands of exonic variants. Only a handful of them, primarily loss-of-function variants, have been shown to increase the risk for ASD, while the contributory roles of other variants, including most missense variants, remain unknown. New approaches that combine tissue-specific molecular profiles with patients' genetic data can thus play an important role in elucidating the functional impact of exonic variation and improve understanding of ASD pathogenesis. Here, we integrate spatio-temporal gene co-expression networks from the developing human brain and protein-protein interaction networks to first reach accurate prioritization of ASD risk genes based on their connectivity patterns with previously known high-confidence ASD risk genes. We subsequently integrate these gene scores with variant pathogenicity predictions to further prioritize individual exonic variants based on the positive-unlabeled learning framework with gene- and variant-score calibration. We demonstrate that this approach discriminates among variants between cases and controls at the high end of the prediction range. Finally, we experimentally validate our top-scoring de novo mutation NP_001243143.1:p.Phe309Ser in the sodium/potassium-transporting ATPase ATP1A3 to disrupt protein binding with different partners.
Collapse
Affiliation(s)
- Yuxiang Jiang
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Jorge Urresti
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Kymberleigh A Pagel
- Department of Computer Science, Indiana University, Bloomington, IN, USA.,Institute for Computational Medicine, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Akula Bala Pramod
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Lilia M Iakoucheva
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA.
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
| |
Collapse
|
13
|
Tarca AL, Pataki BÁ, Romero R, Sirota M, Guan Y, Kutum R, Gomez-Lopez N, Done B, Bhatti G, Yu T, Andreoletti G, Chaiworapongsa T, Hassan SS, Hsu CD, Aghaeepour N, Stolovitzky G, Csabai I, Costello JC. Crowdsourcing assessment of maternal blood multi-omics for predicting gestational age and preterm birth. Cell Rep Med 2021; 2:100323. [PMID: 34195686 PMCID: PMC8233692 DOI: 10.1016/j.xcrm.2021.100323] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 01/18/2021] [Accepted: 05/20/2021] [Indexed: 12/15/2022]
Abstract
Identification of pregnancies at risk of preterm birth (PTB), the leading cause of newborn deaths, remains challenging given the syndromic nature of the disease. We report a longitudinal multi-omics study coupled with a DREAM challenge to develop predictive models of PTB. The findings indicate that whole-blood gene expression predicts ultrasound-based gestational ages in normal and complicated pregnancies (r = 0.83) and, using data collected before 37 weeks of gestation, also predicts the delivery date in both normal pregnancies (r = 0.86) and those with spontaneous preterm birth (r = 0.75). Based on samples collected before 33 weeks in asymptomatic women, our analysis suggests that expression changes preceding preterm prelabor rupture of the membranes are consistent across time points and cohorts and involve leukocyte-mediated immunity. Models built from plasma proteomic data predict spontaneous preterm delivery with intact membranes with higher accuracy and earlier in pregnancy than transcriptomic models (AUROC = 0.76 versus AUROC = 0.6 at 27-33 weeks of gestation).
Collapse
Affiliation(s)
- Adi L. Tarca
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI 48201 USA
- Department of Computer Science, Wayne State University College of Engineering, Detroit, MI 48202, USA
| | - Bálint Ármin Pataki
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Roberto Romero
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI 48824, USA
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI 48201, USA
- Detroit Medical Center, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, Florida International University, Miami, FL 33199, USA
| | - Marina Sirota
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA 94143, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Yuanfang Guan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Rintu Kutum
- Informatics and Big Data Unit, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
| | - Nardhy Gomez-Lopez
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI 48201 USA
- Department of Biochemistry, Microbiology, and Immunology, Wayne State University School of Medicine, Detroit, MI 48201 USA
| | - Bogdan Done
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI 48201 USA
| | - Gaurav Bhatti
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI 48201 USA
| | | | - Gaia Andreoletti
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA 94143, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Tinnakorn Chaiworapongsa
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI 48201 USA
| | - The DREAM Preterm Birth Prediction Challenge Consortium
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI 48201 USA
- Department of Computer Science, Wayne State University College of Engineering, Detroit, MI 48202, USA
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary
- Department of Obstetrics and Gynecology, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI 48824, USA
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI 48201, USA
- Detroit Medical Center, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, Florida International University, Miami, FL 33199, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA 94143, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Informatics and Big Data Unit, CSIR-Institute of Genomics and Integrative Biology, New Delhi, India
- Department of Biochemistry, Microbiology, and Immunology, Wayne State University School of Medicine, Detroit, MI 48201 USA
- Sage Bionetworks, Seattle, WA, USA
- Office of Women’s Health, Integrative Biosciences Center, Wayne State University, Detroit, MI 48202, USA
- Department of Physiology, Wayne State University School of Medicine, Detroit, MI 48201, USA
- Department of Anesthesiology, Perioperative, and Pain Medicine, Department of Pediatrics, and Department of Biomedical Data Sciences, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA
- Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Sonia S. Hassan
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI 48201 USA
- Office of Women’s Health, Integrative Biosciences Center, Wayne State University, Detroit, MI 48202, USA
- Department of Physiology, Wayne State University School of Medicine, Detroit, MI 48201, USA
| | - Chaur-Dong Hsu
- Perinatology Research Branch, Division of Obstetrics and Maternal-Fetal Medicine, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, US Department of Health and Human Services, Detroit, MI 48201, USA
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI 48201 USA
- Department of Physiology, Wayne State University School of Medicine, Detroit, MI 48201, USA
| | - Nima Aghaeepour
- Department of Anesthesiology, Perioperative, and Pain Medicine, Department of Pediatrics, and Department of Biomedical Data Sciences, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Gustavo Stolovitzky
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
- IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA
| | - Istvan Csabai
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary
| | - James C. Costello
- Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| |
Collapse
|
14
|
Popov AV, Endutkin AV, Yatsenko DD, Yudkina AV, Barmatov AE, Makasheva KA, Raspopova DY, Diatlova EA, Zharkov DO. Molecular dynamics approach to identification of new OGG1 cancer-associated somatic variants with impaired activity. J Biol Chem 2021; 296:100229. [PMID: 33361155 PMCID: PMC7948927 DOI: 10.1074/jbc.ra120.014455] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 12/22/2020] [Accepted: 12/23/2020] [Indexed: 01/02/2023] Open
Abstract
DNA of living cells is always exposed to damaging factors. To counteract the consequences of DNA lesions, cells have evolved several DNA repair systems, among which base excision repair is one of the most important systems. Many currently used antitumor drugs act by damaging DNA, and DNA repair often interferes with chemotherapy and radiotherapy in cancer cells. Tumors are usually extremely genetically heterogeneous, often bearing mutations in DNA repair genes. Thus, knowledge of the functionality of cancer-related variants of proteins involved in DNA damage response and repair is of great interest for personalization of cancer therapy. Although computational methods to predict the variant functionality have attracted much attention, at present, they are mostly based on sequence conservation and make little use of modern capabilities in computational analysis of 3D protein structures. We have used molecular dynamics (MD) to model the structures of 20 clinically observed variants of a DNA repair enzyme, 8-oxoguanine DNA glycosylase. In parallel, we have experimentally characterized the activity, thermostability, and DNA binding in a subset of these mutant proteins. Among the analyzed variants of 8-oxoguanine DNA glycosylase, three (I145M, G202C, and V267M) were significantly functionally impaired and were successfully predicted by MD. Alone or in combination with sequence-based methods, MD may be an important functional prediction tool for cancer-related protein variants of unknown significance.
Collapse
Affiliation(s)
- Aleksandr V Popov
- Laboratory of Genome and Protein Engineering, SB RAS Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia.
| | - Anton V Endutkin
- Laboratory of Genome and Protein Engineering, SB RAS Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia
| | - Darya D Yatsenko
- Laboratory of Genome and Protein Engineering, SB RAS Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia; Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Anna V Yudkina
- Laboratory of Genome and Protein Engineering, SB RAS Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia
| | - Alexander E Barmatov
- Laboratory of Genome and Protein Engineering, SB RAS Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia
| | - Kristina A Makasheva
- Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Darya Yu Raspopova
- Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Evgeniia A Diatlova
- Laboratory of Genome and Protein Engineering, SB RAS Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia
| | - Dmitry O Zharkov
- Laboratory of Genome and Protein Engineering, SB RAS Institute of Chemical Biology and Fundamental Medicine, Novosibirsk, Russia; Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia.
| |
Collapse
|
15
|
Iqbal S, Pérez-Palma E, Jespersen JB, May P, Hoksza D, Heyne HO, Ahmed SS, Rifat ZT, Rahman MS, Lage K, Palotie A, Cottrell JR, Wagner FF, Daly MJ, Campbell AJ, Lal D. Comprehensive characterization of amino acid positions in protein structures reveals molecular effect of missense variants. Proc Natl Acad Sci U S A 2020; 117:28201-28211. [PMID: 33106425 PMCID: PMC7668189 DOI: 10.1073/pnas.2002660117] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Interpretation of the colossal number of genetic variants identified from sequencing applications is one of the major bottlenecks in clinical genetics, with the inference of the effect of amino acid-substituting missense variations on protein structure and function being especially challenging. Here we characterize the three-dimensional (3D) amino acid positions affected in pathogenic and population variants from 1,330 disease-associated genes using over 14,000 experimentally solved human protein structures. By measuring the statistical burden of variations (i.e., point mutations) from all genes on 40 3D protein features, accounting for the structural, chemical, and functional context of the variations' positions, we identify features that are generally associated with pathogenic and population missense variants. We then perform the same amino acid-level analysis individually for 24 protein functional classes, which reveals unique characteristics of the positions of the altered amino acids: We observe up to 46% divergence of the class-specific features from the general characteristics obtained by the analysis on all genes, which is consistent with the structural diversity of essential regions across different protein classes. We demonstrate that the function-specific 3D features of the variants match the readouts of mutagenesis experiments for BRCA1 and PTEN, and positively correlate with an independent set of clinically interpreted pathogenic and benign missense variants. Finally, we make our results available through a web server to foster accessibility and downstream research. Our findings represent a crucial step toward translational genetics, from highlighting the impact of mutations on protein structure to rationalizing the variants' pathogenicity in terms of the perturbed molecular mechanisms.
Collapse
Affiliation(s)
- Sumaiya Iqbal
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142;
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114
| | - Eduardo Pérez-Palma
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195
| | - Jakob B Jespersen
- Department of Bio and Health Informatics, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg
| | - David Hoksza
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague 11636, Czech Republic
| | - Henrike O Heyne
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114
- Institute for Molecular Medicine Finland, University of Helsinki, 00100 Helsinki, Finland
| | - Shehab S Ahmed
- Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Zaara T Rifat
- Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - M Sohel Rahman
- Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Kasper Lage
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142
- Department of Surgery, Massachusetts General Hospital, Boston, MA 02114
| | - Aarno Palotie
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Institute for Molecular Medicine Finland, University of Helsinki, 00100 Helsinki, Finland
| | - Jeffrey R Cottrell
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142
| | - Florence F Wagner
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142
| | - Mark J Daly
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114
- Institute for Molecular Medicine Finland, University of Helsinki, 00100 Helsinki, Finland
| | - Arthur J Campbell
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142;
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142
| | - Dennis Lal
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142;
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195
- Cologne Center for Genomics, University of Cologne, 50931 Cologne, Germany
- Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, OH 44195
| |
Collapse
|
16
|
Perdomo-Ramirez A, Antón-Gamero M, Rizzo DS, Trindade A, Ramos-Trujillo E, Claverie-Martin F. Two new missense mutations in the protein interaction ASH domain of OCRL1 identified in patients with Lowe syndrome. Intractable Rare Dis Res 2020; 9:222-228. [PMID: 33139981 PMCID: PMC7586875 DOI: 10.5582/irdr.2020.03092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
The oculocerebrorenal syndrome of Lowe is a rare X-linked disease characterized by congenital cataracts, proximal renal tubulopathy, muscular hypotonia and mental impairment. This disease is caused by mutations in the OCRL gene encoding membrane bound inositol polyphosphate 5-phosphatase OCRL1. Here, we examined the OCRL gene of two Lowe syndrome patients and report two new missense mutations that affect the ASH domain involved in protein-protein interactions. Genomic DNA was extracted from peripheral blood of two non-related patients and their relatives. Exons and flanking intronic regions of OCRL were analyzed by direct sequencing. Several bioinformatics tools were used to assess the pathogenicity of the variants. The three-dimensional structure of wild-type and mutant ASH domains was modeled using the online server SWISS-MODEL. Clinical features suggesting the diagnosis of Lowe syndrome were observed in both patients. Genetic analysis revealed two novel missense variants, c.1907T>A (p.V636E) and c.1979A>C (p.H660P) in exon 18 of the OCRL gene confirming the clinical diagnosis in both cases. Variant c.1907T>A (p.V636E) was inherited from the patient's mother, while variant c.1979A>C (p.H660P) seems to have originated de novo. Analysis with bioinformatics tools indicated that both variants are pathogenic. Both amino acid changes affect the structure of the OCRL1 ASH domain. In conclusion, the identification of two novel missense mutations located in the OCRL1 ASH domain may shed more light on the functional importance of this domain. We suggest that p.V636E and p.H660P cause Lowe syndrome by disrupting the interaction of OCRL1 with other proteins or by impairing protein stability.
Collapse
Affiliation(s)
- Ana Perdomo-Ramirez
- Unidad de Investigación, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | | | | | - Amelia Trindade
- Departamento de Medicina, Universidade Federal de Sao Carlos, Sao Paulo, Brazil
| | - Elena Ramos-Trujillo
- Unidad de Investigación, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | - Felix Claverie-Martin
- Unidad de Investigación, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
- Address correspondence to:Félix Claverie-Martín, Unidad de Investigación, Hospital Nuestra Señora de Candelaria, Carretera del Rosario 145, 38010 Santa Cruz de Tenerife, Spain. E-mail:
| |
Collapse
|
17
|
Lodewijk GA, Fernandes DP, Vretzakis I, Savage JE, Jacobs FMJ. Evolution of Human Brain Size-Associated NOTCH2NL Genes Proceeds toward Reduced Protein Levels. Mol Biol Evol 2020; 37:2531-2548. [PMID: 32330268 PMCID: PMC7475042 DOI: 10.1093/molbev/msaa104] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Ever since the availability of genomes from Neanderthals, Denisovans, and ancient humans, the field of evolutionary genomics has been searching for protein-coding variants that may hold clues to how our species evolved over the last ∼600,000 years. In this study, we identify such variants in the human-specific NOTCH2NL gene family, which were recently identified as possible contributors to the evolutionary expansion of the human brain. We find evidence for the existence of unique protein-coding NOTCH2NL variants in Neanderthals and Denisovans which could affect their ability to activate Notch signaling. Furthermore, in the Neanderthal and Denisovan genomes, we find unusual NOTCH2NL configurations, not found in any of the modern human genomes analyzed. Finally, genetic analysis of archaic and modern humans reveals ongoing adaptive evolution of modern human NOTCH2NL genes, identifying three structural variants acting complementary to drive our genome to produce a lower dosage of NOTCH2NL protein. Because copy-number variations of the 1q21.1 locus, encompassing NOTCH2NL genes, are associated with severe neurological disorders, this seemingly contradicting drive toward low levels of NOTCH2NL protein indicates that the optimal dosage of NOTCH2NL may have not yet been settled in the human population.
Collapse
Affiliation(s)
- Gerrald A Lodewijk
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Diana P Fernandes
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Iraklis Vretzakis
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Jeanne E Savage
- Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
- Amsterdam Neuroscience, Complex Trait Genetics
| | - Frank M J Jacobs
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
- Amsterdam Neuroscience, Complex Trait Genetics
| |
Collapse
|
18
|
García-Castaño A, Perdomo-Ramirez A, Vall-Palomar M, Ramos-Trujillo E, Madariaga L, Ariceta G, Claverie-Martin F. Novel compound heterozygous mutations of CLDN16 in a patient with familial hypomagnesemia with hypercalciuria and nephrocalcinosis. Mol Genet Genomic Med 2020; 8:e1475. [PMID: 32869508 PMCID: PMC7667358 DOI: 10.1002/mgg3.1475] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2020] [Revised: 07/13/2020] [Accepted: 08/04/2020] [Indexed: 12/23/2022] Open
Abstract
Background Familial hypomagnesemia with hypercalciuria and nephrocalcinosis (FHHNC) is an autosomal recessive tubulopathy characterized by excessive urinary wasting of magnesium and calcium, bilateral nephrocalcinosis, and progressive chronic renal failure in childhood or adolescence. FHHNC is caused by mutations in CLDN16 and CLDN19, which encode the tight‐junction proteins claudin‐16 and claudin‐19, respectively. Most of these mutations are missense mutations and large deletions are rare. Methods We examined the clinical and biochemical features of a Spanish boy with early onset of FHHNC symptoms. Exons and flanking intronic segments of CLDN16 and CLDN19 were analyzed by direct sequencing. We developed a new assay based on Quantitative Multiplex PCR of Short Fluorescent Fragments (QMPSF) to investigate large CLDN16 deletions. Results Genetic analysis revealed two novel compound heterozygous mutations of CLDN16, comprising a missense mutation, c.277G>A; p.(Ala93Thr), in one allele, and a gross deletion that lacked exons 4 and 5,c.(840+25_?)del, in the other allele. The patient inherited these variants from his mother and father, respectively. Conclusions Using direct sequencing and our QMPSF assay, we identified the genetic cause of FHHNC in our patient. This QMPSF assay should facilitate the genetic diagnosis of FHHNC. Our study provided additional data on the genotypic spectrum of the CLDN16 gene.
Collapse
Affiliation(s)
| | - Ana Perdomo-Ramirez
- Unidad de Investigación, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | - Mònica Vall-Palomar
- Fisiopatologia Renal, Centro de Investigaciones en Bioquímica y Biología Molecular (CIBBIM), Vall d'Hebron Institut de Recerca (VHIR, Barcelona, Spain
| | - Elena Ramos-Trujillo
- Unidad de Investigación, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | - Leire Madariaga
- Biocruces Bizkaia Research Institute, Barakaldo, Bizkaia, Spain.,Pediatric Nephrology Department, Cruces University Hospital, UPV/EHU, Barakaldo, Spain
| | - Gema Ariceta
- Fisiopatologia Renal, Centro de Investigaciones en Bioquímica y Biología Molecular (CIBBIM), Vall d'Hebron Institut de Recerca (VHIR, Barcelona, Spain.,Servicio de Nefrología Pediátrica, Hospital Universitari Vall d'Hebron, Barcelona, Spain.,Departamento de Pediatría, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain
| | - Felix Claverie-Martin
- Unidad de Investigación, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| |
Collapse
|
19
|
Havranek B, Islam SM. Prediction and evaluation of deleterious and disease causing non-synonymous SNPs (nsSNPs) in human NF2 gene responsible for neurofibromatosis type 2 (NF2). J Biomol Struct Dyn 2020; 39:7044-7055. [PMID: 32787631 DOI: 10.1080/07391102.2020.1805018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The majority of genetic variations in the human genome that lead to variety of different diseases are caused by non-synonymous single nucleotide polymorphisms (nsSNPs). Neurofibromatosis type 2 (NF2) is a deadly disease caused by nsSNPs in the NF2 gene that encodes for a protein called merlin. This study used various in silico methods, SIFT, Polyphen-2, PhD-SNP and MutPred, to investigate the pathogenic effect of 14 nsSNPs in the merlin FERM domain. The G197C and L234R mutations were found to be two deleterious and disease mutations associated with the mild and severe forms of NF2, respectively. Molecular dynamics (MD) simulations were conducted to understand the stability, structure and dynamics of these mutations. Both mutant structures experienced larger flexibility compared to the wildtype. The L234R mutant suffered from more prominent structural instability, which may help to explain why it is associated with the more severe form of NF2. The intramolecular hydrogen bonding in L234R mutation decreased from the wildtype, while intermolecular hydrogen bonding of L234R mutation with solvent greatly increased. The native contacts were also found to be important. Protein-protein docking revealed that L234R mutation decreased the binding complementarity and binding affinity of LATS2 to merlin, which may have an impact on merlin's ability to regulate the Hippo signaling pathway. The calculated binding affinity of the LATS2 to L234R mutant and wildtype merlin protein is found to be 21.73 and -11 kcal/mol, respectively. The binding affinity of the wildtype merlin agreed very well with the experimental value, -8 kcal/mol.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Brandon Havranek
- Department of Chemistry, University of Illinois at Chicago, Chicago, IL, USA
| | - Shahidul M Islam
- Department of Chemistry, University of Illinois at Chicago, Chicago, IL, USA
| |
Collapse
|
20
|
Cruz JDO, Conceição IMCA, Sousa SMB, Luizon MR. Functional prediction and frequency of coding variants in human ACE2 at binding sites with SARS-CoV-2 spike protein on different populations. J Med Virol 2020; 93:71-73. [PMID: 32492195 PMCID: PMC7300988 DOI: 10.1002/jmv.26126] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 05/31/2020] [Accepted: 06/01/2020] [Indexed: 11/17/2022]
Affiliation(s)
- Juliana de O Cruz
- Genetics Graduate Program, Institute of Biological Sciences, Federal University of Minas Gerais (UFMG), Belo Horizonte, Minas Gerais, Brazil
| | - Izabela M C A Conceição
- Department of Genetics, Ecology and Evolution, Institute of Biological Sciences, Federal University of Minas Gerais (UFMG), Belo Horizonte, Minas Gerais, Brazil
| | - Sandra Mara B Sousa
- Department of Natural Sciences, State University of Southwest Bahia (UESB), Vitória da Conquista, Bahia, Brazil
| | - Marcelo R Luizon
- Genetics Graduate Program, Institute of Biological Sciences, Federal University of Minas Gerais (UFMG), Belo Horizonte, Minas Gerais, Brazil.,Department of Genetics, Ecology and Evolution, Institute of Biological Sciences, Federal University of Minas Gerais (UFMG), Belo Horizonte, Minas Gerais, Brazil
| |
Collapse
|
21
|
Combinatorial approach of in silico and in vitro evaluation of MLH1 variant associated with Lynch syndrome like metastatic colorectal cancer. Biosci Rep 2020; 40:224895. [PMID: 32432717 PMCID: PMC7269917 DOI: 10.1042/bsr20200225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 05/06/2020] [Accepted: 05/12/2020] [Indexed: 11/23/2022] Open
Abstract
Colorectal cancer (CRC) is the third most developing cancer worldwide and Lynch syndrome (LS) accounts for 3–4% of CRC. Genetic alteration in any of DNA mismatch repair (MMR) gene is the major cause of LS that disrupt the normal upstream and downstream MMR events. Germline mutation of MLH1 in heterozygous state have an increased risk for CRC. Defective MMR pathway mostly results in microsatellite instability (MSI) that occurs in high percentage of CRC associated tumors. Here, we reported a patient with LS like metastatic CRC (mCRC) associated with other related cancers. Whole exome sequencing (WES) of the proband was performed to identify potential causative gene. Genetic screening validated by Sanger sequencing identified a heterozygous missense mutation in exon 12 of MLH1 (c.1151T>A, p.V384D). The clinical significance of identified variant was elucidated on the basis of clinicopathological data, computational predictions and various in vitro functional analysis. In silico predictions classified the variant to be deleterious and evolutionary conserved. In vitro functional studies revealed a significant decrease in protein expression because of stability defect leading to loss of MMR activity. Mutant residue found in MutL transducer domain of MLH1 that localized in the nucleus but translocation was not found to be significantly disturbed. In conclusion, our study give insight into reliability of combinatorial prediction approach of in silico and in vitro expression analysis. Hence, we highlighted the pathogenic correlation of MLH1 variant with LS associated CRC as well as help in earlier diagnosis and surveillance for improved management and genetic counselling.
Collapse
|
22
|
Takeda JI, Nanatsue K, Yamagishi R, Ito M, Haga N, Hirata H, Ogi T, Ohno K. InMeRF: prediction of pathogenicity of missense variants by individual modeling for each amino acid substitution. NAR Genom Bioinform 2020; 2:lqaa038. [PMID: 33543123 PMCID: PMC7671370 DOI: 10.1093/nargab/lqaa038] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 03/03/2020] [Accepted: 05/13/2020] [Indexed: 12/15/2022] Open
Abstract
In predicting the pathogenicity of a nonsynonymous single-nucleotide variant (nsSNV), a radical change in amino acid properties is prone to be classified as being pathogenic. However, not all such nsSNVs are associated with human diseases. We generated random forest (RF) models individually for each amino acid substitution to differentiate pathogenic nsSNVs in the Human Gene Mutation Database and common nsSNVs in dbSNP. We named a set of our models ‘Individual Meta RF’ (InMeRF). Ten-fold cross-validation of InMeRF showed that the areas under the curves (AUCs) of receiver operating characteristic (ROC) and precision–recall curves were on average 0.941 and 0.957, respectively. To compare InMeRF with seven other tools, the eight tools were generated using the same training dataset, and were compared using the same three testing datasets. ROC-AUCs of InMeRF were ranked first in the eight tools. We applied InMeRF to 155 pathogenic and 125 common nsSNVs in seven major genes causing congenital myasthenic syndromes, as well as in VANGL1 causing spina bifida, and found that the sensitivity and specificity of InMeRF were 0.942 and 0.848, respectively. We made the InMeRF web service, and also made genome-wide InMeRF scores available online (https://www.med.nagoya-u.ac.jp/neurogenetics/InMeRF/).
Collapse
Affiliation(s)
- Jun-Ichi Takeda
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan
| | - Kentaro Nanatsue
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan
| | - Ryosuke Yamagishi
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan
| | - Mikako Ito
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan
| | - Nobuhiko Haga
- Department of Rehabilitation Medicine, The University of Tokyo Hospital, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8655, Japan
| | - Hiromi Hirata
- Department of Chemistry and Biological Science, College of Science and Engineering, Aoyama Gakuin University, 5-10-1 Fuchinobe, Chuo-ku, Sagamihara 252-5258, Japan
| | - Tomoo Ogi
- Department of Genetics, Research Institute of Environmental Medicine (RIeM), Nagoya University, Furo, Chikusa-ku, Nagoya 464-8601, Japan
| | - Kinji Ohno
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan
| |
Collapse
|
23
|
Vázquez-Moreno M, Zeng H, Locia-Morales D, Peralta-Romero J, Asif H, Maharaj A, Tam V, Romero-Figueroa MDS, Sosa-Bustamante GP, Méndez-Martínez S, Mejía-Benítez A, Valladares-Salgado A, Wacher-Rodarte N, Cruz M, Meyre D. The Melanocortin 4 Receptor p.Ile269Asn Mutation Is Associated with Childhood and Adult Obesity in Mexicans. J Clin Endocrinol Metab 2020; 105:5679482. [PMID: 31841602 DOI: 10.1210/clinem/dgz276] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Accepted: 12/13/2019] [Indexed: 12/19/2022]
Abstract
CONTEXT Rare partial/complete loss-of-function mutations in the melanocortin-4 receptor (MC4R) gene are the most common cause of Mendelian obesity in European populations, but their contribution to obesity in the Mexican population is unclear. OBJECTIVE AND DESIGN We investigated whether deleterious mutations in MC4R contribute to obesity in Mexican children and adults. RESULTS We provide evidence that the MC4R p.Ile269Asn (rs79783591) mutation may have arisen in modern human populations from a founder event in native Mexicans. The MC4R Isoleucine 269 is perfectly conserved across 184 species, which suggests a critical role for the amino acid in MC4R activity. Four in silico tools (SIFT, PolyPhen-2, CADD, MutPred2) predicted a deleterious impact of the p.Ile269Asn substitution on MC4R function. The MC4R p.Ile269Asn mutation was associated with childhood (Ncontrols = 952, Ncases = 661, odds ratio (OR) = 3.06, 95% confidence interval (95%CI) [1.94-4.85]) and adult obesity (Ncontrols = 1445, Ncases = 2,487, OR = 2.58, 95%CI [1.52-4.39]). The frequency of the MC4R p.Ile269Asn mutation ranged from 0.52 to 0.59% and 1.53 to 1.59% in children and adults with normal weight and obesity, respectively. The MC4R p.Ile269Asn mutation co-segregated perfectly with obesity in 5 multigenerational Mexican pedigrees. While adults with obesity carrying the p.Ile269Asn mutation had higher BMI values than noncarriers, this trend was not observed in children. The MC4R p.Ile269Asn mutation accounted for a population attributable risk of 1.28% and 0.68% for childhood and adult obesity, respectively, in the Mexican population. CONCLUSION The MC4R p.Ile269Asn mutation may have emerged as a founder mutation in native Mexicans and is associated with childhood and adult obesity in the modern Mexican population.
Collapse
Affiliation(s)
- Miguel Vázquez-Moreno
- Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI del Instituto Mexicano del Seguro Social, Ciudad de México, México
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Helen Zeng
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Daniel Locia-Morales
- Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI del Instituto Mexicano del Seguro Social, Ciudad de México, México
| | - Jesús Peralta-Romero
- Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI del Instituto Mexicano del Seguro Social, Ciudad de México, México
| | - Hamza Asif
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Arjuna Maharaj
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Vivian Tam
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - María D S Romero-Figueroa
- Centro de Investigación en Ciencias de la Salud (CICSA), Facultad de Ciencias de la Salud, Universidad Anáhuac, Campus Norte, Huixquilucan, México
| | | | - Socorro Méndez-Martínez
- Coordinación de Investigación en Salud, Instituto Mexicano del Seguro Social Puebla, Puebla, México
| | - Aurora Mejía-Benítez
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
| | - Adan Valladares-Salgado
- Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI del Instituto Mexicano del Seguro Social, Ciudad de México, México
| | - Niels Wacher-Rodarte
- Unidad de Investigación en Epidemiología Clínica, Hospital de Especialidades Bernardo Sepúlveda, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Ciudad de México, México
| | | | - Miguel Cruz
- Unidad de Investigación Médica en Bioquímica, Hospital de Especialidades, Centro Médico Nacional Siglo XXI del Instituto Mexicano del Seguro Social, Ciudad de México, México
| | - David Meyre
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Hamilton, Canada
| |
Collapse
|
24
|
Pal LR, Kundu K, Yin Y, Moult J. Matching whole genomes to rare genetic disorders: Identification of potential causative variants using phenotype-weighted knowledge in the CAGI SickKids5 clinical genomes challenge. Hum Mutat 2020; 41:347-362. [PMID: 31680375 PMCID: PMC7182498 DOI: 10.1002/humu.23933] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Revised: 09/26/2019] [Accepted: 10/13/2019] [Indexed: 02/06/2023]
Abstract
Precise identification of causative variants from whole-genome sequencing data, including both coding and noncoding variants, is challenging. The Critical Assessment of Genome Interpretation 5 SickKids clinical genome challenge provided an opportunity to assess our ability to extract such information. Participants in the challenge were required to match each of the 24 whole-genome sequences to the correct phenotypic profile and to identify the disease class of each genome. These are all rare disease cases that have resisted genetic diagnosis in a state-of-the-art pipeline. The patients have a range of eye, neurological, and connective-tissue disorders. We used a gene-centric approach to address this problem, assigning each gene a multiphenotype-matching score. Mutations in the top-scoring genes for each phenotype profile were ranked on a 6-point scale of pathogenicity probability, resulting in an approximately equal number of top-ranked coding and noncoding candidate variants overall. We were able to assign the correct disease class for 12 cases and the correct genome to a clinical profile for five cases. The challenge assessor found genes in three of these five cases as likely appropriate. In the postsubmission phase, after careful screening of the genes in the correct genome, we identified additional potential diagnostic variants, a high proportion of which are noncoding.
Collapse
Affiliation(s)
- Lipika R. Pal
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - Kunal Kundu
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, MD 20742, USA
| | - Yizhou Yin
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
25
|
Febres-Aldana CA, Alvarez Moreno JC, Rivera M, Kaplan S, Paramo J, Poppiti R. Understanding the histogenesis of a HRAS-PIK3R1 co-driven metastatic metaplastic breast carcinoma associated with squamous metaplasia of lactiferous ducts. Pathol Int 2019; 70:101-107. [PMID: 31867792 DOI: 10.1111/pin.12887] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Accepted: 11/25/2019] [Indexed: 12/19/2022]
Abstract
Metaplastic breast carcinoma (MBC) represents a heterogeneous group of aggressive primary breast cancers that can show differentiation into carcinomatous and sarcomatous elements. Due to its rapid growth, this malignancy can replace precursor lesions, which remain unknown in most cases. Herein, we describe a MBC presenting as a deceptive post-biopsy hematoma. Histopathological and immunohistochemical evaluation of the primary tumor revealed a squamous cell carcinoma arising in a background of squamous metaplasia of lactiferous ducts (SMOLD). In the absence of ductal carcinoma in situ, we consider SMOLD as a nonobligatory precursor of MBC. The tumor showed 'dedifferentiation' into spindle, mucin-producing, osteoclast-like giant cell and fibromatosis-like carcinoma. Next-generation sequencing revealed the driver mutations HRASQ61R and PIK3R1c.1738_1745+2del in addition to MYH11S638L and amplification of ERCC5 and FGF14, which were potential contributors to tumor phenotype. Tumor dedifferentiation was probably facilitated by epithelial-to-mesenchymal transition (EMT) with aberrant expression of platelet and endothelial adhesion molecule-1, leading to early metastasis via hematogenous route rather than lymphatic. The co-occurrence of phosphoinositide 3-kinase and mitogen-activated protein kinase pathway abnormalities along with EMT could mediate divergent growth in breast cancer.
Collapse
Affiliation(s)
| | - Juan C Alvarez Moreno
- A.M. Rywlin, Department of Pathology, Mount Sinai Medical Center, Miami Beach, FL, USA
| | - Melissa Rivera
- Department of Radiology, Mount Sinai Medical Center, Miami Beach, FL, USA
| | - Stuart Kaplan
- Department of Radiology, Mount Sinai Medical Center, Miami Beach, FL, USA
| | - Juan Paramo
- Department of Surgery, Mount Sinai Medical Center, Miami Beach, FL, USA
| | - Robert Poppiti
- A.M. Rywlin, Department of Pathology, Mount Sinai Medical Center, Miami Beach, FL, USA.,Herbert Wertheim College of Medicine, Florida International University, Miami, FL, USA
| |
Collapse
|
26
|
Tran AK, Pearce A, López-Sánchez M, Pérez-Jurado LA, Barnett C. Novel KIT mutation presenting as marked lentiginosis. Pediatr Dermatol 2019; 36:922-925. [PMID: 31497890 DOI: 10.1111/pde.13952] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Although lentigines are usually benign, they can be associated with a number of genetic syndromes in which neoplasms and other multi-system pathological processes occur. Here, we report the case of a 6-year-old girl who presented with atypical lentiginosis and hyperpigmentation caused by a de novo genetic variant in the KIT gene.
Collapse
Affiliation(s)
- Alain K Tran
- Flinders Medical Centre, Adelaide, South Australia, Australia
| | - Annette Pearce
- Adelaide Dermatology Associates, Western Hospital, Henley Beach, South Australia, Australia
| | - Marcos López-Sánchez
- Hospital del Mar Research Institute (IMIM), Network Centre for Biomedical Research in Rare Diseases (CIBERER) and Universitat Pompeu Fabra, Barcelona, Spain
| | - Luis A Pérez-Jurado
- Hospital del Mar Research Institute (IMIM), Network Centre for Biomedical Research in Rare Diseases (CIBERER) and Universitat Pompeu Fabra, Barcelona, Spain
- South Australian Health and Medical Research Institute, University of Adelaide, Adelaide, South Australia, Australia
- Paediatric and Reproductive Genetic Unit, Women's and Children's Hospital, North Adelaide, South Australia, Australia
| | - Christopher Barnett
- Paediatric and Reproductive Genetic Unit, Women's and Children's Hospital, North Adelaide, South Australia, Australia
| |
Collapse
|
27
|
Pejaver V, Babbi G, Casadio R, Folkman L, Katsonis P, Kundu K, Lichtarge O, Martelli PL, Miller M, Moult J, Pal LR, Savojardo C, Yin Y, Zhou Y, Radivojac P, Bromberg Y. Assessment of methods for predicting the effects of PTEN and TPMT protein variants. Hum Mutat 2019; 40:1495-1506. [PMID: 31184403 PMCID: PMC6744362 DOI: 10.1002/humu.23838] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 05/27/2019] [Accepted: 06/06/2019] [Indexed: 01/16/2023]
Abstract
Thermodynamic stability is a fundamental property shared by all proteins. Changes in stability due to mutation are a widespread molecular mechanism in genetic diseases. Methods for the prediction of mutation-induced stability change have typically been developed and evaluated on incomplete and/or biased data sets. As part of the Critical Assessment of Genome Interpretation, we explored the utility of high-throughput variant stability profiling (VSP) assay data as an alternative for the assessment of computational methods and evaluated state-of-the-art predictors against over 7,000 nonsynonymous variants from two proteins. We found that predictions were modestly correlated with actual experimental values. Predictors fared better when evaluated as classifiers of extreme stability effects. While different methods emerging as top performers depending on the metric, it is nontrivial to draw conclusions on their adoption or improvement. Our analyses revealed that only 16% of all variants in VSP assays could be confidently defined as stability-affecting. Furthermore, it is unclear as to what extent VSP abundance scores were reasonable proxies for the stability-related quantities that participating methods were designed to predict. Overall, our observations underscore the need for clearly defined objectives when developing and using both computational and experimental methods in the context of measuring variant impact.
Collapse
Affiliation(s)
- Vikas Pejaver
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington
- The eScience Institute, University of Washington, Seattle, Washington
| | - Giulia Babbi
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Rita Casadio
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Lukas Folkman
- School of Information and Communication Technology, Griffith University, Southport, Australia
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Kunal Kundu
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
- Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, Texas
- Department of Pharmacology, Baylor College of Medicine, Houston, Texas
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas
| | - Pier Luigi Martelli
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Maximilian Miller
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| | - Lipika R Pal
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
| | - Castrense Savojardo
- Department of Pharmacy and Biotechnology, Biocomputing Group, University of Bologna, Bologna, Italy
| | - Yizhou Yin
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Australia
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, Massachusetts
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey
- Department of Genetics, Human Genetics Institute, Rutgers University, Piscataway, New Jersey
- Institute for Advanced Study at Technische Universität München (TUM-IAS), Garching/Munich, Germany
| |
Collapse
|
28
|
Li C, Liu T, Liu B, Hernandez R, Facelli JC, Grossman D. A novel CDKN2A variant (p16 L117P ) in a patient with familial and multiple primary melanomas. Pigment Cell Melanoma Res 2019; 32:734-738. [PMID: 31001908 PMCID: PMC6751567 DOI: 10.1111/pcmr.12787] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Revised: 03/18/2019] [Accepted: 04/09/2019] [Indexed: 12/29/2022]
Abstract
Germline mutations in CDKN2A (p16) are commonly found in patients with family history of melanoma or personal history of multiple primary melanomas. The p16 tumor suppressor gene regulates cell cycle progression and senescence through binding of cyclin-dependent kinases (CDK) and also regulates cellular oxidative stress independently of cell cycle control. We identified a germline missense (c.350T>C, p.Leu117Pro) CDKN2A mutation in a patient who had history of four primary melanomas, numerous nevi, and self-reported family history of melanoma. This particular CDKN2A mutation has not been previously reported in prior large studies of melanoma kindreds or patients with multiple primary melanomas. Compared with wild-type p16, the p16L117P mutant largely retained binding capacity for CDK4 and CDK6 but exhibited impaired capacity for repressing cell cycle progression and inducing senescence, while retaining its ability to reduce mitochondrial reactive oxygen species. Structural modeling predicted that the Leu117Pro mutation disrupts a putative adenosine monophosphate (AMP) binding pocket involving residue 117 in the fourth ankyrin domain. Identification of this new likely pathogenic variant extends our understanding of CDKN2A in melanoma susceptibility and implicates AMP as a potential regulator of p16.
Collapse
Affiliation(s)
- Christopher Li
- Huntsman Cancer Institute, University of Utah Health Sciences Center, Salt Lake City, Utah
| | - Tong Liu
- Huntsman Cancer Institute, University of Utah Health Sciences Center, Salt Lake City, Utah
| | - Bin Liu
- Huntsman Cancer Institute, University of Utah Health Sciences Center, Salt Lake City, Utah
| | - Rolando Hernandez
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah
| | - Julio C. Facelli
- Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah
| | - Douglas Grossman
- Huntsman Cancer Institute, University of Utah Health Sciences Center, Salt Lake City, Utah
- Department of Dermatology, University of Utah Health Sciences Center, Salt Lake City, Utah
- Department of Oncological Sciences, University of Utah, Salt Lake City, Utah
| |
Collapse
|
29
|
Cao Y, Sun Y, Karimi M, Chen H, Moronfoye O, Shen Y. Predicting pathogenicity of missense variants with weakly supervised regression. Hum Mutat 2019; 40:1579-1592. [PMID: 31144781 PMCID: PMC6744350 DOI: 10.1002/humu.23826] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2019] [Revised: 05/23/2019] [Accepted: 05/27/2019] [Indexed: 12/27/2022]
Abstract
Quickly growing genetic variation data of unknown clinical significance demand computational methods that can reliably predict clinical phenotypes and deeply unravel molecular mechanisms. On the platform enabled by the Critical Assessment of Genome Interpretation (CAGI), we develop a novel "weakly supervised" regression (WSR) model that not only predicts precise clinical significance (probability of pathogenicity) from inexact training annotations (class of pathogenicity) but also infers underlying molecular mechanisms in a variant-specific manner. Compared to multiclass logistic regression, a representative multiclass classifier, our kernelized WSR improves the performance for the ENIGMA Challenge set from 0.72 to 0.97 in binary area under the receiver operating characteristic curve (AUC) and from 0.64 to 0.80 in ordinal multiclass AUC. WSR model interpretation and protein structural interpretation reach consensus in corroborating the most probable molecular mechanisms by which some pathogenic BRCA1 variants confer clinical significance, namely metal-binding disruption for p.C44F and p.C47Y, protein-binding disruption for p.M18T, and structure destabilization for p.S1715N.
Collapse
Affiliation(s)
- Yue Cao
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, 77843-3128, United States
| | - Yuanfei Sun
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, 77843-3128, United States
| | - Mostafa Karimi
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, 77843-3128, United States
| | - Haoran Chen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, 77843-3128, United States
| | - Oluwaseyi Moronfoye
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, 77843-3128, United States
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, 77843-3128, United States
| |
Collapse
|
30
|
Kasak L, Bakolitsa C, Hu Z, Yu C, Rine J, Dimster-Denk DF, Pandey G, Baets GD, Bromberg Y, Cao C, Capriotti E, Casadio R, Durme JV, Giollo M, Karchin R, Katsonis P, Leonardi E, Lichtarge O, Martelli PL, Masica D, Mooney SD, Olatubosun A, Pal LR, Radivojac P, Rousseau F, Savojardo C, Schymkowitz J, Thusberg J, Tosatto SC, Vihinen M, Väliaho J, Repo S, Moult J, Brenner SE, Friedberg I. Assessing computational predictions of the phenotypic effect of cystathionine-beta-synthase variants. Hum Mutat 2019; 40:1530-1545. [PMID: 31301157 PMCID: PMC7325732 DOI: 10.1002/humu.23868] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Revised: 06/22/2019] [Accepted: 07/09/2019] [Indexed: 12/28/2022]
Abstract
Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine. Computational predictions can lead to a better understanding of the mechanisms underlying genetic diseases, including cancer, but their adoption requires thorough and unbiased assessment. Cystathionine-beta-synthase (CBS) is an enzyme that catalyzes the first step of the transsulfuration pathway, from homocysteine to cystathionine, and in which variations are associated with human hyperhomocysteinemia and homocystinuria. We have created a computational challenge under the CAGI framework to evaluate how well different methods can predict the phenotypic effect(s) of CBS single amino acid substitutions using a blinded experimental data set. CAGI participants were asked to predict yeast growth based on the identity of the mutations. The performance of the methods was evaluated using several metrics. The CBS challenge highlighted the difficulty of predicting the phenotype of an ex vivo system in a model organism when classification models were trained on human disease data. We also discuss the variations in difficulty of prediction for known benign and deleterious variants, as well as identify methodological and experimental constraints with lessons to be learned for future challenges.
Collapse
Affiliation(s)
- Laura Kasak
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
- Institute of Biomedicine and Translational Medicine, University of Tartu, Tartu, Estonia
| | - Constantina Bakolitsa
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Changhua Yu
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Jasper Rine
- California Institute for Quantitative Biosciences, University of California, Berkeley, CA, USA
| | - Dago F. Dimster-Denk
- California Institute for Quantitative Biosciences, University of California, Berkeley, CA, USA
| | - Gaurav Pandey
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Greet De Baets
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium
- Department of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, USA
| | - Chen Cao
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, USA
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, MD, USA
| | - Emidio Capriotti
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Joost Van Durme
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium
- Vrije Universiteit Brussel, Brussels, Belgium
| | - Manuel Giollo
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | - Rachel Karchin
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | | | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - David Masica
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, MD, USA
| | | | - Ayodeji Olatubosun
- Institute of Medical Technology, University of Tampere, Tampere, Finland
| | - Lipika R. Pal
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, MD, USA
| | - Predrag Radivojac
- School of Informatics and Computing, Indiana University, Bloomington, IN, USA
| | - Frederic Rousseau
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium
- Department of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Joost Schymkowitz
- Switch Laboratory, VIB Center for Brain and Disease Research, Leuven, Belgium
- Department of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | | | | | - Mauno Vihinen
- Institute of Medical Technology, University of Tampere, Tampere, Finland
| | - Jouni Väliaho
- Institute of Medical Technology, University of Tampere, Tampere, Finland
| | - Susanna Repo
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - John Moult
- Department of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
| | - Steven E. Brenner
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Iddo Friedberg
- Department of Microbiology, Miami University, Oxford, OH, USA
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA USA
| |
Collapse
|
31
|
Voskanian A, Katsonis P, Lichtarge O, Pejaver V, Radivojac P, Mooney SD, Capriotti E, Bromberg Y, Wang Y, Miller M, Martelli PL, Savojardo C, Babbi G, Casadio R, Cao Y, Sun Y, Shen Y, Garg A, Pal D, Yu Y, Huff CD, Tavtigian SV, Young E, Neuhausen SL, Ziv E, Pal LR, Andreoletti G, Brenner S, Kann MG. Assessing the performance of in silico methods for predicting the pathogenicity of variants in the gene CHEK2, among Hispanic females with breast cancer. Hum Mutat 2019; 40:1612-1622. [PMID: 31241222 PMCID: PMC6744287 DOI: 10.1002/humu.23849] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 05/23/2019] [Accepted: 06/21/2019] [Indexed: 01/22/2023]
Abstract
The availability of disease-specific genomic data is critical for developing new computational methods that predict the pathogenicity of human variants and advance the field of precision medicine. However, the lack of gold standards to properly train and benchmark such methods is one of the greatest challenges in the field. In response to this challenge, the scientific community is invited to participate in the Critical Assessment for Genome Interpretation (CAGI), where unpublished disease variants are available for classification by in silico methods. As part of the CAGI-5 challenge, we evaluated the performance of 18 submissions and three additional methods in predicting the pathogenicity of single nucleotide variants (SNVs) in checkpoint kinase 2 (CHEK2) for cases of breast cancer in Hispanic females. As part of the assessment, the efficacy of the analysis method and the setup of the challenge were also considered. The results indicated that though the challenge could benefit from additional participant data, the combined generalized linear model analysis and odds of pathogenicity analysis provided a framework to evaluate the methods submitted for SNV pathogenicity identification and for comparison to other available methods. The outcome of this challenge and the approaches used can help guide further advancements in identifying SNV-disease relationships.
Collapse
Affiliation(s)
- Alin Voskanian
- Department of Biological Sciences, University of Maryland, Baltimore County, MD, U.S.A
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, U.S.A
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, U.S.A
- Department of Pharmacology, Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Vikas Pejaver
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, U.S.A
- The eScience Institute, University of Washington, Seattle, Washington, U.S.A
| | - Predrag Radivojac
- Khoury College of Computer and Information Sciences, Northeastern University, Boston, Massachusetts, U.S.A
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, U.S.A
| | - Emidio Capriotti
- BioFolD Unit, Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via Selmi 3, 40126 Bologna, Italy
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey, U.S.A
- Department of Genetics, Rutgers University, New Brunswick, New Jersey, U.S.A
- Technical University of Munich Institute for Advanced Study, (TUM-IAS), Lichtenbergstr. 2a, 85748 Garching/Munich, Germany
| | - Yanran Wang
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey, U.S.A
| | - Max Miller
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, New Jersey, U.S.A
| | - Pier Luigi Martelli
- Biocomputing Group, BiGeA/Giorgio Prodi Interdepartmental Center for Cancer Research, University of Bologna, Via F. Selmi 3, Bologna, 40126, Italy
| | - Castrense Savojardo
- Biocomputing Group, BiGeA/Giorgio Prodi Interdepartmental Center for Cancer Research, University of Bologna, Via F. Selmi 3, Bologna, 40126, Italy
| | - Giulia Babbi
- Biocomputing Group, BiGeA/Giorgio Prodi Interdepartmental Center for Cancer Research, University of Bologna, Via F. Selmi 3, Bologna, 40126, Italy
| | - Rita Casadio
- Biocomputing Group, BiGeA/Giorgio Prodi Interdepartmental Center for Cancer Research, University of Bologna, Via F. Selmi 3, Bologna, 40126, Italy
| | - Yue Cao
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843-3128, U.S.A
| | - Yuanfei Sun
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843-3128, U.S.A
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843-3128, U.S.A
| | - Aditi Garg
- Department of Computational and Data Sciences Indian Institute of Science, Bengaluru 560 012, India
| | - Debnath Pal
- Department of Computational and Data Sciences Indian Institute of Science, Bengaluru 560 012, India
| | - Yao Yu
- Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, U.S.A
| | - Chad D. Huff
- Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX 77030, U.S.A
| | - Sean V. Tavtigian
- Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, UT 84132, U.S.A
| | - Erin Young
- Huntsman Cancer Institute, University of Utah School of Medicine, Salt Lake City, UT 84132, U.S.A
| | - Susan L. Neuhausen
- Department of Population Sciences, Beckman Research Institute of City of Hope, Duarte, CA, 91010 U.S.A
| | - Elad Ziv
- Division of General Internal Medicine, Department of Medicine, Institute of Human Genetics, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA,U.S.A
| | - Lipika R. Pal
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA
| | - Gaia Andreoletti
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Steven Brenner
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Maricel G. Kann
- Department of Biological Sciences, University of Maryland, Baltimore County, MD, U.S.A
| |
Collapse
|
32
|
Adhikari AN. Gene-specific features enhance interpretation of mutational impact on acid α-glucosidase enzyme activity. Hum Mutat 2019; 40:1507-1518. [PMID: 31228295 DOI: 10.1002/humu.23846] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 05/21/2019] [Accepted: 06/17/2019] [Indexed: 01/30/2023]
Abstract
We present a computational model for predicting mutational impact on enzymatic activity of human acid α-glucosidase (GAA), an enzyme associated with Pompe disease. Using a model that combines features specific to GAA with other general evolutionary and physiochemical features, we made blind predictions of enzymatic activity relative to wildtype human GAA for >300 GAA mutants, as part of the Critical Assessment of Genome Interpretation 5 GAA challenge. We found that gene-specific features can improve the performance of existing impact prediction tools that mostly rely on general features for pathogenicity prediction. Majority of the poorly predicted mutants that lower wildtype GAA enzyme activity occurred on the surface of the GAA protein. We also found that gene-specific features were uncorrelated with existing methods and provided orthogonal information for interpreting the origin of pathogenicity, particular in variants that are poorly predicted by existing general methods. Specific variants in GAA, when investigated in the context of its protein structure, suggested gene-specific information like the disruption of local backbone torsional geometry and disruption of particular sidechain-sidechain hydrogen bonds as some potential sources for pathogenicity.
Collapse
Affiliation(s)
- Aashish N Adhikari
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| |
Collapse
|
33
|
Quartier A, Courraud J, Thi Ha T, McGillivray G, Isidor B, Rose K, Drouot N, Savidan MA, Feger C, Jagline H, Chelly J, Shaw M, Laumonnier F, Gecz J, Mandel JL, Piton A. Novel mutations in NLGN3 causing autism spectrum disorder and cognitive impairment. Hum Mutat 2019; 40:2021-2032. [PMID: 31184401 DOI: 10.1002/humu.23836] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 05/10/2019] [Accepted: 06/05/2019] [Indexed: 12/22/2022]
Abstract
The X-linked NLGN3 gene, encoding a postsynaptic cell adhesion molecule, was involved in a nonsyndromic monogenic form of autism spectrum disorder (ASD) by the description of one unique missense variant, p.Arg451Cys (Jamain et al. 2003). We investigated here the pathogenicity of additional missense variants identified in two multiplex families with intellectual disability (ID) and ASD: c.1789C>T, p.Arg597Trp, previously reported by our group (Redin et al. 2014) and present in three affected cousins and c.1540C>T, p.Pro514Ser, identified in two affected brothers. Overexpression experiments in HEK293 and HeLa cell lines revealed that both variants affect the level of the mature NLGN3 protein, its localization at the plasma membrane and its presence as a cleaved form in the extracellular environment, even more drastically than what was reported for the initial p.Arg451Cys mutation. The variants also induced an unfolded protein response, probably due to the retention of immature NLGN3 proteins in the endoplasmic reticulum. In comparison, the c.1894A>G, p.Ala632Thr and c.1022T>C, p.Val341Ala variants, present in males from the general population, have no effect. Our report of two missense variants affecting the normal localization of NLGN3 in a total of five affected individuals reinforces the involvement of the NLGN3 gene in a neurodevelopmental disorder characterized by ID and ASD.
Collapse
Affiliation(s)
- Angélique Quartier
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France.,Centre National de la Recherche Scientifique, Illkirch, France.,Institut National de la Santé et de la Recherche Médicale, Illkirch, France.,Université de Strasbourg, Illkirch, France
| | - Jérémie Courraud
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France.,Centre National de la Recherche Scientifique, Illkirch, France.,Institut National de la Santé et de la Recherche Médicale, Illkirch, France.,Université de Strasbourg, Illkirch, France
| | - Thuong Thi Ha
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia, Australia.,Adelaide Medical School and Robinson Research Institute, University of Adelaide, Adelaide, South Australia, Australia
| | - George McGillivray
- Victorian Clinical Genetics Services, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, Victoria, Australia
| | - Bertrand Isidor
- Service de Génétique Médicale, CHU de Nantes, Nantes, France
| | - Katherine Rose
- Monash Genetics, Monash Health, Clayton, Victoria, Australia
| | - Nathalie Drouot
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France.,Centre National de la Recherche Scientifique, Illkirch, France.,Institut National de la Santé et de la Recherche Médicale, Illkirch, France.,Université de Strasbourg, Illkirch, France
| | - Marie-Armel Savidan
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France.,Centre National de la Recherche Scientifique, Illkirch, France.,Institut National de la Santé et de la Recherche Médicale, Illkirch, France.,Université de Strasbourg, Illkirch, France
| | - Claire Feger
- Molecular Genetic Unit, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Hélène Jagline
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France.,Centre National de la Recherche Scientifique, Illkirch, France.,Institut National de la Santé et de la Recherche Médicale, Illkirch, France.,Université de Strasbourg, Illkirch, France
| | - Jamel Chelly
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France.,Centre National de la Recherche Scientifique, Illkirch, France.,Institut National de la Santé et de la Recherche Médicale, Illkirch, France.,Université de Strasbourg, Illkirch, France.,Molecular Genetic Unit, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Marie Shaw
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia, Australia.,Adelaide Medical School and Robinson Research Institute, University of Adelaide, Adelaide, South Australia, Australia
| | - Frédéric Laumonnier
- UMR 1253, iBrain, Université de Tours, Inserm, Tours, France.,Service de Génétique, Centre Hospitalier Universitaire de Tours, Tours, France
| | - Jozef Gecz
- School of Biological Sciences, University of Adelaide, Adelaide, South Australia, Australia.,Adelaide Medical School and Robinson Research Institute, University of Adelaide, Adelaide, South Australia, Australia
| | - Jean-Louis Mandel
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France.,Centre National de la Recherche Scientifique, Illkirch, France.,Institut National de la Santé et de la Recherche Médicale, Illkirch, France.,Université de Strasbourg, Illkirch, France.,University of Strasbourg Institute of Advanced Studies, Strasbourg, France
| | - Amélie Piton
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France.,Centre National de la Recherche Scientifique, Illkirch, France.,Institut National de la Santé et de la Recherche Médicale, Illkirch, France.,Université de Strasbourg, Illkirch, France.,Molecular Genetic Unit, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| |
Collapse
|
34
|
Xu J, Yang P, Xue S, Sharma B, Sanchez-Martin M, Wang F, Beaty KA, Dehan E, Parikh B. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet 2019; 138:109-124. [PMID: 30671672 PMCID: PMC6373233 DOI: 10.1007/s00439-019-01970-5] [Citation(s) in RCA: 95] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 01/02/2019] [Indexed: 02/07/2023]
Abstract
In the field of cancer genomics, the broad availability of genetic information offered by next-generation sequencing technologies and rapid growth in biomedical publication has led to the advent of the big-data era. Integration of artificial intelligence (AI) approaches such as machine learning, deep learning, and natural language processing (NLP) to tackle the challenges of scalability and high dimensionality of data and to transform big data into clinically actionable knowledge is expanding and becoming the foundation of precision medicine. In this paper, we review the current status and future directions of AI application in cancer genomics within the context of workflows to integrate genomic analysis for precision cancer care. The existing solutions of AI and their limitations in cancer genetic testing and diagnostics such as variant calling and interpretation are critically analyzed. Publicly available tools or algorithms for key NLP technologies in the literature mining for evidence-based clinical recommendations are reviewed and compared. In addition, the present paper highlights the challenges to AI adoption in digital healthcare with regard to data requirements, algorithmic transparency, reproducibility, and real-world assessment, and discusses the importance of preparing patients and physicians for modern digitized healthcare. We believe that AI will remain the main driver to healthcare transformation toward precision medicine, yet the unprecedented challenges posed should be addressed to ensure safety and beneficial impact to healthcare.
Collapse
Affiliation(s)
- Jia Xu
- IBM Watson Health, Cambridge, MA, USA.
| | | | - Shang Xue
- IBM Watson Health, Cambridge, MA, USA
| | | | | | - Fang Wang
- IBM Watson Health, Cambridge, MA, USA
| | | | | | | |
Collapse
|
35
|
Hodges AM, Fenton AW, Dougherty LL, Overholt AC, Swint-Kruse L. RheoScale: A tool to aggregate and quantify experimentally determined substitution outcomes for multiple variants at individual protein positions. Hum Mutat 2018; 39:1814-1826. [PMID: 30117637 DOI: 10.1002/humu.23616] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Revised: 07/31/2018] [Accepted: 08/13/2018] [Indexed: 12/25/2022]
Abstract
Human mutations often cause amino acid changes (variants) that can alter protein function or stability. Some variants fall at protein positions that experimentally exhibit "rheostatic" mutation outcomes (different amino acid substitutions lead to a range of functional outcomes). In ongoing studies of rheostat positions, we encountered the need to aggregate experimental results from multiple variants, to describe the overall roles of individual positions. Here, we present "RheoScale" which generates quantitative scores to discriminate rheostat positions from those with "toggle" (most substitutions abolish function) or "neutral" (most substitutions have wild-type function) outcomes. RheoScale scores facilitate correlations of experimental data (such as binding affinity or stability) with structural and bioinformatic analyses. The RheoScale calculator is encoded into a Microsoft Excel workbook and an R script. Example analyses are shown for three model protein systems, including one assessed via deep mutational scanning. The RheoScale calculator quickly and efficiently provided quantitative descriptions that were in good agreement with prior qualitative observations. As an example application, scores were compared to the example proteins' structures; strong rheostat positions tended to occur in dynamic locations. In the future, RheoScale scores can be easily integrated into computational studies to facilitate improved algorithms for predicting outcomes of human variants.
Collapse
Affiliation(s)
- Abby M Hodges
- Department of Natural, Health, and Mathematical Sciences, MidAmerica Nazarene University, Olathe, Kansas, USA
| | - Aron W Fenton
- The Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Larissa L Dougherty
- The Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Andrew C Overholt
- Department of Natural, Health, and Mathematical Sciences, MidAmerica Nazarene University, Olathe, Kansas, USA
| | - Liskin Swint-Kruse
- The Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
36
|
Oliveira J, Gruber A, Cardoso M, Taipa R, Fineza I, Gonçalves A, Laner A, Winder TL, Schroeder J, Rath J, Oliveira ME, Vieira E, Sousa AP, Vieira JP, Lourenço T, Almendra L, Negrão L, Santos M, Melo-Pires M, Coelho T, den Dunnen JT, Santos R, Sousa M. LAMA2 gene mutation update: Toward a more comprehensive picture of the laminin-α2 variome and its related phenotypes. Hum Mutat 2018; 39:1314-1337. [PMID: 30055037 DOI: 10.1002/humu.23599] [Citation(s) in RCA: 69] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2018] [Revised: 07/05/2018] [Accepted: 07/25/2018] [Indexed: 12/15/2022]
Abstract
Congenital muscular dystrophy type 1A (MDC1A) is one of the main subtypes of early-onset muscle disease, caused by disease-associated variants in the laminin-α2 (LAMA2) gene. MDC1A usually presents as a severe neonatal hypotonia and failure to thrive. Muscle weakness compromises normal motor development, leading to the inability to sit unsupported or to walk independently. The phenotype associated with LAMA2 defects has been expanded to include milder and atypical cases, being now collectively known as LAMA2-related muscular dystrophies (LAMA2-MD). Through an international multicenter collaborative effort, 61 new LAMA2 disease-associated variants were identified in 86 patients, representing the largest number of patients and new disease-causing variants in a single report. The collaborative variant collection was supported by the LOVD-powered LAMA2 gene variant database (https://www.LOVD.nl/LAMA2), updated as part of this work. As of December 2017, the database contains 486 unique LAMA2 variants (309 disease-associated), obtained from direct submissions and literature reports. Database content was systematically reviewed and further insights concerning LAMA2-MD are presented. We focus on the impact of missense changes, especially the c.2461A > C (p.Thr821Pro) variant and its association with late-onset LAMA2-MD. Finally, we report diagnostically challenging cases, highlighting the relevance of modern genetic analysis in the characterization of clinically heterogeneous muscle diseases.
Collapse
Affiliation(s)
- Jorge Oliveira
- Unidade de Genética Molecular, Centro de Genética Médica Dr. Jacinto Magalhães, Centro Hospitalar do Porto, Porto, Portugal.,Unidade Multidisciplinar de Investigação Biomédica (UMIB), Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto, Porto, Portugal
| | | | - Márcio Cardoso
- Consulta de Doenças Neuromusculares e Serviço de Neurofisiologia, Departamento de Neurociências, Centro Hospitalar do Porto, Porto, Portugal
| | - Ricardo Taipa
- Unidade de Neuropatologia, Centro Hospitalar do Porto, Porto, Portugal
| | - Isabel Fineza
- Unidade de Neuropediatria, Centro de Desenvolvimento da Criança Luís Borges, Hospital Pediátrico de Coimbra, Centro Hospitalar Universitário de Coimbra, Coimbra, Portugal
| | - Ana Gonçalves
- Unidade de Genética Molecular, Centro de Genética Médica Dr. Jacinto Magalhães, Centro Hospitalar do Porto, Porto, Portugal.,Unidade Multidisciplinar de Investigação Biomédica (UMIB), Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto, Porto, Portugal
| | | | | | | | - Julie Rath
- PreventionGenetics, Marshfield, Wisconsin
| | - Márcia E Oliveira
- Unidade de Genética Molecular, Centro de Genética Médica Dr. Jacinto Magalhães, Centro Hospitalar do Porto, Porto, Portugal.,Unidade Multidisciplinar de Investigação Biomédica (UMIB), Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto, Porto, Portugal
| | - Emília Vieira
- Unidade de Genética Molecular, Centro de Genética Médica Dr. Jacinto Magalhães, Centro Hospitalar do Porto, Porto, Portugal.,Unidade Multidisciplinar de Investigação Biomédica (UMIB), Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto, Porto, Portugal
| | - Ana Paula Sousa
- Consulta de Doenças Neuromusculares e Serviço de Neurofisiologia, Departamento de Neurociências, Centro Hospitalar do Porto, Porto, Portugal
| | - José Pedro Vieira
- Serviço de Neurologia, Hospital de Dona Estefânia, Centro Hospitalar de Lisboa Central, Lisboa, Portugal
| | - Teresa Lourenço
- Serviço de Genética Médica, Hospital de Dona Estefânia, Centro Hospitalar de Lisboa Central, Lisboa, Portugal
| | - Luciano Almendra
- Consulta de Doenças Neuromusculares, Hospitais da Universidade de Coimbra, Centro Hospitalar Universitário de Coimbra, Coimbra, Portugal
| | - Luís Negrão
- Consulta de Doenças Neuromusculares, Hospitais da Universidade de Coimbra, Centro Hospitalar Universitário de Coimbra, Coimbra, Portugal
| | - Manuela Santos
- Consulta de Doenças Neuromusculares e Serviço de Neuropediatria, Centro Hospitalar do Porto, Porto, Portugal
| | - Manuel Melo-Pires
- Unidade de Neuropatologia, Centro Hospitalar do Porto, Porto, Portugal
| | - Teresa Coelho
- Consulta de Doenças Neuromusculares e Serviço de Neurofisiologia, Departamento de Neurociências, Centro Hospitalar do Porto, Porto, Portugal
| | - Johan T den Dunnen
- Departments of Human Genetics and Clinical Genetics, Leiden University Medical Center, Leiden, the Netherlands
| | - Rosário Santos
- Unidade de Genética Molecular, Centro de Genética Médica Dr. Jacinto Magalhães, Centro Hospitalar do Porto, Porto, Portugal.,Unidade Multidisciplinar de Investigação Biomédica (UMIB), Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto, Porto, Portugal.,UCIBIO/REQUIMTE, Departamento de Ciências Biológicas, Laboratório de Bioquímica, Faculdade de Farmácia, Universidade do Porto, Porto, Portugal
| | - Mário Sousa
- Unidade Multidisciplinar de Investigação Biomédica (UMIB), Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto, Porto, Portugal.,Departamento de Microscopia, Laboratório de Biologia Celular, Instituto de Ciências Biomédicas Abel Salazar (ICBAS), Universidade do Porto, Porto, Portugal.,Centro de Genética da Reprodução Prof. Alberto Barros, Porto, Portugal
| |
Collapse
|
37
|
Hoskins RA, Repo S, Barsky D, Andreoletti G, Moult J, Brenner SE. Reports from CAGI: The Critical Assessment of Genome Interpretation. Hum Mutat 2017; 38:1039-1041. [PMID: 28817245 PMCID: PMC5606199 DOI: 10.1002/humu.23290] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 07/08/2017] [Indexed: 12/20/2022]
Affiliation(s)
- Roger A Hoskins
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Susanna Repo
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Daniel Barsky
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - Gaia Andreoletti
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742
| | - Steven E. Brenner
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA
| |
Collapse
|