Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Livesey BJ, Marsh JA. Updated benchmarking of variant effect predictors using deep mutational scanning. Mol Syst Biol 2023;19:e11474. [PMID: 37310135 PMCID: PMC10407742 DOI: 10.15252/msb.202211474] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 05/30/2023] [Accepted: 06/02/2023] [Indexed: 06/14/2023] Open

For:	Livesey BJ, Marsh JA. Updated benchmarking of variant effect predictors using deep mutational scanning. Mol Syst Biol 2023;19:e11474. [PMID: 37310135 PMCID: PMC10407742 DOI: 10.15252/msb.202211474] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 05/30/2023] [Accepted: 06/02/2023] [Indexed: 06/14/2023] Open

Number

Cited by Other Article(s)

Harding-Larsen D, Funk J, Madsen NG, Gharabli H, Acevedo-Rocha CG, Mazurenko S, Welner DH. Protein representations: Encoding biological information for machine learning in biocatalysis. Biotechnol Adv 2024;77:108459. [PMID: 39366493 DOI: 10.1016/j.biotechadv.2024.108459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/19/2024] [Accepted: 09/29/2024] [Indexed: 10/06/2024]

Abstract

Enzymes offer a more environmentally friendly and low-impact solution to conventional chemistry, but they often require additional engineering for their application in industrial settings, an endeavour that is challenging and laborious. To address this issue, the power of machine learning can be harnessed to produce predictive models that enable the in silico study and engineering of improved enzymatic properties. Such machine learning models, however, require the conversion of the complex biological information to a numerical input, also called protein representations. These inputs demand special attention to ensure the training of accurate and precise models, and, in this review, we therefore examine the critical step of encoding protein information to numeric representations for use in machine learning. We selected the most important approaches for encoding the three distinct biological protein representations - primary sequence, 3D structure, and dynamics - to explore their requirements for employment and inductive biases. Combined representations of proteins and substrates are also introduced as emergent tools in biocatalysis. We propose the division of fixed representations, a collection of rule-based encoding strategies, and learned representations extracted from the latent spaces of large neural networks. To select the most suitable protein representation, we propose two main factors to consider. The first one is the model setup, which is influenced by the size of the training dataset and the choice of architecture. The second factor is the model objectives such as consideration about the assayed property, the difference between wild-type models and mutant predictors, and requirements for explainability. This review is aimed at serving as a source of information and guidance for properly representing enzymes in future machine learning models for biocatalysis.

Collapse

Faure AJ, Martí-Aranda A, Hidalgo-Carcedo C, Beltran A, Schmiedel JM, Lehner B. The genetic architecture of protein stability. Nature 2024:10.1038/s41586-024-07966-0. [PMID: 39322666 DOI: 10.1038/s41586-024-07966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 08/20/2024] [Indexed: 09/27/2024]

Underwood M, Bidlack C, Desch KC. Venous thromboembolic disease genetics: from variants to function. J Thromb Haemost 2024;22:2393-2403. [PMID: 38908832 DOI: 10.1016/j.jtha.2024.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/05/2024] [Accepted: 06/06/2024] [Indexed: 06/24/2024]

Badonyi M, Marsh JA. Proteome-scale prediction of molecular mechanisms underlying dominant genetic diseases. PLoS One 2024;19:e0307312. [PMID: 39172982 PMCID: PMC11341024 DOI: 10.1371/journal.pone.0307312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 06/26/2024] [Indexed: 08/24/2024] Open

Guclu TF, Atilgan AR, Atilgan C. Deciphering GB1's Single Mutational Landscape: Insights from MuMi Analysis. J Phys Chem B 2024;128:7987-7996. [PMID: 39115184 DOI: 10.1021/acs.jpcb.4c04916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]

David C, Arango-Franco CA, Badonyi M, Fouchet J, Rice GI, Didry-Barca B, Maisonneuve L, Seabra L, Kechiche R, Masson C, Cobat A, Abel L, Talouarn E, Béziat V, Deswarte C, Livingstone K, Paul C, Malik G, Ross A, Adam J, Walsh J, Kumar S, Bonnet D, Bodemer C, Bader-Meunier B, Marsh JA, Casanova JL, Crow YJ, Manoury B, Frémond ML, Bohlen J, Lepelley A. Gain-of-function human UNC93B1 variants cause systemic lupus erythematosus and chilblain lupus. J Exp Med 2024;221:e20232066. [PMID: 38869500 PMCID: PMC11176256 DOI: 10.1084/jem.20232066] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 03/29/2024] [Accepted: 05/15/2024] [Indexed: 06/14/2024] Open

Affiliation(s)

Clémence David Laboratory of Neurogenetics and Neuroinflammation, Imagine Institute, INSERM UMR1163, Paris, France
Carlos A. Arango-Franco Laboratory of Human Genetics of Infectious Diseases, INSERM UMR1163, Necker Hospital for Sick Children, Paris, France Department of Microbiology and Parasitology, Group of Primary Immunodeficiencies, School of Medicine, University of Antioquia, Medellín, Colombia
Mihaly Badonyi MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
Julien Fouchet Faculté de Médecine Necker, Institut Necker Enfants Malades, INSERM U1151-CNRS UMR 8253, Université Paris Cité, Paris, France
Gillian I. Rice Faculty of Biology, Medicine and Health, Division of Evolution and Genomic Sciences, School of Biological Sciences, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK
Blaise Didry-Barca Laboratory of Neurogenetics and Neuroinflammation, Imagine Institute, INSERM UMR1163, Paris, France
Lucie Maisonneuve Faculté de Médecine Necker, Institut Necker Enfants Malades, INSERM U1151-CNRS UMR 8253, Université Paris Cité, Paris, France
Luis Seabra Laboratory of Neurogenetics and Neuroinflammation, Imagine Institute, INSERM UMR1163, Paris, France
Robin Kechiche Laboratory of Neurogenetics and Neuroinflammation, Imagine Institute, INSERM UMR1163, Paris, France Department of Paediatric Hematology-Immunology and Rheumatology, Necker-Enfants Malades Hospital, Assistance publique–hôpitaux de Paris (AP-HP), Paris, France
Cécile Masson Bioinformatics Core Facility, Université Paris Cité-Structure Fédérative de Recherche Necker, INSERM US24/CNRS UMS3633, Paris, France
Aurélie Cobat Laboratory of Human Genetics of Infectious Diseases, INSERM UMR1163, Necker Hospital for Sick Children, Paris, France St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY, USA Imagine Institute, Université Paris Cité, Paris, France
Laurent Abel Laboratory of Human Genetics of Infectious Diseases, INSERM UMR1163, Necker Hospital for Sick Children, Paris, France St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY, USA Imagine Institute, Université Paris Cité, Paris, France
Estelle Talouarn Laboratory of Human Genetics of Infectious Diseases, INSERM UMR1163, Necker Hospital for Sick Children, Paris, France Imagine Institute, Université Paris Cité, Paris, France
Vivien Béziat Laboratory of Human Genetics of Infectious Diseases, INSERM UMR1163, Necker Hospital for Sick Children, Paris, France St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY, USA Imagine Institute, Université Paris Cité, Paris, France
Caroline Deswarte Laboratory of Human Genetics of Infectious Diseases, INSERM UMR1163, Necker Hospital for Sick Children, Paris, France Imagine Institute, Université Paris Cité, Paris, France
Katie Livingstone MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
Carle Paul Université Toulouse Paul Sabatier, Toulouse, France
Gulshan Malik Paediatric Rheumatology, Royal Aberdeen Children’s Hospital, Aberdeen, UK
Alison Ross Paediatric Rheumatology, Royal Aberdeen Children’s Hospital, Aberdeen, UK
Jane Adam Paediatric Rheumatology, Royal Aberdeen Children’s Hospital, Aberdeen, UK
Jo Walsh Department of Paediatric Rheumatology, Royal Hospital for Children, Glasgow, UK
Sathish Kumar Department of Pediatrics, Pediatric Rheumatology, Christian Medical College, Vellore, India
Damien Bonnet Medical and Surgical Unit of Congenital and Paediatric Cardiology, Reference Centre for Complex Congenital Heart Defects—M3C, University Hospital Necker-Enfants Malades, Paris, France Université Paris Cité, Paris, France
Christine Bodemer Department of Dermatology, Hospital Necker-Enfants Malades, AP-HP. Université Paris Cité, Paris, France
Brigitte Bader-Meunier Department of Paediatric Hematology-Immunology and Rheumatology, Necker-Enfants Malades Hospital, Assistance publique–hôpitaux de Paris (AP-HP), Paris, France Centre for Inflammatory Rheumatism, AutoImmune Diseases and Systemic Interferonopathies in Children (RAISE), Paris, France
Joseph A. Marsh MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
Jean-Laurent Casanova Laboratory of Human Genetics of Infectious Diseases, INSERM UMR1163, Necker Hospital for Sick Children, Paris, France St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY, USA Imagine Institute, Université Paris Cité, Paris, France Howard Hughes Medical Institute, New York, NY, USA Department of Pediatrics, Necker Hospital for Sick Children, Paris, France
Yanick J. Crow Laboratory of Neurogenetics and Neuroinflammation, Imagine Institute, INSERM UMR1163, Paris, France MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK Université Paris Cité, Paris, France
Bénédicte Manoury Faculté de Médecine Necker, Institut Necker Enfants Malades, INSERM U1151-CNRS UMR 8253, Université Paris Cité, Paris, France
Marie-Louise Frémond Laboratory of Neurogenetics and Neuroinflammation, Imagine Institute, INSERM UMR1163, Paris, France Department of Paediatric Hematology-Immunology and Rheumatology, Necker-Enfants Malades Hospital, Assistance publique–hôpitaux de Paris (AP-HP), Paris, France Centre for Inflammatory Rheumatism, AutoImmune Diseases and Systemic Interferonopathies in Children (RAISE), Paris, France
Jonathan Bohlen Laboratory of Human Genetics of Infectious Diseases, INSERM UMR1163, Necker Hospital for Sick Children, Paris, France Imagine Institute, Université Paris Cité, Paris, France
Alice Lepelley Laboratory of Neurogenetics and Neuroinflammation, Imagine Institute, INSERM UMR1163, Paris, France

Collapse

McCarthy-Leo CE, Brush GS, Pique-Regi R, Luca F, Tainsky MA, Finley RL. Comprehensive analysis of the functional impact of single nucleotide variants of human CHEK2. PLoS Genet 2024;20:e1011375. [PMID: 39146382 PMCID: PMC11349238 DOI: 10.1371/journal.pgen.1011375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 08/27/2024] [Accepted: 07/25/2024] [Indexed: 08/17/2024] Open

Correa Marrero M, Jänes J, Baptista D, Beltrao P. Integrating Large-Scale Protein Structure Prediction into Human Genetics Research. Annu Rev Genomics Hum Genet 2024;25:123-140. [PMID: 38621234 DOI: 10.1146/annurev-genom-120622-020615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]

Ozkan S, Padilla N, de la Cruz X. QAFI: a novel method for quantitative estimation of missense variant impact using protein-specific predictors and ensemble learning. Hum Genet 2024:10.1007/s00439-024-02692-z. [PMID: 39048855 DOI: 10.1007/s00439-024-02692-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 07/14/2024] [Indexed: 07/27/2024]

McDonnell AF, Plech M, Livesey BJ, Gerasimavicius L, Owen LJ, Hall HN, FitzPatrick DR, Marsh JA, Kudla G. Deep mutational scanning quantifies DNA binding and predicts clinical outcomes of PAX6 variants. Mol Syst Biol 2024;20:825-844. [PMID: 38849565 PMCID: PMC11219921 DOI: 10.1038/s44320-024-00043-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 04/05/2024] [Accepted: 05/14/2024] [Indexed: 06/09/2024] Open

Rubin AF. A new way of looking at transcription factor assays. Mol Syst Biol 2024;20:741-743. [PMID: 38849564 PMCID: PMC11219719 DOI: 10.1038/s44320-024-00044-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 05/14/2024] [Indexed: 06/09/2024] Open

Tabet DR, Kuang D, Lancaster MC, Li R, Liu K, Weile J, Coté AG, Wu Y, Hegele RA, Roden DM, Roth FP. Benchmarking computational variant effect predictors by their ability to infer human traits. Genome Biol 2024;25:172. [PMID: 38951922 PMCID: PMC11218265 DOI: 10.1186/s13059-024-03314-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 06/17/2024] [Indexed: 07/03/2024] Open

Affiliation(s)

Daniel R Tabet Donnelly Centre, University of Toronto, Toronto, ON, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
Da Kuang Donnelly Centre, University of Toronto, Toronto, ON, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
Megan C Lancaster Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
Roujia Li Donnelly Centre, University of Toronto, Toronto, ON, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
Karen Liu Donnelly Centre, University of Toronto, Toronto, ON, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
Jochen Weile Donnelly Centre, University of Toronto, Toronto, ON, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
Atina G Coté Donnelly Centre, University of Toronto, Toronto, ON, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
Yingzhou Wu Donnelly Centre, University of Toronto, Toronto, ON, Canada Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada Department of Computer Science, University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
Robert A Hegele Department of Medicine, Department of Biochemistry, Schulich School of Medicine and Dentistry, Robarts Research Institute, Western University, London, ON, Canada
Dan M Roden Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA Department of Pharmacology, Vanderbilt University Medical Centre, Nashville, TN, USA Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
Frederick P Roth Donnelly Centre, University of Toronto, Toronto, ON, Canada. Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada. Department of Computer Science, University of Toronto, Toronto, ON, Canada. Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada. Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.

Collapse

Jänes J, Müller M, Selvaraj S, Manoel D, Stephenson J, Gonçalves C, Lafita A, Polacco B, Obernier K, Alasoo K, Lemos MC, Krogan N, Martin M, Saraiva LR, Burke D, Beltrao P. Predicted mechanistic impacts of human protein missense variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.29.596373. [PMID: 38854010 PMCID: PMC11160786 DOI: 10.1101/2024.05.29.596373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]

Affiliation(s)

Jürgen Jänes Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Swiss Institute of Bioinformatics, Lausanne, Switzerland
Marc Müller Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland
Senthil Selvaraj Sidra Medicine, Doha, Qatar College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
Diogo Manoel Sidra Medicine, Doha, Qatar College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
James Stephenson European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK Open Targets, Wellcome Genome Campus, Cambridge, CB10 1SA, UK
Catarina Gonçalves Sidra Medicine, Doha, Qatar College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
Aleix Lafita Human Genetics and Genomics, GSK, Stevenage UK
Benjamin Polacco Quantitative Biosciences Institute (QBI), University of California, San Francisco, CA, USA Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA, USA
Kirsten Obernier Quantitative Biosciences Institute (QBI), University of California, San Francisco, CA, USA Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA, USA
Kaur Alasoo Institute of Computer Science, University of Tartu, Tartu, Estonia
Manuel C. Lemos CICS-UBI, Health Sciences Research Centre, University of Beira Interior, 6200-506, Covilhã, Portugal
Nevan Krogan Quantitative Biosciences Institute (QBI), University of California, San Francisco, CA, USA Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA, USA J. David Gladstone Institutes, San Francisco, CA, USA
Maria Martin European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK Open Targets, Wellcome Genome Campus, Cambridge, CB10 1SA, UK
Luis R. Saraiva Sidra Medicine, Doha, Qatar College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
David Burke Faculty of Life Sciences and Medicine, King’s College, London, UK
Pedro Beltrao Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland Swiss Institute of Bioinformatics, Lausanne, Switzerland European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK Open Targets, Wellcome Genome Campus, Cambridge, CB10 1SA, UK

Collapse

Brock DC, Wang M, Hussain HMJ, Rauch DE, Marra M, Pennesi ME, Yang P, Everett L, Ajlan RS, Colbert J, Porto FBO, Matynia A, Gorin MB, Koenekoop RK, Lopez I, Sui R, Zou G, Li Y, Chen R. Comparative analysis of in-silico tools in identifying pathogenic variants in dominant inherited retinal diseases. Hum Mol Genet 2024;33:945-957. [PMID: 38453143 PMCID: PMC11102593 DOI: 10.1093/hmg/ddae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/16/2024] [Accepted: 02/19/2024] [Indexed: 03/09/2024] Open

Affiliation(s)

Daniel C Brock Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States Medical Scientist Training Program, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
Meng Wang Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
Hafiz Muhammad Jafar Hussain Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
David E Rauch Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
Molly Marra Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
Mark E Pennesi Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
Paul Yang Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
Lesley Everett Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239, United States
Radwan S Ajlan Department of Ophthalmology, University of Kansas School of Medicine, 3901 Rainbow Blvd, Kansas City, KS 66160, United States
Jason Colbert Department of Ophthalmology, University of Kansas School of Medicine, 3901 Rainbow Blvd, Kansas City, KS 66160, United States
Fernanda Belga Ottoni Porto INRET Clínica e Centro de Pesquisa, Rua dos Otoni, 735/507 - Santa Efigênia, Belo Horizonte, MG 30150270, Brazil Department of Ophthalmology, Santa Casa de Misericórdia de Belo Horizonte, Av. Francisco Sales, 1111 - Santa Efigênia, Belo Horizonte, MG 30150221, Brazil Centro Oftalmológico de Minas Gerais, R. Santa Catarina, 941 - Lourdes, Belo Horizonte, MG 30180070, Brazil
Anna Matynia College of Optometry, University of Houston, 4401 Martin Luther King Boulevard, Houston, TX 77004, United States
Michael B Gorin Jules Stein Eye Institute, University of California Los Angeles, 100 Stein Plaza, Los Angeles, CA 90095, United States Department of Ophthalmology, University of California Los Angeles David Geffen School of Medicine, 10833 Le Conte Ave, Los Angeles, CA 90095, United States
Robert K Koenekoop McGill Ocular Genetics Laboratory and Centre, Department of Paediatric Surgery, Human Genetics, and Ophthalmology, McGill University Health Centre, 5252 Boul de Maisonneuve ouest, Montreal, QC H4A 3S5, Canada
Irma Lopez McGill Ocular Genetics Laboratory and Centre, Department of Paediatric Surgery, Human Genetics, and Ophthalmology, McGill University Health Centre, 5252 Boul de Maisonneuve ouest, Montreal, QC H4A 3S5, Canada
Ruifang Sui Department of Ophthalmology, Peking Union Medical College Hospital, Peking Union Medical College, Chinese Academy of Medical Sciences, WC67+HW Dongcheng, Beijing 100005, China
Gang Zou Department of Ophthalmology, Ningxia Eye Hospital, People's Hospital of Ningxia Hui Autonomous Region, First Affiliated Hospital of Northwest University for Nationalities, Ningxia Clinical Research Center on Diseases of Blindness in Eye, F4RJ+43 Xixia District, Yinchuan, Ningxia, China
Yumei Li Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States
Rui Chen Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, United States

Collapse

Riccio C, Jansen ML, Guo L, Ziegler A. Variant effect predictors: a systematic review and practical guide. Hum Genet 2024;143:625-634. [PMID: 38573379 PMCID: PMC11098935 DOI: 10.1007/s00439-024-02670-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Accepted: 03/11/2024] [Indexed: 04/05/2024]

Livesey BJ, Badonyi M, Dias M, Frazer J, Kumar S, Lindorff-Larsen K, McCandlish DM, Orenbuch R, Shearer CA, Muffley L, Foreman J, Glazer AM, Lehner B, Marks DS, Roth FP, Rubin AF, Starita LM, Marsh JA. Guidelines for releasing a variant effect predictor. ARXIV 2024:arXiv:2404.10807v1. [PMID: 38699161 PMCID: PMC11065047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]

Affiliation(s)

Benjamin J. Livesey MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
Mihaly Badonyi MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
Mafalda Dias Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
Jonathan Frazer Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
Sushant Kumar Department of Medical Biophysics, University of Toronto; Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
Kresten Lindorff-Larsen Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
David M. McCandlish Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Rose Orenbuch Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Courtney A. Shearer Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Lara Muffley Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
Julia Foreman European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
Andrew M. Glazer Vanderbilt University Medical Center, Nashville, TN, USA
Ben Lehner Wellcome Sanger Institute, Cambridge, UK; Universitat Pompeu Fabra (UPF), Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
Debora S. Marks Department of Systems Biology, Harvard Medical School, Boston, MA, USA Broad Institute of MIT and Harvard, Boston, MA, USA
Frederick P. Roth Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
Alan F. Rubin Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research; Department of Medical Biology, University of Melbourne, Parkville, Australia
Lea M. Starita Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
Joseph A. Marsh MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK

Collapse

Saez-Matia A, Ibarluzea MG, M-Alicante S, Muguruza-Montero A, Nuñez E, Ramis R, Ballesteros OR, Lasa-Goicuria D, Fons C, Gallego M, Casis O, Leonardo A, Bergara A, Villarroel A. MLe-KCNQ2: An Artificial Intelligence Model for the Prognosis of Missense KCNQ2 Gene Variants. Int J Mol Sci 2024;25:2910. [PMID: 38474157 DOI: 10.3390/ijms25052910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 02/27/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024] Open

Abstract

Despite the increasing availability of genomic data and enhanced data analysis procedures, predicting the severity of associated diseases remains elusive in the absence of clinical descriptors. To address this challenge, we have focused on the KV7.2 voltage-gated potassium channel gene (KCNQ2), known for its link to developmental delays and various epilepsies, including self-limited benign familial neonatal epilepsy and epileptic encephalopathy. Genome-wide tools often exhibit a tendency to overestimate deleterious mutations, frequently overlooking tolerated variants, and lack the capacity to discriminate variant severity. This study introduces a novel approach by evaluating multiple machine learning (ML) protocols and descriptors. The combination of genomic information with a novel Variant Frequency Index (VFI) builds a robust foundation for constructing reliable gene-specific ML models. The ensemble model, MLe-KCNQ2, formed through logistic regression, support vector machine, random forest and gradient boosting algorithms, achieves specificity and sensitivity values surpassing 0.95 (AUC-ROC > 0.98). The ensemble MLe-KCNQ2 model also categorizes pathogenic mutations as benign or severe, with an area under the receiver operating characteristic curve (AUC-ROC) above 0.67. This study not only presents a transferable methodology for accurately classifying KCNQ2 missense variants, but also provides valuable insights for clinical counseling and aids in the determination of variant severity. The research context emphasizes the necessity of precise variant classification, especially for genes like KCNQ2, contributing to the broader understanding of gene-specific challenges in the field of genomic research. The MLe-KCNQ2 model stands as a promising tool for enhancing clinical decision making and prognosis in the realm of KCNQ2-related pathologies.

Collapse

Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M. CADD v1.7: using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions. Nucleic Acids Res 2024;52:D1143-D1154. [PMID: 38183205 PMCID: PMC10767851 DOI: 10.1093/nar/gkad989] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/14/2023] [Accepted: 10/17/2023] [Indexed: 01/07/2024] Open

Weissenow K, Rost B. Rendering protein mutation movies with MutAmore. BMC Bioinformatics 2023;24:469. [PMID: 38087198 PMCID: PMC10714560 DOI: 10.1186/s12859-023-05610-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Accepted: 12/08/2023] [Indexed: 12/18/2023] Open

James JK, Norland K, Johar AS, Kullo IJ. Deep generative models of LDLR protein structure to predict variant pathogenicity. J Lipid Res 2023;64:100455. [PMID: 37821076 PMCID: PMC10696256 DOI: 10.1016/j.jlr.2023.100455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 09/16/2023] [Accepted: 10/05/2023] [Indexed: 10/13/2023] Open

Cheng J, Novati G, Pan J, Bycroft C, Žemgulytė A, Applebaum T, Pritzel A, Wong LH, Zielinski M, Sargeant T, Schneider RG, Senior AW, Jumper J, Hassabis D, Kohli P, Avsec Ž. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 2023;381:eadg7492. [PMID: 37733863 DOI: 10.1126/science.adg7492] [Citation(s) in RCA: 317] [Impact Index Per Article: 317.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 08/23/2023] [Indexed: 09/23/2023]

Marsh JA, Teichmann SA. Predicting pathogenic protein variants. Science 2023;381:1284-1285. [PMID: 37725046 DOI: 10.1126/science.adj8672] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]

Livesey BJ, Marsh JA. Advancing variant effect prediction using protein language models. Nat Genet 2023;55:1426-1427. [PMID: 37563330 DOI: 10.1038/s41588-023-01470-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]

Brandes N, Goldman G, Wang CH, Ye CJ, Ntranos V. Genome-wide prediction of disease variant effects with a deep protein language model. Nat Genet 2023;55:1512-1522. [PMID: 37563329 PMCID: PMC10484790 DOI: 10.1038/s41588-023-01465-0] [Citation(s) in RCA: 69] [Impact Index Per Article: 69.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 07/05/2023] [Indexed: 08/12/2023]

Livesey BJ, Marsh JA. Updated benchmarking of variant effect predictors using deep mutational scanning. Mol Syst Biol 2023;19:e11474. [PMID: 37310135 PMCID: PMC10407742 DOI: 10.15252/msb.202211474] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 05/30/2023] [Accepted: 06/02/2023] [Indexed: 06/14/2023] Open

Jagota M, Ye C, Albors C, Rastogi R, Koehl A, Ioannidis N, Song YS. Cross-protein transfer learning substantially improves disease variant prediction. Genome Biol 2023;24:182. [PMID: 37550700 PMCID: PMC10408151 DOI: 10.1186/s13059-023-03024-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 07/27/2023] [Indexed: 08/09/2023] Open

Gerasimavicius L, Livesey BJ, Marsh JA. Correspondence between functional scores from deep mutational scans and predicted effects on protein stability. Protein Sci 2023;32:e4688. [PMID: 37243972 PMCID: PMC10273344 DOI: 10.1002/pro.4688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 04/19/2023] [Accepted: 05/24/2023] [Indexed: 05/29/2023]

Abstract

Many methodologically diverse computational methods have been applied to the growing challenge of predicting and interpreting the effects of protein variants. As many pathogenic mutations have a perturbing effect on protein stability or intermolecular interactions, one highly interpretable approach is to use protein structural information to model the physical impacts of variants and predict their likely effects on protein stability and interactions. Previous efforts have assessed the accuracy of stability predictors in reproducing thermodynamically accurate values and evaluated their ability to distinguish between known pathogenic and benign mutations. Here, we take an alternate approach, and explore how well stability predictor scores correlate with functional impacts derived from deep mutational scanning (DMS) experiments. In this work, we compare the predictions of 9 protein stability-based tools against mutant protein fitness values from 49 independent DMS datasets, covering 170,940 unique single amino acid variants. We find that FoldX and Rosetta show the strongest correlations with DMS-based functional scores, similar to their previous top performance in distinguishing between pathogenic and benign variants. For both methods, performance is considerably improved when considering intermolecular interactions from protein complex structures, when available. Furthermore, using these two predictors, we derive a "Foldetta" consensus score, which improves upon the performance of both, and manages to match dedicated variant effect predictors in reflecting variant functional impacts. Finally, we also highlight that predicted stability effects show consistently higher correlations with certain DMS experimental phenotypes, particularly those based upon protein abundance, and, in certain cases, can significantly outcompete sequence-based variant effect prediction methodologies for predicting functional scores from DMS experiments.

Collapse

Fu Y, Bedő J, Papenfuss AT, Rubin AF. Integrating deep mutational scanning and low-throughput mutagenesis data to predict the impact of amino acid variants. Gigascience 2022;12:giad073. [PMID: 37721410 PMCID: PMC10506130 DOI: 10.1093/gigascience/giad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 07/02/2023] [Accepted: 08/23/2023] [Indexed: 09/19/2023] Open