Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gray VE, Hause RJ, Luebeck J, Shendure J, Fowler DM. Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data. Cell Syst 2017;6:116-124.e3. [PMID: 29226803 DOI: 10.1016/j.cels.2017.11.003] [Citation(s) in RCA: 118] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Revised: 08/30/2017] [Accepted: 11/03/2017] [Indexed: 11/26/2022]

For:	Gray VE, Hause RJ, Luebeck J, Shendure J, Fowler DM. Quantitative Missense Variant Effect Prediction Using Large-Scale Mutagenesis Data. Cell Syst 2017;6:116-124.e3. [PMID: 29226803 DOI: 10.1016/j.cels.2017.11.003] [Citation(s) in RCA: 118] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Revised: 08/30/2017] [Accepted: 11/03/2017] [Indexed: 11/26/2022]

Number

Cited by Other Article(s)

Pan Q, Parra GB, Myung Y, Portelli S, Nguyen TB, Ascher DB. AlzDiscovery: A computational tool to identify Alzheimer's disease-causing missense mutations using protein structure information. Protein Sci 2024;33:e5147. [PMID: 39276018 DOI: 10.1002/pro.5147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 07/14/2024] [Accepted: 07/31/2024] [Indexed: 09/16/2024]

Hu Y, Zhang Q, Bai X, Men L, Ma J, Li D, Xu M, Wei Q, Chen R, Wang D, Yin X, Hu T, Xie T. Screening and modification of (+)-germacrene A synthase for the production of the anti-tumor drug (-)-β-elemene in engineered Saccharomyces cerevisiae. Int J Biol Macromol 2024;279:135455. [PMID: 39260653 DOI: 10.1016/j.ijbiomac.2024.135455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 09/06/2024] [Accepted: 09/06/2024] [Indexed: 09/13/2024]

Affiliation(s)

Yuhan Hu School of Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
Qin Zhang School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
Xue Bai School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
Lianhui Men School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
Jing Ma School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
Dengyu Li School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
Mengdie Xu School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
Qiuhui Wei School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China; Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, China
Rong Chen School of Public Health, Hangzhou Normal University, Hangzhou 311121, China
Daming Wang School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China; Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, China
Xiaopu Yin School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China; Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, China.
Tianyuan Hu School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China; Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, China.
Tian Xie School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China; Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, Hangzhou Normal University, Hangzhou 311121, China.

Collapse

Tan Y, Li M, Zhou Z, Tan P, Yu H, Fan G, Hong L. PETA: evaluating the impact of protein transfer learning with sub-word tokenization on downstream applications. J Cheminform 2024;16:92. [PMID: 39095917 PMCID: PMC11297785 DOI: 10.1186/s13321-024-00884-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 07/13/2024] [Indexed: 08/04/2024] Open

Affiliation(s)

Yang Tan School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China Shanghai National Center for Applied Mathematics (SJTU Center), & Institute of Natural Science, Shanghai Jiao Tong University, Shanghai, 200240, China Shanghai Artificial Intelligence Laboratory, Shanghai, 200240, China Chongqing Artificial Intelligence Research Institute of Shanghai Jiao Tong University, Chongqing, 200240, China
Mingchen Li School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China Shanghai National Center for Applied Mathematics (SJTU Center), & Institute of Natural Science, Shanghai Jiao Tong University, Shanghai, 200240, China Shanghai Artificial Intelligence Laboratory, Shanghai, 200240, China Chongqing Artificial Intelligence Research Institute of Shanghai Jiao Tong University, Chongqing, 200240, China
Ziyi Zhou Shanghai National Center for Applied Mathematics (SJTU Center), & Institute of Natural Science, Shanghai Jiao Tong University, Shanghai, 200240, China
Pan Tan Shanghai National Center for Applied Mathematics (SJTU Center), & Institute of Natural Science, Shanghai Jiao Tong University, Shanghai, 200240, China Shanghai Artificial Intelligence Laboratory, Shanghai, 200240, China
Huiqun Yu School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China.
Guisheng Fan School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237, China.
Liang Hong Shanghai National Center for Applied Mathematics (SJTU Center), & Institute of Natural Science, Shanghai Jiao Tong University, Shanghai, 200240, China. Shanghai Artificial Intelligence Laboratory, Shanghai, 200240, China. Chongqing Artificial Intelligence Research Institute of Shanghai Jiao Tong University, Chongqing, 200240, China.

Collapse

Ozkan S, Padilla N, de la Cruz X. QAFI: a novel method for quantitative estimation of missense variant impact using protein-specific predictors and ensemble learning. Hum Genet 2024:10.1007/s00439-024-02692-z. [PMID: 39048855 DOI: 10.1007/s00439-024-02692-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 07/14/2024] [Indexed: 07/27/2024]

Wirnsberger G, Pritišanac I, Oberdorfer G, Gruber K. Flattening the curve-How to get better results with small deep-mutational-scanning datasets. Proteins 2024;92:886-902. [PMID: 38501649 DOI: 10.1002/prot.26686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 02/24/2024] [Accepted: 03/07/2024] [Indexed: 03/20/2024]

Abstract

Proteins are used in various biotechnological applications, often requiring the optimization of protein properties by introducing specific amino-acid exchanges. Deep mutational scanning (DMS) is an effective high-throughput method for evaluating the effects of these exchanges on protein function. DMS data can then inform the training of a neural network to predict the impact of mutations. Most approaches use some representation of the protein sequence for training and prediction. As proteins are characterized by complex structures and intricate residue interaction networks, directly providing structural information as input reduces the need to learn these features from the data. We introduce a method for encoding protein structures as stacked 2D contact maps, which capture residue interactions, their evolutionary conservation, and mutation-induced interaction changes. Furthermore, we explored techniques to augment neural network training performance on smaller DMS datasets. To validate our approach, we trained three neural network architectures originally used for image analysis on three DMS datasets, and we compared their performances with networks trained solely on protein sequences. The results confirm the effectiveness of the protein structure encoding in machine learning efforts on DMS data. Using structural representations as direct input to the networks, along with data augmentation and pretraining, significantly reduced demands on training data size and improved prediction performance, especially on smaller datasets, while performance on large datasets was on par with state-of-the-art sequence convolutional neural networks. The methods presented here have the potential to provide the same workflow as DMS without the experimental and financial burden of testing thousands of mutants. Additionally, we present an open-source, user-friendly software tool to make these data analysis techniques accessible, particularly to biotechnology and protein engineering researchers who wish to apply them to their mutagenesis data.

Collapse

Livesey BJ, Badonyi M, Dias M, Frazer J, Kumar S, Lindorff-Larsen K, McCandlish DM, Orenbuch R, Shearer CA, Muffley L, Foreman J, Glazer AM, Lehner B, Marks DS, Roth FP, Rubin AF, Starita LM, Marsh JA. Guidelines for releasing a variant effect predictor. ARXIV 2024:arXiv:2404.10807v1. [PMID: 38699161 PMCID: PMC11065047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]

Affiliation(s)

Benjamin J. Livesey MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
Mihaly Badonyi MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
Mafalda Dias Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
Jonathan Frazer Centre for Genomic Regulation (CRG),The Barcelona Institute of Science and Technology, Barcelona, Spain
Sushant Kumar Department of Medical Biophysics, University of Toronto; Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
Kresten Lindorff-Larsen Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
David M. McCandlish Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
Rose Orenbuch Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Courtney A. Shearer Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Lara Muffley Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
Julia Foreman European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
Andrew M. Glazer Vanderbilt University Medical Center, Nashville, TN, USA
Ben Lehner Wellcome Sanger Institute, Cambridge, UK; Universitat Pompeu Fabra (UPF), Barcelona, Spain; Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
Debora S. Marks Department of Systems Biology, Harvard Medical School, Boston, MA, USA Broad Institute of MIT and Harvard, Boston, MA, USA
Frederick P. Roth Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
Alan F. Rubin Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research; Department of Medical Biology, University of Melbourne, Parkville, Australia
Lea M. Starita Department of Genome Sciences, University of Washington and the Brotman Baty Institute for Precision Medicine, Seattle, WA, USA
Joseph A. Marsh MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK

Collapse

Wang X, Li A, Li X, Cui H. Empowering Protein Engineering through Recombination of Beneficial Substitutions. Chemistry 2024;30:e202303889. [PMID: 38288640 DOI: 10.1002/chem.202303889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Indexed: 02/24/2024]

Landerer C, Poehls J, Toth-Petroczy A. Fitness Effects of Phenotypic Mutations at Proteome-Scale Reveal Optimality of Translation Machinery. Mol Biol Evol 2024;41:msae048. [PMID: 38421032 PMCID: PMC10939442 DOI: 10.1093/molbev/msae048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 01/30/2024] [Accepted: 02/23/2024] [Indexed: 03/02/2024] Open

Andorf CM, Haley OC, Hayford RK, Portwood JL, Harding S, Sen S, Cannon EK, Gardiner JM, Kim HS, Woodhouse MR. PanEffect: a pan-genome visualization tool for variant effects in maize. Bioinformatics 2024;40:btae073. [PMID: 38337024 PMCID: PMC10881103 DOI: 10.1093/bioinformatics/btae073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 01/30/2024] [Accepted: 02/06/2024] [Indexed: 02/12/2024] Open

Serghini A, Portelli S, Troadec G, Song C, Pan Q, Pires DEV, Ascher DB. Characterizing and predicting ccRCC-causing missense mutations in Von Hippel-Lindau disease. Hum Mol Genet 2024;33:224-232. [PMID: 37883464 PMCID: PMC10800015 DOI: 10.1093/hmg/ddad181] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 10/19/2023] [Accepted: 10/20/2023] [Indexed: 10/28/2023] Open

Xi C, Diao J, Moon TS. Advances in ligand-specific biosensing for structurally similar molecules. Cell Syst 2023;14:1024-1043. [PMID: 38128482 PMCID: PMC10751988 DOI: 10.1016/j.cels.2023.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/23/2023] [Accepted: 10/19/2023] [Indexed: 12/23/2023]

Yurtseven A, Buyanova S, Agrawal AA, Bochkareva OO, Kalinina OV. Machine learning and phylogenetic analysis allow for predicting antibiotic resistance in M. tuberculosis. BMC Microbiol 2023;23:404. [PMID: 38124060 PMCID: PMC10731705 DOI: 10.1186/s12866-023-03147-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/07/2023] [Indexed: 12/23/2023] Open

Pan Q, Portelli S, Nguyen TB, Ascher DB. Characterization on the oncogenic effect of the missense mutations of p53 via machine learning. Brief Bioinform 2023;25:bbad428. [PMID: 38018912 PMCID: PMC10685404 DOI: 10.1093/bib/bbad428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 10/13/2023] [Accepted: 11/05/2023] [Indexed: 11/30/2023] Open

Kouba P, Kohout P, Haddadi F, Bushuiev A, Samusevich R, Sedlar J, Damborsky J, Pluskal T, Sivic J, Mazurenko S. Machine Learning-Guided Protein Engineering. ACS Catal 2023;13:13863-13895. [PMID: 37942269 PMCID: PMC10629210 DOI: 10.1021/acscatal.3c02743] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/20/2023] [Indexed: 11/10/2023]

Affiliation(s)

Petr Kouba Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic Faculty of Electrical Engineering, Czech Technical University in Prague, Technicka 2, 166 27 Prague 6, Czech Republic
Pavel Kohout Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic International Clinical Research Center, St. Anne’s University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
Faraneh Haddadi Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic International Clinical Research Center, St. Anne’s University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
Anton Bushuiev Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
Raman Samusevich Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
Jiri Sedlar Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
Jiri Damborsky Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic International Clinical Research Center, St. Anne’s University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
Tomas Pluskal Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
Josef Sivic Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
Stanislav Mazurenko Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic International Clinical Research Center, St. Anne’s University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic

Collapse

Al-Jarf R, Karmakar M, Myung Y, Ascher DB. Uncovering the Molecular Drivers of NHEJ DNA Repair-Implicated Missense Variants and Their Functional Consequences. Genes (Basel) 2023;14:1890. [PMID: 37895239 PMCID: PMC10606680 DOI: 10.3390/genes14101890] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 09/24/2023] [Accepted: 09/27/2023] [Indexed: 10/29/2023] Open

Chen J, Woldring DR, Huang F, Huang X, Wei GW. Topological deep learning based deep mutational scanning. Comput Biol Med 2023;164:107258. [PMID: 37506452 PMCID: PMC10528359 DOI: 10.1016/j.compbiomed.2023.107258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/28/2023] [Accepted: 07/08/2023] [Indexed: 07/30/2023]

Chen L, Zhang Z, Li Z, Li R, Huo R, Chen L, Wang D, Luo X, Chen K, Liao C, Zheng M. Learning protein fitness landscapes with deep mutational scanning data from multiple sources. Cell Syst 2023;14:706-721.e5. [PMID: 37591206 DOI: 10.1016/j.cels.2023.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 05/30/2023] [Accepted: 07/18/2023] [Indexed: 08/19/2023]

Affiliation(s)

Lin Chen Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
Zehong Zhang Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
Zhenghao Li Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; Shanghai Institute for Advanced Immunochemical Studies, School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
Rui Li Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China
Ruifeng Huo School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
Lifan Chen Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
Dingyan Wang Lingang Laboratory, Shanghai 200031, China
Xiaomin Luo Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China
Kaixian Chen Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China
Cangsong Liao University of Chinese Academy of Sciences, Beijing 100049, China; Chemical Biology Research Center, Shanghai Institute of Materia Medica, Chinese Academy of Science, Shanghai 201203, China.
Mingyue Zheng Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China; University of Chinese Academy of Sciences, Beijing 100049, China; School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China; School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China.

Collapse

Livesey BJ, Marsh JA. Updated benchmarking of variant effect predictors using deep mutational scanning. Mol Syst Biol 2023;19:e11474. [PMID: 37310135 PMCID: PMC10407742 DOI: 10.15252/msb.202211474] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 05/30/2023] [Accepted: 06/02/2023] [Indexed: 06/14/2023] Open

Jagota M, Ye C, Albors C, Rastogi R, Koehl A, Ioannidis N, Song YS. Cross-protein transfer learning substantially improves disease variant prediction. Genome Biol 2023;24:182. [PMID: 37550700 PMCID: PMC10408151 DOI: 10.1186/s13059-023-03024-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 07/27/2023] [Indexed: 08/09/2023] Open

Fowler DM, Adams DJ, Gloyn AL, Hahn WC, Marks DS, Muffley LA, Neal JT, Roth FP, Rubin AF, Starita LM, Hurles ME. An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biol 2023;24:147. [PMID: 37394429 PMCID: PMC10316620 DOI: 10.1186/s13059-023-02986-x] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 06/13/2023] [Indexed: 07/04/2023] Open

Jessen-Howard D, Pan Q, Ascher DB. Identifying the Molecular Drivers of Pathogenic Aldehyde Dehydrogenase Missense Mutations in Cancer and Non-Cancer Diseases. Int J Mol Sci 2023;24:10157. [PMID: 37373306 DOI: 10.3390/ijms241210157] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/07/2023] [Accepted: 06/08/2023] [Indexed: 06/29/2023] Open

Fabo T, Khavari P. Functional characterization of human genomic variation linked to polygenic diseases. Trends Genet 2023;39:462-490. [PMID: 36997428 PMCID: PMC11025698 DOI: 10.1016/j.tig.2023.02.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 02/22/2023] [Accepted: 02/23/2023] [Indexed: 03/30/2023]

Dunham AS, Beltrao P, AlQuraishi M. High-throughput deep learning variant effect prediction with Sequence UNET. Genome Biol 2023;24:110. [PMID: 37161576 PMCID: PMC10169183 DOI: 10.1186/s13059-023-02948-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 04/20/2023] [Indexed: 05/11/2023] Open

Molecular Property Prediction by Combining LSTM and GAT. Biomolecules 2023;13:biom13030503. [PMID: 36979438 PMCID: PMC10046625 DOI: 10.3390/biom13030503] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 02/10/2023] [Accepted: 03/06/2023] [Indexed: 03/12/2023] Open

Chan MC, Chan KK, Procko E, Shukla D. Machine Learning Guided Design of High-Affinity ACE2 Decoys for SARS-CoV-2 Neutralization. J Phys Chem B 2023;127:1995-2001. [PMID: 36827526 PMCID: PMC9999943 DOI: 10.1021/acs.jpcb.3c00469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 02/03/2023] [Indexed: 02/26/2023]

Diaz DJ, Kulikova AV, Ellington AD, Wilke CO. Using machine learning to predict the effects and consequences of mutations in proteins. Curr Opin Struct Biol 2023;78:102518. [PMID: 36603229 PMCID: PMC9908841 DOI: 10.1016/j.sbi.2022.102518] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 11/07/2022] [Accepted: 11/20/2022] [Indexed: 01/05/2023]

Why does the X chromosome lag behind autosomes in GWAS findings? PLoS Genet 2023;19:e1010472. [PMID: 36848382 PMCID: PMC9997976 DOI: 10.1371/journal.pgen.1010472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 03/09/2023] [Accepted: 02/15/2023] [Indexed: 03/01/2023] Open

Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023;14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open

Landau J, Tsaban L, Yaacov A, Ben Cohen G, Rosenberg S. Shared Cancer Dataset Analysis Identifies and Predicts the Quantitative Effects of Pan-Cancer Somatic Driver Variants. Cancer Res 2023;83:74-88. [PMID: 36264175 DOI: 10.1158/0008-5472.can-22-1038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 08/02/2022] [Accepted: 10/18/2022] [Indexed: 02/03/2023]

Shea A, Bartz J, Zhang L, Dong X. Predicting mutational function using machine learning. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2023;791:108457. [PMID: 36965820 PMCID: PMC10239318 DOI: 10.1016/j.mrrev.2023.108457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/11/2023] [Accepted: 03/20/2023] [Indexed: 03/27/2023]

Harmalkar A, Rao R, Richard Xie Y, Honer J, Deisting W, Anlahr J, Hoenig A, Czwikla J, Sienz-Widmann E, Rau D, Rice AJ, Riley TP, Li D, Catterall HB, Tinberg CE, Gray JJ, Wei KY. Toward generalizable prediction of antibody thermostability using machine learning on sequence and structure features. MAbs 2023;15:2163584. [PMID: 36683173 PMCID: PMC9872953 DOI: 10.1080/19420862.2022.2163584] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 12/14/2022] [Accepted: 12/26/2022] [Indexed: 01/24/2023] Open

Abstract

Over the last three decades, the appeal for monoclonal antibodies (mAbs) as therapeutics has been steadily increasing as evident with FDA's recent landmark approval of the 100th mAb. Unlike mAbs that bind to single targets, multispecific biologics (msAbs) have garnered particular interest owing to the advantage of engaging distinct targets. One important modular component of msAbs is the single-chain variable fragment (scFv). Despite the exquisite specificity and affinity of these scFv modules, their relatively poor thermostability often hampers their development as a potential therapeutic drug. In recent years, engineering antibody sequences to enhance their stability by mutations has gained considerable momentum. As experimental methods for antibody engineering are time-intensive, laborious and expensive, computational methods serve as a fast and inexpensive alternative to conventional routes. In this work, we show two machine learning approaches - one with pre-trained language models (PTLM) capturing functional effects of sequence variation, and second, a supervised convolutional neural network (CNN) trained with Rosetta energetic features - to better classify thermostable scFv variants from sequence. Both of these models are trained over temperature-specific data (TS50 measurements) derived from multiple libraries of scFv sequences. On out-of-distribution (refers to the fact that the out-of-distribution sequnes are blind to the algorithm) sequences, we show that a sufficiently simple CNN model performs better than general pre-trained language models trained on diverse protein sequences (average Spearman correlation coefficient, ρ , of 0.4 as opposed to 0.15). On the other hand, an antibody-specific language model performs comparatively better than the CNN model on the same task (ρ = 0.52). Further, we demonstrate that for an independent mAb with available thermal melting temperatures for 20 experimentally characterized thermostable mutations, these models trained on TS50 data could identify 18 residue positions and 5 identical amino-acid mutations showing remarkable generalizability. Our results suggest that such models can be broadly applicable for improving the biological characteristics of antibodies. Further, transferring such models for alternative physicochemical properties of scFvs can have potential applications in optimizing large-scale production and delivery of mAbs or bsAbs.

Collapse

Fu Y, Bedő J, Papenfuss AT, Rubin AF. Integrating deep mutational scanning and low-throughput mutagenesis data to predict the impact of amino acid variants. Gigascience 2022;12:giad073. [PMID: 37721410 PMCID: PMC10506130 DOI: 10.1093/gigascience/giad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 07/02/2023] [Accepted: 08/23/2023] [Indexed: 09/19/2023] Open

Dace P, Findlay GM. Reducing uncertainty in genetic testing with Saturation Genome Editing. MED GENET-BERLIN 2022;34:297-304. [PMID: 38836089 PMCID: PMC11006300 DOI: 10.1515/medgen-2022-2159] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]

Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu Rev Genet 2022;56:441-465. [PMID: 36055970 DOI: 10.1146/annurev-genet-072920-032107] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022;14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open

High-throughput approaches to functional characterization of genetic variation in yeast. Curr Opin Genet Dev 2022;76:101979. [PMID: 36075138 DOI: 10.1016/j.gde.2022.101979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 07/29/2022] [Accepted: 08/02/2022] [Indexed: 11/20/2022]

Katsonis P, Wilhelm K, Williams A, Lichtarge O. Genome interpretation using in silico predictors of variant impact. Hum Genet 2022;141:1549-1577. [PMID: 35488922 PMCID: PMC9055222 DOI: 10.1007/s00439-022-02457-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/17/2022] [Indexed: 02/06/2023]

Marquet C, Heinzinger M, Olenyi T, Dallago C, Erckert K, Bernhofer M, Nechaev D, Rost B. Embeddings from protein language models predict conservation and variant effects. Hum Genet 2022;141:1629-1647. [PMID: 34967936 PMCID: PMC8716573 DOI: 10.1007/s00439-021-02411-y] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 12/06/2021] [Indexed: 12/13/2022]

Abstract

The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (pLMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences. These methods learn to predict missing or masked amino acids from the context of entire sequence regions. Here, we used pLM representations (embeddings) to predict sequence conservation and SAV effects without multiple sequence alignments (MSAs). Embeddings alone predicted residue conservation almost as accurately from single sequences as ConSeq using MSAs (two-state Matthews Correlation Coefficient-MCC-for ProtT5 embeddings of 0.596 ± 0.006 vs. 0.608 ± 0.006 for ConSeq). Inputting the conservation prediction along with BLOSUM62 substitution scores and pLM mask reconstruction probabilities into a simplistic logistic regression (LR) ensemble for Variant Effect Score Prediction without Alignments (VESPA) predicted SAV effect magnitude without any optimization on DMS data. Comparing predictions for a standard set of 39 DMS experiments to other methods (incl. ESM-1v, DeepSequence, and GEMME) revealed our approach as competitive with the state-of-the-art (SOTA) methods using MSA input. No method outperformed all others, neither consistently nor statistically significantly, independently of the performance measure applied (Spearman and Pearson correlation). Finally, we investigated binary effect predictions on DMS experiments for four human proteins. Overall, embedding-based methods have become competitive with methods relying on MSAs for SAV effect prediction at a fraction of the costs in computing/energy. Our method predicted SAV effects for the entire human proteome (~ 20 k proteins) within 40 min on one Nvidia Quadro RTX 8000. All methods and data sets are freely available for local and online execution through bioembeddings.com, https://github.com/Rostlab/VESPA , and PredictProtein.

Collapse

Affiliation(s)

Céline Marquet Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany. TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany.
Michael Heinzinger Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
Tobias Olenyi Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
Christian Dallago Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
Kyra Erckert Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
Michael Bernhofer Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
Dmitrii Nechaev Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
Burkhard Rost Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, Garching, 85748, Munich, Germany TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany

Collapse

Capel H, Weiler R, Dijkstra M, Vleugels R, Bloem P, Feenstra KA. ProteinGLUE multi-task benchmark suite for self-supervised protein modeling. Sci Rep 2022;12:16047. [PMID: 36163232 PMCID: PMC9512797 DOI: 10.1038/s41598-022-19608-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 08/31/2022] [Indexed: 11/09/2022] Open

Olinger E, Schaeffer C, Kidd K, Elhassan EAE, Cheng Y, Dufour I, Schiano G, Mabillard H, Pasqualetto E, Hofmann P, Fuster DG, Kistler AD, Wilson IJ, Kmoch S, Raymond L, Robert T, Eckardt KU, Bleyer AJ, Köttgen A, Conlon PJ, Wiesener M, Sayer JA, Rampoldi L, Devuyst O. An intermediate-effect size variant in UMOD confers risk for chronic kidney disease. Proc Natl Acad Sci U S A 2022;119:e2114734119. [PMID: 35947615 PMCID: PMC9388113 DOI: 10.1073/pnas.2114734119] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 05/04/2022] [Indexed: 12/12/2022] Open

Affiliation(s)

Eric Olinger Institute of Physiology, University of Zurich, CH-8057 Zurich, Switzerland Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne NE1 3BZ, United Kingdom
Céline Schaeffer Molecular Genetics of Renal Disorders, Division of Genetics and Cell Biology, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale San Raffaele, Milan, 20132 Italy
Kendrah Kidd Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, NC 27101 Department of Pediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University, 128 08 Prague, Czech Republic
Elhussein A. E. Elhassan Division of Nephrology, Beaumont General Hospital, 1297 Dublin, Ireland Department of Medicine, Royal College of Surgeons in Ireland, 1297 Dublin, Ireland
Yurong Cheng Institute of Genetic Epidemiology, Faculty of Medicine and Medical Center, University of Freiburg, D-79106 Freiburg, Germany Faculty of Biology, University of Freiburg, D-79106 Freiburg, Germany
Inès Dufour Institute of Physiology, University of Zurich, CH-8057 Zurich, Switzerland Division of Nephrology, Cliniques Universitaires Saint-Luc, 1200 Brussels, Belgium
Guglielmo Schiano Institute of Physiology, University of Zurich, CH-8057 Zurich, Switzerland
Holly Mabillard Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne NE1 3BZ, United Kingdom Renal Services, Newcastle Upon Tyne Hospitals National Health Service Trust, Newcastle upon Tyne NE7 7DN, United Kingdom
Elena Pasqualetto Molecular Genetics of Renal Disorders, Division of Genetics and Cell Biology, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale San Raffaele, Milan, 20132 Italy
Patrick Hofmann Institute of Physiology, University of Zurich, CH-8057 Zurich, Switzerland
Daniel G. Fuster Department of Nephrology and Hypertension, Inselspital, Bern University Hospital, University of Bern, 3010 Bern, Switzerland
Andreas D. Kistler Department of Medicine, Cantonal Hospital Frauenfeld, 8501 Frauenfeld, Switzerland
Ian J. Wilson Biosciences Institute, Newcastle University, Newcastle upon Tyne NE1 3BZ, United Kingdom
Stanislav Kmoch Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, NC 27101 Department of Pediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University, 128 08 Prague, Czech Republic
Laure Raymond Genetics Department, Laboratoire Eurofins Biomnis, Lyon, 69007 France
Thomas Robert Centre de Néphrologie et Transplantation Rénale, Centre Hospitalier Universitaire (CHU) la Conception, Assistance Publique - Hôpitaux de Marseille (AP-HM), Marseille, 13005 France Marseille Medical Genetics, Bioinformatics & Genetics, Unité Mixte de Recherche (UMR)_S910, Aix-Marseille Université, Marseille, 13005 France
Genomics England Research Consortium
Kai-Uwe Eckardt Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, 10117 Berlin, Germany Department of Nephrology and Hypertension, University Hospital Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
Anthony J. Bleyer Section on Nephrology, Wake Forest School of Medicine, Winston-Salem, NC 27101 Department of Pediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University, 128 08 Prague, Czech Republic
Anna Köttgen Institute of Genetic Epidemiology, Faculty of Medicine and Medical Center, University of Freiburg, D-79106 Freiburg, Germany Centre for Integrative Biological Signalling Studies, University of Freiburg, D-79106 Freiburg, Germany
Peter J. Conlon Division of Nephrology, Beaumont General Hospital, 1297 Dublin, Ireland Department of Medicine, Royal College of Surgeons in Ireland, 1297 Dublin, Ireland
Michael Wiesener Department of Nephrology and Hypertension, University Hospital Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
John A. Sayer Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne NE1 3BZ, United Kingdom Renal Services, Newcastle Upon Tyne Hospitals National Health Service Trust, Newcastle upon Tyne NE7 7DN, United Kingdom National Institute for Health and Care Research (NIHR) Newcastle Biomedical Research Centre, Newcastle upon Tyne NE4 5PL, United Kingdom
Luca Rampoldi Molecular Genetics of Renal Disorders, Division of Genetics and Cell Biology, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale San Raffaele, Milan, 20132 Italy
Olivier Devuyst Institute of Physiology, University of Zurich, CH-8057 Zurich, Switzerland Division of Nephrology, Cliniques Universitaires Saint-Luc, 1200 Brussels, Belgium

Collapse

Wang B, Gamazon ER. Modeling mutational effects on biochemical phenotypes using convolutional neural networks: application to SARS-CoV-2. iScience 2022;25:104500. [PMID: 35669036 PMCID: PMC9159778 DOI: 10.1016/j.isci.2022.104500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 11/15/2021] [Accepted: 05/26/2022] [Indexed: 11/29/2022] Open

Loss-of-function, gain-of-function and dominant-negative mutations have profoundly different effects on protein structure. Nat Commun 2022;13:3895. [PMID: 35794153 PMCID: PMC9259657 DOI: 10.1038/s41467-022-31686-6] [Citation(s) in RCA: 74] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 06/29/2022] [Indexed: 12/12/2022] Open

Abstract

Most known pathogenic mutations occur in protein-coding regions of DNA and change the way proteins are made. Taking protein structure into account has therefore provided great insight into the molecular mechanisms underlying human genetic disease. While there has been much focus on how mutations can disrupt protein structure and thus cause a loss of function (LOF), alternative mechanisms, specifically dominant-negative (DN) and gain-of-function (GOF) effects, are less understood. Here, we investigate the protein-level effects of pathogenic missense mutations associated with different molecular mechanisms. We observe striking differences between recessive vs dominant, and LOF vs non-LOF mutations, with dominant, non-LOF disease mutations having much milder effects on protein structure, and DN mutations being highly enriched at protein interfaces. We also find that nearly all computational variant effect predictors, even those based solely on sequence conservation, underperform on non-LOF mutations. However, we do show that non-LOF mutations could potentially be identified by their tendency to cluster in three-dimensional space. Overall, our work suggests that many pathogenic mutations that act via DN and GOF mechanisms are likely being missed by current variant prioritisation strategies, but that there is considerable scope to improve computational predictions through consideration of molecular disease mechanisms.

Most known pathogenic mutations occur in protein-coding regions of DNA and change the way proteins are made. Here the authors analyse the locations of thousands of human disease mutations and their predicted effects on protein structure and show that,while loss-of-function mutations tend to be highly disruptive, non-loss-of-function mutations are in general much milder at a protein structural level.

Collapse

Anderson CL, Munawar S, Reilly L, Kamp TJ, January CT, Delisle BP, Eckhardt LL. How Functional Genomics Can Keep Pace With VUS Identification. Front Cardiovasc Med 2022;9:900431. [PMID: 35859585 PMCID: PMC9291992 DOI: 10.3389/fcvm.2022.900431] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 06/09/2022] [Indexed: 01/03/2023] Open

Benjamin R, Giacoletto CJ, FitzHugh ZT, Eames D, Buczek L, Wu X, Newsome J, Han MV, Pearson T, Wei Z, Banerjee A, Brown L, Valente LJ, Shen S, Deng HW, Schiller MR. GigaAssay - An adaptable high-throughput saturation mutagenesis assay platform. Genomics 2022;114:110439. [PMID: 35905834 PMCID: PMC9420302 DOI: 10.1016/j.ygeno.2022.110439] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 07/12/2022] [Accepted: 07/24/2022] [Indexed: 11/17/2022]

Affiliation(s)

Ronald Benjamin Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Christopher J Giacoletto Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; School of Life Sciences, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA
Zachary T FitzHugh Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Danielle Eames Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Lindsay Buczek Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Xiaogang Wu Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Jacklyn Newsome Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Mira V Han Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; School of Life Sciences, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Tony Pearson School of Life Sciences, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA
Zhi Wei Department of Computer Science, New Jersey Institute of Technology, GITC 4214C, University Heights, Newark, NJ 07102, USA
Atoshi Banerjee Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Lancer Brown Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA
Liz J Valente Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA
Shirley Shen Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA
Hong-Wen Deng Center for Biomedical Informatics & Genomics Tulane University, 1440 Canal Street, Suite 1621, New Orleans, LA 70112, USA
Martin R Schiller Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; School of Life Sciences, University of Nevada, Las Vegas, 4505 S. Maryland Parkway, Las Vegas, Nevada 89154, USA; Heligenics Inc., 833 Las Vegas Blvd. North, Suite B, Las Vegas, NV 89101, USA.

Collapse

Kuntz CP, Woods H, McKee AG, Zelt NB, Mendenhall JL, Meiler J, Schlebach JP. Towards generalizable predictions for G protein-coupled receptor variant expression. Biophys J 2022;121:2712-2720. [PMID: 35715957 DOI: 10.1016/j.bpj.2022.06.018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 05/31/2022] [Accepted: 06/13/2022] [Indexed: 11/30/2022] Open

Livesey BJ, Marsh JA. Interpreting protein variant effects with computational predictors and deep mutational scanning. Dis Model Mech 2022;15:275742. [PMID: 35736673 PMCID: PMC9235876 DOI: 10.1242/dmm.049510] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open

Horne J, Shukla D. Recent Advances in Machine Learning Variant Effect Prediction Tools for Protein Engineering. Ind Eng Chem Res 2022;61:6235-6245. [PMID: 36051311 PMCID: PMC9432854 DOI: 10.1021/acs.iecr.1c04943] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Integration of machine learning with computational structural biology of plants. Biochem J 2022;479:921-928. [PMID: 35484946 DOI: 10.1042/bcj20200942] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 04/01/2022] [Accepted: 04/06/2022] [Indexed: 11/17/2022]

Anglès F, Wang C, Balch WE. Spatial covariance analysis reveals the residue-by-residue thermodynamic contribution of variation to the CFTR fold. Commun Biol 2022;5:356. [PMID: 35418593 PMCID: PMC9008016 DOI: 10.1038/s42003-022-03302-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 03/21/2022] [Indexed: 12/21/2022] Open

Spielmann M, Kircher M. Computational and experimental methods for classifying variants of unknown clinical significance. Cold Spring Harb Mol Case Stud 2022;8:mcs.a006196. [PMID: 35483875 PMCID: PMC9059783 DOI: 10.1101/mcs.a006196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

Abstract

The increase in sequencing capacity, reduction in costs, and national and international coordinated efforts have led to the widespread introduction of next-generation sequencing (NGS) technologies in patient care. More generally, human genetics and genomic medicine are gaining importance for more and more patients. Some communities are already discussing the prospect of sequencing each individual's genome at time of birth. Together with digital health records, this shall enable individualized treatments and preventive measures, so-called precision medicine. A central step in this process is the identification of disease causal mutations or variant combinations that make us more susceptible for diseases. Although various technological advances have improved the identification of genetic alterations, the interpretation and ranking of the identified variants remains a major challenge. Based on our knowledge of molecular processes or previously identified disease variants, we can identify potentially functional genetic variants and, using different lines of evidence, we are sometimes able to demonstrate their pathogenicity directly. However, the vast majority of variants are classified as variants of uncertain clinical significance (VUSs) with not enough experimental evidence to determine their pathogenicity. In these cases, computational methods may be used to improve the prioritization and an increasing toolbox of experimental methods is emerging that can be used to assay the molecular effects of VUSs. Here, we discuss how computational and experimental methods can be used to create catalogs of variant effects for a variety of molecular and cellular phenotypes. We discuss the prospects of integrating large-scale functional data with machine learning and clinical knowledge for the development of accurate pathogenicity predictions for clinical applications.

Collapse