1
|
Zhou L, Tao C, Shen X, Sun X, Wang J, Yuan Q. Unlocking the potential of enzyme engineering via rational computational design strategies. Biotechnol Adv 2024; 73:108376. [PMID: 38740355 DOI: 10.1016/j.biotechadv.2024.108376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 04/27/2024] [Accepted: 05/08/2024] [Indexed: 05/16/2024]
Abstract
Enzymes play a pivotal role in various industries by enabling efficient, eco-friendly, and sustainable chemical processes. However, the low turnover rates and poor substrate selectivity of enzymes limit their large-scale applications. Rational computational enzyme design, facilitated by computational algorithms, offers a more targeted and less labor-intensive approach. There has been notable advancement in employing rational computational protein engineering strategies to overcome these issues, it has not been comprehensively reviewed so far. This article reviews recent developments in rational computational enzyme design, categorizing them into three types: structure-based, sequence-based, and data-driven machine learning computational design. Case studies are presented to demonstrate successful enhancements in catalytic activity, stability, and substrate selectivity. Lastly, the article provides a thorough analysis of these approaches, highlights existing challenges and potential solutions, and offers insights into future development directions.
Collapse
Affiliation(s)
- Lei Zhou
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Chunmeng Tao
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Xiaolin Shen
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Xinxiao Sun
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China
| | - Jia Wang
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| | - Qipeng Yuan
- State Key Laboratory of Chemical Resource Engineering, Beijing University of Chemical Technology, Beijing 100029, China.
| |
Collapse
|
2
|
Wang J, Chen S, Yuan Q, Chen J, Li D, Wang L, Yang Y. Predicting the effects of mutations on protein solubility using graph convolution network and protein language model representation. J Comput Chem 2024; 45:436-445. [PMID: 37933773 DOI: 10.1002/jcc.27249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 10/11/2023] [Accepted: 10/21/2023] [Indexed: 11/08/2023]
Abstract
Solubility is one of the most important properties of protein. Protein solubility can be greatly changed by single amino acid mutations and the reduced protein solubility could lead to diseases. Since experimental methods to determine solubility are time-consuming and expensive, in-silico methods have been developed to predict the protein solubility changes caused by mutations mostly through protein evolution information. However, these methods are slow since it takes long time to obtain evolution information through multiple sequence alignment. In addition, these methods are of low performance because they do not fully utilize protein 3D structures due to a lack of experimental structures for most proteins. Here, we proposed a sequence-based method DeepMutSol to predict solubility change from residual mutations based on the Graph Convolutional Neural Network (GCN), where the protein graph was initiated according to predicted protein structure from Alphafold2, and the nodes (residues) were represented by protein language embeddings. To circumvent the small data of solubility changes, we further pretrained the model over absolute protein solubility. DeepMutSol was shown to outperform state-of-the-art methods in benchmark tests. In addition, we applied the method to clinically relevant genes from the ClinVar database and the predicted solubility changes were shown able to separate pathogenic mutations. All of the data sets and the source code are available at https://github.com/biomed-AI/DeepMutSol.
Collapse
Affiliation(s)
- Jing Wang
- Guangzhou institute of technology, Xidian University, Guangzhou, China
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Sheng Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Jianwen Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Danping Li
- School of Telecommunications Engineering, Xidian University, Xi'an, China
| | - Lei Wang
- School of Electronic Engineering, Xidian University, Xi'an, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
3
|
Wee J, Chen J, Xia K, Wei GW. Integration of persistent Laplacian and pre-trained transformer for protein solubility changes upon mutation. Comput Biol Med 2024; 169:107918. [PMID: 38194782 PMCID: PMC10922365 DOI: 10.1016/j.compbiomed.2024.107918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 12/21/2023] [Accepted: 01/01/2024] [Indexed: 01/11/2024]
Abstract
Protein mutations can significantly influence protein solubility, which results in altered protein functions and leads to various diseases. Despite tremendous effort, machine learning prediction of protein solubility changes upon mutation remains a challenging task as indicated by the poor scores of normalized Correct Prediction Ratio (CPR). Part of the challenge stems from the fact that there is no three-dimensional (3D) structures for the wild-type and mutant proteins. This work integrates persistent Laplacians and pre-trained Transformer for the task. The Transformer, pretrained with hundreds of millions of protein sequences, embeds wild-type and mutant sequences, while persistent Laplacians track the topological invariant change and homotopic shape evolution induced by mutations in 3D protein structures, which are rendered from AlphaFold2. The resulting machine learning model was trained on an extensive data set labeled with three solubility types. Our model outperforms all existing predictive methods and improves the state-of-the-art up to 15%.
Collapse
Affiliation(s)
- JunJie Wee
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Jiahui Chen
- Department of Mathematical Sciences, University of Arkansas, Fayetteville, AR 72701, USA
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
4
|
Kouba P, Kohout P, Haddadi F, Bushuiev A, Samusevich R, Sedlar J, Damborsky J, Pluskal T, Sivic J, Mazurenko S. Machine Learning-Guided Protein Engineering. ACS Catal 2023; 13:13863-13895. [PMID: 37942269 PMCID: PMC10629210 DOI: 10.1021/acscatal.3c02743] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/20/2023] [Indexed: 11/10/2023]
Abstract
Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas. It already encompasses ambitious tasks such as understanding and predicting protein structure and function, catalytic efficiency, enantioselectivity, protein dynamics, stability, solubility, aggregation, and more. Nonetheless, the field is still evolving, with many challenges to overcome and questions to address. In this Perspective, we provide an overview of ongoing trends in this domain, highlight recent case studies, and examine the current limitations of machine learning-based methods. We emphasize the crucial importance of thorough experimental validation of emerging models before their use for rational protein design. We present our opinions on the fundamental problems and outline the potential directions for future research.
Collapse
Affiliation(s)
- Petr Kouba
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Faculty of
Electrical Engineering, Czech Technical
University in Prague, Technicka 2, 166 27 Prague 6, Czech Republic
| | - Pavel Kohout
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Faraneh Haddadi
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Anton Bushuiev
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Raman Samusevich
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Jiri Sedlar
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Jiri Damborsky
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Tomas Pluskal
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Josef Sivic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Stanislav Mazurenko
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| |
Collapse
|
5
|
Wee J, Chen J, Xia K, Wei GW. Integration of persistent Laplacian and pre-trained transformer for protein solubility changes upon mutation. ARXIV 2023:arXiv:2310.18760v2. [PMID: 37961732 PMCID: PMC10635294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Protein mutations can significantly influence protein solubility, which results in altered protein functions and leads to various diseases. Despite of tremendous effort, machine learning prediction of protein solubility changes upon mutation remains a challenging task as indicated by the poor scores of normalized Correct Prediction Ratio (CPR). Part of the challenge stems from the fact that there is no three-dimensional (3D) structures for the wild-type and mutant proteins. This work integrates persistent Laplacians and pre-trained Transformer for the task. The Transformer, pretrained with hunderds of millions of protein sequences, embeds wild-type and mutant sequences, while persistent Laplacians track the topological invariant change and homotopic shape evolution induced by mutations in 3D protein structures, which are rendered from AlphaFold2. The resulting machine learning model was trained on an extensive data set labeled with three solubility types. Our model outperforms all existing predictive methods and improves the state-of-the-art up to 15%.
Collapse
Affiliation(s)
- JunJie Wee
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
| | - Jiahui Chen
- Department of Mathematical Sciences, University of Arkansas, Fayetteville, AR 72701, USA
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
6
|
Zhou Y, Huang Z, Li W, Wei J, Jiang Q, Yang W, Huang J. Deep learning in preclinical antibody drug discovery and development. Methods 2023; 218:57-71. [PMID: 37454742 DOI: 10.1016/j.ymeth.2023.07.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 03/20/2023] [Accepted: 07/10/2023] [Indexed: 07/18/2023] Open
Abstract
Antibody drugs have become a key part of biotherapeutics. Patients suffering from various diseases have benefited from antibody therapies. However, its development process is rather long, expensive and risky. To speed up the process, reduce cost and improve success rate, artificial intelligence, especially deep learning methods, have been widely used in all aspects of preclinical antibody drug development, from library generation to hit identification, developability screening, lead selection and optimization. In this review, we systematically summarize antibody encodings, deep learning architectures and models used in preclinical antibody drug discovery and development. We also critically discuss challenges and opportunities, problems and possible solutions, current applications and future directions of deep learning in antibody drug development.
Collapse
Affiliation(s)
- Yuwei Zhou
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Ziru Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Wenzhen Li
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jinyi Wei
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Qianhu Jiang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Wei Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jian Huang
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
| |
Collapse
|
7
|
Yang Y, Chong Z, Vihinen M. PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate. Int J Mol Sci 2023; 24:13023. [PMID: 37629203 PMCID: PMC10455311 DOI: 10.3390/ijms241613023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/08/2023] [Accepted: 08/09/2023] [Indexed: 08/27/2023] Open
Abstract
Most proteins fold into characteristic three-dimensional structures. The rate of folding and unfolding varies widely and can be affected by variations in proteins. We developed a novel machine-learning-based method for the prediction of the folding rate effects of amino acid substitutions in two-state folding proteins. We collected a data set of experimentally defined folding rates for variants and used them to train a gradient boosting algorithm starting with 1161 features. Two predictors were designed. The three-class classifier had, in blind tests, specificity and sensitivity ranging from 0.324 to 0.419 and from 0.256 to 0.451, respectively. The other tool was a regression predictor that showed a Pearson correlation coefficient of 0.525. The error measures, mean absolute error and mean squared error, were 0.581 and 0.603, respectively. One of the previously presented tools could be used for comparison with the blind test data set, our method called PON-Fold showed superior performance on all used measures. The applicability of the tool was tested by predicting all possible substitutions in a protein domain. Predictions for different conformations of proteins, open and closed forms of a protein kinase, and apo and holo forms of an enzyme indicated that the choice of the structure had a large impact on the outcome. PON-Fold is freely available.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000, China
| | - Zhang Chong
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, SE-221 84 Lund, Sweden
| |
Collapse
|
8
|
Chen Z, Wang X, Chen X, Huang J, Wang C, Wang J, Wang Z. Accelerating therapeutic protein design with computational approaches toward the clinical stage. Comput Struct Biotechnol J 2023; 21:2909-2926. [PMID: 38213894 PMCID: PMC10781723 DOI: 10.1016/j.csbj.2023.04.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/11/2023] [Accepted: 04/27/2023] [Indexed: 01/13/2024] Open
Abstract
Therapeutic protein, represented by antibodies, is of increasing interest in human medicine. However, clinical translation of therapeutic protein is still largely hindered by different aspects of developability, including affinity and selectivity, stability and aggregation prevention, solubility and viscosity reduction, and deimmunization. Conventional optimization of the developability with widely used methods, like display technologies and library screening approaches, is a time and cost-intensive endeavor, and the efficiency in finding suitable solutions is still not enough to meet clinical needs. In recent years, the accelerated advancement of computational methodologies has ushered in a transformative era in the field of therapeutic protein design. Owing to their remarkable capabilities in feature extraction and modeling, the integration of cutting-edge computational strategies with conventional techniques presents a promising avenue to accelerate the progression of therapeutic protein design and optimization toward clinical implementation. Here, we compared the differences between therapeutic protein and small molecules in developability and provided an overview of the computational approaches applicable to the design or optimization of therapeutic protein in several developability issues.
Collapse
Affiliation(s)
- Zhidong Chen
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xinpei Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Xu Chen
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Juyang Huang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Chenglin Wang
- Shenzhen Qiyu Biotechnology Co., Ltd, Shenzhen 518107, China
| | - Junqing Wang
- School of Pharmaceutical Sciences, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
| | - Zhe Wang
- Department of Pathology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen 518033, China
| |
Collapse
|
9
|
Zhang W, Wang H, Feng N, Li Y, Gu J, Wang Z. Developability assessment at early-stage discovery to enable development of antibody-derived therapeutics. Antib Ther 2022; 6:13-29. [PMID: 36683767 PMCID: PMC9847343 DOI: 10.1093/abt/tbac029] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 11/01/2022] [Accepted: 11/02/2022] [Indexed: 11/13/2022] Open
Abstract
Developability refers to the likelihood that an antibody candidate will become a manufacturable, safe and efficacious drug. Although the safety and efficacy of a drug candidate will be well considered by sponsors and regulatory agencies, developability in the narrow sense can be defined as the likelihood that an antibody candidate will go smoothly through the chemistry, manufacturing and control (CMC) process at a reasonable cost and within a reasonable timeline. Developability in this sense is the focus of this review. To lower the risk that an antibody candidate with poor developability will move to the CMC stage, the candidate's developability-related properties should be screened, assessed and optimized as early as possible. Assessment of developability at the early discovery stage should be performed in a rapid and high-throughput manner while consuming small amounts of testing materials. In addition to monoclonal antibodies, bispecific antibodies, multispecific antibodies and antibody-drug conjugates, as the derivatives of monoclonal antibodies, should also be assessed for developability. Moreover, we propose that the criterion of developability is relative: expected clinical indication, and the dosage and administration route of the antibody could affect this criterion. We also recommend a general screening process during the early discovery stage of antibody-derived therapeutics. With the advance of artificial intelligence-aided prediction of protein structures and features, computational tools can be used to predict, screen and optimize the developability of antibody candidates and greatly reduce the risk of moving a suboptimal candidate to the development stage.
Collapse
Affiliation(s)
- Weijie Zhang
- Biologicals Innovation and Discovery, WuXi Biologicals, 1951 Huifeng West Road, Fengxian District, Shanghai 201400, China
| | - Hao Wang
- Biologicals Innovation and Discovery, WuXi Biologicals, 1951 Huifeng West Road, Fengxian District, Shanghai 201400, China
| | - Nan Feng
- Biologicals Innovation and Discovery, WuXi Biologicals, 1951 Huifeng West Road, Fengxian District, Shanghai 201400, China
| | - Yifeng Li
- Technology and Process Development, WuXi Biologicals, 288 Fute Zhong Road, Waigaoqiao Free Trade Zone, Shanghai 200131, China
| | - Jijie Gu
- Biologicals Innovation and Discovery, WuXi Biologicals, 1951 Huifeng West Road, Fengxian District, Shanghai 201400, China
| | - Zhuozhi Wang
- To whom correspondence should be addressed. Biologics Innovation and Discovery, WuXi Biologicals, 1951 Huifeng West Road, Fengxian District, Shanghai 201400, China, Phone number: +86-21-50518899
| |
Collapse
|
10
|
Velecký J, Hamsikova M, Stourac J, Musil M, Damborsk J, Bednar D, Mazurenko S. SoluProtMutDB: a manually curated database of protein solubility changes upon mutations. Comput Struct Biotechnol J 2022; 20:6339-6347. [DOI: 10.1016/j.csbj.2022.11.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 11/11/2022] Open
|
11
|
Yang Y, Shao A, Vihinen M. PON-All: Amino Acid Substitution Tolerance Predictor for All Organisms. Front Mol Biosci 2022; 9:867572. [PMID: 35782867 PMCID: PMC9245922 DOI: 10.3389/fmolb.2022.867572] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 05/02/2022] [Indexed: 01/08/2023] Open
Abstract
Genetic variations are investigated in human and many other organisms for many purposes (e.g., to aid in clinical diagnosis). Interpretation of the identified variations can be challenging. Although some dedicated prediction methods have been developed and some tools for human variants can also be used for other organisms, the performance and species range have been limited. We developed a novel variant pathogenicity/tolerance predictor for amino acid substitutions in any organism. The method, PON-All, is a machine learning tool trained on human, animal, and plant variants. Two versions are provided, one with Gene Ontology (GO) annotations and another without these details. GO annotations are not available or are partial for many organisms of interest. The methods provide predictions for three classes: pathogenic, benign, and variants of unknown significance. On the blind test, when using GO annotations, accuracy was 0.913 and MCC 0.827. When GO features were not used, accuracy was 0.856 and MCC 0.712. The performance is the best for human and plant variants and somewhat lower for animal variants because the number of known disease-causing variants in animals is rather small. The method was compared to several other tools and was found to have superior performance. PON-All is freely available at http://structure.bmc.lu.se/PON-All and http://8.133.174.28:8999/.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou, China
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, China
| | - Aibin Shao
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
- *Correspondence: Mauno Vihinen,
| |
Collapse
|
12
|
Vasina M, Velecký J, Planas-Iglesias J, Marques SM, Skarupova J, Damborsky J, Bednar D, Mazurenko S, Prokop Z. Tools for computational design and high-throughput screening of therapeutic enzymes. Adv Drug Deliv Rev 2022; 183:114143. [PMID: 35167900 DOI: 10.1016/j.addr.2022.114143] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2021] [Revised: 02/04/2022] [Accepted: 02/09/2022] [Indexed: 12/16/2022]
Abstract
Therapeutic enzymes are valuable biopharmaceuticals in various biomedical applications. They have been successfully applied for fibrinolysis, cancer treatment, enzyme replacement therapies, and the treatment of rare diseases. Still, there is a permanent demand to find new or better therapeutic enzymes, which would be sufficiently soluble, stable, and active to meet specific medical needs. Here, we highlight the benefits of coupling computational approaches with high-throughput experimental technologies, which significantly accelerate the identification and engineering of catalytic therapeutic agents. New enzymes can be identified in genomic and metagenomic databases, which grow thanks to next-generation sequencing technologies exponentially. Computational design and machine learning methods are being developed to improve catalytically potent enzymes and predict their properties to guide the selection of target enzymes. High-throughput experimental pipelines, increasingly relying on microfluidics, ensure functional screening and biochemical characterization of target enzymes to reach efficient therapeutic enzymes.
Collapse
Affiliation(s)
- Michal Vasina
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Jan Velecký
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic
| | - Joan Planas-Iglesias
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Sergio M Marques
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic
| | - Jana Skarupova
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic; Enantis, INBIT, Kamenice 34, Brno, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| | - Stanislav Mazurenko
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| | - Zbynek Prokop
- Loschmidt Laboratories, Department of Experimental Biology, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; Loschmidt Laboratories, RECETOX, Faculty of Science, Masaryk University, Kotlarska 2, Brno, Czech Republic; International Clinical Research Centre, St. Anne's University Hospital, Pekarska 53, Brno, Czech Republic.
| |
Collapse
|