1
|
Elste J, Saini A, Mejia-Alvarez R, Mejía A, Millán-Pacheco C, Swanson-Mungerson M, Tiwari V. Significance of Artificial Intelligence in the Study of Virus-Host Cell Interactions. Biomolecules 2024; 14:911. [PMID: 39199298 PMCID: PMC11352483 DOI: 10.3390/biom14080911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 07/11/2024] [Accepted: 07/23/2024] [Indexed: 09/01/2024] Open
Abstract
A highly critical event in a virus's life cycle is successfully entering a given host. This process begins when a viral glycoprotein interacts with a target cell receptor, which provides the molecular basis for target virus-host cell interactions for novel drug discovery. Over the years, extensive research has been carried out in the field of virus-host cell interaction, generating a massive number of genetic and molecular data sources. These datasets are an asset for predicting virus-host interactions at the molecular level using machine learning (ML), a subset of artificial intelligence (AI). In this direction, ML tools are now being applied to recognize patterns in these massive datasets to predict critical interactions between virus and host cells at the protein-protein and protein-sugar levels, as well as to perform transcriptional and translational analysis. On the other end, deep learning (DL) algorithms-a subfield of ML-can extract high-level features from very large datasets to recognize the hidden patterns within genomic sequences and images to develop models for rapid drug discovery predictions that address pathogenic viruses displaying heightened affinity for receptor docking and enhanced cell entry. ML and DL are pivotal forces, driving innovation with their ability to perform analysis of enormous datasets in a highly efficient, cost-effective, accurate, and high-throughput manner. This review focuses on the complexity of virus-host cell interactions at the molecular level in light of the current advances of ML and AI in viral pathogenesis to improve new treatments and prevention strategies.
Collapse
Affiliation(s)
- James Elste
- Department of Microbiology & Immunology, College of Graduate Studies, Midwestern University, Downers Grove, IL 60515, USA; (J.E.); (M.S.-M.)
| | - Akash Saini
- Hinsdale Central High School, 5500 S Grant St, Hinsdale, IL 60521, USA;
| | - Rafael Mejia-Alvarez
- Department of Physiology, College of Graduate Studies, Midwestern University, Downers Grove, IL 60515, USA;
| | - Armando Mejía
- Departamento de Biotechnology, Universidad Autónoma Metropolitana-Iztapalapa, Ciudad de Mexico 09340, Mexico;
| | - Cesar Millán-Pacheco
- Facultad de Farmacia, Universidad Autónoma del Estado de Morelos, Av. Universidad No. 1001, Col Chamilpa, Cuernavaca 62209, Mexico;
| | - Michelle Swanson-Mungerson
- Department of Microbiology & Immunology, College of Graduate Studies, Midwestern University, Downers Grove, IL 60515, USA; (J.E.); (M.S.-M.)
| | - Vaibhav Tiwari
- Department of Microbiology & Immunology, College of Graduate Studies, Midwestern University, Downers Grove, IL 60515, USA; (J.E.); (M.S.-M.)
| |
Collapse
|
2
|
Martinusen SG, Denard CA. Leveraging yeast sequestration to study and engineer posttranslational modification enzymes. Biotechnol Bioeng 2024; 121:903-914. [PMID: 38079116 PMCID: PMC11229454 DOI: 10.1002/bit.28621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 11/04/2023] [Accepted: 11/27/2023] [Indexed: 02/20/2024]
Abstract
Enzymes that catalyze posttranslational modifications (PTMs) of peptides and proteins (PTM-enzymes)-proteases, protein ligases, oxidoreductases, kinases, and other transferases-are foundational to our understanding of health and disease and empower applications in chemical biology, synthetic biology, and biomedicine. To fully harness the potential of PTM-enzymes, there is a critical need to decipher their enzymatic and biological mechanisms, develop molecules that can probe and modulate them, and endow them with improved and novel functions. These objectives are contingent upon implementation of high-throughput functional screens and selections that interrogate large sequence libraries to isolate desired PTM-enzyme properties. This review discusses the principles of Saccharomyces cerevisiae organelle sequestration to study and engineer PTM-enzymes. These include outer membrane sequestration, specifically methods that modify yeast surface display, and cytoplasmic sequestration based on enzyme-mediated transcription activation. Furthermore, we present a detailed discussion of yeast endoplasmic reticulum sequestration for the first time. Where appropriate, we highlight the major features and limitations of different systems, specifically how they can measure and control enzyme catalytic efficiencies. Taken together, yeast-based high-throughput sequestration approaches significantly lower the barrier to understanding how PTM-enzymes function and how to reprogram them.
Collapse
Affiliation(s)
- Samantha G Martinusen
- Department of Chemical Engineering, University of Florida, Gainesville, Florida, USA
| | - Carl A Denard
- Department of Chemical Engineering, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
3
|
Martinusen SG, Slaton EW, Nelson SE, Pulgar MA, Besu JT, Simas CF, Denard CA. Modular and integrative activity reporters enhance biochemical studies in the yeast ER. Protein Eng Des Sel 2024; 37:gzae008. [PMID: 38696722 DOI: 10.1093/protein/gzae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 03/31/2024] [Accepted: 05/01/2024] [Indexed: 05/04/2024] Open
Abstract
The yeast endoplasmic reticulum sequestration and screening (YESS) system is a broadly applicable platform to perform high-throughput biochemical studies of post-translational modification enzymes (PTM-enzymes). This system enables researchers to profile and engineer the activity and substrate specificity of PTM-enzymes and to discover inhibitor-resistant enzyme mutants. In this study, we expand the capabilities of YESS by transferring its functional components to integrative plasmids. The YESS integrative system yields uniform protein expression and protease activities in various configurations, allows one to integrate activity reporters at two independent loci and to split the system between integrative and centromeric plasmids. We characterize these integrative reporters with two viral proteases, Tobacco etch virus (TEVp) and 3-chymotrypsin like protease (3CLpro), in terms of coefficient of variance, signal-to-noise ratio and fold-activation. Overall, we provide a framework for chromosomal-based studies that is modular, enabling rigorous high-throughput assays of PTM-enzymes in yeast.
Collapse
Affiliation(s)
| | - Ethan W Slaton
- Department of Chemical Engineering, University of Florida, Gainesville, 32611, USA
| | - Sage E Nelson
- Department of Chemical Engineering, University of Florida, Gainesville, 32611, USA
| | - Marian A Pulgar
- Department of Chemical Engineering, University of Florida, Gainesville, 32611, USA
| | - Julia T Besu
- Department of Biology, University of Florida, Gainesville, 32611, USA
| | - Cassidy F Simas
- J. Crayton Pruitt Family Department of Biomedical Engineering, University of Florida, Gainesville, 32611, USA
| | - Carl A Denard
- Department of Chemical Engineering, University of Florida, Gainesville, 32611, USA
- UF Health Cancer Center, University of Florida, Gainesville, 32611, USA
| |
Collapse
|
4
|
Xi C, Diao J, Moon TS. Advances in ligand-specific biosensing for structurally similar molecules. Cell Syst 2023; 14:1024-1043. [PMID: 38128482 PMCID: PMC10751988 DOI: 10.1016/j.cels.2023.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/23/2023] [Accepted: 10/19/2023] [Indexed: 12/23/2023]
Abstract
The specificity of biological systems makes it possible to develop biosensors targeting specific metabolites, toxins, and pollutants in complex medical or environmental samples without interference from structurally similar compounds. For the last two decades, great efforts have been devoted to creating proteins or nucleic acids with novel properties through synthetic biology strategies. Beyond augmenting biocatalytic activity, expanding target substrate scopes, and enhancing enzymes' enantioselectivity and stability, an increasing research area is the enhancement of molecular specificity for genetically encoded biosensors. Here, we summarize recent advances in the development of highly specific biosensor systems and their essential applications. First, we describe the rational design principles required to create libraries containing potential mutants with less promiscuity or better specificity. Next, we review the emerging high-throughput screening techniques to engineer biosensing specificity for the desired target. Finally, we examine the computer-aided evaluation and prediction methods to facilitate the construction of ligand-specific biosensors.
Collapse
Affiliation(s)
- Chenggang Xi
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Jinjin Diao
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Tae Seok Moon
- Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, USA; Division of Biology and Biomedical Sciences, Washington University in St. Louis, St. Louis, MO, USA.
| |
Collapse
|
5
|
Lu C, Lubin JH, Sarma VV, Stentz SZ, Wang G, Wang S, Khare SD. Prediction and design of protease enzyme specificity using a structure-aware graph convolutional network. Proc Natl Acad Sci U S A 2023; 120:e2303590120. [PMID: 37729196 PMCID: PMC10523478 DOI: 10.1073/pnas.2303590120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Accepted: 08/14/2023] [Indexed: 09/22/2023] Open
Abstract
Site-specific proteolysis by the enzymatic cleavage of small linear sequence motifs is a key posttranslational modification involved in physiology and disease. The ability to robustly and rapidly predict protease-substrate specificity would also enable targeted proteolytic cleavage by designed proteases. Current methods for predicting protease specificity are limited to sequence pattern recognition in experimentally derived cleavage data obtained for libraries of potential substrates and generated separately for each protease variant. We reasoned that a more semantically rich and robust model of protease specificity could be developed by incorporating the energetics of molecular interactions between protease and substrates into machine learning workflows. We present Protein Graph Convolutional Network (PGCN), which develops a physically grounded, structure-based molecular interaction graph representation that describes molecular topology and interaction energetics to predict enzyme specificity. We show that PGCN accurately predicts the specificity landscapes of several variants of two model proteases. Node and edge ablation tests identified key graph elements for specificity prediction, some of which are consistent with known biochemical constraints for protease:substrate recognition. We used a pretrained PGCN model to guide the design of protease libraries for cleaving two noncanonical substrates, and found good agreement with experimental cleavage results. Importantly, the model can accurately assess designs featuring diversity at positions not present in the training data. The described methodology should enable the structure-based prediction of specificity landscapes of a wide variety of proteases and the construction of tailor-made protease editors for site-selectively and irreversibly modifying chosen target proteins.
Collapse
Affiliation(s)
- Changpeng Lu
- Institute for Quantitative Biomedicine, Rutgers–The State University of New Jersey, Piscataway, NJ08854
| | - Joseph H. Lubin
- Department of Chemistry and Chemical Biology, Rutgers–The State University of New Jersey, Piscataway, NJ08854
| | - Vidur V. Sarma
- Institute for Quantitative Biomedicine, Rutgers–The State University of New Jersey, Piscataway, NJ08854
| | | | - Guanyang Wang
- Department of Statistics, Rutgers–The State University of New Jersey, Piscataway, NJ08854
| | - Sijian Wang
- Institute for Quantitative Biomedicine, Rutgers–The State University of New Jersey, Piscataway, NJ08854
- Department of Statistics, Rutgers–The State University of New Jersey, Piscataway, NJ08854
| | - Sagar D. Khare
- Institute for Quantitative Biomedicine, Rutgers–The State University of New Jersey, Piscataway, NJ08854
- Department of Chemistry and Chemical Biology, Rutgers–The State University of New Jersey, Piscataway, NJ08854
| |
Collapse
|
6
|
Martinusen SG, Slaton EW, Nelson SE, Pulgar MA, Besu JT, Simas CF, Denard CA. Modular and integrative activity reporters enhance biochemical studies in the yeast ER. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.12.548713. [PMID: 37502857 PMCID: PMC10369952 DOI: 10.1101/2023.07.12.548713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
The yeast endoplasmic reticulum sequestration and screening (YESS) system is a generalizable platform that has become highly useful to investigate post-translational modification enzymes (PTM-enzymes). This system enables researchers to profile and engineer the activity and substrate specificity of PTM-enzymes and to discover inhibitor-resistant enzyme mutants. In this study, we expand the capabilities of YESS by transferring its functional components to integrative plasmids. The YESS integrative system yields uniform protein expression and protease activities in various configurations, allows one to integrate activity reporters at two independent loci and to split the system between integrative and centromeric plasmids. We characterize these integrative reporters with two viral proteases, Tobacco etch virus (TEVp) and 3-chymotrypsin like protease (3CL pro ), in terms of coefficient of variance, signal-to-noise ratio and fold-activation. Overall, we provide a framework for chromosomal-based studies that is modular, enabling rigorous high-throughput assays of PTM-enzymes in yeast.
Collapse
|
7
|
Lu C, Lubin JH, Sarma VV, Stentz SZ, Wang G, Wang S, Khare SD. Prediction and Design of Protease Enzyme Specificity Using a Structure-Aware Graph Convolutional Network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.16.528728. [PMID: 36824945 PMCID: PMC9949123 DOI: 10.1101/2023.02.16.528728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Abstract
Site-specific proteolysis by the enzymatic cleavage of small linear sequence motifs is a key post-translational modification involved in physiology and disease. The ability to robustly and rapidly predict protease substrate specificity would also enable targeted proteolytic cleavage - editing - of a target protein by designed proteases. Current methods for predicting protease specificity are limited to sequence pattern recognition in experimentally-derived cleavage data obtained for libraries of potential substrates and generated separately for each protease variant. We reasoned that a more semantically rich and robust model of protease specificity could be developed by incorporating the three-dimensional structure and energetics of molecular interactions between protease and substrates into machine learning workflows. We present Protein Graph Convolutional Network (PGCN), which develops a physically-grounded, structure-based molecular interaction graph representation that describes molecular topology and interaction energetics to predict enzyme specificity. We show that PGCN accurately predicts the specificity landscapes of several variants of two model proteases: the NS3/4 protease from the Hepatitis C virus (HCV) and the Tobacco Etch Virus (TEV) proteases. Node and edge ablation tests identified key graph elements for specificity prediction, some of which are consistent with known biochemical constraints for protease:substrate recognition. We used a pre-trained PGCN model to guide the design of TEV protease libraries for cleaving two non-canonical substrates, and found good agreement with experimental cleavage results. Importantly, the model can accurately assess designs featuring diversity at positions not present in the training data. The described methodology should enable the structure-based prediction of specificity landscapes of a wide variety of proteases and the construction of tailor-made protease editors for site-selectively and irreversibly modifying chosen target proteins.
Collapse
Affiliation(s)
- Changpeng Lu
- Institute for Quantitative Biomedicine, Rutgers - The State University of New Jersey, Piscataway, NJ
| | - Joseph H. Lubin
- Department of Chemistry & Chemical Biology, Rutgers - The State University of New Jersey, Piscataway, NJ
| | - Vidur V. Sarma
- Institute for Quantitative Biomedicine, Rutgers - The State University of New Jersey, Piscataway, NJ
| | | | - Guanyang Wang
- Department of Statistics, Rutgers - The State University of New Jersey, Piscataway, NJ
| | - Sijian Wang
- Institute for Quantitative Biomedicine, Rutgers - The State University of New Jersey, Piscataway, NJ
- Department of Statistics, Rutgers - The State University of New Jersey, Piscataway, NJ
| | - Sagar D. Khare
- Institute for Quantitative Biomedicine, Rutgers - The State University of New Jersey, Piscataway, NJ
- Department of Chemistry & Chemical Biology, Rutgers - The State University of New Jersey, Piscataway, NJ
| |
Collapse
|
8
|
Leander M, Liu Z, Cui Q, Raman S. Deep mutational scanning and machine learning reveal structural and molecular rules governing allosteric hotspots in homologous proteins. eLife 2022; 11:e79932. [PMID: 36226916 PMCID: PMC9662819 DOI: 10.7554/elife.79932] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 10/13/2022] [Indexed: 01/29/2023] Open
Abstract
A fundamental question in protein science is where allosteric hotspots - residues critical for allosteric signaling - are located, and what properties differentiate them. We carried out deep mutational scanning (DMS) of four homologous bacterial allosteric transcription factors (aTFs) to identify hotspots and built a machine learning model with this data to glean the structural and molecular properties of allosteric hotspots. We found hotspots to be distributed protein-wide rather than being restricted to 'pathways' linking allosteric and active sites as is commonly assumed. Despite structural homology, the location of hotspots was not superimposable across the aTFs. However, common signatures emerged when comparing hotspots coincident with long-range interactions, suggesting that the allosteric mechanism is conserved among the homologs despite differences in molecular details. Machine learning with our large DMS datasets revealed global structural and dynamic properties to be a strong predictor of whether a residue is a hotspot than local and physicochemical properties. Furthermore, a model trained on one protein can predict hotspots in a homolog. In summary, the overall allosteric mechanism is embedded in the structural fold of the aTF family, but the finer, molecular details are sequence-specific.
Collapse
Affiliation(s)
- Megan Leander
- Department of Biochemistry, University of Wisconsin-MadisonMadisonUnited States
| | - Zhuang Liu
- Department of Physics, Boston UniversityBostonUnited States
| | - Qiang Cui
- Department of Physics, Boston UniversityBostonUnited States
- Department of Chemistry, Boston UniversityBostonUnited States
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin-MadisonMadisonUnited States
- Department of Bacteriology, University of Wisconsin-MadisonMadisonUnited States
- Department of Chemical and Biological Engineering, University of Wisconsin-MadisonMadisonUnited States
| |
Collapse
|
9
|
Zhang F, Zheng H, Xian Y, Song H, Wang S, Yun Y, Yi L, Zhang G. Profiling Substrate Specificity of the SUMO Protease Ulp1 by the YESS–PSSC System to Advance the Conserved Mechanism for Substrate Cleavage. Int J Mol Sci 2022; 23:ijms232012188. [PMID: 36293045 PMCID: PMC9603560 DOI: 10.3390/ijms232012188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 10/03/2022] [Accepted: 10/10/2022] [Indexed: 11/29/2022] Open
Abstract
SUMO modification is a vital post-translational regulation process in eukaryotes, in which the SUMO protease is responsible for the maturation of the SUMO precursor and the deconjugation of the SUMO protein from modified proteins by accurately cleaving behind the C-terminal Gly–Gly motif. To promote the understanding of the high specificity of the SUMO protease against the SUMO protein as well as to clarify whether the conserved Gly–Gly motif is strictly required for the processing of the SUMO precursor, we systematically profiled the specificity of the S. cerevisiae SUMO protease (Ulp1) on Smt3 at the P2–P1↓P1’ (Gly–Gly↓Ala) position using the YESS–PSSC system. Our results demonstrated that Ulp1 was able to cleave Gly–Gly↓ motif-mutated substrates, indicating that the diglycine motif is not strictly required for Ulp1 cleavage. A structural-modeling analysis indicated that it is the special tapered active pocket of Ulp1 conferred the selectivity of small residues at the P1–P2 position of Smt3, such as Gly, Ala, Ser and Cys, and only which can smoothly deliver the scissile bond into the active site for cleavage. Meanwhile, the P1’ position Ala of Smt3 was found to play a vital role in maintaining Ulp1’s precise cleavage after the Gly–Gly motif and replacing Ala with Gly in this position could expand Ulp1 inclusivity against the P1 and P2 position residues of Smt3. All in all, our studies advanced the traditional knowledge of the SUMO protein, which may provide potential directions for the drug discovery of abnormal SUMOylation-related diseases.
Collapse
Affiliation(s)
- Faying Zhang
- School of Life Sciences, Hubei University, Wuhan 430062, China
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Hui Zheng
- School of Life Sciences, Hubei University, Wuhan 430062, China
| | - Yufan Xian
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Haoyue Song
- School of Life Sciences, Hubei University, Wuhan 430062, China
| | - Shengchen Wang
- School of Life Sciences, Hubei University, Wuhan 430062, China
| | - Yueli Yun
- School of Life Sciences, Hubei University, Wuhan 430062, China
| | - Li Yi
- School of Life Sciences, Hubei University, Wuhan 430062, China
- Correspondence: (L.Y.); (G.Z.)
| | - Guimin Zhang
- School of Life Sciences, Hubei University, Wuhan 430062, China
- College of Life Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
- Correspondence: (L.Y.); (G.Z.)
| |
Collapse
|
10
|
Yamauchi K, Sato M, Osawa L, Matsuda S, Komiyama Y, Nakakuki N, Takada H, Katoh R, Muraoka M, Suzuki Y, Tatsumi A, Miura M, Takano S, Amemiya F, Fukasawa M, Nakayama Y, Yamaguchi T, Inoue T, Maekawa S, Enomoto N. Analysis of direct-acting antiviral-resistant hepatitis C virus haplotype diversity by single-molecule and long-read sequencing. Hepatol Commun 2022; 6:1634-1651. [PMID: 35357088 PMCID: PMC9234623 DOI: 10.1002/hep4.1929] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Revised: 02/03/2022] [Accepted: 02/04/2022] [Indexed: 11/08/2022] Open
Abstract
The method of analyzing individual resistant hepatitis C virus (HCV) by a combination of haplotyping and resistance-associated substitution (RAS) has not been fully elucidated because conventional sequencing has only yielded short and fragmented viral genomes. We performed haplotype analysis of HCV mutations in 12 asunaprevir/daclatasvir treatment-failure cases using the Oxford Nanopore sequencer. This enabled single-molecule long-read sequencing using rolling circle amplification (RCA) for correction of the sequencing error. RCA of the circularized reverse-transcription polymerase chain reaction products successfully produced DNA longer than 30 kilobase pairs (kb) containing multiple tandem repeats of a target 3 kb HCV genome. The long-read sequencing of these RCA products could determine the original sequence of the target single molecule as the consensus nucleotide sequence of the tandem repeats and revealed the presence of multiple viral haplotypes with the combination of various mutations in each host. In addition to already known signature RASs, such as NS3-D168 and NS5A-L31/Y93, there were various RASs specific to a different haplotype after treatment failure. The distribution of viral haplotype changed over time; some haplotypes disappeared without acquiring resistant mutations, and other haplotypes, which were not observed before treatment, appeared after treatment. Conclusion: The combination of various mutations other than the known signature RAS was suggested to influence the kinetics of individual HCV quasispecies in the direct-acting antiviral treatment. HCV haplotype dynamic analysis will provide novel information on the role of HCV diversity within the host, which will be useful for elucidating the pathological mechanism of HCV-related diseases.
Collapse
Affiliation(s)
- Kozue Yamauchi
- Department of Gastroenterology and HepatologyFaculty of MedicineUniversity of YamanashiYamanashiJapan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Vinogradov AA, Chang JS, Onaka H, Goto Y, Suga H. Accurate Models of Substrate Preferences of Post-Translational Modification Enzymes from a Combination of mRNA Display and Deep Learning. ACS CENTRAL SCIENCE 2022; 8:814-824. [PMID: 35756369 PMCID: PMC9228559 DOI: 10.1021/acscentsci.2c00223] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Indexed: 05/15/2023]
Abstract
Promiscuous post-translational modification (PTM) enzymes often display nonobvious substrate preferences by acting on diverse yet well-defined sets of peptides and/or proteins. Understanding of substrate fitness landscapes for PTM enzymes is important in many areas of contemporary science, including natural product biosynthesis, molecular biology, and biotechnology. Here, we report an integrated platform for accurate profiling of substrate preferences for PTM enzymes. The platform features (i) a combination of mRNA display with next-generation sequencing as an ultrahigh throughput technique for data acquisition and (ii) deep learning for data analysis. The high accuracy (>0.99 in each of two studies) of the resulting deep learning models enables comprehensive analysis of enzymatic substrate preferences. The models can quantify fitness across sequence space, map modification sites, and identify important amino acids in the substrate. To benchmark the platform, we performed profiling of a Ser dehydratase (LazBF) and a Cys/Ser cyclodehydratase (LazDEF), two enzymes from the lactazole biosynthesis pathway. In both studies, our results point to complex enzymatic preferences, which, particularly for LazBF, cannot be reduced to a set of simple rules. The ability of the constructed models to dissect such complexity suggests that the developed platform can facilitate a wider study of PTM enzymes.
Collapse
Affiliation(s)
- Alexander A. Vinogradov
- Department
of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Jun Shi Chang
- Department
of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroyasu Onaka
- Department
of Biotechnology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
- Collaborative
Research Institute for Innovative Microbiology, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Yuki Goto
- Department
of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroaki Suga
- Department
of Chemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
| |
Collapse
|
12
|
Fink T, Jerala R. Designed protease-based signaling networks. Curr Opin Chem Biol 2022; 68:102146. [DOI: 10.1016/j.cbpa.2022.102146] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 03/09/2022] [Accepted: 03/14/2022] [Indexed: 12/27/2022]
|
13
|
Ganjdanesh A, Zhang J, Chew EY, Ding Y, Huang H, Chen W. LONGL-Net: temporal correlation structure guided deep learning model to predict longitudinal age-related macular degeneration severity. PNAS NEXUS 2022; 1:pgab003. [PMID: 35360552 PMCID: PMC8962776 DOI: 10.1093/pnasnexus/pgab003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 11/15/2021] [Indexed: 01/28/2023]
Abstract
Age-related macular degeneration (AMD) is the principal cause of blindness in developed countries, and its prevalence will increase to 288 million people in 2040. Therefore, automated grading and prediction methods can be highly beneficial for recognizing susceptible subjects to late-AMD and enabling clinicians to start preventive actions for them. Clinically, AMD severity is quantified by Color Fundus Photographs (CFP) of the retina, and many machine-learning-based methods are proposed for grading AMD severity. However, few models were developed to predict the longitudinal progression status, i.e. predicting future late-AMD risk based on the current CFP, which is more clinically interesting. In this paper, we propose a new deep-learning-based classification model (LONGL-Net) that can simultaneously grade the current CFP and predict the longitudinal outcome, i.e. whether the subject will be in late-AMD in the future time-point. We design a new temporal-correlation-structure-guided Generative Adversarial Network model that learns the interrelations of temporal changes in CFPs in consecutive time-points and provides interpretability for the classifier's decisions by forecasting AMD symptoms in the future CFPs. We used about 30,000 CFP images from 4,628 participants in the Age-Related Eye Disease Study. Our classifier showed average 0.905 (95% CI: 0.886-0.922) AUC and 0.762 (95% CI: 0.733-0.792) accuracy on the 3-class classification problem of simultaneously grading current time-point's AMD condition and predicting late AMD progression of subjects in the future time-point. We further validated our model on the UK Biobank dataset, where our model showed average 0.905 accuracy and 0.797 sensitivity in grading 300 CFP images.
Collapse
Affiliation(s)
- Alireza Ganjdanesh
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Jipeng Zhang
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Emily Y Chew
- Division of Epidemiology and Clinical Applications, National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Ying Ding
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Heng Huang
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Wei Chen
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15213, USA
- Division of Pulmonary Medicine, Department of Pediatrics, UPMC Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA 15219, USA
| |
Collapse
|
14
|
Dyer RP, Weiss GA. Making the cut with protease engineering. Cell Chem Biol 2022; 29:177-190. [PMID: 34921772 PMCID: PMC9127713 DOI: 10.1016/j.chembiol.2021.12.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 07/30/2021] [Accepted: 11/29/2021] [Indexed: 12/30/2022]
Abstract
Proteases cut with enviable precision and regulate diverse molecular events in biology. Such qualities drive a seemingly inexhaustible appetite for proteases with new activities and capabilities. Comprising 25% of the total industrial enzyme market, proteases appear in consumer goods, such as detergents, textile processing, and numerous foods; additionally, proteases include 25 US Food and Drug Administration-approved medicines and various research tools. Recent advances in protease engineering strategies address target specificity, catalytic efficiency, and stability. This guide to protease engineering surveys best practices and emerging strategies. We further highlight gaps and flexibilities inherent to each system that suggest opportunities for new technology development along with engineered proteases to solve challenges in proteomics, protein sequencing, and synthetic gene circuits.
Collapse
Affiliation(s)
- Rebekah P Dyer
- Department of Molecular Biology and Biochemistry, University of California, Irvine, 1102 NS-2, Irvine, CA 92697-2025, USA
| | - Gregory A Weiss
- Department of Chemistry, University of California, Irvine, 1102 NS-2, Irvine, CA 92697-2025, USA; Department of Molecular Biology and Biochemistry, University of California, Irvine, 1102 NS-2, Irvine, CA 92697-2025, USA; Department of Pharmaceutical Sciences, University of California, Irvine, 1102 NS-2, Irvine, CA 92697-2025, USA.
| |
Collapse
|
15
|
Varga JK, Diffley K, Welker Leng KR, Fierke CA, Schueler-Furman O. Structure-based prediction of HDAC6 substrates validated by enzymatic assay reveals determinants of promiscuity and detects new potential substrates. Sci Rep 2022; 12:1788. [PMID: 35110592 PMCID: PMC8810773 DOI: 10.1038/s41598-022-05681-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 01/17/2022] [Indexed: 01/25/2023] Open
Abstract
Histone deacetylases play important biological roles well beyond the deacetylation of histone tails. In particular, HDAC6 is involved in multiple cellular processes such as apoptosis, cytoskeleton reorganization, and protein folding, affecting substrates such as ɑ-tubulin, Hsp90 and cortactin proteins. We have applied a biochemical enzymatic assay to measure the activity of HDAC6 on a set of candidate unlabeled peptides. These served for the calibration of a structure-based substrate prediction protocol, Rosetta FlexPepBind, previously used for the successful substrate prediction of HDAC8 and other enzymes. A proteome-wide screen of reported acetylation sites using our calibrated protocol together with the enzymatic assay provide new peptide substrates and avenues to novel potential functional regulatory roles of this promiscuous, multi-faceted enzyme. In particular, we propose novel regulatory roles of HDAC6 in tumorigenesis and cancer cell survival via the regulation of EGFR/Akt pathway activation. The calibration process and comparison of the results between HDAC6 and HDAC8 highlight structural differences that explain the established promiscuity of HDAC6.
Collapse
Affiliation(s)
- Julia K Varga
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada (IMRIC), The Hebrew University of Jerusalem, Faculty of Medicine, POB 12272, 9112102, Jerusalem, Israel
| | - Kelsey Diffley
- Department of Chemistry, University of Michigan, 930 North University Avenue, Ann Arbor, MI, 48109, USA
| | - Katherine R Welker Leng
- Department of Chemistry, University of Michigan, 930 North University Avenue, Ann Arbor, MI, 48109, USA
| | - Carol A Fierke
- Department of Chemistry, University of Michigan, 930 North University Avenue, Ann Arbor, MI, 48109, USA
- Department of Biochemistry, Brandeis University, 415 South Street, Waltham, MA, 02453, USA
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada (IMRIC), The Hebrew University of Jerusalem, Faculty of Medicine, POB 12272, 9112102, Jerusalem, Israel.
| |
Collapse
|
16
|
Nakanishi H. Protein-Based Systems for Translational Regulation of Synthetic mRNAs in Mammalian Cells. Life (Basel) 2021; 11:life11111192. [PMID: 34833067 PMCID: PMC8621430 DOI: 10.3390/life11111192] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/31/2021] [Accepted: 11/01/2021] [Indexed: 11/16/2022] Open
Abstract
Synthetic mRNAs, which are produced by in vitro transcription, have been recently attracting attention because they can express any transgenes without the risk of insertional mutagenesis. Although current synthetic mRNA medicine is not designed for spatiotemporal or cell-selective regulation, many preclinical studies have developed the systems for the translational regulation of synthetic mRNAs. Such translational regulation systems will cope with high efficacy and low adverse effects by producing the appropriate amount of therapeutic proteins, depending on the context. Protein-based regulation is one of the most promising approaches for the translational regulation of synthetic mRNAs. As synthetic mRNAs can encode not only output proteins but also regulator proteins, all components of protein-based regulation systems can be delivered as synthetic mRNAs. In addition, in the protein-based regulation systems, the output protein can be utilized as the input for the subsequent regulation to construct multi-layered gene circuits, which enable complex and sophisticated regulation. In this review, I introduce what types of proteins have been used for translational regulation, how to combine them, and how to design effective gene circuits.
Collapse
Affiliation(s)
- Hideyuki Nakanishi
- Department of Biofunction Research, Institute of Biomaterials and Bioengineering, Tokyo Medical and Dental University (TMDU), 2-3-10 Kanda-Surugadai, Chiyoda-ku, Tokyo 101-0062, Japan
| |
Collapse
|
17
|
Alegre-Cebollada J. Protein nanomechanics in biological context. Biophys Rev 2021; 13:435-454. [PMID: 34466164 PMCID: PMC8355295 DOI: 10.1007/s12551-021-00822-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 07/05/2021] [Indexed: 12/20/2022] Open
Abstract
How proteins respond to pulling forces, or protein nanomechanics, is a key contributor to the form and function of biological systems. Indeed, the conventional view that proteins are able to diffuse in solution does not apply to the many polypeptides that are anchored to rigid supramolecular structures. These tethered proteins typically have important mechanical roles that enable cells to generate, sense, and transduce mechanical forces. To fully comprehend the interplay between mechanical forces and biology, we must understand how protein nanomechanics emerge in living matter. This endeavor is definitely challenging and only recently has it started to appear tractable. Here, I introduce the main in vitro single-molecule biophysics methods that have been instrumental to investigate protein nanomechanics over the last 2 decades. Then, I present the contemporary view on how mechanical force shapes the free energy of tethered proteins, as well as the effect of biological factors such as post-translational modifications and mutations. To illustrate the contribution of protein nanomechanics to biological function, I review current knowledge on the mechanobiology of selected muscle and cell adhesion proteins including titin, talin, and bacterial pilins. Finally, I discuss emerging methods to modulate protein nanomechanics in living matter, for instance by inducing specific mechanical loss-of-function (mLOF). By interrogating biological systems in a causative manner, these new tools can contribute to further place protein nanomechanics in a biological context.
Collapse
|
18
|
Serafim MSM, Dos Santos Júnior VS, Gertrudes JC, Maltarollo VG, Honorio KM. Machine learning techniques applied to the drug design and discovery of new antivirals: a brief look over the past decade. Expert Opin Drug Discov 2021; 16:961-975. [PMID: 33957833 DOI: 10.1080/17460441.2021.1918098] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Introduction: Drug design and discovery of new antivirals will always be extremely important in medicinal chemistry, taking into account known and new viral diseases that are yet to come. Although machine learning (ML) have shown to improve predictions on the biological potential of chemicals and accelerate the discovery of drugs over the past decade, new methods and their combinations have improved their performance and established promising perspectives regarding ML in the search for new antivirals.Areas covered: The authors consider some interesting areas that deal with different ML techniques applied to antivirals. Recent innovative studies on ML and antivirals were selected and analyzed in detail. Also, the authors provide a brief look at the past to the present to detect advances and bottlenecks in the area.Expert opinion: From classical ML techniques, it was possible to boost the searches for antivirals. However, from the emergence of new algorithms and the improvement in old approaches, promising results will be achieved every day, as we have observed in the case of SARS-CoV-2. Recent experience has shown that it is possible to use ML to discover new antiviral candidates from virtual screening and drug repurposing.
Collapse
Affiliation(s)
- Mateus Sá Magalhães Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | | | - Jadson Castro Gertrudes
- Departamento de Computação, Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto (UFOP), Ouro Preto, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia Maria Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| |
Collapse
|
19
|
Frappier V, Keating AE. Data-driven computational protein design. Curr Opin Struct Biol 2021; 69:63-69. [PMID: 33910104 DOI: 10.1016/j.sbi.2021.03.009] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 03/18/2021] [Accepted: 03/19/2021] [Indexed: 01/28/2023]
Abstract
Computational protein design can generate proteins not found in nature that adopt desired structures and perform novel functions. Although proteins could, in theory, be designed with ab initio methods, practical success has come from using large amounts of data that describe the sequences, structures, and functions of existing proteins and their variants. We present recent creative uses of multiple-sequence alignments, protein structures, and high-throughput functional assays in computational protein design. Approaches range from enhancing structure-based design with experimental data to building regression models to training deep neural nets that generate novel sequences. Looking ahead, deep learning will be increasingly important for maximizing the value of data for protein design.
Collapse
Affiliation(s)
- Vincent Frappier
- Generate Biomedicines, 26 Landsdowne Street, Cambridge, MA, 02139, USA
| | - Amy E Keating
- MIT Departments of Biology and Biological Engineering, 77 Massachusetts Ave., Cambridge, MA, 02139, USA.
| |
Collapse
|
20
|
Mahajan SP, Srinivasan Y, Labonte JW, DeLisa MP, Gray JJ. Structural basis for peptide substrate specificities of glycosyltransferase GalNAc-T2. ACS Catal 2021; 11:2977-2991. [PMID: 34322281 DOI: 10.1021/acscatal.0c04609] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The polypeptide N-acetylgalactosaminyl transferase (GalNAc-T) enzyme family initiates O-linked mucin-type glycosylation. The family constitutes 20 isoenzymes in humans. GalNAc-Ts exhibit both redundancy and finely tuned specificity for a wide range of peptide substrates. In this work, we deciphered the sequence and structural motifs that determine the peptide substrate preferences for the GalNAc-T2 isoform. Our approach involved sampling and characterization of peptide-enzyme conformations obtained from Rosetta Monte Carlo-minimization-based flexible docking. We computationally scanned 19 amino acid residues at positions -1 and +1 of an eight-residue peptide substrate, which comprised a dataset of 361 (19x19) peptides with previously characterized experimental GalNAc-T2 glycosylation efficiencies. The calculations recapitulated experimental specificity data, successfully discriminating between glycosylatable and non-glycosylatable peptides with a probability of 96.5% (ROC-AUC score), a balanced accuracy of 85.5% and a false positive rate of 7.3%. The glycosylatable peptide substrates viz. peptides with proline, serine, threonine, and alanine at the -1 position of the peptide preferentially exhibited cognate sequon-like conformations. The preference for specific residues at the -1 position of the peptide was regulated by enzyme residues R362, K363, Q364, H365 and W331, which modulate the pocket size and specific enzyme-peptide interactions. For the +1 position of the peptide, enzyme residues K281 and K363 formed gating interactions with aromatics and glutamines at the +1 position of the peptide, leading to modes of peptide-binding sub-optimal for catalysis. Overall, our work revealed enzyme features that lead to the finely tuned specificity observed for a broad range of peptide substrates for the GalNAc-T2 enzyme. We anticipate that the key sequence and structural motifs can be extended to analyze specificities of other isoforms of the GalNAc-T family and can be used to guide design of variants with tailored specificity.
Collapse
Affiliation(s)
- Sai Pooja Mahajan
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Yashes Srinivasan
- Department of Bioengineering, University of California—Los Angeles, Los Angeles, California 90095, United States
| | - Jason W. Labonte
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Department of Chemistry, Franklin & Marshall College, Lancaster, Pennsylvania 17604, United States
| | - Matthew P. DeLisa
- Robert Frederick Smith School of Chemical and Biomolecular Engineering, Department of Microbiology, and Nancy E. and Peter C. Meinig School of Biomedical Engineering, Biochemistry, Molecular and Cell Biology, Cornell University, Ithaca, New York 14853, United States
| | - Jeffrey J. Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland 21224, United States
| |
Collapse
|
21
|
Kanda T, Sasaki R, Masuzaki R, Moriyama M. Artificial intelligence and machine learning could support drug development for hepatitis A virus internal ribosomal entry sites. Artif Intell Gastroenterol 2021; 2:1-9. [DOI: 10.35712/aig.v2.i1.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 12/29/2020] [Accepted: 02/12/2021] [Indexed: 02/06/2023] Open
Abstract
Hepatitis A virus (HAV) infection is still an important health issue worldwide. Although several effective HAV vaccines are available, it is difficult to perform universal vaccination in certain countries. Therefore, it may be better to develop antivirals against HAV for the prevention of severe hepatitis A. We found that several drugs potentially inhibit HAV internal ribosomal entry site-dependent translation and HAV replication. Artificial intelligence and machine learning could also support screening of anti-HAV drugs, using drug repositioning and drug rescue approaches.
Collapse
Affiliation(s)
- Tatsuo Kanda
- Division of Gastroenterology and Hepatology, Department of Medicine, Nihon University School of Medicine, Itabashi-ku 173-8610, Tokyo, Japan
| | - Reina Sasaki
- Division of Gastroenterology and Hepatology, Department of Medicine, Nihon University School of Medicine, Itabashi-ku 173-8610, Tokyo, Japan
| | - Ryota Masuzaki
- Division of Gastroenterology and Hepatology, Department of Medicine, Nihon University School of Medicine, Itabashi-ku 173-8610, Tokyo, Japan
| | - Mitsuhiko Moriyama
- Division of Gastroenterology and Hepatology, Department of Medicine, Nihon University School of Medicine, Itabashi-ku 173-8610, Tokyo, Japan
| |
Collapse
|
22
|
Denard CA, Paresi C, Yaghi R, McGinnis N, Bennett Z, Yi L, Georgiou G, Iverson BL. YESS 2.0, a Tunable Platform for Enzyme Evolution, Yields Highly Active TEV Protease Variants. ACS Synth Biol 2021; 10:63-71. [PMID: 33401904 DOI: 10.1021/acssynbio.0c00452] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Here we describe YESS 2.0, a highly versatile version of the yeast endoplasmic sequestration screening (YESS) system suitable for engineering and characterizing protein/peptide modifying enzymes such as proteases with desired new activities. By incorporating features that modulate gene transcription as well as substrate and enzyme spatial sequestration, YESS 2.0 achieves a significantly higher operational and dynamic range compared with the original YESS. To showcase the new advantages of YESS 2.0, we improved an already efficient TEV protease variant (TEV-EAV) to obtain a variant (eTEV) with a 2.25-fold higher catalytic efficiency, derived almost entirely from an increase in turnover rate (kcat). In our analysis, eTEV specifically digests a fusion protein in 2 h at a low 1:200 enzyme to substrate ratio. Structural modeling indicates that the increase in catalytic efficiency of eTEV is likely due to an enhanced interaction between the catalytic Cys151 with the P1 substrate residue (Gln). Furthermore, the modeling showed that the ENLYFQS peptide substrate is buried to a larger extent in the active site of eTEV compared with WT TEV. The new eTEV variant is functionally the fastest TEV variant reported to date and could potentially improve efficiency in any TEV application.
Collapse
Affiliation(s)
- Carl A. Denard
- Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States
| | - Chelsea Paresi
- Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States
| | - Rasha Yaghi
- Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States
| | - Natalie McGinnis
- Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, Texas 78712, United States
| | - Zachary Bennett
- Department of Biomedical Engineering, University of Texas at Austin, Austin, Texas 78712, United States
| | - Li Yi
- Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States
| | - George Georgiou
- Department of Chemical Engineering, University of Texas at Austin, Austin, Texas 78712, United States
| | - Brent L. Iverson
- Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States
| |
Collapse
|
23
|
Ochoa R, Magnitov M, Laskowski RA, Cossio P, Thornton JM. An automated protocol for modelling peptide substrates to proteases. BMC Bioinformatics 2020; 21:586. [PMID: 33375946 PMCID: PMC7771086 DOI: 10.1186/s12859-020-03931-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 12/09/2020] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND Proteases are key drivers in many biological processes, in part due to their specificity towards their substrates. However, depending on the family and molecular function, they can also display substrate promiscuity which can also be essential. Databases compiling specificity matrices derived from experimental assays have provided valuable insights into protease substrate recognition. Despite this, there are still gaps in our knowledge of the structural determinants. Here, we compile a set of protease crystal structures with bound peptide-like ligands to create a protocol for modelling substrates bound to protease structures, and for studying observables associated to the binding recognition. RESULTS As an application, we modelled a subset of protease-peptide complexes for which experimental cleavage data are available to compare with informational entropies obtained from protease-specificity matrices. The modelled complexes were subjected to conformational sampling using the Backrub method in Rosetta, and multiple observables from the simulations were calculated and compared per peptide position. We found that some of the calculated structural observables, such as the relative accessible surface area and the interaction energy, can help characterize a protease's substrate recognition, giving insights for the potential prediction of novel substrates by combining additional approaches. CONCLUSION Overall, our approach provides a repository of protease structures with annotated data, and an open source computational protocol to reproduce the modelling and dynamic analysis of the protease-peptide complexes.
Collapse
Affiliation(s)
- Rodrigo Ochoa
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia, 050010, Medellín, Colombia.
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Mikhail Magnitov
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Department of Biological and Medical Physics, Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Russia, 141701
| | - Roman A Laskowski
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Pilar Cossio
- Biophysics of Tropical Diseases, Max Planck Tandem Group, University of Antioquia, 050010, Medellín, Colombia
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438, Frankfurt am Main, Germany
| | - Janet M Thornton
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
24
|
Wheeler LC, Perkins A, Wong CE, Harms MJ. Learning peptide recognition rules for a low-specificity protein. Protein Sci 2020; 29:2259-2273. [PMID: 32979254 PMCID: PMC7586891 DOI: 10.1002/pro.3958] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 09/18/2020] [Accepted: 09/18/2020] [Indexed: 12/18/2022]
Abstract
Many proteins interact with short linear regions of target proteins. For some proteins, however, it is difficult to identify a well-defined sequence motif that defines its target peptides. To overcome this difficulty, we used supervised machine learning to train a model that treats each peptide as a collection of easily-calculated biochemical features rather than as an amino acid sequence. As a test case, we dissected the peptide-recognition rules for human S100A5 (hA5), a low-specificity calcium binding protein. We trained a Random Forest model against a recently released, high-throughput phage display dataset collected for hA5. The model identifies hydrophobicity and shape complementarity, rather than polar contacts, as the primary determinants of peptide binding specificity in hA5. We tested this hypothesis by solving a crystal structure of hA5 and through computational docking studies of diverse peptides onto hA5. These structural studies revealed that peptides exhibit multiple binding modes at the hA5 peptide interface-all of which have few polar contacts with hA5. Finally, we used our trained model to predict new, plausible binding targets in the human proteome. This revealed a fragment of the protein α-1-syntrophin that binds to hA5. Our work helps better understand the biochemistry and biology of hA5, as well as demonstrating how high-throughput experiments coupled with machine learning of biochemical features can reveal the determinants of binding specificity in low-specificity proteins.
Collapse
Affiliation(s)
- Lucas C. Wheeler
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of Chemistry and BiochemistryUniversity of OregonEugeneOregonUSA
- Department of Ecology and Evolutionary BiologyUniversity of ColoradoBoulderColoradoUSA
| | - Arden Perkins
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of Chemistry and BiochemistryUniversity of OregonEugeneOregonUSA
| | - Caitlyn E. Wong
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of Chemistry and BiochemistryUniversity of OregonEugeneOregonUSA
| | - Michael J. Harms
- Institute of Molecular BiologyUniversity of OregonEugeneOregonUSA
- Department of Chemistry and BiochemistryUniversity of OregonEugeneOregonUSA
| |
Collapse
|
25
|
Chen S, Yim JJ, Bogyo M. Synthetic and biological approaches to map substrate specificities of proteases. Biol Chem 2020; 401:165-182. [PMID: 31639098 DOI: 10.1515/hsz-2019-0332] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 10/11/2019] [Indexed: 02/07/2023]
Abstract
Proteases are regulators of diverse biological pathways including protein catabolism, antigen processing and inflammation, as well as various disease conditions, such as malignant metastasis, viral infection and parasite invasion. The identification of substrates of a given protease is essential to understand its function and this information can also aid in the design of specific inhibitors and active site probes. However, the diversity of putative protein and peptide substrates makes connecting a protease to its downstream substrates technically difficult and time-consuming. To address this challenge in protease research, a range of methods have been developed to identify natural protein substrates as well as map the overall substrate specificity patterns of proteases. In this review, we highlight recent examples of both synthetic and biological methods that are being used to define the substrate specificity of protease so that new protease-specific tools and therapeutic agents can be developed.
Collapse
Affiliation(s)
- Shiyu Chen
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Joshua J Yim
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matthew Bogyo
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA.,Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
26
|
On the cutting edge: protease-based methods for sensing and controlling cell biology. Nat Methods 2020; 17:885-896. [PMID: 32661424 DOI: 10.1038/s41592-020-0891-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Accepted: 06/09/2020] [Indexed: 02/06/2023]
Abstract
Sequence-specific proteases have proven to be versatile building blocks for tools that report or control cellular function. Reporting methods link protease activity to biochemical signals, whereas control methods rely on engineering proteases to respond to exogenous inputs such as light or chemicals. In turn, proteases have inherent control abilities, as their native functions are to release, activate or destroy proteins by cleavage, with the irreversibility of proteolysis allowing sustained downstream effects. As a result, protease-based synthetic circuits have been created for diverse uses such as reporting cellular signaling, tuning protein expression, controlling viral replication and detecting cancer states. Here, we comprehensively review the development and application of protease-based methods for reporting and controlling cellular function in eukaryotes.
Collapse
|
27
|
Robinson SL, Smith MD, Richman JE, Aukema KG, Wackett LP. Machine learning-based prediction of activity and substrate specificity for OleA enzymes in the thiolase superfamily. Synth Biol (Oxf) 2020. [DOI: 10.1093/synbio/ysaa004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Abstract
Enzymes in the thiolase superfamily catalyze carbon–carbon bond formation for the biosynthesis of polyhydroxyalkanoate storage molecules, membrane lipids and bioactive secondary metabolites. Natural and engineered thiolases have applications in synthetic biology for the production of high-value compounds, including personal care products and therapeutics. A fundamental understanding of thiolase substrate specificity is lacking, particularly within the OleA protein family. The ability to predict substrates from sequence would advance (meta)genome mining efforts to identify active thiolases for the production of desired metabolites. To gain a deeper understanding of substrate scope within the OleA family, we measured the activity of 73 diverse bacterial thiolases with a library of 15 p-nitrophenyl ester substrates to build a training set of 1095 unique enzyme–substrate pairs. We then used machine learning to predict thiolase substrate specificity from physicochemical and structural features. The area under the receiver operating characteristic curve was 0.89 for random forest classification of enzyme activity, and our regression model had a test set root mean square error of 0.22 (R2 = 0.75) to quantitatively predict enzyme activity levels. Substrate aromaticity, oxygen content and molecular connectivity were the strongest predictors of enzyme–substrate pairing. Key amino acid residues A173, I284, V287, T292 and I316 in the Xanthomonas campestris OleA crystal structure lining the substrate binding pockets were important for thiolase substrate specificity and are attractive targets for future protein engineering studies. The predictive framework described here is generalizable and demonstrates how machine learning can be used to quantitatively understand and predict enzyme substrate specificity.
Collapse
Affiliation(s)
- Serina L Robinson
- Graduate Program in Bioinformatics and Computational Biology, University of Minnesota, 111 S. Broadway, Suite 300, Rochester, MN 55904, USA
- Graduate Program in Microbiology, Immunology, and Cancer Biology, University of Minnesota, 689 23rd Ave SE, Minneapolis, MN 55455, USA
- BioTechnology Institute, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| | - Megan D Smith
- Graduate Program in Microbiology, Immunology, and Cancer Biology, University of Minnesota, 689 23rd Ave SE, Minneapolis, MN 55455, USA
- BioTechnology Institute, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| | - Jack E Richman
- BioTechnology Institute, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| | - Kelly G Aukema
- BioTechnology Institute, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| | - Lawrence P Wackett
- BioTechnology Institute, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN 55108, USA
| |
Collapse
|
28
|
Fan X, Li X, Zhou Y, Mei M, Liu P, Zhao J, Peng W, Jiang ZB, Yang S, Iverson BL, Zhang G, Yi L. Quantitative Analysis of the Substrate Specificity of Human Rhinovirus 3C Protease and Exploration of Its Substrate Recognition Mechanisms. ACS Chem Biol 2020; 15:63-73. [PMID: 31613083 DOI: 10.1021/acschembio.9b00539] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Human rhinovirus 3C protease (HRV 3C-P) is a high-value commercial cysteine protease that could specifically recognize the short peptide sequence of LEVLFQ↓GP. In here, a strategy based on our previous Yeast Endoplasmic Reticulum Sequestration Screening (YESS) approach was developed in Saccharomyces cerevisiae, a model microorganism, to fully characterize the substrate specificity of a typical human virus protease, HRV 3C-P, in a quantitative and fast manner. Our results demonstrated that HRV 3C-P had very high specificity at P1 and P1' positions, only recognizing Gln/Glu at the P1 position and Gly/Ala/Cys/Ser at the P1' position, respectively. Comparably, it exhibited efficient recognition of most residues at the P2' position, except Trp. Further biochemical characterization through site mutagenesis, enzyme structural modeling, and comparison with other 3C proteases indicated that the S1 pocket of HRV 3C-P was constituted by neutral and basic amino acids, in which His160 and Thr141 specifically interacted with Gln or Glu residues at the substrate P1 position. Additionally, the stringent S1' pocket determined its unique property of only accommodating residues without or with short side chains. Based on our characterization, LEVLFQ↓GM was identified as a more favorable substrate than the original LEVLFQ↓GP at high temperature, which might be caused by the conversion of random coils to β-turns in HRV 3C-P along with the temperature increase. Our studies prompted a further understanding of the substrate specificity and recognition mechanism of HRV 3C-P. Besides, the YESS-PSSC combined with the enzyme modeling strategy in this study provides a general strategy for deciphering the substrate specificities of proteases.
Collapse
Affiliation(s)
- Xian Fan
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Xinzhi Li
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Yu Zhou
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Meng Mei
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Pi Liu
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences , Tianjin 300308 , China
| | - Jing Zhao
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Wenfang Peng
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Zheng-Bing Jiang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Shihui Yang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Brent L Iverson
- Department of Chemistry , University of Texas , Austin , Texas 78712 , United States
| | - Guimin Zhang
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| | - Li Yi
- State Key Laboratory of Biocatalysis and Enzyme Engineering, Hubei Collaborative Innovation Center for Green Transformation of Bio-resources, Hubei Key Laboratory of Industrial Biotechnology, School of Life Sciences , Hubei University , Wuhan , 430062 , China
| |
Collapse
|
29
|
Sanusi ZK, Lawal MM, Govender T, Maguire GEM, Honarparvar B, Kruger HG. Theoretical Model for HIV-1 PR That Accounts for Substrate Recognition and Preferential Cleavage of Natural Substrates. J Phys Chem B 2019; 123:6389-6400. [DOI: 10.1021/acs.jpcb.9b02207] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Affiliation(s)
- Zainab K. Sanusi
- Catalysis and Peptide Research Unit, School of Health Sciences, University of KwaZulu-Natal, Durban 4041, South Africa
| | - Monsurat M. Lawal
- Catalysis and Peptide Research Unit, School of Health Sciences, University of KwaZulu-Natal, Durban 4041, South Africa
| | | | - Glenn E. M. Maguire
- Catalysis and Peptide Research Unit, School of Health Sciences, University of KwaZulu-Natal, Durban 4041, South Africa
- School of Chemistry and Physics, University of KwaZulu-Natal, Durban 4041, South Africa
| | - Bahareh Honarparvar
- Catalysis and Peptide Research Unit, School of Health Sciences, University of KwaZulu-Natal, Durban 4041, South Africa
| | - Hendrik G. Kruger
- Catalysis and Peptide Research Unit, School of Health Sciences, University of KwaZulu-Natal, Durban 4041, South Africa
| |
Collapse
|