1
|
Zhu Y, Zhao X, Xiang C, Liu X, Li J. Evaluation of Essential Dynamics and Fixed-Length Coarse Graining for Multidomain Proteins. J Phys Chem B 2024; 128:5147-5156. [PMID: 38758598 DOI: 10.1021/acs.jpcb.3c08198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
For multiscale modeling of biomolecules, reliable coarse-grained (CG) models can offer great potential to simulate larger temporal and spatial scales than traditional all-atom (AA) models. In this study, we explore the essential dynamics coarse graining (EDCG) and fixed-length coarse graining (FLCG) approaches for constructing highly coarse-grained models for multidomain proteins (MDPs), with 1 to 10 amino acid residues per CG site. In the studies of 13 MDPs, our data indicate that both EDCG and FLCG can preserve the protein dynamics of MDPs. FLCG, which restricts an equal number of residues in each CG site, represents an excellent approximation to EDCG and a straightforward approach for coarse-graining MDPs. Furthermore, FLCG is tested with a class B G-protein-coupled receptor protein, and the agreement with prior experiments suggests its general application to various MDPs in different environments or conditions. Finally, we demonstrate another application of FLCG through progressive backmapping, showcasing the ability to recover from lower-resolution CG models (6 residues/CG site) to higher-resolution ones (1 residue/CG site). These promising outcomes underscore the broad applicability of FLCG to construct highly or ultra-coarse-grained models of complex biomolecules for multiscale simulations.
Collapse
Affiliation(s)
- Yu Zhu
- Borch Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Xiaochuan Zhao
- Department of Chemistry, University of Vermont, Burlington, Vermont 05405, United States
| | - Chijian Xiang
- Department of Horticulture & Landscape Architecture, Purdue University, West Lafayette, Indiana 47907, United States
| | - Xianshi Liu
- Borch Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Jianing Li
- Borch Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
- Department of Chemistry, University of Vermont, Burlington, Vermont 05405, United States
| |
Collapse
|
2
|
Hsia O, Hinterndorfer M, Cowan AD, Iso K, Ishida T, Sundaramoorthy R, Nakasone MA, Imrichova H, Schätz C, Rukavina A, Husnjak K, Wegner M, Correa-Sáez A, Craigon C, Casement R, Maniaci C, Testa A, Kaulich M, Dikic I, Winter GE, Ciulli A. Targeted protein degradation via intramolecular bivalent glues. Nature 2024; 627:204-211. [PMID: 38383787 PMCID: PMC10917667 DOI: 10.1038/s41586-024-07089-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 01/18/2024] [Indexed: 02/23/2024]
Abstract
Targeted protein degradation is a pharmacological modality that is based on the induced proximity of an E3 ubiquitin ligase and a target protein to promote target ubiquitination and proteasomal degradation. This has been achieved either via proteolysis-targeting chimeras (PROTACs)-bifunctional compounds composed of two separate moieties that individually bind the target and E3 ligase, or via molecular glues that monovalently bind either the ligase or the target1-4. Here, using orthogonal genetic screening, biophysical characterization and structural reconstitution, we investigate the mechanism of action of bifunctional degraders of BRD2 and BRD4, termed intramolecular bivalent glues (IBGs), and find that instead of connecting target and ligase in trans as PROTACs do, they simultaneously engage and connect two adjacent domains of the target protein in cis. This conformational change 'glues' BRD4 to the E3 ligases DCAF11 or DCAF16, leveraging intrinsic target-ligase affinities that do not translate to BRD4 degradation in the absence of compound. Structural insights into the ternary BRD4-IBG1-DCAF16 complex guided the rational design of improved degraders of low picomolar potency. We thus introduce a new modality in targeted protein degradation, which works by bridging protein domains in cis to enhance surface complementarity with E3 ligases for productive ubiquitination and degradation.
Collapse
Affiliation(s)
- Oliver Hsia
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
| | - Matthias Hinterndorfer
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Angus D Cowan
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
| | - Kentaro Iso
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
- Tsukuba Research Laboratory, Eisai Co., Ibaraki, Japan
| | - Tasuku Ishida
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
- Tsukuba Research Laboratory, Eisai Co., Ibaraki, Japan
| | | | - Mark A Nakasone
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
| | - Hana Imrichova
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Caroline Schätz
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Andrea Rukavina
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Koraljka Husnjak
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Martin Wegner
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Alejandro Correa-Sáez
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
| | - Conner Craigon
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
| | - Ryan Casement
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
| | - Chiara Maniaci
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
- Medical Research Council (MRC) Protein Phosphorylation and Ubiquitylation Unit, School of Life Sciences, University of Dundee, Dundee, UK
| | - Andrea Testa
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK
- Amphista Therapeutics, Cambridge, UK
| | - Manuel Kaulich
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Ivan Dikic
- Institute of Biochemistry II, Faculty of Medicine, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Georg E Winter
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria.
| | - Alessio Ciulli
- Centre for Targeted Protein Degradation, School of Life Sciences, University of Dundee, Dundee, UK.
| |
Collapse
|
3
|
Wuyun Q, Chen Y, Shen Y, Cao Y, Hu G, Cui W, Gao J, Zheng W. Recent Progress of Protein Tertiary Structure Prediction. Molecules 2024; 29:832. [PMID: 38398585 PMCID: PMC10893003 DOI: 10.3390/molecules29040832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/06/2024] [Accepted: 02/08/2024] [Indexed: 02/25/2024] Open
Abstract
The prediction of three-dimensional (3D) protein structure from amino acid sequences has stood as a significant challenge in computational and structural bioinformatics for decades. Recently, the widespread integration of artificial intelligence (AI) algorithms has substantially expedited advancements in protein structure prediction, yielding numerous significant milestones. In particular, the end-to-end deep learning method AlphaFold2 has facilitated the rise of structure prediction performance to new heights, regularly competitive with experimental structures in the 14th Critical Assessment of Protein Structure Prediction (CASP14). To provide a comprehensive understanding and guide future research in the field of protein structure prediction for researchers, this review describes various methodologies, assessments, and databases in protein structure prediction, including traditionally used protein structure prediction methods, such as template-based modeling (TBM) and template-free modeling (FM) approaches; recently developed deep learning-based methods, such as contact/distance-guided methods, end-to-end folding methods, and protein language model (PLM)-based methods; multi-domain protein structure prediction methods; the CASP experiments and related assessments; and the recently released AlphaFold Protein Structure Database (AlphaFold DB). We discuss their advantages, disadvantages, and application scopes, aiming to provide researchers with insights through which to understand the limitations, contexts, and effective selections of protein structure prediction methods in protein-related fields.
Collapse
Affiliation(s)
- Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yihan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Yifeng Shen
- Faculty of Environment and Information Studies, Keio University, Fujisawa 252-0882, Kanagawa, Japan;
| | - Yang Cao
- College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Gang Hu
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Wei Cui
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
4
|
Rao S, Skulsuppaisarn M, Strong LM, Ren X, Lazarou M, Hurley JH, Hummer G. Three-step docking by WIPI2, ATG16L1, and ATG3 delivers LC3 to the phagophore. SCIENCE ADVANCES 2024; 10:eadj8027. [PMID: 38324698 PMCID: PMC10851258 DOI: 10.1126/sciadv.adj8027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 01/05/2024] [Indexed: 02/09/2024]
Abstract
The covalent attachment of ubiquitin-like LC3 proteins (microtubule-associated proteins 1A/1B light chain 3) prepares the autophagic membrane for cargo recruitment. We resolve key steps in LC3 lipidation by combining molecular dynamics simulations and experiments in vitro and in cellulo. We show how the E3-like ligaseautophagy-related 12 (ATG12)-ATG5-ATG16L1 in complex with the E2-like conjugase ATG3 docks LC3 onto the membrane in three steps by (i) the phosphatidylinositol 3-phosphate effector protein WD repeat domain phosphoinositide-interacting protein 2 (WIPI2), (ii) helix α2 of ATG16L1, and (iii) a membrane-interacting surface of ATG3. Phosphatidylethanolamine (PE) lipids concentrate in a region around the thioester bond between ATG3 and LC3, highlighting residues with a possible role in the catalytic transfer of LC3 to PE, including two conserved histidines. In a near-complete pathway from the initial membrane recruitment to the LC3 lipidation reaction, the three-step targeting of the ATG12-ATG5-ATG16L1 machinery establishes a high level of regulatory control.
Collapse
Affiliation(s)
- Shanlin Rao
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
| | - Marvin Skulsuppaisarn
- Walter and Eliza Hall Institute of Medical Research, Melbourne, Victoria, Australia
- Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Melbourne, Victoria, Australia
| | - Lisa M. Strong
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Xuefeng Ren
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Michael Lazarou
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Walter and Eliza Hall Institute of Medical Research, Melbourne, Victoria, Australia
- Department of Biochemistry and Molecular Biology, Biomedicine Discovery Institute, Monash University, Melbourne, Victoria, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, Victoria, Australia
| | - James H. Hurley
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720, USA
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Institute of Biophysics, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| |
Collapse
|
5
|
Zhang Z, Cai Y, Zhang B, Zheng W, Freddolino L, Zhang G, Zhou X. DEMO-EM2: assembling protein complex structures from cryo-EM maps through intertwined chain and domain fitting. Brief Bioinform 2024; 25:bbae113. [PMID: 38517699 PMCID: PMC10959074 DOI: 10.1093/bib/bbae113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 02/10/2024] [Accepted: 02/25/2024] [Indexed: 03/24/2024] Open
Abstract
The breakthrough in cryo-electron microscopy (cryo-EM) technology has led to an increasing number of density maps of biological macromolecules. However, constructing accurate protein complex atomic structures from cryo-EM maps remains a challenge. In this study, we extend our previously developed DEMO-EM to present DEMO-EM2, an automated method for constructing protein complex models from cryo-EM maps through an iterative assembly procedure intertwining chain- and domain-level matching and fitting for predicted chain models. The method was carefully evaluated on 27 cryo-electron tomography (cryo-ET) maps and 16 single-particle EM maps, where DEMO-EM2 models achieved an average TM-score of 0.92, outperforming those of state-of-the-art methods. The results demonstrate an efficient method that enables the rapid and reliable solution of challenging cryo-EM structure modeling problems.
Collapse
Affiliation(s)
- Ziying Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yaxian Cai
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Biao Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lydia Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiaogen Zhou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
6
|
Peng CX, Liang F, Xia YH, Zhao KL, Hou MH, Zhang GJ. Recent Advances and Challenges in Protein Structure Prediction. J Chem Inf Model 2024; 64:76-95. [PMID: 38109487 DOI: 10.1021/acs.jcim.3c01324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Artificial intelligence has made significant advances in the field of protein structure prediction in recent years. In particular, DeepMind's end-to-end model, AlphaFold2, has demonstrated the capability to predict three-dimensional structures of numerous unknown proteins with accuracy levels comparable to those of experimental methods. This breakthrough has opened up new possibilities for understanding protein structure and function as well as accelerating drug discovery and other applications in the field of biology and medicine. Despite the remarkable achievements of artificial intelligence in the field, there are still some challenges and limitations. In this Review, we discuss the recent progress and some of the challenges in protein structure prediction. These challenges include predicting multidomain protein structures, protein complex structures, multiple conformational states of proteins, and protein folding pathways. Furthermore, we highlight directions in which further improvements can be conducted.
Collapse
Affiliation(s)
- Chun-Xiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Fang Liang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Kai-Long Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Ming-Hua Hou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
7
|
Roy BG, Choi J, Fuchs MF. Predictive Modeling of Proteins Encoded by a Plant Virus Sheds a New Light on Their Structure and Inherent Multifunctionality. Biomolecules 2024; 14:62. [PMID: 38254661 PMCID: PMC10813169 DOI: 10.3390/biom14010062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 12/29/2023] [Accepted: 12/30/2023] [Indexed: 01/24/2024] Open
Abstract
Plant virus genomes encode proteins that are involved in replication, encapsidation, cell-to-cell, and long-distance movement, avoidance of host detection, counter-defense, and transmission from host to host, among other functions. Even though the multifunctionality of plant viral proteins is well documented, contemporary functional repertoires of individual proteins are incomplete. However, these can be enhanced by modeling tools. Here, predictive modeling of proteins encoded by the two genomic RNAs, i.e., RNA1 and RNA2, of grapevine fanleaf virus (GFLV) and their satellite RNAs by a suite of protein prediction software confirmed not only previously validated functions (suppressor of RNA silencing [VSR], viral genome-linked protein [VPg], protease [Pro], symptom determinant [Sd], homing protein [HP], movement protein [MP], coat protein [CP], and transmission determinant [Td]) and previously identified putative functions (helicase [Hel] and RNA-dependent RNA polymerase [Pol]), but also predicted novel functions with varying levels of confidence. These include a T3/T7-like RNA polymerase domain for protein 1AVSR, a short-chain reductase for protein 1BHel/VSR, a parathyroid hormone family domain for protein 1EPol/Sd, overlapping domains of unknown function and an ABC transporter domain for protein 2BMP, and DNA topoisomerase domains, transcription factor FBXO25 domain, or DNA Pol subunit cdc27 domain for the satellite RNA protein. Structural predictions for proteins 2AHP/Sd, 2BMP, and 3A? had low confidence, while predictions for proteins 1AVSR, 1BHel*/VSR, 1CVPg, 1DPro, 1EPol*/Sd, and 2CCP/Td retained higher confidence in at least one prediction. This research provided new insights into the structure and functions of GFLV proteins and their satellite protein. Future work is needed to validate these findings.
Collapse
Affiliation(s)
- Brandon G. Roy
- Plant Pathology and Plant-Microbe Biology Section, School of Integrative Plant Science, Cornell University, 15 Castle Creek Drive, Geneva, NY 14456, USA; (J.C.); (M.F.F.)
| | | | | |
Collapse
|
8
|
Bhardwaj K, Rajawat NK, Mathur N. Development of Alpha-Synuclein protein model against therapeutic aspects of Parkinson's disease. Indian J Pharmacol 2024; 56:37-41. [PMID: 38454587 PMCID: PMC11001172 DOI: 10.4103/ijp.ijp_325_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Revised: 08/26/2023] [Accepted: 01/29/2024] [Indexed: 03/09/2024] Open
Abstract
JOURNAL/ijpha/04.03/01363791-202456010-00007/figure1/v/2024-03-07T095025Z/r/image-tiff Parkinson's disease (PD) is the most common neurodegenerative disease caused by the steady depletion of dopamine in the striatum due to the loss of dopaminergic neurons. Most of the current therapeutics work on rebuilding the striatal dopamine level through oral administration of levodopa which stops the symptoms of PD. But there is a long-term motor complication with these dopamine precursors. Moreover, no preventive treatment is available for PD. Thus, before finding a therapeutic treatment for PD, it is necessary to first understand the basic cause of PD. Moreover, alpha-synuclein oligomerization can be the major factor in PD. From the UniProt database, protein information was extracted, and the model was designed by homology modeling technique and validated by the model validation server. Hence, the designed model has 96.5% most favored region and 0% disallowed region. Therefore, the model is stable based on RC plot parameters.
Collapse
Affiliation(s)
- Kanika Bhardwaj
- Department of Life Science and Zoology, IIS (Deemed to be University), Jaipur, Rajasthan, India
| | - Neelu Kanwar Rajawat
- Department of Life Science and Zoology, IIS (Deemed to be University), Jaipur, Rajasthan, India
| | - Nupur Mathur
- Department of Microbiology, University of Rajasthan, Jaipur, Rajasthan, India
| |
Collapse
|
9
|
Xia Y, Zhao K, Liu D, Zhou X, Zhang G. Multi-domain and complex protein structure prediction using inter-domain interactions from deep learning. Commun Biol 2023; 6:1221. [PMID: 38040847 PMCID: PMC10692239 DOI: 10.1038/s42003-023-05610-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 11/20/2023] [Indexed: 12/03/2023] Open
Abstract
Accurately capturing domain-domain interactions is key to understanding protein function and designing structure-based drugs. Although AlphaFold2 has made a breakthrough on single domain, it should be noted that the structure modeling for multi-domain protein and complex remains a challenge. In this study, we developed a multi-domain and complex structure assembly protocol, named DeepAssembly, based on domain segmentation and single domain modeling algorithms. Firstly, DeepAssembly uses a population-based evolutionary algorithm to assemble multi-domain proteins by inter-domain interactions inferred from a developed deep learning network. Secondly, protein complexes are assembled by means of domains rather than chains using DeepAssembly. Experimental results show that on 219 multi-domain proteins, the average inter-domain distance precision by DeepAssembly is 22.7% higher than that of AlphaFold2. Moreover, DeepAssembly improves accuracy by 13.1% for 164 multi-domain structures with low confidence deposited in AlphaFold database. We apply DeepAssembly for the prediction of 247 heterodimers. We find that DeepAssembly successfully predicts the interface (DockQ ≥ 0.23) for 32.4% of the dimers, suggesting a lighter way to assemble complex structures by treating domains as assembly units and using inter-domain interactions learned from monomer structures.
Collapse
Affiliation(s)
- Yuhao Xia
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Kailong Zhao
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Dong Liu
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Xiaogen Zhou
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China.
| |
Collapse
|
10
|
Koçkaya ES, Can H, Yaman Y, Ün C. In silico discovery of epitopes of gag and env proteins for the development of a multi-epitope vaccine candidate against Maedi Visna Virus using reverse vaccinology approach. Biologicals 2023; 84:101715. [PMID: 37793308 DOI: 10.1016/j.biologicals.2023.101715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Revised: 08/28/2023] [Accepted: 09/25/2023] [Indexed: 10/06/2023] Open
Abstract
Maedi Visna Virus (MVV) causes a chronic viral disease in sheep. Since there is no specific therapeutic drug that targets MVV, development of a vaccine against the MVV is inevitable. This study aimed to analyze the gag and env proteins as vaccine candidate proteins and to identify epitopes in these proteins. In addition, it was aimed to construct a multi-epitope vaccine candidate. According to the obtained results, the gag protein was detected to be more conserved and had a higher antigenicity value. Also, the number of alpha helix in the secondary structure was higher and transmembrane helices were not detected. Although many B cell and MHC-I/II epitopes were predicted, only 19 of them were detected to have the properties of antigenic, non-allergenic, non-toxic, soluble, and non-hemolytic. Of these epitopes, five were remarkable due to having the highest antigenicity value. However, the final multi-epitope vaccine was constructed with 19 epitopes. A strong affinity was shown between the final multi-epitope vaccine and TLR-2/4. In conclusion, the gag protein was a better antigen. However, both proteins had epitopes with high antigenicity value. Also, the final multi-epitope vaccine construct had a potential to be used as a peptide vaccine due to its immuno-informatics results.
Collapse
Affiliation(s)
- Ecem Su Koçkaya
- Ege University Faculty of Science Department of Biology Molecular Biology Section, İzmir, Türkiye
| | - Hüseyin Can
- Ege University Faculty of Science Department of Biology Molecular Biology Section, İzmir, Türkiye
| | - Yalçın Yaman
- Siirt University Faculty of Veterinary Medicine, Department of Genetics, Siirt, Türkiye
| | - Cemal Ün
- Ege University Faculty of Science Department of Biology Molecular Biology Section, İzmir, Türkiye.
| |
Collapse
|
11
|
Kang J, Gu L, Guo B, Rong W, Xu S, Yang G, Ren W. Molecular evolution of wound healing-related genes during cetacean secondary aquatic adaptation. Integr Zool 2023. [PMID: 37897119 DOI: 10.1111/1749-4877.12781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
The marine environment presents challenges for wound healing in cetaceans, despite their remarkable recovery abilities with minimal infections or complications. However, the molecular mechanism underlying this efficient wound healing remains underexplored. To better understand the molecular mechanisms behind wound healing in cetaceans, we investigated the evolutionary patterns of 37 wound healing-related genes in representative mammals. We found wound healing-related genes experience adaptive evolution in cetaceans: (1) Three extrinsic coagulation pathway-related genes-tissue factor (F3), coagulation factor VII (F7), and coagulation factor X (F10)-are subject to positive selection in cetaceans, which might promote efficient hemostasis after injury; positive selection in transforming growth factor-beta 2 (TGF-β2), transforming growth factor-beta 3 (TGF-β3), and platelet-derived growth factor D (PDGFD), which play immunological roles in wound healing, may help cetaceans enhance inflammatory response and tissue debridement. (2) Coagulation factor XII (F12) is the initiation factor in the intrinsic coagulation pathway. It had a premature stop codon mutation and was subjected to selective stress relaxation in cetaceans, suggesting that the early termination of F12 may help cetaceans avoid the risk of vascular blockage during diving. (3) Fibrinogen alpha chain (FGA) and FIII, which were detected to contain the specific amino acid substitutions in marine mammals, indicating similar evolutionary mechanisms might exist among marine mammals to maintain strong wound-healing ability. Thus, our research provides further impetus to study the evolution of the wound healing system in cetaceans and other marine mammals, extending knowledge of preventing coagulation disorder and atherosclerosis in humans.
Collapse
Affiliation(s)
- Jieqiong Kang
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Long Gu
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Boxiong Guo
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Wenqi Rong
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Shixia Xu
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Guang Yang
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| | - Wenhua Ren
- Jiangsu Key Laboratory for Biodiversity and Biotechnology, College of Life Sciences, Nanjing Normal University, Nanjing, China
| |
Collapse
|
12
|
Zhu HT, Xia YH, Zhang GJ. E2EDA: Protein Domain Assembly Based on End-to-End Deep Learning. J Chem Inf Model 2023; 63:6451-6461. [PMID: 37788318 DOI: 10.1021/acs.jcim.3c01387] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
With the development of deep learning, almost all single-domain proteins can be predicted at experimental resolution. However, the structure prediction of multi-domain proteins remains a challenge. Achieving end-to-end protein domain assembly and further improving the accuracy of the full-chain modeling by accurately predicting inter-domain orientation while improving the assembly efficiency will provide significant insights into structure-based drug discovery. In this work, we propose an End-to-End Domain Assembly method based on deep learning, named E2EDA. We first develop RMNet, an EfficientNetV2-based deep learning model that fuses multiple features using an attention mechanism to predict inter-domain rigid motion. Then, the predicted rigid motions are transformed into inter-domain spatial transformations to directly assemble the full-chain model. Finally, the scoring strategy RMscore is designed to select the best model from multiple assembled models. The experimental results show that the average TM-score of the model assembled by E2EDA on the benchmark set (282) is 0.827, which is better than those of other domain assembly methods SADA (0.792) and DEMO (0.730). Meanwhile, on our constructed multi-domain data set from AlphaFold DB, the model reassembled by E2EDA is 7.0% higher in TM-score compared to the full-chain model predicted by AlphaFold2, indicating that E2EDA can capture more accurate inter-domain orientations to improve the quality of the model predicted by AlphaFold2. Furthermore, compared to SADA and AlphaFold2, E2EDA reduced the average runtime on the benchmark by 64.7% and 19.2%, respectively, indicating that E2EDA can significantly improve assembly efficiency through an end-to-end approach. The online server is available at http://zhanglab-bioinf.com/E2EDA.
Collapse
Affiliation(s)
- Hai-Tao Zhu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| |
Collapse
|
13
|
Hassan M, Shahzadi S, Yasir M, Chun W, Kloczkowski A. Computational prognostic evaluation of Alzheimer's drugs from FDA-approved database through structural conformational dynamics and drug repositioning approaches. Sci Rep 2023; 13:18022. [PMID: 37865690 PMCID: PMC10590448 DOI: 10.1038/s41598-023-45347-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 10/18/2023] [Indexed: 10/23/2023] Open
Abstract
Drug designing is high-priced and time taking process with low success rate. To overcome this obligation, computational drug repositioning technique is being promptly used to predict the possible therapeutic effects of FDA approved drugs against multiple diseases. In this computational study, protein modeling, shape-based screening, molecular docking, pharmacogenomics, and molecular dynamic simulation approaches have been utilized to retrieve the FDA approved drugs against AD. The predicted MADD protein structure was designed by homology modeling and characterized through different computational resources. Donepezil and galantamine were implanted as standard drugs and drugs were screened out based on structural similarities. Furthermore, these drugs were evaluated and based on binding energy (Kcal/mol) profiles against MADD through PyRx tool. Moreover, pharmacogenomics analysis showed good possible associations with AD mediated genes and confirmed through detail literature survey. The best 6 drug (darifenacin, astemizole, tubocurarine, elacridar, sertindole and tariquidar) further docked and analyzed their interaction behavior through hydrogen binding. Finally, MD simulation study were carried out on these drugs and evaluated their stability behavior by generating root mean square deviation and fluctuations (RMSD/F), radius of gyration (Rg) and soluble accessible surface area (SASA) graphs. Taken together, darifenacin, astemizole, tubocurarine, elacridar, sertindole and tariquidar displayed good lead like profile as compared with standard and can be used as possible therapeutic agent in the treatment of AD after in-vitro and in-vivo assessment.
Collapse
Affiliation(s)
- Mubashir Hassan
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43205, USA.
| | - Saba Shahzadi
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43205, USA
| | - Muhammad Yasir
- Department of Pharmacology, College of Medicine, Kangwon National University, Chuncheon, South Korea
| | - Wanjoo Chun
- Department of Pharmacology, College of Medicine, Kangwon National University, Chuncheon, South Korea
| | - Andrzej Kloczkowski
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43205, USA.
- Department of Pediatrics, The Ohio State University, Columbus, OH, 43205, USA.
| |
Collapse
|
14
|
Watanabe N, Kuriya Y, Murata M, Yamamoto M, Shimizu M, Araki M. Different Recognition of Protein Features Depending on Deep Learning Models: A Case Study of Aromatic Decarboxylase UbiD. BIOLOGY 2023; 12:795. [PMID: 37372080 DOI: 10.3390/biology12060795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 05/17/2023] [Accepted: 05/29/2023] [Indexed: 06/29/2023]
Abstract
The number of unannotated protein sequences is explosively increasing due to genome sequence technology. A more comprehensive understanding of protein functions for protein annotation requires the discovery of new features that cannot be captured from conventional methods. Deep learning can extract important features from input data and predict protein functions based on the features. Here, protein feature vectors generated by 3 deep learning models are analyzed using Integrated Gradients to explore important features of amino acid sites. As a case study, prediction and feature extraction models for UbiD enzymes were built using these models. The important amino acid residues extracted from the models were different from secondary structures, conserved regions and active sites of known UbiD information. Interestingly, the different amino acid residues within UbiD sequences were regarded as important factors depending on the type of models and sequences. The Transformer models focused on more specific regions than the other models. These results suggest that each deep learning model understands protein features with different aspects from existing knowledge and has the potential to discover new laws of protein functions. This study will help to extract new protein features for the other protein annotations.
Collapse
Affiliation(s)
- Naoki Watanabe
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
| | - Yuki Kuriya
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
| | - Masahiro Murata
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai, Nada-Ku, Kobe 657-8501, Japan
| | - Masaki Yamamoto
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
| | - Masayuki Shimizu
- Bacchus Bio Innovation Co., Ltd., 6-3-7 Minatojima minami-machi, Kobe 650-0047, Japan
| | - Michihiro Araki
- Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, 3-17 Senrioka-shinmachi, Settsu 566-0002, Japan
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai, Nada-Ku, Kobe 657-8501, Japan
- Graduate School of Medicine, Kyoto University, 54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan
- National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita 564-8565, Japan
| |
Collapse
|
15
|
Zhao K, Xia Y, Zhang F, Zhou X, Li SZ, Zhang G. Protein structure and folding pathway prediction based on remote homologs recognition using PAthreader. Commun Biol 2023; 6:243. [PMID: 36871126 PMCID: PMC9985440 DOI: 10.1038/s42003-023-04605-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 02/16/2023] [Indexed: 03/06/2023] Open
Abstract
Recognition of remote homologous structures is a necessary module in AlphaFold2 and is also essential for the exploration of protein folding pathways. Here, we propose a method, PAthreader, to recognize remote templates and explore folding pathways. Firstly, we design a three-track alignment between predicted distance profiles and structure profiles extracted from PDB and AlphaFold DB, to improve the recognition accuracy of remote templates. Secondly, we improve the performance of AlphaFold2 using the templates identified by PAthreader. Thirdly, we explore protein folding pathways based on our conjecture that dynamic folding information of protein is implicitly contained in its remote homologs. The results show that the average accuracy of PAthreader templates is 11.6% higher than that of HHsearch. In terms of structure modelling, PAthreader outperform AlphaFold2 and ranks first on the CAMEO blind test for the latest three months. Furthermore, we predict protein folding pathways for 37 proteins, in which the results of 7 proteins are almost consistent with those of biological experiments, and the other 30 human proteins have yet to be verified by biological experiments, revealing that folding information can be exploited from remote homologous structures.
Collapse
Affiliation(s)
- Kailong Zhao
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Yuhao Xia
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Fujin Zhang
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Xiaogen Zhou
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Stan Z Li
- AI Lab, Research Center for Industries of the Future, Westlake University, Hangzhou, 310030, Zhejiang, China.
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China.
| |
Collapse
|
16
|
Yu ZZ, Peng CX, Liu J, Zhang B, Zhou XG, Zhang GJ. DomBpred: Protein Domain Boundary Prediction Based on Domain-Residue Clustering Using Inter-Residue Distance. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:912-922. [PMID: 35594218 DOI: 10.1109/tcbb.2022.3175905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Domain boundary prediction is one of the most important problems in the study of protein structure and function, especially for large proteins. At present, most domain boundary prediction methods have low accuracy and limitations in dealing with multi-domain proteins. In this study, we develop a sequence-based protein domain boundary prediction, named DomBpred. In DomBpred, the input sequence is first classified as either a single-domain protein or a multi-domain protein through a designed effective sequence metric based on a constructed single-domain sequence library. For the multi-domain protein, a domain-residue clustering algorithm inspired by Ising model is proposed to cluster the spatially close residues according inter-residue distance. The unclassified residues and the residues at the edge of the cluster are then tuned by the secondary structure to form potential cut points. Finally, a domain boundary scoring function is proposed to recursively evaluate the potential cut points to generate the domain boundary. DomBpred is tested on a large-scale test set of FUpred comprising 2549 proteins. Experimental results show that DomBpred better performs than the state-of-the-art methods in classifying whether protein sequences are composed by single or multiple domains, and the Matthew's correlation coefficient is 0.882. Moreover, on 849 multi-domain proteins, the domain boundary distance and normalised domain overlap scores of DomBpred are 0.523 and 0.824, respectively, which are 5.0% and 4.2% higher than those of the best comparison method, respectively. Comparison with other methods on the given test set shows that DomBpred outperforms most state-of-the-art sequence-based methods and even achieves better results than the top-level template-based method. The executable program is freely available at https://github.com/iobio-zjut/DomBpred and the online server at http://zhanglab-bioinf.com/DomBpred/.
Collapse
|
17
|
Zheng C, Wei Y, Zhang P, Xu L, Zhang Z, Lin K, Hou J, Lv X, Ding Y, Chiu Y, Jain A, Islam N, Malovannaya A, Wu Y, Ding F, Xu H, Sun M, Chen X, Chen Y. CRISPR/Cas9 screen uncovers functional translation of cryptic lncRNA-encoded open reading frames in human cancer. J Clin Invest 2023; 133:e159940. [PMID: 36856111 PMCID: PMC9974104 DOI: 10.1172/jci159940] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 01/19/2023] [Indexed: 03/02/2023] Open
Abstract
Emerging evidence suggests that cryptic translation within long noncoding RNAs (lncRNAs) may produce novel proteins with important developmental/physiological functions. However, the role of this cryptic translation in complex diseases (e.g., cancer) remains elusive. Here, we applied an integrative strategy combining ribosome profiling and CRISPR/Cas9 screening with large-scale analysis of molecular/clinical data for breast cancer (BC) and identified estrogen receptor α-positive (ER+) BC dependency on the cryptic ORFs encoded by lncRNA genes that were upregulated in luminal tumors. We confirmed the in vivo tumor-promoting function of an unannotated protein, GATA3-interacting cryptic protein (GT3-INCP) encoded by LINC00992, the expression of which was associated with poor prognosis in luminal tumors. GTE-INCP was upregulated by estrogen/ER and regulated estrogen-dependent cell growth. Mechanistically, GT3-INCP interacted with GATA3, a master transcription factor key to mammary gland development/BC cell proliferation, and coregulated a gene expression program that involved many BC susceptibility/risk genes and impacted estrogen response/cell proliferation. GT3-INCP/GATA3 bound to common cis regulatory elements and upregulated the expression of the tumor-promoting and estrogen-regulated BC susceptibility/risk genes MYB and PDZK1. Our study indicates that cryptic lncRNA-encoded proteins can be an important integrated component of the master transcriptional regulatory network driving aberrant transcription in cancer, and suggests that the "hidden" lncRNA-encoded proteome might be a new space for therapeutic target discovery.
Collapse
Affiliation(s)
- Caishang Zheng
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Yanjun Wei
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Peng Zhang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Longyong Xu
- Department of Molecular and Cellular Biology
- Lester and Sue Smith Breast Center, and
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, USA
| | - Zhenzhen Zhang
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina, USA
| | - Kangyu Lin
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Jiakai Hou
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Xiangdong Lv
- Department of Molecular and Cellular Biology
- Lester and Sue Smith Breast Center, and
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, USA
| | - Yao Ding
- Department of Molecular and Cellular Biology
- Lester and Sue Smith Breast Center, and
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, USA
| | - Yulun Chiu
- Department of Melanoma Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | | | | | - Anna Malovannaya
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, USA
- Mass Spectrometry Proteomics Core and
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA
| | - Yun Wu
- Department of Pathology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Feng Ding
- Department of Physics and Astronomy, Clemson University, Clemson, South Carolina, USA
| | - Han Xu
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center
- Genetics and Epigenetics Program, and
- Quantitative Sciences Program, MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, Texas, USA
| | - Ming Sun
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Xi Chen
- Department of Molecular and Cellular Biology
- Lester and Sue Smith Breast Center, and
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, Texas, USA
| | - Yiwen Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
- Quantitative Sciences Program, MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, Texas, USA
| |
Collapse
|
18
|
AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms. Commun Biol 2023; 6:160. [PMID: 36755055 PMCID: PMC9908985 DOI: 10.1038/s42003-023-04488-9] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 01/16/2023] [Indexed: 02/10/2023] Open
Abstract
Deep-learning (DL) methods like DeepMind's AlphaFold2 (AF2) have led to substantial improvements in protein structure prediction. We analyse confident AF2 models from 21 model organisms using a new classification protocol (CATH-Assign) which exploits novel DL methods for structural comparison and classification. Of ~370,000 confident models, 92% can be assigned to 3253 superfamilies in our CATH domain superfamily classification. The remaining cluster into 2367 putative novel superfamilies. Detailed manual analysis on 618 of these, having at least one human relative, reveal extremely remote homologies and further unusual features. Only 25 novel superfamilies could be confirmed. Although most models map to existing superfamilies, AF2 domains expand CATH by 67% and increases the number of unique 'global' folds by 36% and will provide valuable insights on structure function relationships. CATH-Assign will harness the huge expansion in structural data provided by DeepMind to rationalise evolutionary changes driving functional divergence.
Collapse
|
19
|
Jensen LE, Rao S, Schuschnig M, Cada AK, Martens S, Hummer G, Hurley JH. Membrane curvature sensing and stabilization by the autophagic LC3 lipidation machinery. SCIENCE ADVANCES 2022; 8:eadd1436. [PMID: 36516251 PMCID: PMC9750143 DOI: 10.1126/sciadv.add1436] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 11/10/2022] [Indexed: 05/28/2023]
Abstract
How the highly curved phagophore membrane is stabilized during autophagy initiation is a major open question in autophagosome biogenesis. Here, we use in vitro reconstitution on membrane nanotubes and molecular dynamics simulations to investigate how core autophagy proteins in the LC3 (Microtubule-associated proteins 1A/1B light chain 3) lipidation cascade interact with curved membranes, providing insight into their possible roles in regulating membrane shape during autophagosome biogenesis. ATG12(Autophagy-related 12)-ATG5-ATG16L1 was up to 100-fold enriched on highly curved nanotubes relative to flat membranes. At high surface density, ATG12-ATG5-ATG16L1 binding increased the curvature of the nanotubes. While WIPI2 (WD repeat domain phosphoinositide-interacting protein 2) binding directs membrane recruitment, the amphipathic helix α2 of ATG16L1 is responsible for curvature sensitivity. Molecular dynamics simulations revealed that helix α2 of ATG16L1 inserts shallowly into the membrane, explaining its curvature-sensitive binding to the membrane. These observations show how the binding of the ATG12-ATG5-ATG16L1 complex to the early phagophore rim could stabilize membrane curvature and facilitate autophagosome growth.
Collapse
Affiliation(s)
- Liv E. Jensen
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, CA 94720, USA
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
| | - Shanlin Rao
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
| | - Martina Schuschnig
- Department of Biochemistry and Cell Biology, Max Perutz Labs, University of Vienna, Vienna BioCenter, Vienna, Austria
| | - A. King Cada
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, CA 94720, USA
| | - Sascha Martens
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Department of Biochemistry and Cell Biology, Max Perutz Labs, University of Vienna, Vienna BioCenter, Vienna, Austria
| | - Gerhard Hummer
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, Frankfurt am Main, Germany
- Institute of Biophysics, Goethe University Frankfurt, Frankfurt am Main 60438, Germany
| | - James H. Hurley
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, CA 94720, USA
- Aligning Science Across Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD 20815, USA
- Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
20
|
Vymětal J, Mertová K, Boušová K, Šulc J, Tripsianes K, Vondrasek J. Fusion of two unrelated protein domains in a chimera protein and its 3D prediction: Justification of the x-ray reference structures as a prediction benchmark. Proteins 2022; 90:2067-2079. [PMID: 35833233 PMCID: PMC9796088 DOI: 10.1002/prot.26398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 05/20/2022] [Accepted: 07/08/2022] [Indexed: 12/30/2022]
Abstract
Proteins are naturally formed by domains edging their functional and structural properties. A domain out of the context of an entire protein can retain its structure and to some extent also function on its own. These properties rationalize construction of artificial fusion multidomain proteins with unique combination of various functions. Information on the specific functional and structural characteristics of individual domains in the context of new artificial fusion proteins is inevitably encoded in sequential order of composing domains defining their mutual spatial positions. So the challenges in designing new proteins with new domain combinations lie dominantly in structure/function prediction and its context dependency. Despite the enormous body of publications on artificial fusion proteins, the task of their structure/function prediction is complex and nontrivial. The degree of spatial freedom facilitated by a linker between domains and their mutual orientation driven by noncovalent interactions is beyond a simple and straightforward methodology to predict their structure with reasonable accuracy. In the presented manuscript, we tested methodology using available modeling tools and computational methods. We show that the process and methodology of such prediction are not straightforward and must be done with care even when recently introduced AlphaFold II is used. We also addressed a question of benchmarking standards for prediction of multidomain protein structures-x-ray or Nuclear Magnetic Resonance experiments. On the study of six two-domain protein chimeras as well as their composing domains and their x-ray structures selected from PDB, we conclude that the major obstacle for justified prediction is inappropriate sampling of the conformational space by the explored methods. On the other hands, we can still address particular steps of the methodology and improve the process of chimera proteins prediction.
Collapse
Affiliation(s)
- Jiří Vymětal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesPrague 6Czech Republic
| | - Kateřina Mertová
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesPrague 6Czech Republic,Faculty of Natural SciencesCharles UniversityPraha 2Czech Republic
| | - Kristýna Boušová
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesPrague 6Czech Republic
| | - Josef Šulc
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesPrague 6Czech Republic,Faculty of Natural SciencesCharles UniversityPraha 2Czech Republic
| | | | - Jiri Vondrasek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of SciencesPrague 6Czech Republic
| |
Collapse
|
21
|
Peng CX, Zhou XG, Xia YH, Liu J, Hou MH, Zhang GJ. Structural analogue-based protein structure domain assembly assisted by deep learning. Bioinformatics 2022; 38:4513-4521. [PMID: 35962986 DOI: 10.1093/bioinformatics/btac553] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Revised: 07/27/2022] [Accepted: 08/08/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION With the breakthrough of AlphaFold2, the protein structure prediction problem has made remarkable progress through deep learning end-to-end techniques, in which correct folds could be built for nearly all single-domain proteins. However, the full-chain modelling appears to be lower on average accuracy than that for the constituent domains and requires higher demand on computing hardware, indicating the performance of full-chain modelling still needs to be improved. In this study, we investigate whether the predicted accuracy of the full-chain model can be further improved by domain assembly assisted by deep learning. RESULTS In this article, we developed a structural analogue-based protein structure domain assembly method assisted by deep learning, named SADA. In SADA, a multi-domain protein structure database was constructed for the full-chain analogue detection using individual domain models. Starting from the initial model constructed from the analogue, the domain assembly simulation was performed to generate the full-chain model through a two-stage differential evolution algorithm guided by the energy function with an inter-residue distance potential predicted by deep learning. SADA was compared with the state-of-the-art domain assembly methods on 356 benchmark proteins, and the average TM-score of SADA models is 8.1% and 27.0% higher than that of DEMO and AIDA, respectively. We also assembled 293 human multi-domain proteins, where the average TM-score of the full-chain model after the assembly by SADA is 1.1% higher than that of the model by AlphaFold2. To conclude, we find that the domains often interact in the similar way in the quaternary orientations if the domains have similar tertiary structures. Furthermore, homologous templates and structural analogues are complementary for multi-domain protein full-chain modelling. AVAILABILITY AND IMPLEMENTATION http://zhanglab-bioinf.com/SADA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chun-Xiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiao-Gen Zhou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Jun Liu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Ming-Hua Hou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
22
|
A New Structural Model of Apolipoprotein B100 Based on Computational Modeling and Cross Linking. Int J Mol Sci 2022; 23:ijms231911480. [PMID: 36232786 PMCID: PMC9569473 DOI: 10.3390/ijms231911480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 09/17/2022] [Accepted: 09/18/2022] [Indexed: 12/02/2022] Open
Abstract
ApoB-100 is a member of a large lipid transfer protein superfamily and is one of the main apolipoproteins found on low-density lipoprotein (LDL) and very low-density lipoprotein (VLDL) particles. Despite its clinical significance for the development of cardiovascular disease, there is limited information on apoB-100 structure. We have developed a novel method based on the “divide and conquer” algorithm, using PSIPRED software, by dividing apoB-100 into five subunits and 11 domains. Models of each domain were prepared using I-TASSER, DEMO, RoseTTAFold, Phyre2, and MODELLER. Subsequently, we used disuccinimidyl sulfoxide (DSSO), a new mass spectrometry cleavable cross-linker, and the known position of disulfide bonds to experimentally validate each model. We obtained 65 unique DSSO cross-links, of which 87.5% were within a 26 Å threshold in the final model. We also evaluated the positions of cysteine residues involved in the eight known disulfide bonds in apoB-100, and each pair was measured within the expected 5.6 Å constraint. Finally, multiple domains were combined by applying constraints based on detected long-range DSSO cross-links to generate five subunits, which were subsequently merged to achieve an uninterrupted architecture for apoB-100 around a lipoprotein particle. Moreover, the dynamics of apoB-100 during particle size transitions was examined by comparing VLDL and LDL computational models and using experimental cross-linking data. In addition, the proposed model of receptor ligand binding of apoB-100 provides new insights into some of its functions.
Collapse
|
23
|
Murph M, Singh S, Schvarzstein M. A combined in silico and in vivo approach to the structure-function annotation of SPD-2 provides mechanistic insight into its functional diversity. Cell Cycle 2022; 21:1958-1979. [PMID: 35678569 PMCID: PMC9415446 DOI: 10.1080/15384101.2022.2078458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 04/10/2022] [Accepted: 05/04/2022] [Indexed: 11/03/2022] Open
Abstract
Centrosomes are organelles that function as hubs of microtubule nucleation and organization, with key roles in organelle positioning, asymmetric cell division, ciliogenesis, and signaling. Aberrant centrosome number, structure or function is linked to neurodegenerative diseases, developmental abnormalities, ciliopathies, and tumor development. A major regulator of centrosome biogenesis and function in C. elegans is the conserved Spindle-defective protein 2 (SPD-2), a homolog of the human CEP-192 protein. CeSPD-2 is required for centrosome maturation, centriole duplication, spindle assembly and possibly cell polarity establishment. Despite its importance, the specific molecular mechanism of CeSPD-2 regulation and function is poorly understood. Here, we combined computational analysis with cell biology approaches to uncover possible structure-function relationships of CeSPD-2 that may shed mechanistic light on its function. Domain prediction analysis corroborated and refined previously identified coiled-coils and ASH (Aspm-SPD-2 Hydin) domains and identified new domains: a GEF domain, an Ig-like domain, and a PDZ-like domain. In addition to these predicted structural features, CeSPD-2 is also predicted to be intrinsically disordered. Surface electrostatic maps identified a large basic region unique to the ASH domain of CeSPD-2. This basic region overlaps with most of the residues predicted to be involved in protein-protein interactions. In vivo, ASH::GFP localized to centrosomes and centrosome-associated microtubules. Our analysis groups ASH domains, PapD, Usher chaperone domains, and Major Sperm Protein (MSP) domains into a single superfold within the larger Immunoglobulin superfamily. This study lays the groundwork for designing rational hypothesis-based experiments to uncover the mechanisms of CeSPD-2 function in vivo.Abbreviations: AIR, Aurora kinase; ASH, Aspm-SPD-2 Hydin; ASP, Abnormal Spindle Protein; ASPM, Abnormal Spindle-like Microcephaly-associated Protein; CC, coiled-coil; CDK, Cyclin-dependent Kinase; Ce, Caenorhabditis elegans; CEP, Centrosomal Protein; CPAP, centrosomal P4.1-associated protein; D, Drosophila; GAP, GTPase activating protein; GEF, GTPase guanine nucleotide exchange factor; Hs, Homo sapiens/Human; Ig, Immunoglobulin; MAP, Microtubule associated Protein; MSP, Major Sperm Protein; MDP, Major Sperm Domain-Containing Protein; OCRL-1, Golgi endocytic trafficking protein Inositol polyphosphate 5-phosphatase; PAR, abnormal embryonic PARtitioning of the cytosol; PCM, Pericentriolar material; PCMD, pericentriolar matrix deficient; PDZ, PSD95/Dlg-1/zo-1; PLK, Polo like kinase; RMSD, Root Mean Square Deviation; SAS, Spindle assembly abnormal proteins; SPD, Spindle-defective protein; TRAPP, TRAnsport Protein Particle; Xe, Xenopus; ZYG, zygote defective protein.
Collapse
Affiliation(s)
- Mikaela Murph
- Department of Biology, City University of New York, Brooklyn College, New York, NY, USA
| | - Shaneen Singh
- Department of Biology, City University of New York, Brooklyn College, New York, NY, USA
- Department of Biology, The Graduate Center at City University of New York, New York, NY, USA
- Department Biochemistry, The Graduate Center at City University of New York, New York, NY, USA
| | - Mara Schvarzstein
- Department of Biology, City University of New York, Brooklyn College, New York, NY, USA
- Department of Biology, The Graduate Center at City University of New York, New York, NY, USA
- Department Biochemistry, The Graduate Center at City University of New York, New York, NY, USA
| |
Collapse
|
24
|
Zhang C, Shine M, Pyle AM, Zhang Y. US-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes. Nat Methods 2022; 19:1109-1115. [PMID: 36038728 DOI: 10.1038/s41592-022-01585-1] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 07/19/2022] [Indexed: 11/09/2022]
Abstract
Structure comparison and alignment are of fundamental importance in structural biology studies. We developed the first universal platform, US-align, to uniformly align monomer and complex structures of different macromolecules-proteins, RNAs and DNAs. The pipeline is built on a uniform TM-score objective function coupled with a heuristic alignment searching algorithm. Large-scale benchmarks demonstrated consistent advantages of US-align over state-of-the-art methods in pairwise and multiple structure alignments of different molecules. Detailed analyses showed that the main advantage of US-align lies in the extensive optimization of the unified objective function powered by efficient heuristic search iterations, which substantially improve the accuracy and speed of the structural alignment process. Meanwhile, the universal protocol fusing different molecular and structural types helps facilitate the heterogeneous oligomer structure comparison and template-based protein-protein and protein-RNA/DNA docking.
Collapse
Affiliation(s)
- Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.,Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT, USA.,Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Morgan Shine
- Yale Combined Program in the Biological and Biomedical Sciences, Yale University, New Haven, CT, USA
| | - Anna Marie Pyle
- Howard Hughes Medical Institute, Chevy Chase, MD, USA.,Yale Combined Program in the Biological and Biomedical Sciences, Yale University, New Haven, CT, USA.,Department of Chemistry, Yale University, New Haven, CT, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA. .,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
25
|
I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat Protoc 2022; 17:2326-2353. [PMID: 35931779 DOI: 10.1038/s41596-022-00728-0] [Citation(s) in RCA: 96] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 05/24/2022] [Indexed: 01/17/2023]
Abstract
Most proteins in cells are composed of multiple folding units (or domains) to perform complex functions in a cooperative manner. Relative to the rapid progress in single-domain structure prediction, there are few effective tools available for multi-domain protein structure assembly, mainly due to the complexity of modeling multi-domain proteins, which involves higher degrees of freedom in domain-orientation space and various levels of continuous and discontinuous domain assembly and linker refinement. To meet the challenge and the high demand of the community, we developed I-TASSER-MTD to model the structures and functions of multi-domain proteins through a progressive protocol that combines sequence-based domain parsing, single-domain structure folding, inter-domain structure assembly and structure-based function annotation in a fully automated pipeline. Advanced deep-learning models have been incorporated into each of the steps to enhance both the domain modeling and inter-domain assembly accuracy. The protocol allows for the incorporation of experimental cross-linking data and cryo-electron microscopy density maps to guide the multi-domain structure assembly simulations. I-TASSER-MTD is built on I-TASSER but substantially extends its ability and accuracy in modeling large multi-domain protein structures and provides meaningful functional insights for the targets at both the domain- and full-chain levels from the amino acid sequence alone.
Collapse
|
26
|
Wang C, Wang Y, Chen J, Liu L, Yang M, Li Z, Wang C, Pichersky E, Xu H. Synthesis of 4-methylvaleric acid, a precursor of pogostone, involves a 2-isobutylmalate synthase related to 2-isopropylmalate synthase of leucine biosynthesis. THE NEW PHYTOLOGIST 2022; 235:1129-1145. [PMID: 35485988 DOI: 10.1111/nph.18186] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 04/19/2022] [Indexed: 06/14/2023]
Abstract
We show here that the side chain of pogostone, one of the major components of patchouli oil obtained from Pogostemon cablin and possessing a variety of pharmacological activities, is derived from 4-methylvaleric acid. We also show that 4-methylvaleric acid is produced through the one-carbon α-ketoacid elongation pathway with the involvement of the key enzyme 2-isobutylmalate synthase (IBMS), a newly identified enzyme related to isopropylmalate synthase (IPMS) of leucine (Leu) biosynthesis. Site-directed mutagenesis identified Met132 in the N-terminal catalytic region as affecting the substrate specificity of PcIBMS1. Even though PcIBMS1 possesses the C-terminal domain that in IPMS serves to mediate Leu inhibition, it is insensitive to Leu. The observation of the evolution of IBMS from IPMS, as well as previously reported examples of IPMS-related genes involved in making glucosinolates in Brassicaceae, acylsugars in Solanaceae, and flavour compounds in apple, indicate that IPMS genes represent an important pool for the independent evolution of genes for specialised metabolism.
Collapse
Affiliation(s)
- Chu Wang
- School of Life Sciences, Chongqing University, Chongqing, 401331, China
- Center of Plant Functional Genomics, Institute of Advanced Interdisciplinary Studies, Chongqing University, Chongqing, 401331, China
| | - Ying Wang
- School of Life Sciences, Chongqing University, Chongqing, 401331, China
- Center of Plant Functional Genomics, Institute of Advanced Interdisciplinary Studies, Chongqing University, Chongqing, 401331, China
| | - Jing Chen
- School of Life Sciences, Chongqing University, Chongqing, 401331, China
- Center of Plant Functional Genomics, Institute of Advanced Interdisciplinary Studies, Chongqing University, Chongqing, 401331, China
| | - Lang Liu
- School of Life Sciences, Chongqing University, Chongqing, 401331, China
- Center of Plant Functional Genomics, Institute of Advanced Interdisciplinary Studies, Chongqing University, Chongqing, 401331, China
| | - Mingxia Yang
- The Center for Microbes, Development and Health, Institute Pasteur of Shanghai, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, 200031, China
| | - Zhengguo Li
- School of Life Sciences, Chongqing University, Chongqing, 401331, China
- Center of Plant Functional Genomics, Institute of Advanced Interdisciplinary Studies, Chongqing University, Chongqing, 401331, China
| | - Chengyuan Wang
- The Center for Microbes, Development and Health, Institute Pasteur of Shanghai, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, 200031, China
| | - Eran Pichersky
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Haiyang Xu
- School of Life Sciences, Chongqing University, Chongqing, 401331, China
- Center of Plant Functional Genomics, Institute of Advanced Interdisciplinary Studies, Chongqing University, Chongqing, 401331, China
| |
Collapse
|
27
|
Vo NNQ, Nomura Y, Kinugasa K, Takagi H, Takahashi S. Identification and Characterization of Bifunctional Drimenol Synthases of Marine Bacterial Origin. ACS Chem Biol 2022; 17:1226-1238. [PMID: 35446557 PMCID: PMC9128629 DOI: 10.1021/acschembio.2c00163] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Natural drimane-type sesquiterpenes, including drimenol, display diverse biological activities. These active compounds are distributed in plants and fungi; however, their accumulation in bacteria remains unknown. Consequently, bacterial drimane-type sesquiterpene synthases remain to be characterized. Here, we report five drimenol synthases (DMSs) of marine bacterial origin, all belonging to the haloacid dehalogenase (HAD)-like hydrolase superfamily with the conserved DDxxE motif typical of class I terpene synthases and the DxDTT motif found in class II diterpene synthases. They catalyze two continuous reactions: the cyclization of farnesyl pyrophosphate (FPP) into drimenyl pyrophosphate and dephosphorylation of drimenyl pyrophosphate into drimenol. Protein structure modeling of the characterized Aquimarina spongiae DMS (AsDMS) suggests that the FPP substrate is located within the interdomain created by the DDxxE motif of N-domain and DxDTT motif of C-domain. Biochemical analysis revealed two aspartate residues of the DDxxE motif that might contribute to the capture of the pyrophosphate moiety of FPP inside the catalytic site of AsDMS, which is essential for efficient cyclization and subsequent dephosphorylation reactions. The middle aspartate residue of the DxDTT motif is also critical for cyclization. Thus, AsDMS utilizes both motifs in the reactions. Remarkably, the unique protein architecture of AsDMS, which is characterized by the fusion of a HAD-like domain (N-domain) and a terpene synthase β domain (C-domain), significantly differentiates this new enzyme. Our findings of the first examples of bacterial DMSs suggest the biosynthesis of drimane sesquiterpenes in bacteria and shed light on the divergence of the structures and functions of terpene synthases.
Collapse
Affiliation(s)
- Nhu Ngoc Quynh Vo
- Natural Product Biosynthesis Research Unit, RIKEN Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yuhta Nomura
- Biomolecular Characterization Unit, RIKEN Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Kiyomi Kinugasa
- Natural Product Biosynthesis Research Unit, RIKEN Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama 338-8570, Japan
| | - Hiroshi Takagi
- Natural Product Biosynthesis Research Unit, RIKEN Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Shunji Takahashi
- Natural Product Biosynthesis Research Unit, RIKEN Center for Sustainable Resource Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama 338-8570, Japan
| |
Collapse
|
28
|
Zhou X, Peng C, Zheng W, Li Y, Zhang G, Zhang Y. DEMO2: Assemble multi-domain protein structures by coupling analogous template alignments with deep-learning inter-domain restraint prediction. Nucleic Acids Res 2022; 50:W235-W245. [PMID: 35536281 PMCID: PMC9252800 DOI: 10.1093/nar/gkac340] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 04/13/2022] [Accepted: 04/22/2022] [Indexed: 01/19/2023] Open
Abstract
Most proteins in nature contain multiple folding units (or domains). The revolutionary success of AlphaFold2 in single-domain structure prediction showed potential to extend deep-learning techniques for multi-domain structure modeling. This work presents a significantly improved method, DEMO2, which integrates analogous template structural alignments with deep-learning techniques for high-accuracy domain structure assembly. Starting from individual domain models, inter-domain spatial restraints are first predicted with deep residual convolutional networks, where full-length structure models are assembled using L-BFGS simulations under the guidance of a hybrid energy function combining deep-learning restraints and analogous multi-domain template alignments searched from the PDB. The output of DEMO2 contains deep-learning inter-domain restraints, top-ranked multi-domain structure templates, and up to five full-length structure models. DEMO2 was tested on a large-scale benchmark and the blind CASP14 experiment, where DEMO2 was shown to significantly outperform its predecessor and the state-of-the-art protein structure prediction methods. By integrating with new deep-learning techniques, DEMO2 should help fill the rapidly increasing gap between the improved ability of tertiary structure determination and the high demand for the high-quality multi-domain protein structures. The DEMO2 server is available at https://zhanggroup.org/DEMO/.
Collapse
Affiliation(s)
- Xiaogen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.,College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Chunxiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
29
|
Zheng W, Wuyun Q, Zhou X, Li Y, Freddolino PL, Zhang Y. LOMETS3: integrating deep learning and profile alignment for advanced protein template recognition and function annotation. Nucleic Acids Res 2022; 50:W454-W464. [PMID: 35420129 PMCID: PMC9252734 DOI: 10.1093/nar/gkac248] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Revised: 03/29/2022] [Accepted: 03/31/2022] [Indexed: 11/25/2022] Open
Abstract
Deep learning techniques have significantly advanced the field of protein structure prediction. LOMETS3 (https://zhanglab.ccmb.med.umich.edu/LOMETS/) is a new generation meta-server approach to template-based protein structure prediction and function annotation, which integrates newly developed deep learning threading methods. For the first time, we have extended LOMETS3 to handle multi-domain proteins and to construct full-length models with gradient-based optimizations. Starting from a FASTA-formatted sequence, LOMETS3 performs four steps of domain boundary prediction, domain-level template identification, full-length template/model assembly and structure-based function prediction. The output of LOMETS3 contains (i) top-ranked templates from LOMETS3 and its component threading programs, (ii) up to 5 full-length structure models constructed by L-BFGS (limited-memory Broyden–Fletcher–Goldfarb–Shanno algorithm) optimization, (iii) the 10 closest Protein Data Bank (PDB) structures to the target, (iv) structure-based functional predictions, (v) domain partition and assembly results, and (vi) the domain-level threading results, including items (i)–(iii) for each identified domain. LOMETS3 was tested in large-scale benchmarks and the blind CASP14 (14th Critical Assessment of Structure Prediction) experiment, where the overall template recognition and function prediction accuracy is significantly beyond its predecessors and other state-of-the-art threading approaches, especially for hard targets without homologous templates in the PDB. Based on the improved developments, LOMETS3 should help significantly advance the capability of broader biomedical community for template-based protein structure and function modelling.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Xiaogen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Peter L Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
30
|
Zhou X, Li Y, Zhang C, Zheng W, Zhang G, Zhang Y. Progressive assembly of multi-domain protein structures from cryo-EM density maps. NATURE COMPUTATIONAL SCIENCE 2022; 2:265-275. [PMID: 35844960 PMCID: PMC9281201 DOI: 10.1038/s43588-022-00232-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 03/21/2022] [Indexed: 05/20/2023]
Abstract
Progress in cryo-electron microscopy has provided the potential for large-size protein structure determination. However, the success rate for solving multi-domain proteins remains low because of the difficulty in modelling inter-domain orientations. Here we developed domain enhanced modeling using cryo-electron microscopy (DEMO-EM), an automatic method to assemble multi-domain structures from cryo-electron microscopy maps through a progressive structural refinement procedure combining rigid-body domain fitting and flexible assembly simulations with deep-neural-network inter-domain distance profiles. The method was tested on a large-scale benchmark set of proteins containing up to 12 continuous and discontinuous domains with medium- to low-resolution density maps, where DEMO-EM produced models with correct inter-domain orientations (template modeling score (TM-score) >0.5) for 97% of cases and outperformed state-of-the-art methods. DEMO-EM was applied to the severe acute respiratory syndrome coronavirus 2 genome and generated models with average TM-score and root-mean-square deviation of 0.97 and 1.3 Å, respectively, with respect to the deposited structures. These results demonstrate an efficient pipeline that enables automated and reliable large-scale multi-domain protein structure modelling from cryo-electron microscopy maps.
Collapse
Affiliation(s)
- Xiaogen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA
- Correspondence and requests for materials should be addressed to Yang Zhang.
| |
Collapse
|
31
|
Guo SS, Liu J, Zhou XG, Zhang GJ. DeepUMQA: ultrafast shape recognition-based protein model quality assessment using deep learning. Bioinformatics 2022; 38:1895-1903. [PMID: 35134108 DOI: 10.1093/bioinformatics/btac056] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Revised: 12/26/2021] [Accepted: 01/27/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Protein model quality assessment is a key component of protein structure prediction. In recent research, the voxelization feature was used to characterize the local structural information of residues, but it may be insufficient for describing residue-level topological information. Design features that can further reflect residue-level topology when combined with deep learning methods are therefore crucial to improve the performance of model quality assessment. RESULTS We developed a deep-learning method, DeepUMQA, based on Ultrafast Shape Recognition (USR) for the residue-level single-model quality assessment. In the framework of the deep residual neural network, the residue-level USR feature was introduced to describe the topological relationship between the residue and overall structure by calculating the first moment of a set of residue distance sets and then combined with 1D, 2D and voxelization features to assess the quality of the model. Experimental results on the CASP13, CASP14 test datasets and CAMEO blind test show that USR could supplement the voxelization features to comprehensively characterize residue structure information and significantly improve model assessment accuracy. The performance of DeepUMQA ranks among the top during the state-of-the-art single-model quality assessment methods, including ProQ2, ProQ3, ProQ3D, Ornate, VoroMQA, ProteinGCN, ResNetQA, QDeep, GraphQA, ModFOLD6, ModFOLD7, ModFOLD8, QMEAN3, QMEANDisCo3 and DeepAccNet. AVAILABILITY AND IMPLEMENTATION The DeepUMQA server is freely available at http://zhanglab-bioinf.com/DeepUMQA/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sai-Sai Guo
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Jun Liu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiao-Gen Zhou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
32
|
Feng Q, Hou M, Liu J, Zhao K, Zhang G. Construct a variable-length fragment library for de novo protein structure prediction. Brief Bioinform 2022; 23:6547572. [PMID: 35284936 DOI: 10.1093/bib/bbac086] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 02/10/2022] [Accepted: 02/20/2022] [Indexed: 11/12/2022] Open
Abstract
Although remarkable achievements, such as AlphaFold2, have been made in end-to-end structure prediction, fragment libraries remain essential for de novo protein structure prediction, which can help explore and understand the protein-folding mechanism. In this work, we developed a variable-length fragment library (VFlib). In VFlib, a master structure database was first constructed from the Protein Data Bank through sequence clustering. The hidden Markov model (HMM) profile of each protein in the master structure database was generated by HHsuite, and the secondary structure of each protein was calculated by DSSP. For the query sequence, the HMM-profile was first constructed. Then, variable-length fragments were retrieved from the master structure database through dynamically variable-length profile-profile comparison. A complete method for chopping the query HMM-profile during this process was proposed to obtain fragments with increased diversity. Finally, secondary structure information was used to further screen the retrieved fragments to generate the final fragment library of specific query sequence. The experimental results obtained with a set of 120 nonredundant proteins show that the global precision and coverage of the fragment library generated by VFlib were 55.04% and 94.95% at the RMSD cutoff of 1.5 Å, respectively. Compared with the benchmark method of NNMake, the global precision of our fragment library had increased by 62.89% with equivalent coverage. Furthermore, the fragments generated by VFlib and NNMake were used to predict structure models through fragment assembly. Controlled experimental results demonstrate that the average TM-score of VFlib was 16.00% higher than that of NNMake.
Collapse
Affiliation(s)
- Qiongqiong Feng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Minghua Hou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Jun Liu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Kailong Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
33
|
GalaxyDomDock: An ab initio domain–domain docking web server for multi-domain protein structure prediction. J Mol Biol 2022; 434:167508. [DOI: 10.1016/j.jmb.2022.167508] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 02/14/2022] [Accepted: 02/16/2022] [Indexed: 11/18/2022]
|
34
|
Peng CX, Zhou XG, Zhang GJ. De novo Protein Structure Prediction by Coupling Contact With Distance Profile. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:395-406. [PMID: 32750861 DOI: 10.1109/tcbb.2020.3000758] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
De novo protein structure prediction is a challenging problem that requires both an accurate energy function and an efficient conformation sampling method. In this study, a de novo structure prediction method, named CoDiFold, is proposed. In CoDiFold, contacts and distance profiles are organically combined into the Rosetta low-resolution energy function to improve the accuracy of energy function. As a result, the correlation between energy and root mean square deviation (RMSD) is improved. In addition, a population-based multi-mutation strategy is designed to balance the exploration and exploitation of conformation space sampling. The average RMSD of the models generated by the proposed protocol is decreased by 49.24 and 45.21 percent in the test set with 43 proteins compared with those of Rosetta and QUARK de novo protocols, respectively. The results also demonstrate that the structures predicted by proposed CoDiFold are comparable to the state-of-the-art methods for the 10 FM targets of CASP13. The source code and executable versions are freely available at http://github.com/iobio-zjut/CoDiFold.
Collapse
|
35
|
Heo L, Janson G, Feig M. Physics-based protein structure refinement in the era of artificial intelligence. Proteins 2021; 89:1870-1887. [PMID: 34156124 PMCID: PMC8616793 DOI: 10.1002/prot.26161] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 05/31/2021] [Accepted: 06/08/2021] [Indexed: 12/21/2022]
Abstract
Protein structure refinement is the last step in protein structure prediction pipelines. Physics-based refinement via molecular dynamics (MD) simulations has made significant progress during recent years. During CASP14, we tested a new refinement protocol based on an improved sampling strategy via MD simulations. MD simulations were carried out at an elevated temperature (360 K). An optimized use of biasing restraints and the use of multiple starting models led to enhanced sampling. The new protocol generally improved the model quality. In comparison with our previous protocols, the CASP14 protocol showed clear improvements. Our approach was successful with most initial models, many based on deep learning methods. However, we found that our approach was not able to refine machine-learning models from the AlphaFold2 group, often decreasing already high initial qualities. To better understand the role of refinement given new types of models based on machine-learning, a detailed analysis via MD simulations and Markov state modeling is presented here. We continue to find that MD-based refinement has the potential to improve AI predictions. We also identified several practical issues that make it difficult to realize that potential. Increasingly important is the consideration of inter-domain and oligomeric contacts in simulations; the presence of large kinetic barriers in refinement pathways also continues to present challenges. Finally, we provide a perspective on how physics-based refinement could continue to play a role in the future for improving initial predictions based on machine learning-based methods.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Giacomo Janson
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
36
|
Hou M, Peng C, Zhou X, Zhang B, Zhang G. Multi contact-based folding method for de novo protein structure prediction. Brief Bioinform 2021; 23:6445108. [PMID: 34849573 DOI: 10.1093/bib/bbab463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 09/21/2021] [Accepted: 10/10/2021] [Indexed: 11/12/2022] Open
Abstract
Meta contact, which combines different contact maps into one to improve contact prediction accuracy and effectively reduce the noise from a single contact map, is a widely used method. However, protein structure prediction using meta contact cannot fully exploit the information carried by original contact maps. In this work, a multi contact-based folding method under the evolutionary algorithm framework, MultiCFold, is proposed. In MultiCFold, the thorough information of different contact maps is directly used by populations to guide protein structure folding. In addition, noncontact is considered as an effective supplement to contact information and can further assist protein folding. MultiCFold is tested on a set of 120 nonredundant proteins, and the average TM-score and average RMSD reach 0.617 and 5.815 Å, respectively. Compared with the meta contact-based method, MetaCFold, average TM-score and average RMSD have a 6.62 and 8.82% improvement. In particular, the import of noncontact information increases the average TM-score by 6.30%. Furthermore, MultiCFold is compared with four state-of-the-art methods of CASP13 on the 24 FM targets, and results show that MultiCFold is significantly better than other methods after the full-atom relax procedure.
Collapse
Affiliation(s)
- Minghua Hou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Chunxiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiaogen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Hangzhou 310023, China
| | - Biao Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
37
|
Born A, Soetbeer J, Breitgoff F, Henen MA, Sgourakis N, Polyhach Y, Nichols PJ, Strotz D, Jeschke G, Vögeli B. Reconstruction of Coupled Intra- and Interdomain Protein Motion from Nuclear and Electron Magnetic Resonance. J Am Chem Soc 2021; 143:16055-16067. [PMID: 34579531 DOI: 10.1021/jacs.1c06289] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Proteins composed of multiple domains allow for structural heterogeneity and interdomain dynamics that may be vital for function. Intradomain structures and dynamics can influence interdomain conformations and vice versa. However, no established structure determination method is currently available that can probe the coupling of these motions. The protein Pin1 contains separate regulatory and catalytic domains that sample "extended" and "compact" states, and ligand binding changes this equilibrium. Ligand binding and interdomain distance have been shown to impact the activity of Pin1, suggesting interdomain allostery. In order to characterize the conformational equilibrium of Pin1, we describe a novel method to model the coupling between intra- and interdomain dynamics at atomic resolution using multistate ensembles. The method uses time-averaged nuclear magnetic resonance (NMR) restraints and double electron-electron resonance (DEER) data that resolve distance distributions. While the intradomain calculation is primarily driven by exact nuclear Overhauser enhancements (eNOEs), J couplings, and residual dipolar couplings (RDCs), the relative domain distribution is driven by paramagnetic relaxation enhancement (PREs), RDCs, interdomain NOEs, and DEER. Our data support a 70:30 population of the compact and extended states in apo Pin1. A multistate ensemble describes these conformations simultaneously, with distinct conformational differences located in the interdomain interface stabilizing the compact or extended states. We also describe correlated conformations between the catalytic site and interdomain interface that may explain allostery driven by interdomain contact.
Collapse
Affiliation(s)
- Alexandra Born
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, 12801 East 17th Avenue, Aurora, Colorado 80045, United States
| | - Janne Soetbeer
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, ETH-Hönggerberg, Zürich CH-8093, Switzerland
| | - Frauke Breitgoff
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, ETH-Hönggerberg, Zürich CH-8093, Switzerland
| | - Morkos A Henen
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, 12801 East 17th Avenue, Aurora, Colorado 80045, United States.,Faculty of Pharmacy, Mansoura University, Mansoura 35516, Egypt
| | - Nikolaos Sgourakis
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Yevhen Polyhach
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, ETH-Hönggerberg, Zürich CH-8093, Switzerland
| | - Parker J Nichols
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, 12801 East 17th Avenue, Aurora, Colorado 80045, United States
| | - Dean Strotz
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, ETH-Hönggerberg, Zürich CH-8093, Switzerland
| | - Gunnar Jeschke
- Laboratory of Physical Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, ETH-Hönggerberg, Zürich CH-8093, Switzerland
| | - Beat Vögeli
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, 12801 East 17th Avenue, Aurora, Colorado 80045, United States
| |
Collapse
|
38
|
|
39
|
Wang L, Liu J, Xia Y, Xu J, Zhou X, Zhang G. Distance-guided protein folding based on generalized descent direction. Brief Bioinform 2021; 22:6341661. [PMID: 34355233 DOI: 10.1093/bib/bbab296] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 06/30/2021] [Accepted: 07/12/2021] [Indexed: 12/25/2022] Open
Abstract
Advances in the prediction of the inter-residue distance for a protein sequence have increased the accuracy to predict the correct folds of proteins with distance information. Here, we propose a distance-guided protein folding algorithm based on generalized descent direction, named GDDfold, which achieves effective structural perturbation and potential minimization in two stages. In the global stage, random-based direction is designed using evolutionary knowledge, which guides conformation population to cross potential barriers and explore conformational space rapidly in a large range. In the local stage, locally rugged potential landscape can be explored with the aid of conjugate-based direction integrated into a specific search strategy, which can improve the exploitation ability. GDDfold is tested on 347 proteins of a benchmark set, 24 template-free modeling (FM) approaches targets of CASP13 and 20 FM targets of CASP14. Results show that GDDfold correctly folds [template modeling (TM) score ≥ = 0.5] 316 out of 347 proteins, where 65 proteins have TM scores that are greater than 0.8, and significantly outperforms Rosetta-dist (distance-assisted fragment assembly method) and L-BFGSfold (distance geometry optimization method). On CASP FM targets, GDDfold is comparable with five state-of-the-art full-version methods, namely, Quark, RaptorX, Rosetta, MULTICOM and trRosetta in the CASP 13 and 14 server groups.
Collapse
Affiliation(s)
- Liujing Wang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Jun Liu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yuhao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Jiakang Xu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiaogen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Michigan USA
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
40
|
Zheng W, Li Y, Zhang C, Zhou X, Pearce R, Bell EW, Huang X, Zhang Y. Protein structure prediction using deep learning distance and hydrogen-bonding restraints in CASP14. Proteins 2021; 89:1734-1751. [PMID: 34331351 DOI: 10.1002/prot.26193] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/06/2021] [Accepted: 07/22/2021] [Indexed: 11/10/2022]
Abstract
In this article, we report 3D structure prediction results by two of our best server groups ("Zhang-Server" and "QUARK") in CASP14. These two servers were built based on the D-I-TASSER and D-QUARK algorithms, which integrated four newly developed components into the classical protein folding pipelines, I-TASSER and QUARK, respectively. The new components include: (a) a new multiple sequence alignment (MSA) collection tool, DeepMSA2, which is extended from the DeepMSA program; (b) a contact-based domain boundary prediction algorithm, FUpred, to detect protein domain boundaries; (c) a residual convolutional neural network-based method, DeepPotential, to predict multiple spatial restraints by co-evolutionary features derived from the MSA; and (d) optimized spatial restraint energy potentials to guide the structure assembly simulations. For 37 FM targets, the average TM-scores of the first models produced by D-I-TASSER and D-QUARK were 96% and 112% higher than those constructed by I-TASSER and QUARK, respectively. The data analysis indicates noticeable improvements produced by each of the four new components, especially for the newly added spatial restraints from DeepPotential and the well-tuned force field that combines spatial restraints, threading templates, and generic knowledge-based potentials. However, challenges still exist in the current pipelines. These include difficulties in modeling multi-domain proteins due to low accuracy in inter-domain distance prediction and modeling protein domains from oligomer complexes, as the co-evolutionary analysis cannot distinguish inter-chain and intra-chain distances. Specifically tuning the deep learning-based predictors for multi-domain targets and protein complexes may be helpful to address these issues.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.,School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Xiaogen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Eric W Bell
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Xiaoqiang Huang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
41
|
Zheng W, Zhang C, Li Y, Pearce R, Bell EW, Zhang Y. Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations. CELL REPORTS METHODS 2021; 1:100014. [PMID: 34355210 PMCID: PMC8336924 DOI: 10.1016/j.crmeth.2021.100014] [Citation(s) in RCA: 227] [Impact Index Per Article: 75.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 04/22/2021] [Accepted: 05/03/2021] [Indexed: 12/23/2022]
Abstract
Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. We developed a protocol, C-I-TASSER, to integrate interresidue contact maps from deep neural-network learning with the cutting-edge I-TASSER fragment assembly simulations. Large-scale benchmark tests showed that C-I-TASSER can fold more than twice the number of non-homologous proteins than the I-TASSER, which does not use contacts. When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB. Furthermore, it created correct folds for 85% of proteins in the SARS-CoV-2 genome, despite the quick mutation rate of the virus and sparse sequence profiles. The results demonstrated the critical importance of coupling whole-genome and metagenome-based evolutionary information with optimal structure assembly simulations for solving the problem of non-homologous protein structure prediction.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Eric W. Bell
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
42
|
Mulnaes D, Golchin P, Koenig F, Gohlke H. TopDomain: Exhaustive Protein Domain Boundary Metaprediction Combining Multisource Information and Deep Learning. J Chem Theory Comput 2021; 17:4599-4613. [PMID: 34161735 DOI: 10.1021/acs.jctc.1c00129] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Protein domains are independent, functional, and stable structural units of proteins. Accurate protein domain boundary prediction plays an important role in understanding protein structure and evolution, as well as for protein structure prediction. Current domain boundary prediction methods differ in terms of boundary definition, methodology, and training databases resulting in disparate performance for different proteins. We developed TopDomain, an exhaustive metapredictor, that uses deep neural networks to combine multisource information from sequence- and homology-based features of over 50 primary predictors. For this purpose, we developed a new domain boundary data set termed the TopDomain data set, in which the true annotations are informed by SCOPe annotations, structural domain parsers, human inspection, and deep learning. We benchmark TopDomain against 2484 targets with 3354 boundaries from the TopDomain test set and achieve F1 scores of 78.4% and 73.8% for multidomain boundary prediction within ±20 residues and ±10 residues of the true boundary, respectively. When examined on targets from CASP11-13 competitions, TopDomain achieves F1 scores of 47.5% and 42.8% for multidomain proteins. TopDomain significantly outperforms 15 widely used, state-of-the-art ab initio and homology-based domain boundary predictors. Finally, we implemented TopDomainTMC, which accurately predicts whether domain parsing is necessary for the target protein.
Collapse
Affiliation(s)
- Daniel Mulnaes
- Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany
| | - Pegah Golchin
- Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany
| | - Filip Koenig
- Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany
| | - Holger Gohlke
- Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany.,John von Neumann Institute for Computing (NIC), Jülich Supercomputing Centre (JSC), Institute of Biological Information Processing (IBI-7: Structural Biochemistry) & Institute of Bio- and Geosciences (IBG-4: Bioinformatics), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| |
Collapse
|
43
|
Xia YH, Peng CX, Zhou XG, Zhang GJ. A Sequential Niche Multimodal Conformational Sampling Algorithm for Protein Structure Prediction. Bioinformatics 2021; 37:4357-4365. [PMID: 34245242 DOI: 10.1093/bioinformatics/btab500] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 06/23/2021] [Accepted: 07/05/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Massive local minima on the protein energy landscape often cause traditional conformational sampling algorithms to be easily trapped in local basin regions, because they find it difficult to overcome high-energy barriers. Also, the lowest energy conformation may not correspond to the native structure due to the inaccuracy of energy models. This study investigates whether these two problems can be alleviated by a sequential niche technique without loss of accuracy. RESULTS A sequential niche multimodal conformational sampling algorithm for protein structure prediction (SNfold) is proposed in this study. In SNfold, a derating function is designed based on the knowledge learned from the previous sampling and used to construct a series of sampling-guided energy functions. These functions then help the sampling algorithm overcome high-energy barriers and avoid the re-sampling of the explored regions. In inaccurate protein energy models, the high-energy conformation that may correspond to the native structure can be sampled with successively updated sampling-guided energy functions. The proposed SNfold is tested on 300 benchmark proteins, 24 CASP13 and 19 CASP14 FM targets. Results show that SNfold correctly folds (TM-score ≥ 0.5) 231 out of 300 proteins. In particular, compared with Rosetta restrained by distance (Rosetta-dist), SNfold achieves higher average TM-score and improves the sampling efficiency by more than 100 times. On several CASP FM targets, SNfold also shows good performance compared with four state-of-the-art servers in CASP. As a plug-in conformational sampling algorithm, SNfold can be extended to other protein structure prediction methods. AVAILABILITY The source code and executable versions are freely available at https://github.com/iobio-zjut/SNfold. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Chun-Xiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Xiao-Gen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| |
Collapse
|
44
|
Zhao KL, Liu J, Zhou XG, Su JZ, Zhang Y, Zhang GJ. MMpred: a distance-assisted multimodal conformation sampling for de novo protein structure prediction. Bioinformatics 2021; 37:4350-4356. [PMID: 34185079 DOI: 10.1093/bioinformatics/btab484] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 06/22/2021] [Accepted: 06/28/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The mathematically optimal solution in computational protein folding simulations does not always correspond to the native structure, due to the imperfection of the energy force fields. There is therefore a need to search for more diverse suboptimal solutions in order to identify the states close to the native. We propose a novel multimodal optimization protocol to improve the conformation sampling efficiency and modeling accuracy of de novo protein structure folding simulations. RESULTS A distance-assisted multimodal optimization sampling algorithm, MMpred, is proposed for de novo protein structure prediction. The protocol consists of three stages. In the first modal exploration stage, a structural similarity evaluation model DMscore is designed to control the diversity of conformations, generating a population of diverse structures in different low-energy basins. In the second modal maintaining stage, an adaptive clustering algorithm MNDcluster is proposed to divide the populations and merge the modal by adjusting the annealing temperature to locate the promising basins. In the last stage of modal exploitation, a greedy search strategy is used to accelerate the convergence of the modal. Distance constraint information is used to construct the conformation scoring model to guide sampling. MMpred is tested on 320 non-redundant proteins, where MMpred obtains models with TM-score ≥ 0.5 on 268 cases, which is 20.3% higher than that of Rosetta guided with the same distance constraints. In addition, on 320 benchmark proteins, the average TM-score of the enhanced version of MMpred (E-MMpred) is 0.732 on the best model, which is comparable to trRosetta (0.730). AVAILABILITY The source code and executable are freely available at https://github.com/iobio-zjut/MMpred. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kai-Long Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Jun Liu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiao-Gen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw, Ann Arbor, MI 48109-2218, USA
| | - Jian-Zhong Su
- School of Biomedical Engineering, School of Ophthalmology and Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325011, Zhejiang, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw, Ann Arbor, MI 48109-2218, USA
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
45
|
Zhou Y, Chen H, Li S, Chen M. mPPI: a database extension to visualize structural interactome in a one-to-many manner. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2021:6307707. [PMID: 34156447 DOI: 10.1093/database/baab036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 05/10/2021] [Accepted: 05/28/2021] [Indexed: 01/02/2023]
Abstract
Protein-protein interaction (PPI) databases with structural information are useful to investigate biological functions at both systematic and atomic levels. However, most existing PPI databases only curate binary interactome. From the perspective of the display and function of PPI, as well as the structural binding interface, the related database and resources are summarized. We developed a database extension, named mPPI, for PPI structural visualization. Comparing with the existing structural interactomes that curate resolved PPI conformation in pairs, mPPI can visualize target protein and its multiple interactors simultaneously, which facilitates multi-target drug discovery and structure prediction of protein macro-complexes. By employing a protein-protein docking algorithm, mPPI largely extends the coverage of structural interactome from experimentally resolved complexes. mPPI is designed to be a customizable and convenient plugin for PPI databases. It possesses wide potential applications for various PPI databases, and it has been used for a neurodegenerative disease-related PPI database as demonstration. Scripts and implementation guidelines of mPPI are documented at the database tool website. Database URL http://bis.zju.edu.cn/mppi/.
Collapse
Affiliation(s)
- Yekai Zhou
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou 310058, China.,Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Hongjun Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
| | - Sida Li
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ming Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou 310058, China.,Bioinformatics Center, The First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
46
|
Basit A, Qadir S, Qureshi S, Rehman SU. Cloning and expression analysis of fused holin-endolysin from RL bacteriophage; Exhibits broad activity against multi drug resistant pathogens. Enzyme Microb Technol 2021; 149:109846. [PMID: 34311883 DOI: 10.1016/j.enzmictec.2021.109846] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 06/02/2021] [Accepted: 06/06/2021] [Indexed: 01/20/2023]
Abstract
Antibiotic resistance has become a major risk to community health over last few years because of antibiotics overuse around the globe and lack of new antibiotics development. Phages and their lytic enzymes are considered as an effective alternative of antibiotics to control drug resistant bacterial pathogens. Endolysins prove to be a promising class of antibacterials due to their specificity and less chances of resistance development in bacterial pathogens. Though large number of endolysins has been reported against gram positive bacteria, very few reported against gram negative bacteria due to the presence of outer membrane, which acts as physical barrier against endolysin attack to peptidoglycan. In the current study, we have expressed endolysin (RL_Lys) and holin fused at the N terminus of endolysin (RL_Hlys) from RL phage infecting multi drug resistant (MDR) Pseudomonas aeruginosa. Both endolysin variants were found active against wide range of MDR strains P. aeruginosa, Klebsella pneumonia, Salmonella Sp. and Methicillin Resistant Staphylococcus aureus (MRSA). Broth reduction assay showed that RL_Hlys is more active than RL_Lys due to presence of holin, which assist the endolysin access towards cell wall. The protein ligand docking and molecular dynamic simulation results showed that C- terminus region of endolysin play vital role in cell wall binding and even in the absence of holin, hydrolyze a broad range of gram negative bacterial pathogens. The significant activity of RL-Lys and RL_Hlys against a broad range of MDR gram negative and positive bacterial pathogens makes them good candidates for antibiotic alternatives.
Collapse
Affiliation(s)
- Abdul Basit
- Institute of Microbiology and Molecular Genetics, University of the Punjab, Lahore, 54590, Pakistan.
| | - Sania Qadir
- Institute of Microbiology and Molecular Genetics, University of the Punjab, Lahore, 54590, Pakistan.
| | - Sara Qureshi
- Institute of Microbiology and Molecular Genetics, University of the Punjab, Lahore, 54590, Pakistan.
| | - Shafiq Ur Rehman
- Institute of Microbiology and Molecular Genetics, University of the Punjab, Lahore, 54590, Pakistan.
| |
Collapse
|
47
|
Abstract
Usher syndrome type 1B (USH1B) is a genetic disorder caused by mutations in the unconventional Myosin VIIa (MYO7A) protein. USH1B is characterized by hearing loss due to abnormalities in the inner ear and vision loss due to retinitis pigmentosa. Here, we present the model of human MYO7A homodimer, built using homology modeling, and refined using 5 ns molecular dynamics in water. Global computational mutagenesis was applied to evaluate the effect of missense mutations that are critical for maintaining protein structure and stability of MYO7A in inherited eye disease. We found that 43.26% (77 out of 178 in HGMD) and 41.9% (221 out of 528 in ClinVar) of the disease-related missense mutations were associated with higher protein structure destabilizing effects. Overall, most mutations destabilizing the MYO7A protein were found to associate with USH1 and USH1B. Particularly, motor domain and MyTH4 domains were found to be most susceptible to mutations causing the USH1B phenotype. Our work contributes to the understanding of inherited disease from the atomic level of protein structure and analysis of the impact of genetic mutations on protein stability and genotype-to-phenotype relationships in human disease.
Collapse
Affiliation(s)
- Annapurna Kuppa
- Ophthalmic Genetics and Visual Function Branch, National Eye Institute, National Institutes of Health, Bethesda, United States
| | - Yuri V Sergeev
- Ophthalmic Genetics and Visual Function Branch, National Eye Institute, National Institutes of Health, Bethesda, United States
| |
Collapse
|
48
|
Hansen EB, Marcatili P. Modeled Structure of the Cell Envelope Proteinase of Lactococcus lactis. Front Bioeng Biotechnol 2021; 8:613986. [PMID: 33415101 PMCID: PMC7783315 DOI: 10.3389/fbioe.2020.613986] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Accepted: 12/02/2020] [Indexed: 11/23/2022] Open
Abstract
The cell envelope proteinase (CEP) of Lactococcus lactis is a large extracellular protease covalently linked to the peptidoglycan of the cell wall. Strains of L. lactis are typically auxotrophic for several amino acids and in order to grow to high cell densities in milk they need an extracellular protease. The structure of the entire CEP enzyme is difficult to determine experimentally due to the large size and due to the attachment to the cell surface. We here describe the use of a combination of structure prediction tools to create a structural model for the entire CEP enzyme of Lactococcus lactis. The model has implications for how the bacterium interacts with casein micelles during growth in milk, and it has implications regarding the energetics of the proteolytic system. Our model for the CEP indicates that the catalytic triad is activated through a structural change caused by interaction with the substrate. The CEP of L. lactis might become a useful model for the mode of action for enzymes belonging to the large class of S8 proteinases with a PA (protease associated) domain and a downstream fibronectin like domain.
Collapse
Affiliation(s)
- Egon Bech Hansen
- National Food Institute, Technical University of Denmark, Kongens Lyngby, Demark
| | - Paolo Marcatili
- Department of Health Technology, Technical University of Denmark, Kongens Lyngby, Demark
| |
Collapse
|
49
|
Valle J, Fang X, Lasa I. Revisiting Bap Multidomain Protein: More Than Sticking Bacteria Together. Front Microbiol 2020; 11:613581. [PMID: 33424817 PMCID: PMC7785521 DOI: 10.3389/fmicb.2020.613581] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Accepted: 12/03/2020] [Indexed: 12/21/2022] Open
Abstract
One of the major components of the staphylococcal biofilm is surface proteins that assemble as scaffold components of the biofilm matrix. Among the different surface proteins able to contribute to biofilm formation, this review is dedicated to the Biofilm Associated Protein (Bap). Bap is part of the accessory genome of Staphylococcus aureus but orthologs of Bap in other staphylococcal species belong to the core genome. When present, Bap promotes adhesion to abiotic surfaces and induces strong intercellular adhesion by self-assembling into amyloid like aggregates in response to the levels of calcium and the pH in the environment. During infection, Bap enhances the adhesion to epithelial cells where it binds directly to the host receptor Gp96 and inhibits the entry of the bacteria into the cells. To perform such diverse range of functions, Bap comprises several domains, and some of them include several motifs associated to distinct functions. Based on the knowledge accumulated with the Bap protein of S. aureus, this review aims to summarize the current knowledge of the structure and properties of each domain of Bap and their contribution to Bap functionality.
Collapse
Affiliation(s)
- Jaione Valle
- Instituto de Agrobiotecnología, CSIC-Gobierno de Navarra, Mutilva, Spain
| | - Xianyang Fang
- Beijing Advanced Innovation Center for Structural Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Iñigo Lasa
- Laboratory of Microbial Pathogenesis, Navarrabiomed-Universidad Pública de Navarra-Departamento de Salud, IDISNA, Pamplona, Spain
| |
Collapse
|
50
|
Hameduh T, Haddad Y, Adam V, Heger Z. Homology modeling in the time of collective and artificial intelligence. Comput Struct Biotechnol J 2020; 18:3494-3506. [PMID: 33304450 PMCID: PMC7695898 DOI: 10.1016/j.csbj.2020.11.007] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 11/04/2020] [Accepted: 11/04/2020] [Indexed: 12/12/2022] Open
Abstract
Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and later, side-chains are added. Once the low-homology loops are modeled, the whole 3D structure is optimized and validated. In the past three decades, a few collective and collaborative initiatives allowed for continuous progress in both homology and ab initio modeling. Critical Assessment of protein Structure Prediction (CASP) is a worldwide community experiment that has historically recorded the progress in this field. Folding@Home and Rosetta@Home are examples of crowd-sourcing initiatives where the community is sharing computational resources, whereas RosettaCommons is an example of an initiative where a community is sharing a codebase for the development of computational algorithms. Foldit is another initiative where participants compete with each other in a protein folding video game to predict 3D structure. In the past few years, contact maps deep machine learning was introduced to the 3D structure prediction process, adding more information and increasing the accuracy of models significantly. In this review, we will take the reader in a journey of exploration from the beginnings to the most recent turnabouts, which have revolutionized the field of homology modeling. Moreover, we discuss the new trends emerging in this rapidly growing field.
Collapse
Affiliation(s)
- Tareq Hameduh
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
| | - Yazan Haddad
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| | - Vojtech Adam
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| | - Zbynek Heger
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| |
Collapse
|