1
|
Han Y, Lu Y, Yan X, Cui H, Cheng S, Zheng J, Zhou Y, Wang S, Li Z. Atom-ProteinQA: Atom-level protein model quality assessment through fine-grained joint learning. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 249:108078. [PMID: 38537495 DOI: 10.1016/j.cmpb.2024.108078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 12/26/2023] [Accepted: 02/10/2024] [Indexed: 04/21/2024]
Abstract
MOTIVATION Protein model quality assessment (ProteinQA) is a fundamental task that is essential for biologically relevant applications, i.e., protein structure refinement, protein design, etc. Previous works aimed to conduct ProteinQA only on the global structure or per-residue level, ignoring potentially usable and precise cues from a fine-grained per-atom perspective. In this study, we propose an atom-level ProteinQA model, named Atom-ProteinQA, in which two innovative modules are designed to extract geometric and topological atom-level relationships respectively. Specifically, on the one hand, a geometric perception module exploits 3D sparse convolution to capture the geometric features of the input protein, generating fine-grained atom-level predictions. On the other hand, natural chemical bonds are utilized to construct an atom-level graph, then message passing from a topological perception module is applied to output residue-level predictions in parallel. Eventually, through a cross-model aggregation module, features from different modules mutually interact, enhancing performance on both the atom and residue levels. RESULTS Extensive experiments show that our proposed Atom-ProteinQA outperforms previous methods by a large margin, regardless of residue-level or atom-level assessment. Concretely, we achieved state-of-the-art performance on CATH-2084, Decoy-8000, public benchmarks CASP13 & CASP14, and the CAMEO. AVAILABILITY The repository of this project is released on: https://github.com/luyfcandy/Atom_ProteinQA.
Collapse
Affiliation(s)
- Yatong Han
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Yingfeng Lu
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Xu Yan
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Hannah Cui
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | | | - Jiayou Zheng
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Yuzhe Zhou
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai, 200030, China.
| | - Zhen Li
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China.
| |
Collapse
|
2
|
Palamiuc L, Johnson JL, Haratipour Z, Loughran RM, Choi WJ, Arora GK, Tieu V, Ly K, Llorente A, Crabtree S, Wong JCY, Ravi A, Wiederhold T, Murad R, Blind RD, Emerling BM. Hippo and PI5P4K signaling intersect to control the transcriptional activation of YAP. Sci Signal 2024; 17:eado6266. [PMID: 38805583 DOI: 10.1126/scisignal.ado6266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 05/09/2024] [Indexed: 05/30/2024]
Abstract
Phosphoinositides are essential signaling molecules. The PI5P4K family of phosphoinositide kinases and their substrates and products, PI5P and PI4,5P2, respectively, are emerging as intracellular metabolic and stress sensors. We performed an unbiased screen to investigate the signals that these kinases relay and the specific upstream regulators controlling this signaling node. We found that the core Hippo pathway kinases MST1/2 phosphorylated PI5P4Ks and inhibited their signaling in vitro and in cells. We further showed that PI5P4K activity regulated several Hippo- and YAP-related phenotypes, specifically decreasing the interaction between the key Hippo proteins MOB1 and LATS and stimulating the YAP-mediated genetic program governing epithelial-to-mesenchymal transition. Mechanistically, we showed that PI5P interacted with MOB1 and enhanced its interaction with LATS, thereby providing a signaling connection between the Hippo pathway and PI5P4Ks. These findings reveal how these two important evolutionarily conserved signaling pathways are integrated to regulate metazoan development and human disease.
Collapse
Affiliation(s)
| | - Jared L Johnson
- Weill Cornell Medicine, Meyer Cancer Center, New York, NY 10021, USA
- Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Zeinab Haratipour
- Department of Medicine, Division of Diabetes, Endocrinology and Metabolism, Vanderbilt University Medical Center, Nashville, TN 37232, USA
- Austin Peay State University, Clarksville, TN 37044, USA
| | | | - Woong Jae Choi
- Department of Medicine, Division of Diabetes, Endocrinology and Metabolism, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | | | - Vivian Tieu
- Sanford Burnham Prebys, La Jolla, CA 92037, USA
| | - Kyanh Ly
- Sanford Burnham Prebys, La Jolla, CA 92037, USA
| | | | | | - Jenny C Y Wong
- Weill Cornell Medicine, Meyer Cancer Center, New York, NY 10021, USA
- Department of Cell Biology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Archna Ravi
- Sanford Burnham Prebys, La Jolla, CA 92037, USA
| | | | - Rabi Murad
- Sanford Burnham Prebys, La Jolla, CA 92037, USA
| | - Raymond D Blind
- Department of Medicine, Division of Diabetes, Endocrinology and Metabolism, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | | |
Collapse
|
3
|
Sawhney A, Li J, Liao L. Improving AlphaFold Predicted Contacts for Alpha-Helical Transmembrane Proteins Using Structural Features. Int J Mol Sci 2024; 25:5247. [PMID: 38791287 PMCID: PMC11121315 DOI: 10.3390/ijms25105247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 05/06/2024] [Accepted: 05/09/2024] [Indexed: 05/26/2024] Open
Abstract
Residue contact maps provide a condensed two-dimensional representation of three-dimensional protein structures, serving as a foundational framework in structural modeling but also as an effective tool in their own right in identifying inter-helical binding sites and drawing insights about protein function. Treating contact maps primarily as an intermediate step for 3D structure prediction, contact prediction methods have limited themselves exclusively to sequential features. Now that AlphaFold2 predicts 3D structures with good accuracy in general, we examine (1) how well predicted 3D structures can be directly used for deciding residue contacts, and (2) whether features from 3D structures can be leveraged to further improve residue contact prediction. With a well-known benchmark dataset, we tested predicting inter-helical residue contact based on AlphaFold2's predicted structures, which gave an 83% average precision, already outperforming a sequential features-based state-of-the-art model. We then developed a procedure to extract features from atomic structure in the neighborhood of a residue pair, hypothesizing that these features will be useful in determining if the residue pair is in contact, provided the structure is decently accurate, such as predicted by AlphaFold2. Training on features generated from experimentally determined structures, we leveraged knowledge from known structures to significantly improve residue contact prediction, when testing using the same set of features but derived using AlphaFold2 structures. Our results demonstrate a remarkable improvement over AlphaFold2, achieving over 91.9% average precision for a held-out subset and over 89.5% average precision in cross-validation experiments.
Collapse
Affiliation(s)
- Aman Sawhney
- Department of Computer and Information Sciences, University of Delaware, Smith Hall, 18 Amstel Avenue, Newark, DE 19716, USA;
| | - Jiefu Li
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, 516 Jun Gong Road, Shanghai 200093, China;
| | - Li Liao
- Department of Computer and Information Sciences, University of Delaware, Smith Hall, 18 Amstel Avenue, Newark, DE 19716, USA;
| |
Collapse
|
4
|
Xie T, Huang J. Can Protein Structure Prediction Methods Capture Alternative Conformations of Membrane Transporters? J Chem Inf Model 2024; 64:3524-3536. [PMID: 38564295 DOI: 10.1021/acs.jcim.3c01936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Understanding the conformational dynamics of proteins, such as the inward-facing (IF) and outward-facing (OF) transition observed in transporters, is vital for elucidating their functional mechanisms. Despite significant advances in protein structure prediction (PSP) over the past three decades, most efforts have been focused on single-state prediction, leaving multistate or alternative conformation prediction (ACP) relatively unexplored. This discrepancy has led to the development of highly accurate PSP methods such as AlphaFold, yet their capabilities for ACP remain limited. To investigate the performance of current PSP methods in ACP, we curated a data set, named IOMemP, consisting of 32 experimentally determined high-resolution IF and OF structures of 16 membrane proteins with substantial conformational changes. We benchmarked 12 representative PSP methods, along with two recent multistate methods based on AlphaFold, against this data set. Our findings reveal a remarkably consistent preference for specific states across various PSP methods. We elucidated how coevolution information in MSAs influences state preference. Moreover, we showed that AlphaFold, when excluding coevolution information, estimated similar energies between the experimental IF and OF conformations, indicating that the energy model learned by AlphaFold is not biased toward any particular state. Our IOMemP data set and benchmark results are anticipated to advance the development of robust ACP methods.
Collapse
Affiliation(s)
- Tengyu Xie
- College of Life Science, Zhejiang University, HangZhou Zhejiang 310058, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, HangZhou Zhejiang 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, HangZhou Zhejiang 310024, China
| | - Jing Huang
- College of Life Science, Zhejiang University, HangZhou Zhejiang 310058, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, HangZhou Zhejiang 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, HangZhou Zhejiang 310024, China
| |
Collapse
|
5
|
Jing X, Wu F, Luo X, Xu J. Single-sequence protein structure prediction by integrating protein language models. Proc Natl Acad Sci U S A 2024; 121:e2308788121. [PMID: 38507445 PMCID: PMC10990103 DOI: 10.1073/pnas.2308788121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 02/05/2024] [Indexed: 03/22/2024] Open
Abstract
Protein structure prediction has been greatly improved by deep learning in the past few years. However, the most successful methods rely on multiple sequence alignment (MSA) of the sequence homologs of the protein under prediction. In nature, a protein folds in the absence of its sequence homologs and thus, a MSA-free structure prediction method is desired. Here, we develop a single-sequence-based protein structure prediction method RaptorX-Single by integrating several protein language models and a structure generation module and then study its advantage over MSA-based methods. Our experimental results indicate that in addition to running much faster than MSA-based methods such as AlphaFold2, RaptorX-Single outperforms AlphaFold2 and other MSA-free methods in predicting the structure of antibodies (after fine-tuning on antibody data), proteins of very few sequence homologs, and single mutation effects. By comparing different protein language models, our results show that not only the scale but also the training data of protein language models will impact the performance. RaptorX-Single also compares favorably to MSA-based AlphaFold2 when the protein under prediction has a large number of sequence homologs.
Collapse
Affiliation(s)
| | - Fandi Wu
- MoleculeMind Ltd., Beijing100084, China
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing100190, China
| | - Xiao Luo
- Toyota Technological Institute at Chicago, Chicago, IL60637
- Shanghai Artificial Intelligence Laboratory, Shanghai200232, China
| | - Jinbo Xu
- MoleculeMind Ltd., Beijing100084, China
- Toyota Technological Institute at Chicago, Chicago, IL60637
| |
Collapse
|
6
|
Jänes J, Beltrao P. Deep learning for protein structure prediction and design-progress and applications. Mol Syst Biol 2024; 20:162-169. [PMID: 38291232 PMCID: PMC10912668 DOI: 10.1038/s44320-024-00016-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 12/21/2023] [Accepted: 01/11/2024] [Indexed: 02/01/2024] Open
Abstract
Proteins are the key molecular machines that orchestrate all biological processes of the cell. Most proteins fold into three-dimensional shapes that are critical for their function. Studying the 3D shape of proteins can inform us of the mechanisms that underlie biological processes in living cells and can have practical applications in the study of disease mutations or the discovery of novel drug treatments. Here, we review the progress made in sequence-based prediction of protein structures with a focus on applications that go beyond the prediction of single monomer structures. This includes the application of deep learning methods for the prediction of structures of protein complexes, different conformations, the evolution of protein structures and the application of these methods to protein design. These developments create new opportunities for research that will have impact across many areas of biomedical research.
Collapse
Affiliation(s)
- Jürgen Jänes
- Institute of Molecular Systems Biology, ETH Zürich, 8093, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pedro Beltrao
- Institute of Molecular Systems Biology, ETH Zürich, 8093, Zürich, Switzerland.
- Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
7
|
Wuyun Q, Chen Y, Shen Y, Cao Y, Hu G, Cui W, Gao J, Zheng W. Recent Progress of Protein Tertiary Structure Prediction. Molecules 2024; 29:832. [PMID: 38398585 PMCID: PMC10893003 DOI: 10.3390/molecules29040832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/06/2024] [Accepted: 02/08/2024] [Indexed: 02/25/2024] Open
Abstract
The prediction of three-dimensional (3D) protein structure from amino acid sequences has stood as a significant challenge in computational and structural bioinformatics for decades. Recently, the widespread integration of artificial intelligence (AI) algorithms has substantially expedited advancements in protein structure prediction, yielding numerous significant milestones. In particular, the end-to-end deep learning method AlphaFold2 has facilitated the rise of structure prediction performance to new heights, regularly competitive with experimental structures in the 14th Critical Assessment of Protein Structure Prediction (CASP14). To provide a comprehensive understanding and guide future research in the field of protein structure prediction for researchers, this review describes various methodologies, assessments, and databases in protein structure prediction, including traditionally used protein structure prediction methods, such as template-based modeling (TBM) and template-free modeling (FM) approaches; recently developed deep learning-based methods, such as contact/distance-guided methods, end-to-end folding methods, and protein language model (PLM)-based methods; multi-domain protein structure prediction methods; the CASP experiments and related assessments; and the recently released AlphaFold Protein Structure Database (AlphaFold DB). We discuss their advantages, disadvantages, and application scopes, aiming to provide researchers with insights through which to understand the limitations, contexts, and effective selections of protein structure prediction methods in protein-related fields.
Collapse
Affiliation(s)
- Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yihan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Yifeng Shen
- Faculty of Environment and Information Studies, Keio University, Fujisawa 252-0882, Kanagawa, Japan;
| | - Yang Cao
- College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Gang Hu
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Wei Cui
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
8
|
Ran C, Pu K. Molecularly generated light and its biomedical applications. Angew Chem Int Ed Engl 2024; 63:e202314468. [PMID: 37955419 DOI: 10.1002/anie.202314468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 11/01/2023] [Accepted: 11/10/2023] [Indexed: 11/14/2023]
Abstract
Molecularly generated light, referred to here as "molecular light", mainly includes bioluminescence, chemiluminescence, and Cerenkov luminescence. Molecular light possesses unique dual features of being both a molecule and a source of light. Its molecular nature enables it to be delivered as molecules to regions deep within the body, overcoming the limitations of natural sunlight and physically generated light sources like lasers and LEDs. Simultaneously, its light properties make it valuable for applications such as imaging, photodynamic therapy, photo-oxidative therapy, and photobiomodulation. In this review article, we provide an updated overview of the diverse applications of molecular light and discuss the strengths and weaknesses of molecular light across various domains. Lastly, we present forward-looking perspectives on the potential of molecular light in the realms of molecular imaging, photobiological mechanisms, therapeutic applications, and photobiomodulation. While some of these perspectives may be considered bold and contentious, our intent is to inspire further innovations in the field of molecular light applications.
Collapse
Affiliation(s)
- Chongzhao Ran
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02129, USA
| | - Kanyi Pu
- School of Chemistry, Chemical Engineering and Biotechnology, Nanyang Technological University, 637459, Singapore, Singapore
- Lee Kong Chian School of Medicine, Nanyang Technological University, 308232, Singapore, Singapore
| |
Collapse
|
9
|
Darden C, Donkor JE, Korolkova O, Barozai MYK, Chaudhuri M. Distinct structural motifs are necessary for targeting and import of Tim17 in Trypanosoma brucei mitochondrion. mSphere 2024; 9:e0055823. [PMID: 38193679 PMCID: PMC10871166 DOI: 10.1128/msphere.00558-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 11/28/2023] [Indexed: 01/10/2024] Open
Abstract
Nuclear-encoded mitochondrial proteins are correctly translocated to their proper sub-mitochondrial destination using location-specific mitochondrial targeting signals and via multi-protein import machineries (translocases) in the outer and inner mitochondrial membranes (TOM and TIMs, respectively). However, targeting signals of multi-pass Tims are less defined. Here, we report the characterization of the targeting signals of Trypanosoma brucei Tim17 (TbTim17), an essential component of the most divergent TIM complex. TbTim17 possesses a characteristic secondary structure including four predicted transmembrane (TM) domains in the center with hydrophilic N- and C-termini. After examining mitochondrial localization of various deletion and site-directed mutants of TbTim17 in T. brucei using subcellular fractionation and confocal microscopy, we located at least two internal targeting signals (ITS): (i) within TM1 (31-50 AAs) and (ii) TM4 + loop 3 (120-136 AAs). Both signals are required for proper targeting and integration of TbTim17 in the membrane. Furthermore, a positively charged residue (K122) is critical for mitochondrial localization of TbTim17. This is the first report of characterizing the ITS for a multipass inner membrane protein in a divergent eukaryote, like T. brucei.IMPORTANCEAfrican trypanosomiasis (AT) is a deadly disease in human and domestic animals, caused by the parasitic protozoan Trypanosoma brucei. Therefore, AT is not only a concern for human health but also for economic development in the vast area of sub-Saharan Africa. T. brucei possesses a single mitochondrion per cell that imports hundreds of nuclear-encoded mitochondrial proteins for its functions. T. brucei Tim17 (TbTim17), an essential component of the TbTIM17 complex, is a nuclear-encoded protein; thus, it is necessary to be imported from the cytosol to form the TbTIM17 complex. Here, we demonstrated that the internal targeting signals within the transmembrane 1 (TM1) and TM4 with loop 3, and residue K122 are required collectively for import and integration of TbTim17 in the T. brucei mitochondrion. This information could be utilized to block TbTim17 function and parasite growth.
Collapse
Affiliation(s)
- Chauncey Darden
- Department of Biochemistry, Cancer Biology, Neuroscience, and Pharmacology, Meharry Medical College, Nashville, Tennessee, USA
| | - Joseph E. Donkor
- Department of Microbiology, Immunology, and Physiology, Meharry Medical College, Nashville, Tennessee, USA
| | - Olga Korolkova
- The Consolidated Research Instrumentation, Informatics, Statistics, and Learning Integration Suite (CRISALIS), Meharry Medical College, Nashville, Tennessee, USA
| | | | - Minu Chaudhuri
- Department of Microbiology, Immunology, and Physiology, Meharry Medical College, Nashville, Tennessee, USA
| |
Collapse
|
10
|
Peng CX, Liang F, Xia YH, Zhao KL, Hou MH, Zhang GJ. Recent Advances and Challenges in Protein Structure Prediction. J Chem Inf Model 2024; 64:76-95. [PMID: 38109487 DOI: 10.1021/acs.jcim.3c01324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Artificial intelligence has made significant advances in the field of protein structure prediction in recent years. In particular, DeepMind's end-to-end model, AlphaFold2, has demonstrated the capability to predict three-dimensional structures of numerous unknown proteins with accuracy levels comparable to those of experimental methods. This breakthrough has opened up new possibilities for understanding protein structure and function as well as accelerating drug discovery and other applications in the field of biology and medicine. Despite the remarkable achievements of artificial intelligence in the field, there are still some challenges and limitations. In this Review, we discuss the recent progress and some of the challenges in protein structure prediction. These challenges include predicting multidomain protein structures, protein complex structures, multiple conformational states of proteins, and protein folding pathways. Furthermore, we highlight directions in which further improvements can be conducted.
Collapse
Affiliation(s)
- Chun-Xiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Fang Liang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Kai-Long Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Ming-Hua Hou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
11
|
Li J, Wang L, Zhu Z, Song C. Exploring the Alternative Conformation of a Known Protein Structure Based on Contact Map Prediction. J Chem Inf Model 2024; 64:301-315. [PMID: 38117138 PMCID: PMC10777399 DOI: 10.1021/acs.jcim.3c01381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023]
Abstract
The rapid development of deep learning-based methods has considerably advanced the field of protein structure prediction. The accuracy of predicting the 3D structures of simple proteins is comparable to that of experimentally determined structures, providing broad possibilities for structure-based biological studies. Another critical question is whether and how multistate structures can be predicted from a given protein sequence. In this study, analysis of tens of two-state proteins demonstrated that deep learning-based contact map predictions contain structural information on both states, which suggests that it is probably appropriate to change the target of deep learning-based protein structure prediction from one specific structure to multiple likely structures. Furthermore, by combining deep learning- and physics-based computational methods, we developed a protocol for exploring alternative conformations from a known structure of a given protein, by which we successfully approached the holo-state conformations of multiple representative proteins from their apo-state structures.
Collapse
Affiliation(s)
- Jiaxuan Li
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Lei Wang
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Zefeng Zhu
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Chen Song
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| |
Collapse
|
12
|
Krokidis MG, Dimitrakopoulos GN, Vrahatis AG, Exarchos TP, Vlamos P. Challenges and limitations in computational prediction of protein misfolding in neurodegenerative diseases. Front Comput Neurosci 2024; 17:1323182. [PMID: 38250244 PMCID: PMC10796696 DOI: 10.3389/fncom.2023.1323182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/19/2023] [Indexed: 01/23/2024] Open
Affiliation(s)
| | | | | | | | - Panagiotis Vlamos
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, Corfu, Greece
| |
Collapse
|
13
|
Salgado B, Rivas RB, Pinto D, Sonstegard TS, Carlson DF, Martins K, Bostrom JR, Sinebo Y, Rowland RRR, Brandariz-Nuñez A. Genetically modified pigs lacking CD163 PSTII-domain-coding exon 13 are completely resistant to PRRSV infection. Antiviral Res 2024; 221:105793. [PMID: 38184111 DOI: 10.1016/j.antiviral.2024.105793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 12/18/2023] [Accepted: 01/02/2024] [Indexed: 01/08/2024]
Abstract
CD163 expressed on cell surface of porcine alveolar macrophages (PAMs) serves as a cellular entry receptor for porcine reproductive and respiratory syndrome virus (PRRSV). The extracellular portion of CD163 contains nine scavenger receptor cysteine-rich (SRCR) and two proline-serine-threonine (PST) domains. Genomic editing of pigs to remove the entire CD163 or just the SRCR5 domain confers resistance to infection with both PRRSV-1 and PRRSV-2 viruses. By performing a mutational analysis of CD163, previous in vitro infection experiments showed resistance to PRRSV infection following deletion of exon 13 which encodes the first 12 amino acids of the 16 amino acid PSTII domain. These findings predicted that removal of exon 13 can be used as a strategy to produce gene-edited pigs fully resistant to PRRSV infection. In this study, to determine whether the deletion of exon 13 is sufficient to confer resistance of pigs to PRRSV infection, we produced pigs possessing a defined CD163 exon 13 deletion (ΔExon13 pigs) and evaluated their susceptibility to viral infection. Wild type (WT) and CD163 modified pigs, placed in the same room, were infected with PRRSV-2. The modified pigs remained PCR and serologically negative for PRRSV throughout the study; whereas the WT pigs supported PRRSV infection and showed PRRSV related pathology. Importantly, our data also suggested that removal of exon 13 did not affect the main physiological function associated with CD163 in vivo. These results demonstrate that a modification of CD163 through a precise deletion of exon 13 provides a strategy for protection against PRRSV infection.
Collapse
Affiliation(s)
- Brianna Salgado
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Rafael Bautista Rivas
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Derek Pinto
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | | | | | | | | | | | - Raymond R R Rowland
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Alberto Brandariz-Nuñez
- Department of Pathobiology, College of Veterinary Medicine, University of Illinois at Urbana-Champaign, Champaign, IL, USA.
| |
Collapse
|
14
|
Curatolo AI, Kimchi O, Goodrich CP, Krueger RK, Brenner MP. A computational toolbox for the assembly yield of complex and heterogeneous structures. Nat Commun 2023; 14:8328. [PMID: 38097568 PMCID: PMC10721878 DOI: 10.1038/s41467-023-43168-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 11/02/2023] [Indexed: 12/17/2023] Open
Abstract
The self-assembly of complex structures from a set of non-identical building blocks is a hallmark of soft matter and biological systems, including protein complexes, colloidal clusters, and DNA-based assemblies. Predicting the dependence of the equilibrium assembly yield on the concentrations and interaction energies of building blocks is highly challenging, owing to the difficulty of computing the entropic contributions to the free energy of the many structures that compete with the ground state configuration. While these calculations yield well known results for spherically symmetric building blocks, they do not hold when the building blocks have internal rotational degrees of freedom. Here we present an approach for solving this problem that works with arbitrary building blocks, including proteins with known structure and complex colloidal building blocks. Our algorithm combines classical statistical mechanics with recently developed computational tools for automatic differentiation. Automatic differentiation allows efficient evaluation of equilibrium averages over configurations that would otherwise be intractable. We demonstrate the validity of our framework by comparison to molecular dynamics simulations of simple examples, and apply it to calculate the yield curves for known protein complexes and for the assembly of colloidal shells.
Collapse
Affiliation(s)
- Agnese I Curatolo
- School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
| | - Ofer Kimchi
- Lewis-Sigler Institute, Princeton University, Princeton, NJ, 08544, USA
| | - Carl P Goodrich
- Institute of Science and Technology Austria, A-3400, Klosterneuburg, Austria
| | - Ryan K Krueger
- School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
| | - Michael P Brenner
- School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA.
- Department of Physics, Harvard University, Cambridge, MA, 02138, USA.
| |
Collapse
|
15
|
Zhou Y, Litfin T, Zhan J. 3 = 1 + 2: how the divide conquered de novo protein structure prediction and what is next? Natl Sci Rev 2023; 10:nwad259. [PMID: 38033736 PMCID: PMC10684263 DOI: 10.1093/nsr/nwad259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 09/18/2023] [Indexed: 12/02/2023] Open
Affiliation(s)
- Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, China
- Institute for Glycomics, Griffith University, Australia
| | - Thomas Litfin
- Institute for Glycomics, Griffith University, Australia
| | - Jian Zhan
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, China
| |
Collapse
|
16
|
Zhang J, Liu S, Chen M, Chu H, Wang M, Wang Z, Yu J, Ni N, Yu F, Chen D, Yang YI, Xue B, Yang L, Liu Y, Gao YQ. Unsupervisedly Prompting AlphaFold2 for Accurate Few-Shot Protein Structure Prediction. J Chem Theory Comput 2023; 19:8460-8471. [PMID: 37947474 DOI: 10.1021/acs.jctc.3c00528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
Data-driven predictive methods that can efficiently and accurately transform protein sequences into biologically active structures are highly valuable for scientific research and medical development. Determining an accurate folding landscape using coevolutionary information is fundamental to the success of modern protein structure prediction methods. As the state of the art, AlphaFold2 has dramatically raised the accuracy without performing explicit coevolutionary analysis. Nevertheless, its performance still shows strong dependence on available sequence homologues. Based on the interrogation on the cause of such dependence, we presented EvoGen, a meta generative model, to remedy the underperformance of AlphaFold2 for poor MSA targets. By prompting the model with calibrated or virtually generated homologue sequences, EvoGen helps AlphaFold2 fold accurately in the low-data regime and even achieve encouraging performance with single-sequence predictions. Being able to make accurate predictions with few-shot MSA not only generalizes AlphaFold2 better for orphan sequences but also democratizes its use for high-throughput applications. Besides, EvoGen combined with AlphaFold2 yields a probabilistic structure generation method that could explore alternative conformations of protein sequences, and the task-aware differentiable algorithm for sequence generation will benefit other related tasks including protein design.
Collapse
Affiliation(s)
- Jun Zhang
- Changping Laboratory, Beijing 102200, China
| | - Sirui Liu
- Changping Laboratory, Beijing 102200, China
| | - Mengyun Chen
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Haotian Chu
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Min Wang
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Zidong Wang
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Jialiang Yu
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Ningxi Ni
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Fan Yu
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Dechin Chen
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yi Isaac Yang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Boxin Xue
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Lijiang Yang
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yuan Liu
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yi Qin Gao
- Changping Laboratory, Beijing 102200, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China
| |
Collapse
|
17
|
Pérez S. Computational modeling of protein-carbohydrate interactions: Current trends and future challenges. Adv Carbohydr Chem Biochem 2023; 83:133-149. [PMID: 37968037 DOI: 10.1016/bs.accb.2023.10.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2023]
Abstract
The article leads the reader through an up-to-date presentation of the concepts, developments, and main applications of computational modeling to study protein-carbohydrate interactions. It follows with the presentation of some current issues and perspectives arising from the expected evolution of generic methodological developments in deep learning, immersive analytics, and virtual reality for molecular visualization and data management. Such methodological developments for macromolecular interactions would greatly benefit a wide range of scientific endeavors in the field of carbohydrate chemistry and biochemistry, including the following interrelated efforts dealing with highly crowded media, with examples concerning glycoside transferases, the extracellular matrix, and the exploration of interactions between complex carbohydrates and intrinsically disordered proteins.
Collapse
Affiliation(s)
- Serge Pérez
- Centre de Recherches sur les Macromolécules Végétales, CNRS, Université Grenoble Alpes, Grenoble, France.
| |
Collapse
|
18
|
Ramya L, Helina Hilda S. Structural dynamics of moonlighting intrinsically disordered proteins - A black box in multiple sclerosis. J Mol Graph Model 2023; 124:108572. [PMID: 37494873 DOI: 10.1016/j.jmgm.2023.108572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/19/2023] [Accepted: 07/20/2023] [Indexed: 07/28/2023]
Abstract
Multiple Sclerosis (MS) is a demyelinating disease of the central nervous system that disturbs the flow of brain signals to other parts of the body. The actual cause of the disease is still not apparent. The intrinsically disordered proteins (IDP) play a crucial role in neurodegenerative diseases like Alzheimer's, Lewy bodies, Parkinson's, Amyotrophic Lateral Sclerosis, Multiple Sclerosis, etc. In MS, it was known that the immune system attacks the proteins like Myelin Basic Protein (MBP), Myelin-associated Oligodendrocyte Basic protein (MOBP), Myelin-Associated Protein (MAG), and Myelin Proteolipid Protein (PLP) and this leads to demyelination causing MS. Here the proteins MBP and MOBP are both moonlighting intrinsically disordered proteins and exist between the myelin sheath, unlike MAG which is a transmembrane protein. The main focus of the article was to examine the significant role of proteins intrinsically disordered regions (IDR) in maintaining their function. Molecular dynamics simulation studies were performed to study the conformational dynamics of these protein IDRs both in water and near the myelin sheath. The results suggest that the IDR dominates the structural dynamics of these proteins and IDR in both proteins was responsible for their interaction with the myelin sheath. Interestingly, it was noted that the known epitopes MBP83-96 and MOBP65-87 in the IDR have no interaction with the myelin sheath. Thus when the protein remains intrinsically disordered it maintains the proper function and myelin integrity and if it adopts folds the region was identified as an epitope by the immune system leading to demyelination causing MS.
Collapse
Affiliation(s)
- L Ramya
- Department of Bioinformatics, School of Chemical and Biotechnology, SASTRA Deemed University, Thirumalaisamudram, Thanjavur, 613401, Tamil Nadu, India.
| | - S Helina Hilda
- Department of Bioinformatics, School of Chemical and Biotechnology, SASTRA Deemed University, Thirumalaisamudram, Thanjavur, 613401, Tamil Nadu, India
| |
Collapse
|
19
|
Larrea-Sebal A, Jebari-Benslaiman S, Galicia-Garcia U, Jose-Urteaga AS, Uribe KB, Benito-Vicente A, Martín C. Predictive Modeling and Structure Analysis of Genetic Variants in Familial Hypercholesterolemia: Implications for Diagnosis and Protein Interaction Studies. Curr Atheroscler Rep 2023; 25:839-859. [PMID: 37847331 PMCID: PMC10618353 DOI: 10.1007/s11883-023-01154-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/15/2023] [Indexed: 10/18/2023]
Abstract
PURPOSE OF REVIEW Familial hypercholesterolemia (FH) is a hereditary condition characterized by elevated levels of low-density lipoprotein cholesterol (LDL-C), which increases the risk of cardiovascular disease if left untreated. This review aims to discuss the role of bioinformatics tools in evaluating the pathogenicity of missense variants associated with FH. Specifically, it highlights the use of predictive models based on protein sequence, structure, evolutionary conservation, and other relevant features in identifying genetic variants within LDLR, APOB, and PCSK9 genes that contribute to FH. RECENT FINDINGS In recent years, various bioinformatics tools have emerged as valuable resources for analyzing missense variants in FH-related genes. Tools such as REVEL, Varity, and CADD use diverse computational approaches to predict the impact of genetic variants on protein function. These tools consider factors such as sequence conservation, structural alterations, and receptor binding to aid in interpreting the pathogenicity of identified missense variants. While these predictive models offer valuable insights, the accuracy of predictions can vary, especially for proteins with unique characteristics that might not be well represented in the databases used for training. This review emphasizes the significance of utilizing bioinformatics tools for assessing the pathogenicity of FH-associated missense variants. Despite their contributions, a definitive diagnosis of a genetic variant necessitates functional validation through in vitro characterization or cascade screening. This step ensures the precise identification of FH-related variants, leading to more accurate diagnoses. Integrating genetic data with reliable bioinformatics predictions and functional validation can enhance our understanding of the genetic basis of FH, enabling improved diagnosis, risk stratification, and personalized treatment for affected individuals. The comprehensive approach outlined in this review promises to advance the management of this inherited disorder, potentially leading to better health outcomes for those affected by FH.
Collapse
Affiliation(s)
- Asier Larrea-Sebal
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
- Fundación Biofisika Bizkaia, 48940, Leioa, Spain
| | - Shifa Jebari-Benslaiman
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - Unai Galicia-Garcia
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - Ane San Jose-Urteaga
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
| | - Kepa B Uribe
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
| | - Asier Benito-Vicente
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - César Martín
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain.
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain.
| |
Collapse
|
20
|
Sawhney A, Li J, Liao L. Improving AlphaFold predicted contacts in alpha-helical transmembrane proteins structures using structural features. RESEARCH SQUARE 2023:rs.3.rs-3475769. [PMID: 37961476 PMCID: PMC10635369 DOI: 10.21203/rs.3.rs-3475769/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Background Residue contacts maps offer a 2-d reduced representation of 3-d protein structures and constitute a structural constraint and scaffold in structural modeling. In addition, contact maps are also an effective tool in identifying interhelical binding sites and drawing insights about protein function. While most works predict contact maps using features derived from sequences, we believe information from known structures can be leveraged for a prediction improvement in unknown structures where decent approximate structures such as ones predicted by AlphaFold2 are available. Results Alphafold2's predicted structures are found to be quite accurate at inter-helical residue contact prediction task, achieving 83% average precision. We adopt an unconventional approach, using features extracted from atomic structures in the neighborhood of a residue pair and use them to predicting residue contact. We trained on features derived from experimentally determined structures and predicted on features derived from AlphaFold2's predicted structures. Our results demonstrate a remarkable improvement over AlphaFold2 achieving over 91.9% average precision for held-out and over 89.5% average precision in cross validation experiments. Conclusion Training on features generated from experimentally determined structures, we were able to leverage knowledge from known structures to significantly improve the contacts predicted using AlphaFold2 structures. We demonstrated that using coordinates directly (instead of the proposed features) does not lead to an improvement in contact prediction performance.
Collapse
Affiliation(s)
- Aman Sawhney
- Department of Computer and Information Sciences, University of
Delaware, Smith Hall, 18 Amstel Avenue, Newark, DE, 19716,United States
| | - Jiefu Li
- School of Optical-Electrical and Computer Engineering, University
of Shanghai for Science and Technology, 516 Jun Gong Road, Shanghai 200093, P. R.
China
| | - Li Liao
- Department of Computer and Information Sciences, University of
Delaware, Smith Hall, 18 Amstel Avenue, Newark, DE, 19716,United States
| |
Collapse
|
21
|
Fathollahi M, Motamedi H, Hossainpour H, Abiri R, Shahlaei M, Moradi S, Dashtbin S, Moradi J, Alvandi A. Designing a novel multi-epitopes pan-vaccine against SARS-CoV-2 and seasonal influenza: in silico and immunoinformatics approach. J Biomol Struct Dyn 2023:1-24. [PMID: 37723861 DOI: 10.1080/07391102.2023.2258420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 09/07/2023] [Indexed: 09/20/2023]
Abstract
The merger of COVID-19 and seasonal influenza infections is considered a potentially serious threat to public health. These two viral agents can cause extensive and severe lower and upper respiratory tract infections with lung damage with host factors. Today, the development of vaccination has been shown to reduce the risk of hospitalization and mortality from the COVID-19 virus and influenza epidemics. Therefore, this study contributes to an immunoinformatics approach to producing a vaccine that can elicit strong and specific immune responses against COVID-19 and influenza A and B viruses. The NCBI, GISAID, and Uniprot databases were used to retrieve sequences. Linear B cell, Cytotoxic T lymphocyte, and Helper T lymphocyte epitopes were predicted using the online servers. Population coverage of MHC I epitopes worldwide for SARS-CoV-2, Influenza virus H3N2, H3N2, and Yamagata/Victoria were 99.93%, 68.67%, 68.38%, and 85.45%, respectively. Candidate epitopes were linked by GGGGS, GPGPG, and KK linkers. Different epitopes were permutated several times to form different peptides and then screened for antigenicity, allergenicity, and toxicity. The vaccine construct was analyzed for physicochemical properties, conformational B-cell epitopes, interaction with Toll-like receptors, and IFN-gamma-induced. Immune stimulation response of final construct was evaluated using C-IMMSIM. Eventually, the final construct sequence was codon-optimized for Escherichia coli K12 and Homo sapiens to design a multi-epitope vaccine and mRNA vaccine. In conclusion, due to the variable nature of SARS-CoV-2 and influenza proteins, the design of a multi-epitope vaccine can protect against all their standard variants, but laboratory validation is required.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Matin Fathollahi
- Student Research Committee, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Hamid Motamedi
- Student Research Committee, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Hadi Hossainpour
- Student Research Committee, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Ramin Abiri
- Fertility and Infertility Research Center, Research Institute for Health Technology, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Mohsen Shahlaei
- Nano Drug Delivery Research Center, Health Technology Institute, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Sajad Moradi
- Nano Drug Delivery Research Center, Health Technology Institute, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Shirin Dashtbin
- Department of Microbiology, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Jale Moradi
- Department of Microbiology, School of Medicine, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Amirhooshang Alvandi
- Medical Technology Research Center, Research Institute for Health Technology, Kermanshah University of Medical Sciences, Kermanshah, Iran
| |
Collapse
|
22
|
Cohen S, Schneidman-Duhovny D. A deep learning model for predicting optimal distance range in crosslinking mass spectrometry data. Proteomics 2023; 23:e2200341. [PMID: 37070547 DOI: 10.1002/pmic.202200341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 04/02/2023] [Accepted: 04/03/2023] [Indexed: 04/19/2023]
Abstract
Macromolecular assemblies play an important role in all cellular processes. While there has recently been significant progress in protein structure prediction based on deep learning, large protein complexes cannot be predicted with these approaches. The integrative structure modeling approach characterizes multi-subunit complexes by computational integration of data from fast and accessible experimental techniques. Crosslinking mass spectrometry is one such technique that provides spatial information about the proximity of crosslinked residues. One of the challenges in interpreting crosslinking datasets is designing a scoring function that, given a structure, can quantify how well it fits the data. Most approaches set an upper bound on the distance between Cα atoms of crosslinked residues and calculate a fraction of satisfied crosslinks. However, the distance spanned by the crosslinker greatly depends on the neighborhood of the crosslinked residues. Here, we design a deep learning model for predicting the optimal distance range for a crosslinked residue pair based on the structures of their neighborhoods. We find that our model can predict the distance range with the area under the receiver-operator curve of 0.86 and 0.7 for intra- and inter-protein crosslinks, respectively. Our deep scoring function can be used in a range of structure modeling applications.
Collapse
Affiliation(s)
- Shon Cohen
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
23
|
Ho W, Huang H, Huang J. IFF: Identifying key residues in intrinsically disordered regions of proteins using machine learning. Protein Sci 2023; 32:e4739. [PMID: 37498545 PMCID: PMC10443345 DOI: 10.1002/pro.4739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 06/21/2023] [Accepted: 07/25/2023] [Indexed: 07/28/2023]
Abstract
Conserved residues in protein homolog sequence alignments are structurally or functionally important. For intrinsically disordered proteins or proteins with intrinsically disordered regions (IDRs), however, alignment often fails because they lack a steric structure to constrain evolution. Although sequences vary, the physicochemical features of IDRs may be preserved in maintaining function. Therefore, a method to retrieve common IDR features may help identify functionally important residues. We applied unsupervised contrastive learning to train a model with self-attention neuronal networks on human IDR orthologs. Parameters in the model were trained to match sequences in ortholog pairs but not in other IDRs. The trained model successfully identifies previously reported critical residues from experimental studies, especially those with an overall pattern (e.g., multiple aromatic residues or charged blocks) rather than short motifs. This predictive model can be used to identify potentially important residues in other proteins, improving our understanding of their functions. The trained model can be run directly from the Jupyter Notebook in the GitHub repository using Binder (mybinder.org). The only required input is the primary sequence. The training scripts are available on GitHub (https://github.com/allmwh/IFF). The training datasets have been deposited in an Open Science Framework repository (https://osf.io/jk29b).
Collapse
Affiliation(s)
- Wen‐Lin Ho
- Institute of Biochemistry and Molecular Biology, National Yang Ming Chiao Tung UniversityTaipeiTaiwan
| | - Hsuan‐Cheng Huang
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung UniversityTaipeiTaiwan
| | - Jie‐rong Huang
- Institute of Biochemistry and Molecular Biology, National Yang Ming Chiao Tung UniversityTaipeiTaiwan
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung UniversityTaipeiTaiwan
- Department of Life Sciences and Institute of Genome SciencesNational Yang Ming Chiao Tung UniversityTaipeiTaiwan
| |
Collapse
|
24
|
Kandathil SM, Lau AM, Jones DT. Machine learning methods for predicting protein structure from single sequences. Curr Opin Struct Biol 2023; 81:102627. [PMID: 37320955 DOI: 10.1016/j.sbi.2023.102627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 05/17/2023] [Accepted: 05/17/2023] [Indexed: 06/17/2023]
Abstract
Recent breakthroughs in protein structure prediction have increasingly relied on the use of deep neural networks. These recent methods are notable in that they produce 3-D atomic coordinates as a direct output of the networks, a feature which presents many advantages. Although most techniques of this type make use of multiple sequence alignments as their primary input, a new wave of methods have attempted to use just single sequences as the input. We discuss the make-up and operating principles of these models, and highlight new developments in these areas, as well as areas for future development.
Collapse
Affiliation(s)
- Shaun M Kandathil
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom
| | - Andy M Lau
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom
| | - David T Jones
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom.
| |
Collapse
|
25
|
Darden C, Donkor J, Korolkova O, Khan Barozai MY, Chaudhuri M. Distinct structural motifs are necessary for targeting and import of Tim17 in Trypanosoma brucei mitochondrion. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.07.548172. [PMID: 37461662 PMCID: PMC10350046 DOI: 10.1101/2023.07.07.548172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2024]
Abstract
Nuclear-encoded mitochondrial proteins are correctly translocated to their proper sub-mitochondrial destination using location specific mitochondrial targeting signals (MTSs) and via multi-protein import machineries (translocases) in the outer and inner mitochondrial membranes (TOM and TIMs, respectively). However, MTSs of multi-pass Tims are less defined. Here we report the characterization of the MTSs of Trypanosoma brucei Tim17 (TbTim17), an essential component of the most divergent TIM complex. TbTim17 possesses a characteristic secondary structure including four predicted transmembrane (TM) domains in the center with hydrophilic N- and C-termini. After examining mitochondrial localization of various deletion and site-directed mutants of TbTim17 in T. brucei using subcellular fractionation and confocal microscopy we located at least two internal signals, 1) within TM1 (31-50 AAs) and 2) TM4 + Loop 3 (120-136 AAs). Both signals are required for proper targeting and integration of TbTim17 in the membrane. Furthermore, a positively charged residue (K 122 ) is critical for mitochondrial localization of TbTim17. This is the first report of characterizing the internal mitochondrial targeting signals (ITS) for a multipass inner membrane protein in a divergent eukaryote, like T. brucei . Summary Internal targeting signals within the TM1, TM4 with Loop 3, and residue K122 are required collectively for import and integration of TbTim17 in the T. brucei mitochondrion. This information could be utilized to block parasite growth.
Collapse
|
26
|
Raghavachari K, Maier S, Collins EM, Debnath S, Sengupta A. Approaching Coupled Cluster Accuracy with Density Functional Theory Using the Generalized Connectivity-Based Hierarchy. J Chem Theory Comput 2023. [PMID: 37338997 DOI: 10.1021/acs.jctc.3c00301] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/22/2023]
Abstract
This Perspective reviews connectivity-based hierarchy (CBH), a systematic hierarchy of error-cancellation schemes developed in our group with the goal of achieving chemical accuracy using inexpensive computational techniques ("coupled cluster accuracy with DFT"). The hierarchy is a generalization of Pople's isodesmic bond separation scheme that is based only on the structure and connectivity and is applicable to any organic and biomolecule consisting of covalent bonds. It is formulated as a series of rungs involving increasing levels of error cancellation on progressively larger fragments of the parent molecule. The method and our implementation are discussed briefly. Examples are given for the applications of CBH involving (1) energies of complex organic rearrangement reactions, (2) bond energies of biofuel molecules, (3) redox potentials in solution, (4) pKa predictions in the aqueous medium, and (5) theoretical thermochemistry combining CBH with machine learning. They clearly show that near-chemical accuracy (1-2 kcal/mol) is achieved for a variety of applications with DFT methods irrespective of the underlying density functional used. They demonstrate conclusively that seemingly disparate results, often seen with different density functionals in many chemical applications, are due to an accumulation of systematic errors in the smaller local molecular fragments that can be easily corrected with higher-level calculations on those small units. This enables the method to achieve the accuracy of the high level of theory (e.g., coupled cluster) while the cost remains that of DFT. The advantages and limitations of the method are discussed along with areas of ongoing developments.
Collapse
Affiliation(s)
- Krishnan Raghavachari
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - Sarah Maier
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - Eric M Collins
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - Sibali Debnath
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| | - Arkajyoti Sengupta
- Department of Chemistry, Indiana University, Bloomington, Indiana 47405, United States
| |
Collapse
|
27
|
Cheng Y, Wang H, Xu H, Liu Y, Ma B, Chen X, Zeng X, Wang X, Wang B, Shiau C, Ovchinnikov S, Su XD, Wang C. Co-evolution-based prediction of metal-binding sites in proteomes by machine learning. Nat Chem Biol 2023; 19:548-555. [PMID: 36593274 DOI: 10.1038/s41589-022-01223-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 11/08/2022] [Indexed: 01/03/2023]
Abstract
Metal ions have various important biological roles in proteins, including structural maintenance, molecular recognition and catalysis. Previous methods of predicting metal-binding sites in proteomes were based on either sequence or structural motifs. Here we developed a co-evolution-based pipeline named 'MetalNet' to systematically predict metal-binding sites in proteomes. We applied MetalNet to proteomes of four representative prokaryotic species and predicted 4,849 potential metalloproteins, which substantially expands the currently annotated metalloproteomes. We biochemically and structurally validated previously unannotated metal-binding sites in several proteins, including apo-citrate lyase phosphoribosyl-dephospho-CoA transferase citX, an Escherichia coli enzyme lacking structural or sequence homology to any known metalloprotein (Protein Data Bank (PDB) codes: 7DCM and 7DCN ). MetalNet also successfully recapitulated all known zinc-binding sites from the human spliceosome complex. The pipeline of MetalNet provides a unique and enabling tool for interrogating the hidden metalloproteome and studying metal biology.
Collapse
Affiliation(s)
- Yao Cheng
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Haobo Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Hua Xu
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | - Yuan Liu
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China.
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
| | - Bin Ma
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xuemin Chen
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xin Zeng
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Xianghe Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Bo Wang
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | | | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellow, Harvard University, Cambridge, MA, USA
| | - Xiao-Dong Su
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China.
| | - Chu Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China.
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.
| |
Collapse
|
28
|
Jafari Najaf Abadi MH, Abyaneh FA, Zare N, Zamani J, Abdoli A, Aslanbeigi F, Hamblin MR, Tarrahimofrad H, Rahimi M, Hashemian SM, Mirzaei H. In silico design and immunoinformatics analysis of a chimeric vaccine construct based on Salmonella pathogenesis factors. Microb Pathog 2023; 180:106130. [PMID: 37121524 DOI: 10.1016/j.micpath.2023.106130] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 04/26/2023] [Accepted: 04/27/2023] [Indexed: 05/02/2023]
Abstract
Currently, there are two vaccines based on killed and/or weakened Salmonella bacteria, but no recombinant vaccine is available for preventing or treating the disease. We used an in silico approach to design a multi-epitope vaccine against Salmonella using OmpA, OmpS, SopB, SseB, SthA and FilC antigens. We predicted helper T lymphocyte, cytotoxic T lymphocyte, and IFN-γ epitopes. The FilC sequence was used as a bovine TLR5 agonist, and the linkers KK, AAY, GPGPG and EAAAK were used to connect epitopes. The final sequence consisted of 747 amino acid residues, and the expressed soluble protein (∼79.6 kDa) was predicted to be both non-allergenic and antigenic. The tertiary structure of modeled protein was refined and validated, and the interactions of vaccine 3D structure were evaluated using molecular docking, and molecular dynamics simulation (RMSD, RMSF and Gyration). This structurally stable protein could interact with human TLR5. The C-ImmSim server predicted that this proposed vaccine likely induces an immune response by stimulating T and B cells, making it a potential candidate for further evaluation for the prevention and treatment of Salmonella infection.
Collapse
Affiliation(s)
| | | | - Noushid Zare
- Faculty of Pharmacy, International Campus, Tehran University of Medical Science, Tehran, Iran
| | - Javad Zamani
- Department of Animal Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Amirhossein Abdoli
- School of Medicine, Kashan University of Medical Sciences, Kashan, Iran; Student Research Committee, Kashan University of Medical Sciences, Kashan, Iran
| | - Fatemeh Aslanbeigi
- School of Medicine, Kashan University of Medical Sciences, Kashan, Iran; Student Research Committee, Kashan University of Medical Sciences, Kashan, Iran
| | - Michael R Hamblin
- Laser Research Centre, Faculty of Health Science, University of Johannesburg, Doornfontein, 2028, South Africa
| | - Hossein Tarrahimofrad
- Department of Animal Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran.
| | - Mohammadreza Rahimi
- Infectious Diseases Research Center, Faculty of Medicine, Kashan University of Medical Sciences, Kashan, Iran; Department of Microbiology and Immunology, Faculty of Medicine, Kashan University of Medical Sciences, Kashan, Iran.
| | - Seyed Mohammadreza Hashemian
- Chronic Respiratory Diseases Research Center, National Research Institute of Tuberculosis and Lung Disease, Shahid Beheshti University of Medical Sciences, Tehran, 1983535511, Iran.
| | - Hamed Mirzaei
- Research Center for Biochemistry and Nutrition in Metabolic Diseases, Institute for Basic Sciences, Kashan University of Medical Sciences, Kashan, Iran.
| |
Collapse
|
29
|
Choudhury A, Saha S, Maiti NC, Datta S. Exploring structural features and potential lipid interactions of Pseudomonas aeruginosa type three secretion effector PemB by spectroscopic and calorimetric experiments. Protein Sci 2023; 32:e4627. [PMID: 36916835 PMCID: PMC10044109 DOI: 10.1002/pro.4627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 03/06/2023] [Accepted: 03/10/2023] [Indexed: 03/15/2023]
Abstract
Type Three Secretion System (T3SS) is a sophisticated nano-scale weapon utilized by several gram negative bacteria under stringent spatio-temporal regulation to manipulate and evade host immune systems in order to cause infection. To the best of our knowledge, this present study is the first report where we embark upon characterizing inherent features of native type three secretion effector protein PemB through biophysical techniques. Herein, first, we demonstrate binding affinity of PemB for phosphoinositides through isothermal calorimetric titrations. Second, we shed light on its strong homo-oligomerization propensity in aqueous solution through multiple biophysical methods. Third, we also employ several spectroscopic techniques to delineate its disordered and helical conformation. Lastly, we perform a phylogenetic analysis of this new effector to elucidate evolutionary relationship with other organisms. Taken together, our results shall surely contribute to our existing knowledge of Pseudomonas aeruginosa secretome.
Collapse
Affiliation(s)
- Arkaprabha Choudhury
- Department of Structural Biology and BioinformaticsCSIR‐Indian Institute of Chemical Biology (CSIR‐IICB)Kolkata700032India
- Biological SciencesAcademy of Scientific and Innovative Research (AcSIR)201002GhaziabadIndia
| | - Saumen Saha
- Department of Structural Biology and BioinformaticsCSIR‐Indian Institute of Chemical Biology (CSIR‐IICB)Kolkata700032India
| | - Nakul Chandra Maiti
- Department of Structural Biology and BioinformaticsCSIR‐Indian Institute of Chemical Biology (CSIR‐IICB)Kolkata700032India
- Biological SciencesAcademy of Scientific and Innovative Research (AcSIR)201002GhaziabadIndia
| | - Saumen Datta
- Department of Structural Biology and BioinformaticsCSIR‐Indian Institute of Chemical Biology (CSIR‐IICB)Kolkata700032India
- Biological SciencesAcademy of Scientific and Innovative Research (AcSIR)201002GhaziabadIndia
| |
Collapse
|
30
|
Leiva S, Bugnon Valdano M, Gardiol D. Unravelling the epidemiological diversity of Zika virus by analyzing key protein variations. Arch Virol 2023; 168:115. [PMID: 36943525 DOI: 10.1007/s00705-023-05726-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 01/19/2023] [Indexed: 03/23/2023]
Abstract
The consequences of Zika virus (ZIKV) infections were limited to sporadic mild diseases until almost a decade ago, when epidemic outbreaks took place, with quick spread into the Americas. Simultaneously, novel severe neurological manifestations of ZIKV infections were identified, including congenital microcephaly. However, why the epidemic strains behave differently is not yet completely understood, and many questions remain about the actual significance of genetic variations in the epidemiology and biology of ZIKV. In this study, we analysed a large number of viral sequences to identify genes with different levels of variability and patterns of genomic variations that could be associated with ZIKV diversity. We compared numerous epidemic strains with pre-epidemic strains, using the BWA-mem algorithm, and we also examined specific variations among the epidemic ZIKV strains derived from microcephaly cases. We identified several viral genes with dissimilar mutation rates among the ZIKV strain groups and novel protein variation profiles that might be associated with epidemiological particularities. Finally, we assessed the impact of the detected changes on the structure and stability of the NS1, NS5, and E proteins using the I-TASSER, trRosetta, and RaptorX modelling algorithms, and we found some interesting variations that might help to explain the heterogeneous features of the diverse ZIKA strains. This work contributes to the identification of genetic differences in the ZIKV genome that might have a phenotypic impact, providing a basis for future experimental analysis to elucidate the genetic causes of the recent ZIKV emergency.
Collapse
Affiliation(s)
- Santiago Leiva
- Facultad de Ciencias Bioquímicas y Farmacéuticas, Instituto de Biología Molecular y Celular de Rosario-CONICET, Universidad Nacional de Rosario, Suipacha 531, 2000, Rosario, Argentina
| | - Marina Bugnon Valdano
- Facultad de Ciencias Bioquímicas y Farmacéuticas, Instituto de Biología Molecular y Celular de Rosario-CONICET, Universidad Nacional de Rosario, Suipacha 531, 2000, Rosario, Argentina.
| | - Daniela Gardiol
- Facultad de Ciencias Bioquímicas y Farmacéuticas, Instituto de Biología Molecular y Celular de Rosario-CONICET, Universidad Nacional de Rosario, Suipacha 531, 2000, Rosario, Argentina.
| |
Collapse
|
31
|
Li T, Li Y, Zhu X, He Y, Wu Y, Ying T, Xie Z. Artificial intelligence in cancer immunotherapy: Applications in neoantigen recognition, antibody design and immunotherapy response prediction. Semin Cancer Biol 2023; 91:50-69. [PMID: 36870459 DOI: 10.1016/j.semcancer.2023.02.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 02/13/2023] [Accepted: 02/28/2023] [Indexed: 03/06/2023]
Abstract
Cancer immunotherapy is a method of controlling and eliminating tumors by reactivating the body's cancer-immunity cycle and restoring its antitumor immune response. The increased availability of data, combined with advancements in high-performance computing and innovative artificial intelligence (AI) technology, has resulted in a rise in the use of AI in oncology research. State-of-the-art AI models for functional classification and prediction in immunotherapy research are increasingly used to support laboratory-based experiments. This review offers a glimpse of the current AI applications in immunotherapy, including neoantigen recognition, antibody design, and prediction of immunotherapy response. Advancing in this direction will result in more robust predictive models for developing better targets, drugs, and treatments, and these advancements will eventually make their way into the clinical setting, pushing AI forward in the field of precision oncology.
Collapse
Affiliation(s)
- Tong Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yupeng Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Xiaoyi Zhu
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China
| | - Yao He
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yanling Wu
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China
| | - Tianlei Ying
- MOE/NHC Key Laboratory of Medical Molecular Virology, Shanghai Institute of Infectious Disease and Biosecurity, School of Basic Medical Sciences, Shanghai Medical College, Fudan University, Shanghai, China; Shanghai Engineering Research Center for Synthetic Immunology, Shanghai, China.
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China; Center for Precision Medicine, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
32
|
In silico design of a polypeptide as a vaccine candidate against ascariasis. Sci Rep 2023; 13:3504. [PMID: 36864139 PMCID: PMC9981566 DOI: 10.1038/s41598-023-30445-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 02/23/2023] [Indexed: 03/04/2023] Open
Abstract
Ascariasis is the most prevalent zoonotic helminthic disease worldwide, and is responsible for nutritional deficiencies, particularly hindering the physical and neurological development of children. The appearance of anthelmintic resistance in Ascaris is a risk for the target of eliminating ascariasis as a public health problem by 2030 set by the World Health Organisation. The development of a vaccine could be key to achieving this target. Here we have applied an in silico approach to design a multi-epitope polypeptide that contains T-cell and B-cell epitopes of reported novel potential vaccination targets, alongside epitopes from established vaccination candidates. An artificial toll-like receptor-4 (TLR4) adjuvant (RS09) was added to improve immunogenicity. The constructed peptide was found to be non-allergic, non-toxic, with adequate antigenic and physicochemical characteristics, such as solubility and potential expression in Escherichia coli. A tertiary structure of the polypeptide was used to predict the presence of discontinuous B-cell epitopes and to confirm the molecular binding stability with TLR2 and TLR4 molecules. Immune simulations predicted an increase in B-cell and T-cell immune response after injection. This polypeptide can now be validated experimentally and compared to other vaccine candidates to assess its possible impact in human health.
Collapse
|
33
|
Huh E, Agosto MA, Wensel TG, Lichtarge O. Coevolutionary signals in metabotropic glutamate receptors capture residue contacts and long-range functional interactions. J Biol Chem 2023; 299:103030. [PMID: 36806686 PMCID: PMC10060750 DOI: 10.1016/j.jbc.2023.103030] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 02/09/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023] Open
Abstract
Upon ligand binding to a G protein-coupled receptor, extracellular signals are transmitted into a cell through sets of residue interactions that translate ligand binding into structural rearrangements. These interactions needed for functions impose evolutionary constraints so that, on occasion, mutations in one position may be compensated by other mutations at functionally coupled positions. To quantify the impact of amino acid substitutions in the context of major evolutionary divergence in the G protein-coupled receptor subfamily of metabotropic glutamate receptors (mGluRs), we combined two phylogenetic-based algorithms, Evolutionary Trace and covariation Evolutionary Trace, to infer potential structure-function couplings and roles in mGluRs. We found a subset of evolutionarily important residues at known functional sites and evidence of coupling among distinct structural clusters in mGluR. In addition, experimental mutagenesis and functional assays confirmed that some highly covariant residues are coupled, revealing their synergy. Collectively, these findings inform a critical step toward understanding the molecular and structural basis of amino acid variation patterns within mGluRs and provide insight for drug development, protein engineering, and analysis of naturally occurring variants.
Collapse
Affiliation(s)
- Eunna Huh
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, Texas, USA
| | - Melina A Agosto
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA; Retina and Optic Nerve Research Laboratory, Department of Physiology and Biophysics, Dalhousie University, Halifax, Canada
| | - Theodore G Wensel
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA
| | - Olivier Lichtarge
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, Texas, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
| |
Collapse
|
34
|
AlphaFold2 reveals commonalities and novelties in protein structure space for 21 model organisms. Commun Biol 2023; 6:160. [PMID: 36755055 PMCID: PMC9908985 DOI: 10.1038/s42003-023-04488-9] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 01/16/2023] [Indexed: 02/10/2023] Open
Abstract
Deep-learning (DL) methods like DeepMind's AlphaFold2 (AF2) have led to substantial improvements in protein structure prediction. We analyse confident AF2 models from 21 model organisms using a new classification protocol (CATH-Assign) which exploits novel DL methods for structural comparison and classification. Of ~370,000 confident models, 92% can be assigned to 3253 superfamilies in our CATH domain superfamily classification. The remaining cluster into 2367 putative novel superfamilies. Detailed manual analysis on 618 of these, having at least one human relative, reveal extremely remote homologies and further unusual features. Only 25 novel superfamilies could be confirmed. Although most models map to existing superfamilies, AF2 domains expand CATH by 67% and increases the number of unique 'global' folds by 36% and will provide valuable insights on structure function relationships. CATH-Assign will harness the huge expansion in structural data provided by DeepMind to rationalise evolutionary changes driving functional divergence.
Collapse
|
35
|
Liu J, Zhao K, Zhang G. Improved model quality assessment using sequence and structural information by enhanced deep neural networks. Brief Bioinform 2023; 24:6865134. [PMID: 36460624 DOI: 10.1093/bib/bbac507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 10/02/2022] [Accepted: 10/24/2022] [Indexed: 12/04/2022] Open
Abstract
Protein model quality assessment plays an important role in protein structure prediction, protein design and drug discovery. In this work, DeepUMQA2, a substantially improved version of DeepUMQA for protein model quality assessment, is proposed. First, sequence features containing protein co-evolution information and structural features reflecting family information are extracted to complement model-dependent features. Second, a novel backbone network based on triangular multiplication update and axial attention mechanism is designed to enhance information exchange between inter-residue pairs. On CASP13 and CASP14 datasets, the performance of DeepUMQA2 increases by 20.5 and 20.4% compared with DeepUMQA, respectively (measured by top 1 loss). Moreover, on the three-month CAMEO dataset (11 March to 04 June 2022), DeepUMQA2 outperforms DeepUMQA by 15.5% (measured by local AUC0,0.2) and ranks first among all competing server methods in CAMEO blind test. Experimental results show that DeepUMQA2 outperforms state-of-the-art model quality assessment methods, such as ProQ3D-LDDT, ModFOLD8, and DeepAccNet and DeepUMQA2 can select more suitable best models than state-of-the-art protein structure methods, such as AlphaFold2, RoseTTAFold and I-TASSER, provided themselves.
Collapse
Affiliation(s)
- Jun Liu
- College of Information Engineering, Zhejiang University of Technology
| | - Kailong Zhao
- College of Information Engineering, Zhejiang University of Technology
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology
| |
Collapse
|
36
|
Velazquez MB, Busi MV, Gomez-Casati DF, Nag-Dasgupta C, Barchiesi J. Molecular insight into cellulose degradation by the phototrophic green alga Scenedesmus. Proteins 2023; 91:750-770. [PMID: 36607613 DOI: 10.1002/prot.26464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 12/29/2022] [Accepted: 01/03/2023] [Indexed: 01/07/2023]
Abstract
Lignocellulose is the most abundant natural biopolymer on earth and a potential raw material for the production of fuels and chemicals. However, only some organisms such as bacteria and fungi produce enzymes that metabolize this polymer. In this work we have demonstrated the presence of cellulolytic activity in the supernatant of Scenedesmus quadricauda cultures and we identified the presence of extracellular cellulases in the genome of five Scenedesmus species. Scenedesmus is a green alga which grows in both freshwater and saltwater regions as well as in soils, showing highly flexible metabolic properties. Sequence comparison of the different identified cellulases with hydrolytic enzymes from other organisms using multisequence alignments and phylogenetic trees showed that these proteins belong to the families of glycosyl hydrolases 1, 5, 9, and 10. In addition, most of the Scenedesmus cellulases showed greater sequence similarity with those from invertebrates, fungi, bacteria, and other microalgae than with the plant homologs. Furthermore, the data obtained from the three dimensional structure showed that both, their global structure and the main amino acid residues involved in catalysis and substrate binding are well conserved. Based on our results, we propose that different species of Scenedesmus could act as biocatalysts for the hydrolysis of cellulosic biomass produced from sunlight.
Collapse
Affiliation(s)
- María B Velazquez
- Centro de Estudios Fotosintéticos y Bioquímicos (CEFOBI-CONICET), Universidad Nacional de Rosario, Rosario, Argentina
| | - María V Busi
- Centro de Estudios Fotosintéticos y Bioquímicos (CEFOBI-CONICET), Universidad Nacional de Rosario, Rosario, Argentina
| | - Diego F Gomez-Casati
- Centro de Estudios Fotosintéticos y Bioquímicos (CEFOBI-CONICET), Universidad Nacional de Rosario, Rosario, Argentina
| | | | - Julieta Barchiesi
- Centro de Estudios Fotosintéticos y Bioquímicos (CEFOBI-CONICET), Universidad Nacional de Rosario, Rosario, Argentina
| |
Collapse
|
37
|
Bhattacharya S, Roche R, Shuvo MH, Moussad B, Bhattacharya D. Contact-Assisted Threading in Low-Homology Protein Modeling. Methods Mol Biol 2023; 2627:41-59. [PMID: 36959441 DOI: 10.1007/978-1-0716-2974-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
The ability to successfully predict the three-dimensional structure of a protein from its amino acid sequence has made considerable progress in the recent past. The progress is propelled by the improved accuracy of deep learning-based inter-residue contact map predictors coupled with the rising growth of protein sequence databases. Contact map encodes interatomic interaction information that can be exploited for highly accurate prediction of protein structures via contact map threading even for the query proteins that are not amenable to direct homology modeling. As such, contact-assisted threading has garnered considerable research effort. In this chapter, we provide an overview of existing contact-assisted threading methods while highlighting the recent advances and discussing some of the current limitations and future prospects in the application of contact-assisted threading for improving the accuracy of low-homology protein modeling.
Collapse
Affiliation(s)
- Sutanu Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | | | - Md Hossain Shuvo
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Bernard Moussad
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | | |
Collapse
|
38
|
Chen C, Chen X, Morehead A, Wu T, Cheng J. 3D-equivariant graph neural networks for protein model quality assessment. BIOINFORMATICS (OXFORD, ENGLAND) 2023; 39:6986970. [PMID: 36637199 PMCID: PMC10089647 DOI: 10.1093/bioinformatics/btad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 11/28/2022] [Accepted: 01/12/2023] [Indexed: 01/14/2023]
Abstract
MOTIVATION Quality assessment (QA) of predicted protein tertiary structure models plays an important role in ranking and using them. With the recent development of deep learning end-to-end protein structure prediction techniques for generating highly confident tertiary structures for most proteins, it is important to explore corresponding QA strategies to evaluate and select the structural models predicted by them since these models have better quality and different properties than the models predicted by traditional tertiary structure prediction methods. RESULTS We develop EnQA, a novel graph-based 3D-equivariant neural network method that is equivariant to rotation and translation of 3D objects to estimate the accuracy of protein structural models by leveraging the structural features acquired from the state-of-the-art tertiary structure prediction method-AlphaFold2. We train and test the method on both traditional model datasets (e.g. the datasets of the Critical Assessment of Techniques for Protein Structure Prediction) and a new dataset of high-quality structural models predicted only by AlphaFold2 for the proteins whose experimental structures were released recently. Our approach achieves state-of-the-art performance on protein structural models predicted by both traditional protein structure prediction methods and the latest end-to-end deep learning method-AlphaFold2. It performs even better than the model QA scores provided by AlphaFold2 itself. The results illustrate that the 3D-equivariant graph neural network is a promising approach to the evaluation of protein structural models. Integrating AlphaFold2 features with other complementary sequence and structural features is important for improving protein model QA. AVAILABILITY AND IMPLEMENTATION The source code is available at https://github.com/BioinfoMachineLearning/EnQA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chen Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Xiao Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Alex Morehead
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Tianqi Wu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
39
|
Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022; 21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open
Abstract
Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input. Machine learning methods making use of structures are scattered across literature and cover a number of different applications and scopes; while some try to address questions and tasks within a single protein family, others aim to capture characteristics across all available proteins. In this review, we look at the variety of structure-based machine learning approaches, how structures can be used as input, and typical applications of these approaches in protein biology. We also discuss current challenges and opportunities in this all-important and increasingly popular field.
Collapse
Affiliation(s)
- Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Dick de Ridder
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| | - Aalt D.J. van Dijk
- Bioinformatics Group, Department of Plant Sciences, Wageningen University and Research, Wageningen, the Netherlands
| |
Collapse
|
40
|
Chang Y, Hawkins BA, Du JJ, Groundwater PW, Hibbs DE, Lai F. A Guide to In Silico Drug Design. Pharmaceutics 2022; 15:pharmaceutics15010049. [PMID: 36678678 PMCID: PMC9867171 DOI: 10.3390/pharmaceutics15010049] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 12/16/2022] [Accepted: 12/17/2022] [Indexed: 12/28/2022] Open
Abstract
The drug discovery process is a rocky path that is full of challenges, with the result that very few candidates progress from hit compound to a commercially available product, often due to factors, such as poor binding affinity, off-target effects, or physicochemical properties, such as solubility or stability. This process is further complicated by high research and development costs and time requirements. It is thus important to optimise every step of the process in order to maximise the chances of success. As a result of the recent advancements in computer power and technology, computer-aided drug design (CADD) has become an integral part of modern drug discovery to guide and accelerate the process. In this review, we present an overview of the important CADD methods and applications, such as in silico structure prediction, refinement, modelling and target validation, that are commonly used in this area.
Collapse
Affiliation(s)
- Yiqun Chang
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Bryson A. Hawkins
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Jonathan J. Du
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Paul W. Groundwater
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - David E. Hibbs
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Felcia Lai
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
- Correspondence:
| |
Collapse
|
41
|
Miah MM, Tabassum N, Afroj Zinnia M, Islam ABMMK. Drug and Anti-Viral Peptide Design to Inhibit the Monkeypox Virus by Restricting A36R Protein. Bioinform Biol Insights 2022; 16:11779322221141164. [PMID: 36570327 PMCID: PMC9772960 DOI: 10.1177/11779322221141164] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/06/2022] [Indexed: 12/24/2022] Open
Abstract
Most recently, monkeypox virus (MPXV) has emanated as a global public health threat. Unavailability of effective medicament against MPXV escalates demand for new therapeutic agent. In this study, in silico strategies were conducted to identify novel drug against the A36R protein of MPXV. The A36R protein of MPXV is responsible for the viral migration, adhesion, and vesicle trafficking to the host cell. To block the A36R protein, 4893 potential antiviral peptides (AVPs) were retrieved from DRAMP and SATPdb databases. Finally, 57 sequences were screened based on peptide filtering criteria, which were then modeled. Likewise, 31 monkeypox virus A36R protein sequences were collected from NCBI protein database to find consensus sequence and to predict 3D protein model. The refined and validated models of the A36R protein and AVP peptides were used to predict receptor-ligand interactions using DINC 2 server. Three peptides that showed best interactions were SATPdb10193, SATPdb21850, and SATPdb26811 with binding energies -6.10, -6.10, and -6.30 kcal/mol, respectively. Small molecules from drug databases were also used to perform virtual screening against the A36R protein. Among databases, Enamine-HTSC showed strong affinity with docking scores ranging from -8.8 to 9.8 kcal/mol. Interaction of target protein A36R with the top 3 peptides and the most probable drug (Z55287118) examined by molecular dynamic (MD) simulation. Trajectory analyses (RMSD, RMSF, SASA, and Rg) confirmed the stable nature of protein-ligand and protein-peptide complexes. This work suggests that identified top AVPs and small molecules might interfere with the function of the A36R protein of MPXV.
Collapse
Affiliation(s)
| | - Nuzhat Tabassum
- Department of Pharmacy, East West University, Dhaka, Bangladesh
| | | | - Abul Bashar Mir Md. Khademul Islam
- Department of Genetic Engineering & Biotechnology, University of Dhaka, Dhaka, Bangladesh,Abul Bashar Mir Md. Khademul Islam, Department of Genetic Engineering and Biotechnology, University of Dhaka, Nilkhet Rd, Dhaka 1000, Bangladesh.
| |
Collapse
|
42
|
Mufassirin MMM, Newton MAH, Sattar A. Artificial intelligence for template-free protein structure prediction: a comprehensive review. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10350-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
43
|
Buehler MJ. Multiscale Modeling at the Interface of Molecular Mechanics and Natural Language through Attention Neural Networks. Acc Chem Res 2022; 55:3387-3403. [PMID: 36378952 DOI: 10.1021/acs.accounts.2c00330] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Humans are continually bombarded with massive amounts of data. To deal with this influx of information, we use the concept of attention in order to perceive the most relevant input from vision, hearing, touch, and others. Thereby, the complex ensemble of signals is used to generate output by querying the processed data in appropriate ways. Attention is also the hallmark of the development of scientific theories, where we elucidate which parts of a problem are critical, often expressed through differential equations. In this Account we review the emergence of attention-based neural networks as a class of approaches that offer many opportunities to describe materials across scales and modalities, including how universal building blocks interact to yield a set of material properties. In fact, the self-assembly of hierarchical, structurally complex, and multifunctional biomaterials remains a grand challenge in modeling, theory, and experiment. Expanding from the process by which material building blocks physically interact to form a type of material, in this Account we view self-assembly as both the functional emergence of properties from interacting building blocks as well as the physical process by which elementary building blocks interact and yield structure and, thereby, functions. This perspective, integrated through the theory of materiomics, allows us to solve multiscale problems with a first-principles-based computational approach based on attention-based neural networks that transform information to feature to property while providing a flexible modeling approach that can integrate theory, simulation, and experiment. Since these models are based on a natural language framework, they offer various benefits including incorporation of general domain knowledge via general-purpose pretraining, which can be accomplished without labeled data or large amounts of lower-quality data. Pretrained models then offer a general-purpose platform that can be fine-tuned to adapt these models to make specific predictions, often with relatively little labeled data. The transferrable power of the language-based modeling approach realizes a neural olog description, where mathematical categorization is learned by multiheaded attention, without domain knowledge in its formulation. It can hence be applied to a range of complex modeling tasks─such as physical field predictions, molecular properties, or structure predictions, all using an identical formulation. This offers a complementary modeling approach that is already finding numerous applications, with great potential to solve complex assembly problems, enabling us to learn, build, and utilize functional categorization of how building blocks yield a range of material functions. In this Account, we demonstrate the approach in various application areas, including protein secondary structure prediction and prediction of normal-mode frequencies as well as predicting mechanical fields near cracks. Unifying these diverse problem areas is the building block approach, where the models are based on a universally applicable platform that offers benefits ranging from transferability, interpretability, and cross-domain pollination of knowledge as exemplified through a transformer model applied to predict how musical compositions infer de novo protein structures. We discuss future potentialities of this approach for a variety of material phenomena across scales, including the use in multiparadigm modeling schemes.
Collapse
Affiliation(s)
- Markus J Buehler
- Laboratory for Atomistic and Molecular Mechanics, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, Massachusetts 02139, United States.,Center for Computational Science and Engineering, Schwarzman College of Computing, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, Massachusetts 02139, United States
| |
Collapse
|
44
|
Robert PA, Akbar R, Frank R, Pavlović M, Widrich M, Snapkov I, Slabodkin A, Chernigovskaya M, Scheffer L, Smorodina E, Rawat P, Mehta BB, Vu MH, Mathisen IF, Prósz A, Abram K, Olar A, Miho E, Haug DTT, Lund-Johansen F, Hochreiter S, Haff IH, Klambauer G, Sandve GK, Greiff V. Unconstrained generation of synthetic antibody-antigen structures to guide machine learning methodology for antibody specificity prediction. NATURE COMPUTATIONAL SCIENCE 2022; 2:845-865. [PMID: 38177393 DOI: 10.1038/s43588-022-00372-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/09/2022] [Indexed: 01/06/2024]
Abstract
Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: the lack of a unified ML formalization of immunological antibody-specificity prediction problems and the unavailability of large-scale synthetic datasets to benchmark real-world relevant ML methods and dataset design. Here we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthetic lattice-based three-dimensional antibody-antigen-binding structures with ground-truth access to conformational paratope, epitope and affinity. We formalized common immunological antibody-specificity prediction problems as ML tasks and confirmed that for both sequence- and structure-based tasks, accuracy-based rankings of ML methods trained on experimental data hold for ML methods trained on Absolut!-generated data. The Absolut! framework has the potential to enable real-world relevant development and benchmarking of ML strategies for biotherapeutics design.
Collapse
Affiliation(s)
- Philippe A Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| | - Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Robert Frank
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Michael Widrich
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | - Igor Snapkov
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Andrei Slabodkin
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Maria Chernigovskaya
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | | | - Eva Smorodina
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Brij Bhushan Mehta
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway
| | | | - Aurél Prósz
- Danish Cancer Society Research Center, Translational Cancer Genomics, Copenhagen, Denmark
| | - Krzysztof Abram
- The Novo Nordisk Foundation Center for Biosustainability, Autoflow, DTU Biosustain and IT University of Copenhagen, Copenhagen, Denmark
| | - Alex Olar
- Department of Complex Systems in Physics, Eötvös Loránd University, Budapest, Hungary
| | - Enkelejda Miho
- Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
- aiNET GmbH, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | | - Sepp Hochreiter
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
- Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria
| | | | - Günter Klambauer
- ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
| | | | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
45
|
Roche R, Bhattacharya S, Shuvo MH, Bhattacharya D. rrQNet: Protein contact map quality estimation by deep evolutionary reconciliation. Proteins 2022; 90:2023-2034. [PMID: 35751651 PMCID: PMC9633355 DOI: 10.1002/prot.26394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/31/2022] [Accepted: 06/21/2022] [Indexed: 11/10/2022]
Abstract
Protein contact maps have proven to be a valuable tool in the deep learning revolution of protein structure prediction, ushering in the recent breakthrough by AlphaFold2. However, self-assessment of the quality of predicted structures are typically performed at the granularity of three-dimensional coordinates as opposed to directly exploiting the rotation- and translation-invariant two-dimensional (2D) contact maps. Here, we present rrQNet, a deep learning method for self-assessment in 2D by contact map quality estimation. Our approach is based on the intuition that for a contact map to be of high quality, the residue pairs predicted to be in contact should be mutually consistent with the evolutionary context of the protein. The deep neural network architecture of rrQNet implements this intuition by cascading two deep modules-one encoding the evolutionary context and the other performing evolutionary reconciliation. The penultimate stage of rrQNet estimates the quality scores at the interacting residue-pair level, which are then aggregated for estimating the quality of a contact map. This design choice offers versatility at varied resolutions from individual residue pairs to full-fledged contact maps. Trained on multiple complementary sources of contact predictors, rrQNet facilitates generalizability across various contact maps. By rigorously testing using publicly available datasets and comparing against several in-house baseline approaches, we show that rrQNet accurately reproduces the true quality score of a predicted contact map and successfully distinguishes between accurate and inaccurate contact maps predicted by a wide variety of contact predictors. The open-source rrQNet software package is freely available at https://github.com/Bhattacharya-Lab/rrQNet.
Collapse
Affiliation(s)
- Rahmatullah Roche
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, USA
| | - Sutanu Bhattacharya
- Department of Computer Science, Florida Polytechnic University, Lakeland, FL 33805, USA
| | - Md Hossain Shuvo
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, USA
| | | |
Collapse
|
46
|
Protein structure prediction in the deep learning era. Curr Opin Struct Biol 2022; 77:102495. [PMID: 36371845 DOI: 10.1016/j.sbi.2022.102495] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 10/03/2022] [Accepted: 10/04/2022] [Indexed: 11/11/2022]
Abstract
Significant advances have been achieved in protein structure prediction, especially with the recent development of the AlphaFold2 and the RoseTTAFold systems. This article reviews the progress in deep learning-based protein structure prediction methods in the past two years. First, we divide the representative methods into two categories: the two-step approach and the end-to-end approach. Then, we show that the two-step approach is possible to achieve similar accuracy to the state-of-the-art end-to-end approach AlphaFold2. Compared to the end-to-end approach, the two-step approach requires fewer computing resources. We conclude that it is valuable to keep developing both approaches. Finally, a few outstanding challenges in function-orientated protein structure prediction are pointed out for future development.
Collapse
|
47
|
Hasanzadeh A, Hamblin MR, Kiani J, Noori H, Hardie JM, Karimi M, Shafiee H. Could artificial intelligence revolutionize the development of nanovectors for gene therapy and mRNA vaccines? NANO TODAY 2022; 47:101665. [PMID: 37034382 PMCID: PMC10081506 DOI: 10.1016/j.nantod.2022.101665] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Gene therapy enables the introduction of nucleic acids like DNA and RNA into host cells, and is expected to revolutionize the treatment of a wide range of diseases. This growth has been further accelerated by the discovery of CRISPR/Cas technology, which allows accurate genomic editing in a broad range of cells and organisms in vitro and in vivo. Despite many advances in gene delivery and the development of various viral and non-viral gene delivery vectors, the lack of highly efficient non-viral systems with low cellular toxicity remains a challenge. The application of cutting-edge technologies such as artificial intelligence (AI) has great potential to find new paradigms to solve this issue. Herein, we review AI and its major subfields including machine learning (ML), neural networks (NNs), expert systems, deep learning (DL), computer vision and robotics. We discuss the potential of AI-based models and algorithms in the design of targeted gene delivery vehicles capable of crossing extracellular and intracellular barriers by viral mimicry strategies. We finally discuss the role of AI in improving the function of CRISPR/Cas systems, developing novel nanobots, and mRNA vaccine carriers.
Collapse
Affiliation(s)
- Akbar Hasanzadeh
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
| | - Michael R Hamblin
- Laser Research Centre, Faculty of Health Science, University of Johannesburg, Doornfontein 2028, South Africa
- Radiation Biology Research Center, Iran University of Medical Sciences, Tehran, Iran
| | - Jafar Kiani
- Oncopathology Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Molecular Medicine, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Hamid Noori
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
| | - Joseph M. Hardie
- Division of Engineering in Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02139 USA
| | - Mahdi Karimi
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Oncopathology Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Research Center for Science and Technology in Medicine, Tehran University of Medical Sciences, Tehran 141556559, Iran
- Applied Biotechnology Research Centre, Tehran Medical Science, Islamic Azad University, Tehran 1584743311, Iran
| | - Hadi Shafiee
- Division of Engineering in Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02139 USA
| |
Collapse
|
48
|
Katase N, Nishimatsu SI, Yamauchi A, Okano S, Fujita S. Establishment of anti-DKK3 peptide for the cancer control in head and neck squamous cell carcinoma (HNSCC). Cancer Cell Int 2022; 22:352. [PMID: 36376957 PMCID: PMC9664703 DOI: 10.1186/s12935-022-02783-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/04/2022] [Indexed: 11/16/2022] Open
Abstract
Background Head and neck squamous cell carcinoma (HNSCC) is the most common malignant tumor of the head and neck. We identified cancer-specific genes in HNSCC and focused on DKK3 expression. DKK3 gene codes two isoforms of proteins (secreted and non-secreted) with two distinct cysteine rich domains (CRDs). It is reported that DKK3 functions as a negative regulator of oncogenic Wnt signaling and, is therefore, considered to be a tumor suppressor gene. However, our series of studies have demonstrated that DKK3 expression is specifically high in HNSCC tissues and cells, and that DKK3 might determine the malignant potentials of HNSCC cells via the activation of Akt. Further analyses strongly suggested that both secreted DKK3 and non-secreted DKK3 could activate Akt signaling in discrete ways, and consequently exert tumor promoting effects. We hypothesized that DKK3 might be a specific druggable target, and it is necessary to establish a DKK3 inhibitor that can inhibit both secreted and non-secreted isoforms of DKK3. Methods Using inverse polymerase chain reaction, we generated mutant expression plasmids that express DKK3 without CRD1, CRD2, or both CRD1 and CRD2 (DKK3ΔC1, DKK3ΔC2, and DKK3ΔC1ΔC2, respectively). These plasmids were then transfected into HNSCC-derived cells to determine the domain responsible for DKK3-mediated Akt activation. We designed antisense peptides using the MIMETEC program, targeting DKK3-specific amino acid sequences within CRD1 and CRD2. The structural models for peptides and DKK3 were generated using Raptor X, and then a docking simulation was performed using CluPro2. Afterward, the best set of the peptides was applied into HNSCC-derived cells, and the effects on Akt phosphorylation, cellular proliferation, invasion, and migration were assessed. We also investigated the therapeutic effects of the peptides in the xenograft models. Results Transfection of mutant expression plasmids and subsequent functional analyses revealed that it is necessary to delete both CRD1 and CRD2 to inhibit Akt activation and inhibition of proliferation, migration, and invasion. The inhibitory peptides for CRD1 and CRD2 of DKK3 significantly reduced the phosphorylation of Akt, and consequently suppressed cellular proliferation, migration, invasion and in vivo tumor growth at very low doses. Conclusions This inhibitory peptide represents a promising new therapeutic strategy for HNSCC treatment. Supplementary Information The online version contains supplementary material available at 10.1186/s12935-022-02783-9.
Collapse
|
49
|
AmyJ33, a truncated amylase with improved catalytic properties. Biotechnol Lett 2022; 44:1447-1463. [DOI: 10.1007/s10529-022-03311-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 10/10/2022] [Indexed: 11/06/2022]
|
50
|
Barger J, Adhikari B. New Labeling Methods for Deep Learning Real-Valued Inter-Residue Distance Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3586-3594. [PMID: 34559660 DOI: 10.1109/tcbb.2021.3115053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
BACKGROUND Much of the recent success in protein structure prediction has been a result of accurate protein contact prediction-a binary classification problem. Dozens of methods, built from various types of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups, including Google DeepMind, have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As an alternative approach, we recently proposed real-valued distance predictions, formulating the problem as a regression problem. The nuances of protein 3D structures make this formulation appropriate, allowing predictions to reflect inter-residue distances in nature. Despite these promises, the accurate prediction of real-valued distances remains relatively unexplored; possibly due to classification being better suited to machine and deep learning algorithms. METHODS Can regression methods be designed to predict real-valued distances as precise as binary contacts? To investigate this, we propose multiple novel methods of input label engineering, which is different from feature engineering, with the goal of optimizing the distribution of distances to cater to the loss function of the deep-learning model. Since an important utility of predicted contacts or distances is to build three-dimensional models, we also tested if predicted distances can reconstruct more accurate models than contacts. RESULTS Our results demonstrate, for the first time, that deep learning methods for real-valued protein distance prediction can deliver distances as precise as binary classification methods. When using an optimal distance transformation function on the standard PSICOV dataset consisting of 150 representative proteins, the precision of 'top-all' long-range contacts improves from 60.9% to 61.4% when predicting real-valued distances instead of contacts. When building three-dimensional models we observed an average TM-score increase from 0.61 to 0.72, highlighting the advantage of predicting real-valued distances.
Collapse
|