1
|
Khalil HB. Genome-Wide Characterization and Expression Profiling of Phytosulfokine Receptor Genes ( PSKRs) in Triticum aestivum with Docking Simulations of Their Interactions with Phytosulfokine (PSK): A Bioinformatics Study. Genes (Basel) 2024; 15:1306. [PMID: 39457430 PMCID: PMC11507999 DOI: 10.3390/genes15101306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2024] [Revised: 09/29/2024] [Accepted: 10/08/2024] [Indexed: 10/28/2024] Open
Abstract
Background/Objectives: The phytosulfokine receptor (PSKR) gene family plays a crucial role in regulating plant growth, development, and stress response. Here, the PSKR gene family was characterized in Triticum aestivum L. The study aimed to bridge knowledge gaps and clarify the functional roles of TaPSKRs to create a solid foundation for examining the structure, functions, and regulatory aspects. Methods: The investigation involved genome-wide identification of PSKRs through collection and chromosomal assignment, followed by phylogenetic analysis and gene expression profiling. Additionally, interactions with their interactors were stimulated and analyzed to elucidate their function. Results: The wide-genome inspection of all TaPSKRs led to 25 genes with various homeologs, resulting in 57 TaPSKR members distributed among the A, B, and D subgenomes. Investigating the expression of 61 TaPSKR cDNAs in RNA-seq datasets generated from different growth stages at 14, 21, and 60 days old and diverse tissues such as leaves, shoots, and roots provided further insight into their functional purposes. The expression profile of the TaPSKRs resulted in three key clusters. Gene cluster 1 (GC 1) is partially associated with root growth, suggesting that specific TaPSKRs control root development. The GC 2 cluster targeted genes that show high levels of expression in all tested leaf growth stages and the early developmental stage of the shoots and roots. Furthermore, the GC 3 cluster was composed of genes that are constantly expressed, highlighting their crucial role in regulating various processes during the entire life cycle of wheat. Molecular docking simulations showed that phytosulfokine type α (PSK-α) interacted with all TaPSKRs and had a strong binding affinity with certain TaPSKR proteins, encompassing TaPSKR1A, TaPSKR3B, and TaPSKR13A, that support their involvement in PSK signaling pathways. The crucial arbitration of the affinity may depend on interactions between wheat PSK-α and PSKRs, especially in the LRR domain region. Conclusions: These discoveries deepened our knowledge of the role of the TaPSKR gene family in wheat growth and development, opening up possibilities for further studies to enhance wheat durability and yield via focused innovation approaches.
Collapse
Affiliation(s)
- Hala Badr Khalil
- Department of Biological Sciences, College of Science, King Faisal University, P.O. Box 380, Al-Ahsa 31982, Saudi Arabia;
- Department of Genetics, Faculty of Agriculture, Ain Shams University, 68 Hadayek Shoubra, Cairo 11241, Egypt
| |
Collapse
|
2
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors. Hum Genomics 2024; 18:90. [PMID: 39198917 PMCID: PMC11360829 DOI: 10.1186/s40246-024-00663-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 08/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). RESULTS The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past three decades, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 190 VIPs, resulting in a total of 407 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. CONCLUSIONS VIPdb version 2 summarizes 407 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. VIPdb is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - Arul S Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA
- Illumina, Foster City, CA, 94404, USA
| | - Steven E Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA.
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA.
| |
Collapse
|
3
|
Zhou Y, Jiang Y, Chen SJ. SPRank─A Knowledge-Based Scoring Function for RNA-Ligand Pose Prediction and Virtual Screening. J Chem Theory Comput 2024. [PMID: 39150889 DOI: 10.1021/acs.jctc.4c00681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2024]
Abstract
The growing interest in RNA-targeted drugs underscores the need for computational modeling of interactions between RNA molecules and small compounds. Having a reliable scoring function for RNA-ligand interactions is essential for effective computational drug screening. An ideal scoring function should not only predict the native pose for ligand binding but also rank the affinity of the binding for different ligands. However, existing scoring functions are primarily designed to predict the native binding modes for a given RNA-ligand pair and have not been thoroughly assessed for virtual screening purposes. In this paper, we introduce SPRank, a combination of machine-learning and knowledge-based scoring functions developed through a weighted iterative approach, specifically designed to tackle both binding mode prediction and virtual screening challenges. Our approach incorporates third-party docking software, such as rDock and AutoDock Vina, to sample flexible ligands against an ensemble of RNA structures, capturing the conformational flexibility of both the RNA and the ligand. Through rigorous testing, SPRank demonstrates improved performance compared to the tested scoring functions across four test sets comprising 122, 42, 55, and 71 nucleic acid-ligand complexes. Furthermore, SPRank exhibits improved performance in virtual screening tests targeting the HIV-1 TAR ensemble, which highlights its advantage in drug discovery. These results underscore the advantages of SPRank as a potentially promising tool for the RNA-targeted drug design. The source code of SPRank and the data sets are freely accessible at https://github.com/Vfold-RNA/SPRank.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri 65211-7010, United States
| | - Yangwei Jiang
- Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri 65211-7010, United States
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri-Columbia, Columbia, Missouri 65211-7010, United States
| |
Collapse
|
4
|
Zhu Y, Yu M, Aisikaer M, Zhang C, He Y, Chen Z, Yang Y, Han R, Li Z, Zhang F, Ding J, Lu X. Contriving a novel of CHB therapeutic vaccine based on IgV_CTLA-4 and L protein via immunoinformatics approach. J Biomol Struct Dyn 2024; 42:6323-6341. [PMID: 37424209 DOI: 10.1080/07391102.2023.2234043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Accepted: 07/01/2023] [Indexed: 07/11/2023]
Abstract
Chronic infection induced by immune tolerance to hepatitis B virus (HBV) is one of the most common causes of hepatic cirrhosis and hepatoma. Fortunately, the application of therapeutic vaccine can not only reverse HBV-tolerance, but also serve a potentially effective therapeutic strategy for treating chronic hepatitis B (CHB). However, the clinical effect of the currently developed CHB therapeutic vaccine is not optimistic due to the weak immunogenicity. Given that the human leukocyte antigen CTLA-4 owns strong binding ability to the surface B7 molecules (CD80 and CD86) of antigen presenting cell (APCs), the immunoglobulin variable region of CTLA-4 (IgV_CTLA-4) was fused with the L protein of HBV to contrive a novel therapeutic vaccine (V_C4HBL) for CHB in this study. We found that the addition of IgV_CTLA-4 did not interfere with the formation of L protein T cell and B cell epitopes after analysis by means of immunoinformatics approaches. Meanwhile, we also found that the IgV_CTLA-4 had strong binding force to B7 molecules through molecular docking and molecular dynamics (MD) simulation. Notably, our vaccine V_C4HBL showed good immunogenicity and antigenicity by in vitro and in vivo experiments. Therefore, the V_C4HBL is promising to again effectively activate the cellular and humoral immunity of CHB patients, and provides a potentially effective therapeutic strategy for the treatment of CHB in the future.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Yuejie Zhu
- Reproductive Medicine Center, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
- Infectious Disease Center, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
| | - Mingkai Yu
- Department of Immunology, School of Basic Medical Sciences, Xinjiang Medical University, Urumqi, China
- Xinjiang Key Molecular Biology Laboratory of Endemic Disease, Xinjiang Medical University, Urumqi, China
| | - Maierhaba Aisikaer
- Department of Immunology, School of Basic Medical Sciences, Xinjiang Medical University, Urumqi, China
- Xinjiang Key Molecular Biology Laboratory of Endemic Disease, Xinjiang Medical University, Urumqi, China
| | - Chuntao Zhang
- Department of Microbiology, School of Basic Medical Sciences, Xinjiang Medical University, Urumqi, China
| | - Yueyue He
- Department of Immunology, School of Basic Medical Sciences, Xinjiang Medical University, Urumqi, China
- Xinjiang Key Molecular Biology Laboratory of Endemic Disease, Xinjiang Medical University, Urumqi, China
| | - Zhiqiang Chen
- Department of Immunology, School of Basic Medical Sciences, Xinjiang Medical University, Urumqi, China
- Xinjiang Key Molecular Biology Laboratory of Endemic Disease, Xinjiang Medical University, Urumqi, China
| | - Yinyin Yang
- Xinjiang Key Molecular Biology Laboratory of Endemic Disease, Xinjiang Medical University, Urumqi, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Xinjiang Medical University, Urumqi, China
| | - Rui Han
- Reproductive Medicine Center, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
| | - Zhiwei Li
- Clinical Laboratory Center, Xinjiang Uygur Autonomous Region People's Hospital, Urumqi, China
| | - Fengbo Zhang
- Department of Clinical Laboratory, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
| | - Jianbing Ding
- Department of Immunology, School of Basic Medical Sciences, Xinjiang Medical University, Urumqi, China
- Xinjiang Key Molecular Biology Laboratory of Endemic Disease, Xinjiang Medical University, Urumqi, China
| | - Xiaobo Lu
- Infectious Disease Center, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
| |
Collapse
|
5
|
Tarafder S, Bhattacharya D. lociPARSE: a locality-aware invariant point attention model for scoring RNA 3D structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.04.565599. [PMID: 37961488 PMCID: PMC10635153 DOI: 10.1101/2023.11.04.565599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
A scoring function that can reliably assess the accuracy of a 3D RNA structural model in the absence of experimental structure is not only important for model evaluation and selection but also useful for scoring-guided conformational sampling. However, high-fidelity RNA scoring has proven to be difficult using conventional knowledge-based statistical potentials and currently-available machine learning-based approaches. Here we present lociPARSE, a locality-aware invariant point attention architecture for scoring RNA 3D structures. Unlike existing machine learning methods that estimate superposition-based root mean square deviation (RMSD), lociPARSE estimates Local Distance Difference Test (lDDT) scores capturing the accuracy of each nucleotide and its surrounding local atomic environment in a superposition-free manner, before aggregating information to predict global structural accuracy. Tested on multiple datasets including CASP15, lociPARSE significantly outperforms existing statistical potentials (rsRNASP, cgRNASP, DFIRE-RNA, and RASP) and machine learning methods (ARES and RNA3DCNN) across complementary assessment metrics. lociPARSE is freely available at https://github.com/Bhattacharya-Lab/lociPARSE.
Collapse
Affiliation(s)
- Sumit Tarafder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, 24061, USA
| | | |
Collapse
|
6
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600283. [PMID: 38979289 PMCID: PMC11230257 DOI: 10.1101/2024.06.25.600283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). Results The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past 25 years, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 186 VIPs, resulting in a total of 403 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. Conclusions VIPdb version 2 summarizes 403 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. Availability VIPdb version 2 is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Arul S. Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Currently at: Illumina, Foster City, California 94404, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
7
|
da Silva-Júnior AHP, de Oliveira Silva RC, Gurgel APAD, Barros-Júnior MR, Nascimento KCG, Santos DL, Pena LJ, Lima RDCP, Batista MVDA, Chagas BS, de Freitas AC. Identification and Functional Implications of the E5 Oncogene Polymorphisms of Human Papillomavirus Type 16. Trop Med Infect Dis 2024; 9:140. [PMID: 39058182 PMCID: PMC11281449 DOI: 10.3390/tropicalmed9070140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 06/12/2024] [Accepted: 06/21/2024] [Indexed: 07/28/2024] Open
Abstract
The persistence of the human papillomavirus type 16 (HPV16) infection on the cervical epithelium contributes to the progression of cervical cancer. Studies have demonstrated that HPV16 genetic variants may be associated with different risks of developing cervical cancer. However, the E5 oncoprotein of HPV16, which is related to several cellular mechanisms in the initial phases of the infection and thus contributes to carcinogenesis, is still little studied. Here we investigate the HPV16 E5 oncogene variants to assess the effects of different mutations on the biological function of the E5 protein. We detected and analyzed the HPV16 E5 oncogene polymorphisms and their phylogenetic relationships. After that, we proposed a tertiary structure analysis of the protein variants, preferential codon usage, and functional activity of the HPV16 E5 protein. Intra-type variants were grouped in the lineages A and D using in silico analysis. The mutations in E5 were located in the T-cell epitopes region. We therefore analyzed the interference of the HPV16 E5 protein in the NF-kB pathway. Our results showed that the variants HPV16E5_49PE and HPV16E5_85PE did not increase the potential of the pathway activation capacity. This study provides additional knowledge about the mechanisms of dispersion of the HPV16 E5 variants, providing evidence that these variants may be relevant to the modulation of the NF-κB signaling pathway.
Collapse
Affiliation(s)
- Antônio Humberto P. da Silva-Júnior
- Laboratory of Molecular Studies and Experimental Therapy (LEMTE), Department of Genetics, Federal University of Pernambuco, Recife 50670-901, Pernambuco, Brazil; (A.H.P.d.S.-J.); (R.C.d.O.S.); (M.R.B.-J.); (K.C.G.N.); (D.L.S.); (R.d.C.P.L.); (B.S.C.)
| | - Ruany Cristyne de Oliveira Silva
- Laboratory of Molecular Studies and Experimental Therapy (LEMTE), Department of Genetics, Federal University of Pernambuco, Recife 50670-901, Pernambuco, Brazil; (A.H.P.d.S.-J.); (R.C.d.O.S.); (M.R.B.-J.); (K.C.G.N.); (D.L.S.); (R.d.C.P.L.); (B.S.C.)
| | - Ana Pavla A. Diniz Gurgel
- Department of Engineering and Environment, Federal University of Paraiba, João Pessoa 58033-455, Paraíba, Brazil;
| | - Marconi Rêgo Barros-Júnior
- Laboratory of Molecular Studies and Experimental Therapy (LEMTE), Department of Genetics, Federal University of Pernambuco, Recife 50670-901, Pernambuco, Brazil; (A.H.P.d.S.-J.); (R.C.d.O.S.); (M.R.B.-J.); (K.C.G.N.); (D.L.S.); (R.d.C.P.L.); (B.S.C.)
| | - Kamylla Conceição Gomes Nascimento
- Laboratory of Molecular Studies and Experimental Therapy (LEMTE), Department of Genetics, Federal University of Pernambuco, Recife 50670-901, Pernambuco, Brazil; (A.H.P.d.S.-J.); (R.C.d.O.S.); (M.R.B.-J.); (K.C.G.N.); (D.L.S.); (R.d.C.P.L.); (B.S.C.)
| | - Daffany Luana Santos
- Laboratory of Molecular Studies and Experimental Therapy (LEMTE), Department of Genetics, Federal University of Pernambuco, Recife 50670-901, Pernambuco, Brazil; (A.H.P.d.S.-J.); (R.C.d.O.S.); (M.R.B.-J.); (K.C.G.N.); (D.L.S.); (R.d.C.P.L.); (B.S.C.)
| | - Lindomar J. Pena
- Laboratory of Virology and Experimental Therapy, Instituto Aggeu Magalhães (IAM), Oswaldo Cruz Foundation, Recife 50670-901, Pernambuco, Brazil;
| | - Rita de Cássia Pereira Lima
- Laboratory of Molecular Studies and Experimental Therapy (LEMTE), Department of Genetics, Federal University of Pernambuco, Recife 50670-901, Pernambuco, Brazil; (A.H.P.d.S.-J.); (R.C.d.O.S.); (M.R.B.-J.); (K.C.G.N.); (D.L.S.); (R.d.C.P.L.); (B.S.C.)
| | - Marcus Vinicius de Aragão Batista
- Laboratory of Molecular Genetics and Biotechnology (GMBio), Department of Biology, Federal University of Sergipe, São Cristóvão 49107-230, Sergipe, Brazil
| | - Bárbara Simas Chagas
- Laboratory of Molecular Studies and Experimental Therapy (LEMTE), Department of Genetics, Federal University of Pernambuco, Recife 50670-901, Pernambuco, Brazil; (A.H.P.d.S.-J.); (R.C.d.O.S.); (M.R.B.-J.); (K.C.G.N.); (D.L.S.); (R.d.C.P.L.); (B.S.C.)
| | - Antonio Carlos de Freitas
- Laboratory of Molecular Studies and Experimental Therapy (LEMTE), Department of Genetics, Federal University of Pernambuco, Recife 50670-901, Pernambuco, Brazil; (A.H.P.d.S.-J.); (R.C.d.O.S.); (M.R.B.-J.); (K.C.G.N.); (D.L.S.); (R.d.C.P.L.); (B.S.C.)
| |
Collapse
|
8
|
Han Y, Lu Y, Yan X, Cui H, Cheng S, Zheng J, Zhou Y, Wang S, Li Z. Atom-ProteinQA: Atom-level protein model quality assessment through fine-grained joint learning. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 249:108078. [PMID: 38537495 DOI: 10.1016/j.cmpb.2024.108078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 12/26/2023] [Accepted: 02/10/2024] [Indexed: 04/21/2024]
Abstract
MOTIVATION Protein model quality assessment (ProteinQA) is a fundamental task that is essential for biologically relevant applications, i.e., protein structure refinement, protein design, etc. Previous works aimed to conduct ProteinQA only on the global structure or per-residue level, ignoring potentially usable and precise cues from a fine-grained per-atom perspective. In this study, we propose an atom-level ProteinQA model, named Atom-ProteinQA, in which two innovative modules are designed to extract geometric and topological atom-level relationships respectively. Specifically, on the one hand, a geometric perception module exploits 3D sparse convolution to capture the geometric features of the input protein, generating fine-grained atom-level predictions. On the other hand, natural chemical bonds are utilized to construct an atom-level graph, then message passing from a topological perception module is applied to output residue-level predictions in parallel. Eventually, through a cross-model aggregation module, features from different modules mutually interact, enhancing performance on both the atom and residue levels. RESULTS Extensive experiments show that our proposed Atom-ProteinQA outperforms previous methods by a large margin, regardless of residue-level or atom-level assessment. Concretely, we achieved state-of-the-art performance on CATH-2084, Decoy-8000, public benchmarks CASP13 & CASP14, and the CAMEO. AVAILABILITY The repository of this project is released on: https://github.com/luyfcandy/Atom_ProteinQA.
Collapse
Affiliation(s)
- Yatong Han
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Yingfeng Lu
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Xu Yan
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Hannah Cui
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | | | - Jiayou Zheng
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Yuzhe Zhou
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., Shanghai, 200030, China.
| | - Zhen Li
- Future Network of Intelligence Institute, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China; School of Science and Engineering, the Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China.
| |
Collapse
|
9
|
Luo D, Liu D, Qu X, Dong L, Wang B. Enhancing Generalizability in Protein-Ligand Binding Affinity Prediction with Multimodal Contrastive Learning. J Chem Inf Model 2024; 64:1892-1906. [PMID: 38441880 DOI: 10.1021/acs.jcim.3c01961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Improving the generalization ability of scoring functions remains a major challenge in protein-ligand binding affinity prediction. Many machine learning methods are limited by their reliance on single-modal representations, hindering a comprehensive understanding of protein-ligand interactions. We introduce a graph-neural-network-based scoring function that utilizes a triplet contrastive learning loss to improve protein-ligand representations. In this model, three-dimensional complex representations and the fusion of two-dimensional ligand and coarse-grained pocket representations converge while distancing from decoy representations in latent space. After rigorous validation on multiple external data sets, our model exhibits commendable generalization capabilities compared to those of other deep learning-based scoring functions, marking it as a promising tool in the realm of drug discovery. In the future, our training framework can be extended to other biophysical- and biochemical-related problems such as protein-protein interaction and protein mutation prediction.
Collapse
Affiliation(s)
- Ding Luo
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Dandan Liu
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Xiaoyang Qu
- School of Pharmacy and Medical Technology, Putian University, Putian 351100, P. R. China
- Key Laboratory of Pharmaceutical Analysis and Laboratory Medicine (Putian University), Fujian Province University, Putian 351100, P. R. China
| | - Lina Dong
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Binju Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces and Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, P. R. China
| |
Collapse
|
10
|
Christoffer C, Harini K, Archit G, Kihara D. Assembly of Protein Complexes in and on the Membrane with Predicted Spatial Arrangement Constraints. J Mol Biol 2024; 436:168486. [PMID: 38336197 PMCID: PMC10942765 DOI: 10.1016/j.jmb.2024.168486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/17/2024] [Accepted: 02/05/2024] [Indexed: 02/12/2024]
Abstract
Membrane proteins play crucial roles in various cellular processes, and their interactions with other proteins in and on the membrane are essential for their proper functioning. While an increasing number of structures of more membrane proteins are being determined, the available structure data is still sparse. To gain insights into the mechanisms of membrane protein complexes, computational docking methods are necessary due to the challenge of experimental determination. Here, we introduce Mem-LZerD, a rigid-body membrane docking algorithm designed to take advantage of modern membrane modeling and protein docking techniques to facilitate the docking of membrane protein complexes. Mem-LZerD is based on the LZerD protein docking algorithm, which has been constantly among the top servers in many rounds of CAPRI protein docking assessment. By employing a combination of geometric hashing, newly constrained by the predicted membrane height and tilt angle, and model scoring accounting for the energy of membrane insertion, we demonstrate the capability of Mem-LZerD to model diverse membrane protein-protein complexes. Mem-LZerD successfully performed unbound docking on 13 of 21 (61.9%) transmembrane complexes in an established benchmark, more than shown by previous approaches. It was additionally tested on new datasets of 44 transmembrane complexes and 92 peripheral membrane protein complexes, of which it successfully modeled 35 (79.5%) and 15 (16.3%) complexes respectively. When non-blind orientations of peripheral targets were included, the number of successes increased to 54 (58.7%). We further demonstrate that Mem-LZerD produces complex models which are suitable for molecular dynamics simulation. Mem-LZerD is made available at https://lzerd.kiharalab.org.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Kannan Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India; Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Gupta Archit
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Department of Genetic Engineering, SRM Institute of Science and Technology, Kattankulathur 603203, India
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA; Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN 47907, USA.
| |
Collapse
|
11
|
Bernard C, Postic G, Ghannay S, Tahi F. RNAdvisor: a comprehensive benchmarking tool for the measure and prediction of RNA structural model quality. Brief Bioinform 2024; 25:bbae064. [PMID: 38436560 PMCID: PMC10939302 DOI: 10.1093/bib/bbae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/30/2024] [Accepted: 02/02/2024] [Indexed: 03/05/2024] Open
Abstract
RNA is a complex macromolecule that plays central roles in the cell. While it is well known that its structure is directly related to its functions, understanding and predicting RNA structures is challenging. Assessing the real or predictive quality of a structure is also at stake with the complex 3D possible conformations of RNAs. Metrics have been developed to measure model quality while scoring functions aim at assigning quality to guide the discrimination of structures without a known and solved reference. Throughout the years, many metrics and scoring functions have been developed, and no unique assessment is used nowadays. Each developed assessment method has its specificity and might be complementary to understanding structure quality. Therefore, to evaluate RNA 3D structure predictions, it would be important to calculate different metrics and/or scoring functions. For this purpose, we developed RNAdvisor, a comprehensive automated software that integrates and enhances the accessibility of existing metrics and scoring functions. In this paper, we present our RNAdvisor tool, as well as state-of-the-art existing metrics, scoring functions and a set of benchmarks we conducted for evaluating them. Source code is freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr.
Collapse
Affiliation(s)
- Clement Bernard
- Université Paris Saclay, Univ Evry, IBISC, 91020 Evry-Courcouronnes, France
| | - Guillaume Postic
- Université Paris Saclay, Univ Evry, IBISC, 91020 Evry-Courcouronnes, France
| | - Sahar Ghannay
- LISN - CNRS/Université Paris-Saclay, France, 91400 Orsay, France
| | - Fariza Tahi
- Université Paris Saclay, Univ Evry, IBISC, 91020 Evry-Courcouronnes, France
| |
Collapse
|
12
|
Kiani YS, Jabeen I. Challenges of Protein-Protein Docking of the Membrane Proteins. Methods Mol Biol 2024; 2780:203-255. [PMID: 38987471 DOI: 10.1007/978-1-0716-3985-6_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Despite the recent advances in the determination of high-resolution membrane protein (MP) structures, the structural and functional characterization of MPs remains extremely challenging, mainly due to the hydrophobic nature, low abundance, poor expression, purification, and crystallization difficulties associated with MPs. Whereby the major challenges/hurdles for MP structure determination are associated with the expression, purification, and crystallization procedures. Although there have been significant advances in the experimental determination of MP structures, only a limited number of MP structures (approximately less than 1% of all) are available in the Protein Data Bank (PDB). Therefore, the structures of a large number of MPs still remain unresolved, which leads to the availability of widely unplumbed structural and functional information related to MPs. As a result, recent developments in the drug discovery realm and the significant biological contemplation have led to the development of several novel, low-cost, and time-efficient computational methods that overcome the limitations of experimental approaches, supplement experiments, and provide alternatives for the characterization of MPs. Whereby the fine tuning and optimizations of these computational approaches remains an ongoing endeavor.Computational methods offer a potential way for the elucidation of structural features and the augmentation of currently available MP information. However, the use of computational modeling can be extremely challenging for MPs mainly due to insufficient knowledge of (or gaps in) atomic structures of MPs. Despite the availability of numerous in silico methods for 3D structure determination the applicability of these methods to MPs remains relatively low since all methods are not well-suited or adequate for MPs. However, sophisticated methods for MP structure predictions are constantly being developed and updated to integrate the modifications required for MPs. Currently, different computational methods for (1) MP structure prediction, (2) stability analysis of MPs through molecular dynamics simulations, (3) modeling of MP complexes through docking, (4) prediction of interactions between MPs, and (5) MP interactions with its soluble partner are extensively used. Towards this end, MP docking is widely used. It is notable that the MP docking methods yet few in number might show greater potential in terms of filling the knowledge gap. In this chapter, MP docking methods and associated challenges have been reviewed to improve the applicability, accuracy, and the ability to model macromolecular complexes.
Collapse
Affiliation(s)
- Yusra Sajid Kiani
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Ishrat Jabeen
- School of Interdisciplinary Engineering and Sciences (SINES), National University of Sciences and Technology (NUST), Islamabad, Pakistan.
| |
Collapse
|
13
|
Chandrasekhar G, Srinivasan E, Nandhini S, Pravallika G, Sanjay G, Rajasekaran R. Computer aided therapeutic tripeptide design, in alleviating the pathogenic proclivities of nocuous α-synuclein fibrils. J Biomol Struct Dyn 2024; 42:483-494. [PMID: 36961221 DOI: 10.1080/07391102.2023.2194003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 03/15/2023] [Indexed: 03/25/2023]
Abstract
Parkinson's disorder (PD) exacerbates neuronal degeneration of motor nerves, thereby effectuating uncoordinated movements and tremors. Aberrant alpha-synuclein (α-syn) is culpable of triggering PD, wherein cytotoxic amyloid aggregates of α-syn get deposited in motor neurons to instigate neuro-degeneration. Amyloid aggregates, typically rich in beta sheets are cardinal targets to mitigate their neurotoxic effects. In this analysis, owing to their interaction specificity, we formulated an efficacious tripeptide out of the aggregation-prone region of α-syn protein. With the help of a proficient computational pipeline, systematic peptide shortening and an adept molecular simulation platform, we formulated a tripeptide, VAV from α-syn structure based hexapeptide KISVRV. Indeed, the VAV tripeptide was able to effectively mitigate the α-syn amyloid fibrils' dynamic rate of beta-sheet formation. Additional trajectory analyses of the VAV- α-syn complex indicated that, upon its dynamic interaction, VAV efficiently altered the distinct pathogenic structural dynamics of α-syn, further advocating its potential in alleviating aberrant α-syn's amyloidogenic proclivities. Consistent findings from various computational analyses have led us to surmise that VAV could potentially re-alter the pathogenic conformational orientation of α-syn, essential to mitigate its cytotoxicity. Hence, VAV tripeptide could be an efficacious therapeutic candidate to efficiently ameliorate aberrant α-syn amyloid mediated neurotoxicity, eventually attenuating the nocuous effects of PD.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- G Chandrasekhar
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology (VIT, Deemed to Be University), Vellore, Tamil Nadu, India
| | - E Srinivasan
- Molecular Biophysics Unit, Indian Institute of Science, Bengaluru, Karnataka, India
| | - S Nandhini
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology (VIT, Deemed to Be University), Vellore, Tamil Nadu, India
| | - G Pravallika
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology (VIT, Deemed to Be University), Vellore, Tamil Nadu, India
| | - G Sanjay
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology (VIT, Deemed to Be University), Vellore, Tamil Nadu, India
| | - R Rajasekaran
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology (VIT, Deemed to Be University), Vellore, Tamil Nadu, India
| |
Collapse
|
14
|
Wang M, Li W, Yu X, Luo Y, Han K, Wang C, Jin Q. AffinityVAE: A multi-objective model for protein-ligand affinity prediction and drug design. Comput Biol Chem 2023; 107:107971. [PMID: 37852036 DOI: 10.1016/j.compbiolchem.2023.107971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 09/23/2023] [Accepted: 10/08/2023] [Indexed: 10/20/2023]
Abstract
In the prediction of protein-ligand affinity, the traditional methods require a large amount of computing resources, and have certain limitations in predicting and simulating the structural changes. Although employing data-driven approaches can yield favorable outcomes in deep learning, it entails a lack of interpretability. Some methods may require additional structural information or domain knowledge to support the interpretation, which may limit their applicability. This paper proposes an affinity variational autoencoder (AffinityVAE) using interaction feature mapping and a variational autoencoder, which consists of a multi-objective model capable of end-to-end affinity prediction and drug discovery. In this study, the limitations of affinity prediction in terms of interpretability are tackled by proposing the concept of a protein-ligand interaction feature map. This increases the diversity and quantity of protein-ligand binding data by designing an adaptive autoencoder of target chemical properties to generate new ligands similar to known ligands and adding them to the original training set. AffinityVAE is then retrained using this extended training set to further validate the protein-ligand binding affinity prediction. Comparisons were conducted between the AffinityVAE and recent methods to demonstrate the high efficiency of the proposed model. The experimental results show that AffinityVAE has very high prediction performance, and it has the potential to enhance the diversity and the amount of protein-ligand binding data, which promotes the drug development.
Collapse
Affiliation(s)
- Mengying Wang
- School of Computer Engineering and Science, Shanghai University, Shanghai, China.
| | - Weimin Li
- School of Computer Engineering and Science, Shanghai University, Shanghai, China.
| | - Xiao Yu
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Yin Luo
- School of Life Sciences, East China Normal University, China
| | - Ke Han
- Medical and Health Center, Liaocheng People's Hospital, LiaoCheng, China.
| | - Can Wang
- School of Information and Communication Technology, Griffith University, Australia
| | - Qun Jin
- Networked Information System Laboratory, Waseda University, Tokyo, Japan
| |
Collapse
|
15
|
Li L, Sun S, Wang M, Xiang J, Shao Y, Wu G, Zhou J, khan U, Xin Z. Improving the hydrolysis and acyltransferase activity of bifunctional feruloyl esterases DLFae4 by multiple rational predictions and directed evolution. FOOD BIOSCI 2023; 56:103140. [DOI: 10.1016/j.fbio.2023.103140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
|
16
|
Christoffer C, Harini K, Archit G, Kihara D. Assembly of Protein Complexes In and On the Membrane with Predicted Spatial Arrangement Constraints. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.20.563303. [PMID: 37961264 PMCID: PMC10634698 DOI: 10.1101/2023.10.20.563303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Membrane proteins play crucial roles in various cellular processes, and their interactions with other proteins in and on the membrane are essential for their proper functioning. While an increasing number of structures of more membrane proteins are being determined, the available structure data is still sparse. To gain insights into the mechanisms of membrane protein complexes, computational docking methods are necessary due to the challenge of experimental determination. Here, we introduce Mem-LZerD, a rigid-body membrane docking algorithm designed to take advantage of modern membrane modeling and protein docking techniques to facilitate the docking of membrane protein complexes. Mem-LZerD is based on the LZerD protein docking algorithm, which has been constantly among the top servers in many rounds of CAPRI protein docking assessment. By employing a combination of geometric hashing, newly constrained by the predicted membrane height and tilt angle, and model scoring accounting for the energy of membrane insertion, we demonstrate the capability of Mem-LZerD to model diverse membrane protein-protein complexes. Mem-LZerD successfully performed unbound docking on 13 of 21 (61.9%) transmembrane complexes in an established benchmark, more than shown by previous approaches. It was additionally tested on new datasets of 44 transmembrane complexes and 92 peripheral membrane protein complexes, of which it successfully modeled 35 (79.5%) and 15 (16.3%) complexes respectively. When non-blind orientations of peripheral targets were included, the number of successes increased to 54 (58.7%). We further demonstrate that Mem-LZerD produces complex models which are suitable for molecular dynamics simulation. Mem-LZerD is made available at https://lzerd.kiharalab.org.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Kannan Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Gupta Archit
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Genetic Engineering, SRM Institute of Science and Technology, Kattankulathur 603203, India
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
17
|
Teruel N, Borges VM, Najmanovich R. Surfaces: a software to quantify and visualize interactions within and between proteins and ligands. Bioinformatics 2023; 39:btad608. [PMID: 37788107 PMCID: PMC10568369 DOI: 10.1093/bioinformatics/btad608] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 08/23/2023] [Accepted: 09/29/2023] [Indexed: 10/05/2023] Open
Abstract
SUMMARY Computational methods for the quantification and visualization of the relative contribution of molecular interactions to the stability of biomolecular structures and complexes are fundamental to understand, modulate and engineer biological processes. Here, we present Surfaces, an easy to use, fast and customizable software for quantification and visualization of molecular interactions based on the calculation of surface areas in contact. Surfaces calculations shows equivalent or better correlations with experimental data as computationally expensive methods based on molecular dynamics. AVAILABILITY AND IMPLEMENTATION All scripts are available at https://github.com/NRGLab/Surfaces. Surface's documentation is available at https://surfaces-tutorial.readthedocs.io/en/latest/index.html.
Collapse
Affiliation(s)
- Natália Teruel
- Department of Pharmacology and Physiology, Faculty of Medicine, Université de Montréal, Montreal H3T 1J4, Canada
| | - Vinicius Magalhães Borges
- Department of Biomedical Sciences, Joan C. Edwards School of Medicine, Marshall University, Huntington, WV, USA
| | - Rafael Najmanovich
- Department of Pharmacology and Physiology, Faculty of Medicine, Université de Montréal, Montreal H3T 1J4, Canada
| |
Collapse
|
18
|
Xu G, Luo Z, Zhou R, Wang Q, Ma J. OPUS-Fold3: a gradient-based protein all-atom folding and docking framework on TensorFlow. Brief Bioinform 2023; 24:bbad365. [PMID: 37833840 DOI: 10.1093/bib/bbad365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 08/29/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
For refining and designing protein structures, it is essential to have an efficient protein folding and docking framework that generates a protein 3D structure based on given constraints. In this study, we introduce OPUS-Fold3 as a gradient-based, all-atom protein folding and docking framework, which accurately generates 3D protein structures in compliance with specified constraints, such as a potential function as long as it can be expressed as a function of positions of heavy atoms. Our tests show that, for example, OPUS-Fold3 achieves performance comparable to pyRosetta in backbone folding and significantly better in side-chain modeling. Developed using Python and TensorFlow 2.4, OPUS-Fold3 is user-friendly for any source-code level modifications and can be seamlessly combined with other deep learning models, thus facilitating collaboration between the biology and AI communities. The source code of OPUS-Fold3 can be downloaded from http://github.com/OPUS-MaLab/opus_fold3. It is freely available for academic usage.
Collapse
Affiliation(s)
- Gang Xu
- Multiscale Research Institute of Complex Systems, Fudan University, Shanghai, 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 201210, China
- Shanghai AI Laboratory, Shanghai, 200030, China
| | - Zhenwei Luo
- Multiscale Research Institute of Complex Systems, Fudan University, Shanghai, 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 201210, China
- Shanghai AI Laboratory, Shanghai, 200030, China
| | - Ruhong Zhou
- Institute of Quantitative Biology, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
- Shanghai Institute for Advanced Study, Zhejiang University, Shanghai, 201203, China
| | - Qinghua Wang
- Center for Biomolecular Innovation, Harcam Biomedicines, Shanghai, 200131, China
| | - Jianpeng Ma
- Multiscale Research Institute of Complex Systems, Fudan University, Shanghai, 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 201210, China
- Shanghai AI Laboratory, Shanghai, 200030, China
- Shanghai Institute for Advanced Study, Zhejiang University, Shanghai, 201203, China
| |
Collapse
|
19
|
Le Naour—Vernet M, Charriat F, Gracy J, Cros-Arteil S, Ravel S, Veillet F, Meusnier I, Padilla A, Kroj T, Cesari S, Gladieux P. Adaptive evolution in virulence effectors of the rice blast fungus Pyricularia oryzae. PLoS Pathog 2023; 19:e1011294. [PMID: 37695773 PMCID: PMC10513199 DOI: 10.1371/journal.ppat.1011294] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 09/21/2023] [Accepted: 08/09/2023] [Indexed: 09/13/2023] Open
Abstract
Plant pathogens secrete proteins called effectors that target host cellular processes to promote disease. Recently, structural genomics has identified several families of fungal effectors that share a similar three-dimensional structure despite remarkably variable amino-acid sequences and surface properties. To explore the selective forces that underlie the sequence variability of structurally-analogous effectors, we focused on MAX effectors, a structural family of effectors that are major determinants of virulence in the rice blast fungus Pyricularia oryzae. Using structure-informed gene annotation, we identified 58 to 78 MAX effector genes per genome in a set of 120 isolates representing seven host-associated lineages. The expression of MAX effector genes was primarily restricted to the early biotrophic phase of infection and strongly influenced by the host plant. Pangenome analyses of MAX effectors demonstrated extensive presence/absence polymorphism and identified gene loss events possibly involved in host range adaptation. However, gene knock-in experiments did not reveal a strong effect on virulence phenotypes suggesting that other evolutionary mechanisms are the main drivers of MAX effector losses. MAX effectors displayed high levels of standing variation and high rates of non-synonymous substitutions, pointing to widespread positive selection shaping the molecular diversity of MAX effectors. The combination of these analyses with structural data revealed that positive selection acts mostly on residues located in particular structural elements and at specific positions. By providing a comprehensive catalog of amino acid polymorphism, and by identifying the structural determinants of the sequence diversity, our work will inform future studies aimed at elucidating the function and mode of action of MAX effectors.
Collapse
Affiliation(s)
- Marie Le Naour—Vernet
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| | - Florian Charriat
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| | - Jérôme Gracy
- Centre de Biologie Structurale (CBS), Univ Montpellier, INSERM, CNRS, Montpellier, France
| | - Sandrine Cros-Arteil
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| | - Sébastien Ravel
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
- CIRAD, UMR PHIM, Montpellier, France
| | - Florian Veillet
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| | - Isabelle Meusnier
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| | - André Padilla
- Centre de Biologie Structurale (CBS), Univ Montpellier, INSERM, CNRS, Montpellier, France
| | - Thomas Kroj
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| | - Stella Cesari
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| | - Pierre Gladieux
- PHIM Plant Health Institute, Univ Montpellier, INRAE, CIRAD, Institut Agro, IRD, Montpellier, France
| |
Collapse
|
20
|
Christoffer C, Kihara D. Modeling protein-nucleic acid complexes with extremely large conformational changes using Flex-LZerD. Proteomics 2023; 23:e2200322. [PMID: 36529945 PMCID: PMC10448949 DOI: 10.1002/pmic.202200322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 12/08/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022]
Abstract
Proteins and nucleic acids are key components in many processes in living cells, and interactions between proteins and nucleic acids are often crucial pathway components. In many cases, large flexibility of proteins as they interact with nucleic acids is key to their function. To understand the mechanisms of these processes, it is necessary to consider the 3D atomic structures of such protein-nucleic acid complexes. When such structures are not yet experimentally determined, protein docking can be used to computationally generate useful structure models. However, such docking has long had the limitation that the consideration of flexibility is usually limited to small movements or to small structures. We previously developed a method of flexible protein docking which could model ordered proteins which undergo large-scale conformational changes, which we also showed was compatible with nucleic acids. Here, we elaborate on the ability of that pipeline, Flex-LZerD, to model specifically interactions between proteins and nucleic acids, and demonstrate that Flex-LZerD can model more interactions and types of conformational change than previously shown.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
- Purdue University Center for Cancer Research, Purdue University, West Lafayette, Indiana, USA
| |
Collapse
|
21
|
Wang X, Yu S, Lou E, Tan YL, Tan ZJ. RNA 3D Structure Prediction: Progress and Perspective. Molecules 2023; 28:5532. [PMID: 37513407 PMCID: PMC10386116 DOI: 10.3390/molecules28145532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/05/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Ribonucleic acid (RNA) molecules play vital roles in numerous important biological functions such as catalysis and gene regulation. The functions of RNAs are strongly coupled to their structures or proper structure changes, and RNA structure prediction has been paid much attention in the last two decades. Some computational models have been developed to predict RNA three-dimensional (3D) structures in silico, and these models are generally composed of predicting RNA 3D structure ensemble, evaluating near-native RNAs from the structure ensemble, and refining the identified RNAs. In this review, we will make a comprehensive overview of the recent advances in RNA 3D structure modeling, including structure ensemble prediction, evaluation, and refinement. Finally, we will emphasize some insights and perspectives in modeling RNA 3D structures.
Collapse
Affiliation(s)
- Xunxun Wang
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Shixiong Yu
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - En Lou
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Ya-Lan Tan
- School of Bioengineering and Health, Wuhan Textile University, Wuhan 430200, China
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, China
| | - Zhi-Jie Tan
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| |
Collapse
|
22
|
Yvonnesdotter L, Rovšnik U, Blau C, Lycksell M, Howard RJ, Lindahl E. Automated simulation-based membrane protein refinement into cryo-EM data. Biophys J 2023; 122:2773-2781. [PMID: 37277992 PMCID: PMC10397807 DOI: 10.1016/j.bpj.2023.05.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 04/02/2023] [Accepted: 05/31/2023] [Indexed: 06/07/2023] Open
Abstract
The resolution revolution has increasingly enabled single-particle cryogenic electron microscopy (cryo-EM) reconstructions of previously inaccessible systems, including membrane proteins-a category that constitutes a disproportionate share of drug targets. We present a protocol for using density-guided molecular dynamics simulations to automatically refine atomistic models into membrane protein cryo-EM maps. Using adaptive force density-guided simulations as implemented in the GROMACS molecular dynamics package, we show how automated model refinement of a membrane protein is achieved without the need to manually tune the fitting force ad hoc. We also present selection criteria to choose the best-fit model that balances stereochemistry and goodness of fit. The proposed protocol was used to refine models into a new cryo-EM density of the membrane protein maltoporin, either in a lipid bilayer or detergent micelle, and we found that results do not substantially differ from fitting in solution. Fitted structures satisfied classical model-quality metrics and improved the quality and the model-to-map correlation of the x-ray starting structure. Additionally, the density-guided fitting in combination with generalized orientation-dependent all-atom potential was used to correct the pixel-size estimation of the experimental cryo-EM density map. This work demonstrates the applicability of a straightforward automated approach to fitting membrane protein cryo-EM densities. Such computational approaches promise to facilitate rapid refinement of proteins under different conditions or with various ligands present, including targets in the highly relevant superfamily of membrane proteins.
Collapse
Affiliation(s)
- Linnea Yvonnesdotter
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Urška Rovšnik
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Christian Blau
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Marie Lycksell
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Rebecca Joy Howard
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden
| | - Erik Lindahl
- Science for Life Laboratory & Swedish e-Science Research Center, Department of Applied Physics, KTH Royal Institute of Technology, Solna, Sweden; Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden.
| |
Collapse
|
23
|
Chen X, Morehead A, Liu J, Cheng J. A gated graph transformer for protein complex structure quality assessment and its performance in CASP15. Bioinformatics 2023; 39:i308-i317. [PMID: 37387159 PMCID: PMC10311325 DOI: 10.1093/bioinformatics/btad203] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Proteins interact to form complexes to carry out essential biological functions. Computational methods such as AlphaFold-multimer have been developed to predict the quaternary structures of protein complexes. An important yet largely unsolved challenge in protein complex structure prediction is to accurately estimate the quality of predicted protein complex structures without any knowledge of the corresponding native structures. Such estimations can then be used to select high-quality predicted complex structures to facilitate biomedical research such as protein function analysis and drug discovery. RESULTS In this work, we introduce a new gated neighborhood-modulating graph transformer to predict the quality of 3D protein complex structures. It incorporates node and edge gates within a graph transformer framework to control information flow during graph message passing. We trained, evaluated and tested the method (called DProQA) on newly-curated protein complex datasets before the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) and then blindly tested it in the 2022 CASP15 experiment. The method was ranked 3rd among the single-model quality assessment methods in CASP15 in terms of the ranking loss of TM-score on 36 complex targets. The rigorous internal and external experiments demonstrate that DProQA is effective in ranking protein complex structures. AVAILABILITY AND IMPLEMENTATION The source code, data, and pre-trained models are available at https://github.com/jianlin-cheng/DProQA.
Collapse
Affiliation(s)
- Xiao Chen
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65201, United States
| | - Alex Morehead
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65201, United States
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65201, United States
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65201, United States
| |
Collapse
|
24
|
Tollefson MR, Gogal RA, Weaver AM, Schaefer AM, Marini RJ, Azaiez H, Kolbe DL, Wang D, Weaver AE, Casavant TL, Braun TA, Smith RJH, Schnieders MJ. Assessing variants of uncertain significance implicated in hearing loss using a comprehensive deafness proteome. Hum Genet 2023; 142:819-834. [PMID: 37086329 PMCID: PMC10182131 DOI: 10.1007/s00439-023-02559-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 04/11/2023] [Indexed: 04/23/2023]
Abstract
Hearing loss is the leading sensory deficit, affecting ~ 5% of the population. It exhibits remarkable heterogeneity across 223 genes with 6328 pathogenic missense variants, making deafness-specific expertise a prerequisite for ascribing phenotypic consequences to genetic variants. Deafness-implicated variants are curated in the Deafness Variation Database (DVD) after classification by a genetic hearing loss expert panel and thorough informatics pipeline. However, seventy percent of the 128,167 missense variants in the DVD are "variants of uncertain significance" (VUS) due to insufficient evidence for classification. Here, we use the deep learning protein prediction algorithm, AlphaFold2, to curate structures for all DVD genes. We refine these structures with global optimization and the AMOEBA force field and use DDGun3D to predict folding free energy differences (∆∆GFold) for all DVD missense variants. We find that 5772 VUSs have a large, destabilizing ∆∆GFold that is consistent with pathogenic variants. When also filtered for CADD scores (> 25.7), we determine 3456 VUSs are likely pathogenic at a probability of 99.0%. Of the 224 genes in the DVD, 166 genes (74%) exhibit one or more missense variants predicted to cause a pathogenic change in protein folding stability. The VUSs prioritized here affect 119 patients (~ 3% of cases) sequenced by the OtoSCOPE targeted panel. Approximately half of these patients previously received an inconclusive report, and reclassification of these VUSs as pathogenic provides a new genetic diagnosis for six patients.
Collapse
Affiliation(s)
- Mallory R Tollefson
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Rose A Gogal
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - A Monique Weaver
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Amanda M Schaefer
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Robert J Marini
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Hela Azaiez
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Diana L Kolbe
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Donghong Wang
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Amy E Weaver
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Thomas L Casavant
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - Terry A Braun
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - Richard J H Smith
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA.
| | - Michael J Schnieders
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Biochemistry and Molecular Biology, University of Iowa, Iowa City, IA, 52242, USA.
| |
Collapse
|
25
|
Yang S, Gong W, Zhou T, Sun X, Chen L, Zhou W, Li C. emPDBA: protein-DNA binding affinity prediction by combining features from binding partners and interface learned with ensemble regression model. Brief Bioinform 2023:7165253. [PMID: 37193676 DOI: 10.1093/bib/bbad192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 04/26/2023] [Accepted: 04/29/2023] [Indexed: 05/18/2023] Open
Abstract
Protein-deoxyribonucleic acid (DNA) interactions are important in a variety of biological processes. Accurately predicting protein-DNA binding affinity has been one of the most attractive and challenging issues in computational biology. However, the existing approaches still have much room for improvement. In this work, we propose an ensemble model for Protein-DNA Binding Affinity prediction (emPDBA), which combines six base models with one meta-model. The complexes are classified into four types based on the DNA structure (double-stranded or other forms) and the percentage of interface residues. For each type, emPDBA is trained with the sequence-based, structure-based and energy features from binding partners and complex structures. Through feature selection by the sequential forward selection method, it is found that there do exist considerable differences in the key factors contributing to intermolecular binding affinity. The complex classification is beneficial for the important feature extraction for binding affinity prediction. The performance comparison of our method with other peer ones on the independent testing dataset shows that emPDBA outperforms the state-of-the-art methods with the Pearson correlation coefficient of 0.53 and the mean absolute error of 1.11 kcal/mol. The comprehensive results demonstrate that our method has a good performance for protein-DNA binding affinity prediction. Availability and implementation: The source code is available at https://github.com/ChunhuaLiLab/emPDBA/.
Collapse
Affiliation(s)
- Shuang Yang
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Tong Zhou
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Xiaohan Sun
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Lei Chen
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Wenxue Zhou
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
26
|
Wodak SJ, Vajda S, Lensink MF, Kozakov D, Bates PA. Critical Assessment of Methods for Predicting the 3D Structure of Proteins and Protein Complexes. Annu Rev Biophys 2023; 52:183-206. [PMID: 36626764 PMCID: PMC10885158 DOI: 10.1146/annurev-biophys-102622-084607] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Advances in a scientific discipline are often measured by small, incremental steps. In this review, we report on two intertwined disciplines in the protein structure prediction field, modeling of single chains and modeling of complexes, that have over decades emulated this pattern, as monitored by the community-wide blind prediction experiments CASP and CAPRI. However, over the past few years, dramatic advances were observed for the accurate prediction of single protein chains, driven by a surge of deep learning methodologies entering the prediction field. We review the mainscientific developments that enabled these recent breakthroughs and feature the important role of blind prediction experiments in building up and nurturing the structure prediction field. We discuss how the new wave of artificial intelligence-based methods is impacting the fields of computational and experimental structural biology and highlight areas in which deep learning methods are likely to lead to future developments, provided that major challenges are overcome.
Collapse
Affiliation(s)
- Shoshana J Wodak
- VIB-VUB Center for Structural Biology, Vrije Universiteit Brussel, Brussels, Belgium;
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA;
- Department of Chemistry, Boston University, Boston, Massachusetts, USA
| | - Marc F Lensink
- Univ. Lille, CNRS, UMR 8576-UGSF-Unité de Glycobiologie Structurale et Fonctionnelle, Lille, France;
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York, USA;
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA
| | - Paul A Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, London, United Kingdom;
| |
Collapse
|
27
|
Tan YL, Wang X, Yu S, Zhang B, Tan ZJ. cgRNASP: coarse-grained statistical potentials with residue separation for RNA structure evaluation. NAR Genom Bioinform 2023; 5:lqad016. [PMID: 36879898 PMCID: PMC9985339 DOI: 10.1093/nargab/lqad016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 01/21/2023] [Accepted: 02/03/2023] [Indexed: 03/07/2023] Open
Abstract
Knowledge-based statistical potentials are very important for RNA 3-dimensional (3D) structure prediction and evaluation. In recent years, various coarse-grained (CG) and all-atom models have been developed for predicting RNA 3D structures, while there is still lack of reliable CG statistical potentials not only for CG structure evaluation but also for all-atom structure evaluation at high efficiency. In this work, we have developed a series of residue-separation-based CG statistical potentials at different CG levels for RNA 3D structure evaluation, namely cgRNASP, which is composed of long-ranged and short-ranged interactions by residue separation. Compared with the newly developed all-atom rsRNASP, the short-ranged interaction in cgRNASP was involved more subtly and completely. Our examinations show that, the performance of cgRNASP varies with CG levels and compared with rsRNASP, cgRNASP has similarly good performance for extensive types of test datasets and can have slightly better performance for the realistic dataset-RNA-Puzzles dataset. Furthermore, cgRNASP is strikingly more efficient than all-atom statistical potentials/scoring functions, and can be apparently superior to other all-atom statistical potentials and scoring functions trained from neural networks for the RNA-Puzzles dataset. cgRNASP is available at https://github.com/Tan-group/cgRNASP.
Collapse
Affiliation(s)
- Ya-Lan Tan
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430073, China.,Department of Physics and Key Laboratory of Artificial Micro & Nano-structures of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Xunxun Wang
- Department of Physics and Key Laboratory of Artificial Micro & Nano-structures of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Shixiong Yu
- Department of Physics and Key Laboratory of Artificial Micro & Nano-structures of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430073, China
| | - Zhi-Jie Tan
- Department of Physics and Key Laboratory of Artificial Micro & Nano-structures of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| |
Collapse
|
28
|
Tollefson MR, Gogal RA, Weaver AM, Schaefer AM, Marini RJ, Azaiez H, Kolbe DL, Wang D, Weaver AE, Casavant TL, Braun TA, Smith RJH, Schnieders M. Assessing Variants of Uncertain Significance Implicated in Hearing Loss Using a Comprehensive Deafness Proteome. RESEARCH SQUARE 2023:rs.3.rs-2508462. [PMID: 36778238 PMCID: PMC9915777 DOI: 10.21203/rs.3.rs-2508462/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Hearing loss is the leading sensory deficit, affecting ~ 5% of the population. It exhibits remarkable heterogeneity across 223 genes with 6,328 pathogenic missense variants, making deafness-specific expertise a prerequisite for ascribing phenotypic consequences to genetic variants. Deafness-implicated variants are curated in the Deafness Variation Database (DVD) after classification by a genetic hearing loss expert panel and thorough informatics pipeline. However, seventy percent of the 128,167 missense variants in the DVD are "variants of uncertain significance" (VUS) due to insufficient evidence for classification. Here, we use the deep learning protein prediction algorithm, AlphaFold2, to curate structures for all DVD genes. We refine these structures with global optimization and the AMOEBA force field and use DDGun3D to predict folding free energy differences (∆∆G Fold ) for all DVD missense variants. We find that 5,772 VUSs have a large, destabilizing ∆∆G Fold that is consistent with pathogenic variants. When also filtered for CADD scores (> 25.7), we determine 3,456 VUSs are likely pathogenic at a probability of 99.0%. These VUSs affect 119 patients (~ 3% of cases) sequenced by the OtoSCOPE targeted panel. Approximately half of these patients previously received an inconclusive report, and reclassification of these VUSs as pathogenic provides a new genetic diagnosis for six patients.
Collapse
|
29
|
Bodie NM, Hashimoto R, Connolly D, Chu J, Takayama K, Uhal BD. Design of a chimeric ACE-2/Fc-silent fusion protein with ultrahigh affinity and neutralizing capacity for SARS-CoV-2 variants. Antib Ther 2023; 6:59-74. [PMID: 36741194 PMCID: PMC9889962 DOI: 10.1093/abt/tbad001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 10/14/2022] [Accepted: 01/03/2023] [Indexed: 01/22/2023] Open
Abstract
Background As SARS-CoV-2 continues to mutate into Variants of Concern (VOC), there is growing and urgent need to develop effective antivirals to combat COVID-19. Monoclonal antibodies developed earlier are no longer capable of effectively neutralizing currently active VOCs. This report describes the design of variant-agnostic chimeric molecules consisting of an Angiotensin-Converting Enzyme 2 (ACE-2) domain mutated to retain ultrahigh affinity binding to a wide variety of SARS-CoV-2 variants, coupled to an Fc-silent immunoglobulin domain that eliminates antibody-dependent enhancement and extends biological half-life. Methods Molecular modeling, Surrogate Viral Neutralization tests (sVNTs) and infection studies of human airway organoid cultures were performed with synthetic chimeras, SARS-CoV-2 spike protein mimics and SARS-CoV-2 Omicron variants B.1.1.214, BA.1, BA.2 and BA.5. Results ACE-2 mutations L27, V34 and E90 resulted in ultrahigh affinity binding of the LVE-ACE-2 domain to the widest variety of VOCs, with KDs of 93 pM and 73 pM for binding to the Alpha B1.1.7 and Omicron B.1.1.529 variants, and notably, 78fM, 133fM and 1.81pM affinities to the Omicron BA.2, BA2.75 and BQ.1.1 subvariants, respectively. sVNT assays revealed titers of ≥4.9 ng/ml, for neutralization of recombinant viral proteins corresponding to the Alpha, Delta and Omicron variants. The values above were obtained with LVE-ACE-2/mAB chimeras containing the FcRn-binding Y-T-E sequence which extends biological half-life 3-4-fold. Conclusions The ACE-2-mutant/Fc silent fusion proteins described have ultrahigh affinity to a wide variety of SARS-CoV-2 variants including Omicron. It is proposed that these chimeric ACE-2/mABs will constitute variant-agnostic and cost-effective prophylactics against SARS-CoV-2, particularly when administered nasally.
Collapse
Affiliation(s)
- Neil M Bodie
- Paradigm Immunotherapeutics Inc., Monrovia, CA 91016, USA
| | - Rina Hashimoto
- Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto 6068507, Japan
| | - David Connolly
- College of Osteopathic Medicine, Department of Medicine, Michigan State University, East Lansing, MI 48824, USA
| | - Jennifer Chu
- Innovation Lab, ACROBiosystems, 1 Innovation Way, Newark, DE 19711, USA
| | - Kazuo Takayama
- To whom correspondence should be addressed. Bruce D. Uhal, Department of Physiology, Michigan State University, 3197 Biomedical and Physical Sciences Building, 567 Wilson Road, East Lansing, MI 48824, USA. and Kazuo Takayama, Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto 6068507, Japan.
| | - Bruce D Uhal
- To whom correspondence should be addressed. Bruce D. Uhal, Department of Physiology, Michigan State University, 3197 Biomedical and Physical Sciences Building, 567 Wilson Road, East Lansing, MI 48824, USA. and Kazuo Takayama, Center for iPS Cell Research and Application (CiRA), Kyoto University, Kyoto 6068507, Japan.
| |
Collapse
|
30
|
Nagaraju M, Liu H. A scoring function for the prediction of protein complex interfaces based on the neighborhood preferences of amino acids. Acta Crystallogr D Struct Biol 2023; 79:31-39. [PMID: 36601805 DOI: 10.1107/s2059798322011858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 12/13/2022] [Indexed: 12/24/2022] Open
Abstract
Proteins often assemble into functional complexes, the structures of which are more difficult to obtain than those of the individual protein molecules. Given the structures of the subunits, it is possible to predict plausible complex models via computational methods such as molecular docking. Assessing the quality of the predicted models is crucial to obtain correct complex structures. Here, an energy-scoring function was developed based on the interfacial residues of structures in the Protein Data Bank. The statistically derived energy function (Nepre) imitates the neighborhood preferences of amino acids, including the types and relative positions of neighboring residues. Based on the preference statistics, a program iNepre was implemented and its performance was evaluated with several benchmarking decoy data sets. The results show that iNepre scores are powerful in model ranking to select the best protein complex structures.
Collapse
Affiliation(s)
- Mulpuri Nagaraju
- Complex Systems Division, Beijing Computational Science Research Center, Beijing 100193, People's Republic of China
| | - Haiguang Liu
- Complex Systems Division, Beijing Computational Science Research Center, Beijing 100193, People's Republic of China
| |
Collapse
|
31
|
Villegas JA, Vaid TM, Johnson ME, Moore TW. Mapping the energy landscape of PROTAC-mediated protein-protein interactions. Comput Struct Biotechnol J 2023; 21:1885-1892. [PMID: 36923472 PMCID: PMC10008833 DOI: 10.1016/j.csbj.2023.02.049] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/27/2023] [Accepted: 02/27/2023] [Indexed: 03/06/2023] Open
Abstract
A principal challenge in computational modeling of macromolecules is the vast conformational space that arises out of large numbers of atomic degrees of freedom. Recently, growing interest in building predictive models of complexes mediated by Proteolysis Targeting Chimeras (PROTACs) has led to the application of state-of-the-art computational techniques to tackle this problem. However, repurposing existing tools to carry out protein-protein docking and linker conformer generation independently results in extensive sampling of structures incompatible with PROTAC-mediated complex formation. Here we show that it is possible to restrict the search to the space of protein-protein conformations that can be bridged by a PROTAC molecule with a given linker composition by using a cyclic coordinate descent algorithm to position PROTACs into complex-bound configurations. We use this methodology to construct potential energy and solvation energy landscapes of PROTAC-mediated interactions. Our results suggest that desolvation of amino acids at interfaces could play a dominant role in PROTAC-mediated complex formation.
Collapse
Affiliation(s)
- José A Villegas
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois Chicago, Chicago, IL 60612, USA
| | - Tasneem M Vaid
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois Chicago, Chicago, IL 60612, USA
| | - Michael E Johnson
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois Chicago, Chicago, IL 60612, USA.,Center for Biomolecular Sciences, College of Pharmacy, University of Illinois Chicago, Chicago, IL 60606, USA
| | - Terry W Moore
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Illinois Chicago, Chicago, IL 60612, USA.,University of Illinois Cancer Center, University of Illinois Chicago, Chicago, IL 60612, USA
| |
Collapse
|
32
|
Abstract
Protein structure modeling is one of the most advanced and complex processes in computational biology. One of the major problems for the protein structure prediction field has been how to estimate the accuracy of the predicted 3D models, on both a local and global level, in the absence of known structures. We must be able to accurately measure the confidence that we have in the quality predicted 3D models of proteins for them to become widely adopted by the general bioscience community. To address this major issue, it was necessary to develop new model quality assessment (MQA) methods and integrate them into our pipelines for building 3D protein models. Our MQA method, called ModFOLD, has been ranked as one of the most accurate MQA tools in independent blind evaluations. This chapter discusses model quality assessment in the protein modeling field, demonstrating both its strengths and limitations. We also present some of the best methods according to independent benchmarking data, which has been gathered in recent years.
Collapse
Affiliation(s)
- Ali H A Maghrabi
- College of Applied Sciences, Umm Al Qura University, Mecca, Saudi Arabia
| | | | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Reading, UK.
| |
Collapse
|
33
|
Harini K, Christoffer C, Gromiha MM, Kihara D. Pairwise and Multi-chain Protein Docking Enhanced Using LZerD Web Server. Methods Mol Biol 2023; 2690:355-373. [PMID: 37450159 PMCID: PMC10561630 DOI: 10.1007/978-1-0716-3327-4_28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
Interactions of proteins with other macromolecules have important structural and functional roles in the basic processes of living cells. To understand and elucidate the mechanisms of interactions, it is important to know the 3D structures of the complexes. Proteomes contain numerous protein-protein complexes, for which experimentally determined structures often do not exist. Computational techniques can be a practical alternative to obtain useful complex structure models. Here, we present a web server that provides access to the LZerD and Multi-LZerD protein docking tools, which can perform both pairwise and multi-chain docking. The web server is user-friendly, with options to visualize the distribution and structures of binding poses of top-scoring models. The LZerD web server is available at https://lzerd.kiharalab.org . This chapter dictates the algorithm and step-by-step procedure to model the monomeric structures with AttentiveDist, and also provides the detail of pairwise LZerD docking, and multi-LZerD. This also provided case studies for each of the three modules.
Collapse
Affiliation(s)
- Kannan Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | | | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
34
|
Liu J, Zhang C, Lai L. GeoPacker: A novel deep learning framework for protein side-chain modeling. Protein Sci 2022; 31:e4484. [PMID: 36309961 PMCID: PMC9667900 DOI: 10.1002/pro.4484] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 10/23/2022] [Accepted: 10/26/2022] [Indexed: 12/13/2022]
Abstract
Atomic interactions play essential roles in protein folding, structure stabilization, and function performance. Recent advances in deep learning-based methods have achieved impressive success not only in protein structure prediction, but also in protein sequence design. However, highly efficient and accurate protein side-chain prediction methods that can give detailed atomic interactions are still lacking. In the present study, we developed a deep learning based method, GeoPacker, that uses geometric deep learning coupled ResNet for protein side-chain modeling. GeoPacker explicitly represents atomic interactions with rotational and translational invariance for information extraction of relative locations. GeoPacker outperformed the state-of-the-art energy function-based methods in side-chain structure prediction accuracy and runs about 10 and 700 times faster than the deep learning-based method DLPacker and OPUS-rota4 with comparable prediction accuracy, respectively. The performance of GeoPacker does not depend on the secondary structures that the residues belong to. GeoPacker gives highly accurate predictions for buried residues in the protein core as well as protein-protein interface, making it a useful tool for protein structure modeling, protein, and interaction design.
Collapse
Affiliation(s)
- Jiale Liu
- Center for Life Sciences, Academy for Advanced Interdisciplinary StudiesPeking UniversityBeijingChina
| | - Changsheng Zhang
- BNLMS, College of Chemistry and Molecular EngineeringPeking UniversityBeijingChina
| | - Luhua Lai
- Center for Life Sciences, Academy for Advanced Interdisciplinary StudiesPeking UniversityBeijingChina
- BNLMS, College of Chemistry and Molecular EngineeringPeking UniversityBeijingChina
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary StudiesPeking UniversityBeijingChina
| |
Collapse
|
35
|
Christoffer C, Kihara D. Domain-Based Protein Docking with Extremely Large Conformational Changes. J Mol Biol 2022; 434:167820. [PMID: 36089054 PMCID: PMC9992458 DOI: 10.1016/j.jmb.2022.167820] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 08/31/2022] [Accepted: 09/03/2022] [Indexed: 11/17/2022]
Abstract
Proteins are key components in many processes in living cells, and physical interactions with other proteins and nucleic acids often form key parts of their functions. In many cases, large flexibility of proteins as they interact is key to their function. To understand the mechanisms of these processes, it is necessary to consider the 3D structures of such protein complexes. When such structures are not yet experimentally determined, protein docking has long been present to computationally generate useful structure models. However, protein docking has long had the limitation that the consideration of flexibility is usually limited to very small movements or very small structures. Methods have been developed which handle minor flexibility via normal mode or other structure sampling, but new methods are required to model ordered proteins which undergo large-scale conformational changes to elucidate their function at the molecular level. Here, we present Flex-LZerD, a framework for docking such complexes. Via partial assembly multidomain docking and an iterative normal mode analysis admitting curvilinear motions, we demonstrate the ability to model the assembly of a variety of protein-protein and protein-nucleic acid complexes.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA; Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN 47907, USA.
| |
Collapse
|
36
|
Katsonis P, Wilhelm K, Williams A, Lichtarge O. Genome interpretation using in silico predictors of variant impact. Hum Genet 2022; 141:1549-1577. [PMID: 35488922 PMCID: PMC9055222 DOI: 10.1007/s00439-022-02457-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 04/17/2022] [Indexed: 02/06/2023]
Abstract
Estimating the effects of variants found in disease driver genes opens the door to personalized therapeutic opportunities. Clinical associations and laboratory experiments can only characterize a tiny fraction of all the available variants, leaving the majority as variants of unknown significance (VUS). In silico methods bridge this gap by providing instant estimates on a large scale, most often based on the numerous genetic differences between species. Despite concerns that these methods may lack reliability in individual subjects, their numerous practical applications over cohorts suggest they are already helpful and have a role to play in genome interpretation when used at the proper scale and context. In this review, we aim to gain insights into the training and validation of these variant effect predicting methods and illustrate representative types of experimental and clinical applications. Objective performance assessments using various datasets that are not yet published indicate the strengths and limitations of each method. These show that cautious use of in silico variant impact predictors is essential for addressing genome interpretation challenges.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| | - Kevin Wilhelm
- Graduate School of Biomedical Sciences, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Biochemistry, Human Genetics and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Department of Pharmacology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
37
|
Molecular and thermodynamic mechanisms for protein adaptation. EUROPEAN BIOPHYSICS JOURNAL 2022; 51:519-534. [DOI: 10.1007/s00249-022-01618-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 08/01/2022] [Accepted: 09/20/2022] [Indexed: 11/07/2022]
|
38
|
Aderinwale T, Christoffer C, Kihara D. RL-MLZerD: Multimeric protein docking using reinforcement learning. Front Mol Biosci 2022; 9:969394. [PMID: 36090027 PMCID: PMC9459051 DOI: 10.3389/fmolb.2022.969394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 08/08/2022] [Indexed: 11/24/2022] Open
Abstract
Numerous biological processes in a cell are carried out by protein complexes. To understand the molecular mechanisms of such processes, it is crucial to know the quaternary structures of the complexes. Although the structures of protein complexes have been determined by biophysical experiments at a rapid pace, there are still many important complex structures that are yet to be determined. To supplement experimental structure determination of complexes, many computational protein docking methods have been developed; however, most of these docking methods are designed only for docking with two chains. Here, we introduce a novel method, RL-MLZerD, which builds multiple protein complexes using reinforcement learning (RL). In RL-MLZerD a multi-chain assembly process is considered as a series of episodes of selecting and integrating pre-computed pairwise docking models in a RL framework. RL is effective in correctly selecting plausible pairwise models that fit well with other subunits in a complex. When tested on a benchmark dataset of protein complexes with three to five chains, RL-MLZerD showed better modeling performance than other existing multiple docking methods under different evaluation criteria, except against AlphaFold-Multimer in unbound docking. Also, it emerged that the docking order of multi-chain complexes can be naturally predicted by examining preferred paths of episodes in the RL computation.
Collapse
Affiliation(s)
- Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, United States
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States
- *Correspondence: Daisuke Kihara,
| |
Collapse
|
39
|
Souza FR, Moura PG, Costa RKM, Silva RS, Pimentel AS. Absolute binding free energies of mucroporin and its analog mucroporin-M1 with the heptad repeat 1 domain and RNA-dependent RNA polymerase of SARS-CoV-2. J Biomol Struct Dyn 2022:1-12. [PMID: 35993479 DOI: 10.1080/07391102.2022.2114014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
The peptide Mucroporin and its analog Mucroporin-M1 were studied using the molecular docking and molecular dynamics simulation of their complexation with two protein targets, the Heptad Repeat 1 (HR1) domain and RNA-dependent RNA polymerase (RdRp) of SARS-CoV-2. The molecular docking of the peptide-protein complexes was performed using the glowworm swarm optimization algorithm. The lowest energy poses were submitted to molecular dynamics simulation. Then, the binding free energies of Mucroporin and its analog Mucroporin-M1 with these two protein targets were calculated using the Multistate Bennett Acceptance Ratio (MBAR) method. It was verified that the peptides/HR1 domain complex showed stability in the interaction site determined by molecular docking. It was also found that Mucroporin-M1 has a much higher affinity than Mucroporin to the HR1 protein target. The peptides showed similar stability and affinity at the NTP binding site in the RdRp protein. Additional experimental studies are needed to confirm the antiviral activity of Mucroporin-M1 and a possible mechanism of action against SARS-CoV-2. However, here we indicate that Mucroporin-M1 may have potential antiviral activity against the HR1 domain with the possibility for further peptide optimization.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Felipe Rodrigues Souza
- Departamento de Química, Pontifícia Universidade Católica do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - Paloma Guimarães Moura
- Departamento de Química, Pontifícia Universidade Católica do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | | | - Rudielson Santos Silva
- Departamento de Química, Pontifícia Universidade Católica do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| | - André Silva Pimentel
- Departamento de Química, Pontifícia Universidade Católica do Rio de Janeiro, Rio de Janeiro, RJ, Brazil
| |
Collapse
|
40
|
Verburgt J, Zhang Z, Kihara D. Multi-level analysis of intrinsically disordered protein docking methods. Methods 2022; 204:55-63. [PMID: 35609776 PMCID: PMC9701586 DOI: 10.1016/j.ymeth.2022.05.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/17/2022] [Accepted: 05/19/2022] [Indexed: 12/29/2022] Open
Abstract
Intrinsically Disordered Proteins (IDPs) are a class of proteins in which at least some region of the protein does not possess any stable structure in solution in the physiological condition but may adopt an ordered structure upon binding to a globular receptor. These IDP-receptor complexes are thus subject to protein complex modeling in which computational techniques are applied to accurately reproduce the IDP ligand-receptor interactions. This often exists in the form of protein docking, in which the 3D structures of both the subunits are known, but the position of the ligand relative to the receptor is not. Here, we evaluate the performance of three IDP-receptor modeling tools with metrics that characterize the IDP-receptor interface at various resolutions. We show that all three methods are able to properly identify the general binding site, as identified by lower resolution metrics, but begin to struggle with higher resolution metrics that capture biophysical interactions.
Collapse
Affiliation(s)
- Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA,Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA,Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, 47907, USA,Corresponding Author
| |
Collapse
|
41
|
Kurniawan J, Ishida T. Protein Model Quality Estimation Using Molecular Dynamics Simulation. ACS OMEGA 2022; 7:24274-24281. [PMID: 35874260 PMCID: PMC9301944 DOI: 10.1021/acsomega.2c01475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The estimation of protein model quality remains a challenging task and is important for protein structural model utilization. In the last decade, existing methods that rely on machine learning to deep learning have been developed and shown progressive improvement. Despite utilizing more sophisticated techniques and introducing new features, none of these methods employ explicit protein structure stability information. Hypothetically, protein model quality might be indicated by its structural stability in an in silico system disclosed by the structural difference from its initial structure. One of the possible methods to exploit such information is by implementing molecular dynamics simulations that have shown successful applications in many research fields. We present a novel approach by introducing explicit protein structure stability information using molecular dynamics simulation. Despite using only simple features, small data with no training process required, and a short molecular dynamics simulation time, our method shows comparable performance to the state-of-the-art deep learning-based method.
Collapse
|
42
|
Structure Prediction, Evaluation, and Validation of GPR18 Lipid Receptor Using Free Programs. Int J Mol Sci 2022; 23:ijms23147917. [PMID: 35887268 PMCID: PMC9319093 DOI: 10.3390/ijms23147917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/04/2022] [Accepted: 07/08/2022] [Indexed: 11/30/2022] Open
Abstract
The GPR18 receptor, often referred to as the N-arachidonylglycine receptor, although assigned (along with GPR55 and GPR119) to the new class A GPCR subfamily-lipid receptors, officially still has the status of a class A GPCR orphan. While its signaling pathways and biological significance have not yet been fully elucidated, increasing evidence points to the therapeutic potential of GPR18 in relation to immune, neurodegenerative, and cancer processes to name a few. Therefore, it is necessary to understand the interactions of potential ligands with the receptor and the influence of particular structural elements on their activity. Thus, given the lack of an experimentally solved structure, the goal of the present study was to obtain a homology model of the GPR18 receptor in the inactive state, meeting all requirements in terms of protein structure quality and recognition of active ligands. To increase the reliability and precision of the predictions, different contemporary protein structure prediction methods and software were used and compared herein. To test the usability of the resulting models, we optimized and compared the selected structures followed by the assessment of the ability to recognize known, active ligands. The stability of the predicted poses was then evaluated by means of molecular dynamics simulations. On the other hand, most of the best-ranking contemporary CADD software/platforms for its full usability require rather expensive licenses. To overcome this down-to-earth obstacle, the overarching goal of these studies was to test whether it is possible to perform the thorough CADD experiments with high scientific confidence while using only license-free/academic software and online platforms. The obtained results indicate that a wide range of freely available software and/or academic licenses allow us to carry out meaningful molecular modelling/docking studies.
Collapse
|
43
|
Akhter N, Kabir KL, Chennupati G, Vangara R, Alexandrov BS, Djidjev H, Shehu A. Improved Protein Decoy Selection via Non-Negative Matrix Factorization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1670-1682. [PMID: 33400654 DOI: 10.1109/tcbb.2020.3049088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
A central challenge in protein modeling research and protein structure prediction in particular is known as decoy selection. The problem refers to selecting biologically-active/native tertiary structures among a multitude of physically-realistic structures generated by template-free protein structure prediction methods. Research on decoy selection is active. Clustering-based methods are popular, but they fail to identify good/near-native decoys on datasets where near-native decoys are severely under-sampled by a protein structure prediction method. Reasonable progress is reported by methods that additionally take into account the internal energy of a structure and employ it to identify basins in the energy landscape organizing the multitude of decoys. These methods, however, incur significant time costs for extracting basins from the landscape. In this paper, we propose a novel decoy selection method based on non-negative matrix factorization. We demonstrate that our method outperforms energy landscape-based methods. In particular, the proposed method addresses both the time cost issue and the challenge of identifying good decoys in a sparse dataset, successfully recognizing near-native decoys for both easy and hard protein targets.
Collapse
|
44
|
García-Cebollada H, López A, Sancho J. Protposer: the web server that readily proposes protein stabilizing mutations with high PPV. Comput Struct Biotechnol J 2022; 20:2415-2433. [PMID: 35664235 PMCID: PMC9133766 DOI: 10.1016/j.csbj.2022.05.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 05/05/2022] [Accepted: 05/05/2022] [Indexed: 01/23/2023] Open
Abstract
Protein stability is a requisite for most biotechnological and medical applications of proteins. As natural proteins tend to suffer from a low conformational stability ex vivo, great efforts have been devoted toward increasing their stability through rational design and engineering of appropriate mutations. Unfortunately, even the best currently used predictors fail to compute the stability of protein variants with sufficient accuracy and their usefulness as tools to guide the rational stabilisation of proteins is limited. We present here Protposer, a protein stabilising tool based on a different approach. Instead of quantifying changes in stability, Protposer uses structure- and sequence-based screening modules to nominate candidate mutations for subsequent evaluation by a logistic regression model, carefully trained to avoid overfitting. Thus, Protposer analyses PDB files in search for stabilization opportunities and provides a ranked list of promising mutations with their estimated success rates (eSR), their probabilities of being stabilising by at least 0.5 kcal/mol. The agreement between eSRs and actual positive predictive values (PPV) on external datasets of mutations is excellent. When Protposer is used with its Optimal kappa selection threshold, its PPV is above 0.7. Even with less stringent thresholds, Protposer largely outperforms FoldX, Rosetta and PoPMusiC. Indicating the PDB file of the protein suffices to obtain a ranked list of mutations, their eSRs and hints on the likely source of the stabilization expected. Protposer is a distinct, straightforward and highly successful tool to design protein stabilising mutations, and it is freely available for academic use at http://webapps.bifi.es/the-protposer.
Collapse
|
45
|
Sekar PC, Srinivasan E, Chandrasekhar G, Paul DM, Sanjay G, Surya S, Kumar NSAR, Rajasekaran R. Probing the competitive inhibitor efficacy of frog-skin alpha helical AMPs identified against ACE2 binding to SARS-CoV-2 S1 spike protein as therapeutic scaffold to prevent COVID-19. J Mol Model 2022; 28:128. [PMID: 35461388 PMCID: PMC9034900 DOI: 10.1007/s00894-022-05117-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 04/06/2022] [Indexed: 12/19/2022]
Abstract
In COVID-19 infection, the SARS-CoV-2 spike protein S1 interacts to the ACE2 receptor of human host, instigating the viral infection. To examine the competitive inhibitor efficacy of broad spectrum alpha helical AMPs extracted from frog skin, a comparative study of intermolecular interactions between viral S1 and AMPs was performed relative to S1-ACE2p interactions. The ACE2 binding region with S1 was extracted as ACE2p from the complex for ease of computation. Surprisingly, the Spike-Dermaseptin-S9 complex had more intermolecular interactions than the other peptide complexes and importantly, the S1-ACE2p complex. We observed how atomic displacements in docked complexes impacted structural integrity of a receptor-binding domain in S1 through conformational sampling analysis. Notably, this geometry-based sampling approach confers the robust interactions that endure in S1-Dermaseptin-S9 complex, demonstrating its conformational transition. Additionally, QM calculations revealed that the global hardness to resist chemical perturbations was found more in Dermaseptin-S9 compared to ACE2p. Moreover, the conventional MD through PCA and the torsional angle analyses indicated that Dermaseptin-S9 altered the conformations of S1 considerably. Our analysis further revealed the high structural stability of S1-Dermaseptin-S9 complex and particularly, the trajectory analysis of the secondary structural elements established the alpha helical conformations to be retained in S1-Dermaseptin-S9 complex, as substantiated by SMD results. In conclusion, the functional dynamics proved to be significant for viral Spike S1 and Dermaseptin-S9 peptide when compared to ACE2p complex. Hence, Dermaseptin-S9 peptide inhibitor could be a strong candidate for therapeutic scaffold to prevent infection of SARS-CoV-2.
Collapse
Affiliation(s)
- P Chandra Sekar
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to Be University), Vellore, Tamil Nadu, India
| | - E Srinivasan
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to Be University), Vellore, Tamil Nadu, India
- Department of Bioinformatics, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences (Deemed to Be University), Chennai, Tamil Nadu, India
| | - G Chandrasekhar
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to Be University), Vellore, Tamil Nadu, India
| | - D Meshach Paul
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to Be University), Vellore, Tamil Nadu, India
| | - G Sanjay
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to Be University), Vellore, Tamil Nadu, India
| | - S Surya
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to Be University), Vellore, Tamil Nadu, India
| | - N S Arun Raj Kumar
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to Be University), Vellore, Tamil Nadu, India
| | - R Rajasekaran
- Quantitative Biology Lab, Department of Biotechnology, School of Bio Sciences and Technology, VIT (Deemed to Be University), Vellore, Tamil Nadu, India.
| |
Collapse
|
46
|
Zhou P, Wen L, Lin J, Mei L, Liu Q, Shang S, Li J, Shu J. Integrated unsupervised-supervised modeling and prediction of protein-peptide affinities at structural level. Brief Bioinform 2022; 23:6555404. [PMID: 35352094 DOI: 10.1093/bib/bbac097] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 02/15/2022] [Accepted: 02/23/2022] [Indexed: 12/24/2022] Open
Abstract
Cell signal networks are orchestrated directly or indirectly by various peptide-mediated protein-protein interactions, which are normally weak and transient and thus ideal for biological regulation and medicinal intervention. Here, we develop a general-purpose method for modeling and predicting the binding affinities of protein-peptide interactions (PpIs) at the structural level. The method is a hybrid strategy that employs an unsupervised approach to derive a layered PpI atom-residue interaction (ulPpI[a-r]) potential between different protein atom types and peptide residue types from thousands of solved PpI complex structures and then statistically correlates the potential descriptors with experimental affinities (KD values) over hundreds of known PpI samples in a supervised manner to create an integrated unsupervised-supervised PpI affinity (usPpIA) predictor. Although both the ulPpI[a-r] potential and usPpIA predictor can be used to calculate PpI affinities from their complex structures, the latter seems to perform much better than the former, suggesting that the unsupervised potential can be improved substantially with a further correction by supervised statistical learning. We examine the robustness and fault-tolerance of usPpIA predictor when applied to treat the coarse-grained PpI complex structures modeled computationally by sophisticated peptide docking and dynamics simulation. It is revealed that, despite developed solely based on solved structures, the integrated unsupervised-supervised method is also applicable for locally docked structures to reach a quantitative prediction but can only give a qualitative prediction on globally docked structures. The dynamics refinement seems not to change (or improve) the predictive results essentially, although it is computationally expensive and time-consuming relative to peptide docking. We also perform extrapolation of usPpIA predictor to the indirect affinity quantities of HLA-A*0201 binding epitope peptides and NHERF PDZ binding scaffold peptides, consequently resulting in a good and moderate correlation of the predicted KD with experimental IC50 and BLU on the two peptide sets, with Pearson's correlation coefficients Rp = 0.635 and 0.406, respectively.
Collapse
Affiliation(s)
- Peng Zhou
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Li Wen
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Jing Lin
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Li Mei
- Institute of Culinary, Sichuan Tourism University, Chengdu 610100, China
| | - Qian Liu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Shuyong Shang
- of Ecological Environment Protection, Chengdu Normal University, Chengdu 611130, China
| | - Juelin Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| | - Jianping Shu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China (UESTC), Chengdu 611731, China
| |
Collapse
|
47
|
Holland J, Grigoryan G. Structure‐conditioned amino‐acid couplings: how contact geometry affects pairwise sequence preferences. Protein Sci 2022; 31:900-917. [PMID: 35060221 PMCID: PMC8927866 DOI: 10.1002/pro.4280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 01/06/2022] [Accepted: 01/12/2022] [Indexed: 11/11/2022]
Abstract
Relating a protein's sequence to its conformation is a central challenge for both structure prediction and sequence design. Statistical contact potentials, as well as their more descriptive versions that account for side‐chain orientation and other geometric descriptors, have served as simplistic but useful means of representing second‐order contributions in sequence–structure relationships. Here we ask what happens when a pairwise potential is conditioned on the fully defined geometry of interacting backbones fragments. We show that the resulting structure‐conditioned coupling energies more accurately reflect pair preferences as a function of structural contexts. These structure‐conditioned energies more reliably encode native sequence information and more highly correlate with experimentally determined coupling energies. Clustering a database of interaction motifs by structure results in ensembles of similar energies and clustering them by energy results in ensembles of similar structures. By comparing many pairs of interaction motifs and showing that structural similarity and energetic similarity go hand‐in‐hand, we provide a tangible link between modular sequence and structure elements. This link is applicable to structural modeling, and we show that scoring CASP models with structured‐conditioned energies results in substantially higher correlation with structural quality than scoring the same models with a contact potential. We conclude that structure‐conditioned coupling energies are a good way to model the impact of interaction geometry on second‐order sequence preferences.
Collapse
Affiliation(s)
- Jack Holland
- Department of Computer Science Dartmouth College Hanover New Hampshire USA
| | - Gevorg Grigoryan
- Department of Computer Science Dartmouth College Hanover New Hampshire USA
| |
Collapse
|
48
|
Xu G, Wang Q, Ma J. OPUS-Rota4: a gradient-based protein side-chain modeling framework assisted by deep learning-based predictors. Brief Bioinform 2022; 23:bbab529. [PMID: 34905769 PMCID: PMC8769891 DOI: 10.1093/bib/bbab529] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 10/11/2021] [Accepted: 11/15/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate protein side-chain modeling is crucial for protein folding and protein design. In the past decades, many successful methods have been proposed to address this issue. However, most of them depend on the discrete samples from the rotamer library, which may have limitations on their accuracies and usages. In this study, we report an open-source toolkit for protein side-chain modeling, named OPUS-Rota4. It consists of three modules: OPUS-RotaNN2, which predicts protein side-chain dihedral angles; OPUS-RotaCM, which measures the distance and orientation information between the side chain of different residue pairs and OPUS-Fold2, which applies the constraints derived from the first two modules to guide side-chain modeling. OPUS-Rota4 adopts the dihedral angles predicted by OPUS-RotaNN2 as its initial states, and uses OPUS-Fold2 to refine the side-chain conformation with the side-chain contact map constraints derived from OPUS-RotaCM. Therefore, we convert the side-chain modeling problem into a side-chain contact map prediction problem. OPUS-Fold2 is written in Python and TensorFlow2.4, which is user-friendly to include other differentiable energy terms. OPUS-Rota4 also provides a platform in which the side-chain conformation can be dynamically adjusted under the influence of other processes. We apply OPUS-Rota4 on 15 FM predictions submitted by AlphaFold2 on CASP14, the results show that the side chains modeled by OPUS-Rota4 are closer to their native counterparts than those predicted by AlphaFold2 (e.g. the residue-wise RMSD for all residues and core residues are 0.588 and 0.472 for AlphaFold2, and 0.535 and 0.407 for OPUS-Rota4).
Collapse
Affiliation(s)
- Gang Xu
- Multiscale Research Institute of Complex Systems Fudan University Shanghai, 200433, China
- Zhangjiang Fudan International Innovation Center Fudan University Shanghai, 201210, China
- Shanghai AI Laboratory Shanghai, 200030, China
| | - Qinghua Wang
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology Baylor College of Medicine Houston, Texas 77030, United States
| | - Jianpeng Ma
- Multiscale Research Institute of Complex Systems Fudan University Shanghai, 200433, China
- Zhangjiang Fudan International Innovation Center Fudan University Shanghai, 201210, China
- Shanghai AI Laboratory Shanghai, 200030, China
| |
Collapse
|
49
|
rsRNASP: A residue-separation-based statistical potential for RNA 3D structure evaluation. Biophys J 2022; 121:142-156. [PMID: 34798137 PMCID: PMC8758408 DOI: 10.1016/j.bpj.2021.11.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 10/23/2021] [Accepted: 11/10/2021] [Indexed: 01/07/2023] Open
Abstract
Knowledge-based statistical potentials have been shown to be rather effective in protein 3-dimensional (3D) structure evaluation and prediction. Recently, several statistical potentials have been developed for RNA 3D structure evaluation, while their performances are either still at a low level for the test datasets from structure prediction models or dependent on the "black-box" process through neural networks. In this work, we have developed an all-atom distance-dependent statistical potential based on residue separation for RNA 3D structure evaluation, namely rsRNASP, which is composed of short- and long-ranged potentials distinguished by residue separation. The extensive examinations against available RNA test datasets show that rsRNASP has apparently higher performance than the existing statistical potentials for the realistic test datasets with large RNAs from structure prediction models, including the newly released RNA-Puzzles dataset, and is comparable to the existing top statistical potentials for the test datasets with small RNAs or near-native decoys. In addition, rsRNASP is superior to RNA3DCNN, a recently developed scoring function through 3D convolutional neural networks. rsRNASP and the relevant databases are available to the public.
Collapse
|
50
|
Anishchenko I, Baek M, Park H, Hiranuma N, Kim DE, Dauparas J, Mansoor S, Humphreys IR, Baker D. Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14. Proteins 2021; 89:1722-1733. [PMID: 34331359 PMCID: PMC8616808 DOI: 10.1002/prot.26194] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/23/2021] [Accepted: 07/25/2021] [Indexed: 12/29/2022]
Abstract
The trRosetta structure prediction method employs deep learning to generate predicted residue-residue distance and orientation distributions from which 3D models are built. We sought to improve the method by incorporating as inputs (in addition to sequence information) both language model embeddings and template information weighted by sequence similarity to the target. We also developed a refinement pipeline that recombines models generated by template-free and template utilizing versions of trRosetta guided by the DeepAccNet accuracy predictor. Both benchmark tests and CASP results show that the new pipeline is a considerable improvement over the original trRosetta, and it is faster and requires less computing resources, completing the entire modeling process in a median < 3 h in CASP14. Our human group improved results with this pipeline primarily by identifying additional homologous sequences for input into the network. We also used the DeepAccNet accuracy predictor to guide Rosetta high-resolution refinement for submissions in the regular and refinement categories; although performance was quite good on a CASP relative scale, the overall improvements were rather modest in part due to missing inter-domain or inter-chain contacts.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
| | - Minkyung Baek
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
| | - Hahnbeom Park
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
| | - Naozumi Hiranuma
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
- Paul G. Allen School of Computer Science & EngineeringUniversity of WashingtonSeattleWashingtonUSA
| | - David E. Kim
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
- Howard Hughes Medical InstituteUniversity of WashingtonSeattleWashingtonUSA
| | - Justas Dauparas
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
| | - Sanaa Mansoor
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
| | - Ian R. Humphreys
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
| | - David Baker
- Department of Biochemistry and Institute for Protein DesignUniversity of WashingtonSeattleWashingtonUSA
- Howard Hughes Medical InstituteUniversity of WashingtonSeattleWashingtonUSA
| |
Collapse
|