51
|
Zheng W, Wuyun Q, Li Y, Zhang C, Freddolino PL, Zhang Y. Improving deep learning protein monomer and complex structure prediction using DeepMSA2 with huge metagenomics data. Nat Methods 2024; 21:279-289. [PMID: 38167654 PMCID: PMC10864179 DOI: 10.1038/s41592-023-02130-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Accepted: 11/13/2023] [Indexed: 01/05/2024]
Abstract
Leveraging iterative alignment search through genomic and metagenome sequence databases, we report the DeepMSA2 pipeline for uniform protein single- and multichain multiple-sequence alignment (MSA) construction. Large-scale benchmarks show that DeepMSA2 MSAs can remarkably increase the accuracy of protein tertiary and quaternary structure predictions compared with current state-of-the-art methods. An integrated pipeline with DeepMSA2 participated in the most recent CASP15 experiment and created complex structural models with considerably higher quality than the AlphaFold2-Multimer server (v.2.2.0). Detailed data analyses show that the major advantage of DeepMSA2 lies in its balanced alignment search and effective model selection, and in the power of integrating huge metagenomics databases. These results demonstrate a new avenue to improve deep learning protein structure prediction through advanced MSA construction and provide additional evidence that optimization of input information to deep learning-based structure prediction methods must be considered with as much care as the design of the predictor itself.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - P Lydia Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA.
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA.
- Department of Computer Science, School of Computing, National University of Singapore, Singapore, Singapore.
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
52
|
Giancotti R, Lomoio U, Puccio B, Tradigo G, Vizza P, Torti C, Veltri P, Guzzi PH. The Omicron XBB.1 Variant and Its Descendants: Genomic Mutations, Rapid Dissemination and Notable Characteristics. BIOLOGY 2024; 13:90. [PMID: 38392308 PMCID: PMC10886209 DOI: 10.3390/biology13020090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/26/2024] [Accepted: 01/30/2024] [Indexed: 02/24/2024]
Abstract
The SARS-CoV-2 virus, which is a major threat to human health, has undergone many mutations during the replication process due to errors in the replication steps and modifications in the structure of viral proteins. The XBB variant was identified for the first time in Singapore in the fall of 2022. It was then detected in other countries, including the United States, Canada, and the United Kingdom. We study the impact of sequence changes on spike protein structure on the subvariants of XBB, with particular attention to the velocity of variant diffusion and virus activity with respect to its diffusion. We examine the structural and functional distinctions of the variants in three different conformations: (i) spike glycoprotein in complex with ACE2 (1-up state), (ii) spike glycoprotein (closed-1 state), and (iii) S protein (open-1 state). We also estimate the affinity binding between the spike protein and ACE2. The market binding affinity observed in specific variants raises questions about the efficacy of current vaccines in preparing the immune system for virus variant recognition. This work may be useful in devising strategies to manage the ongoing COVID-19 pandemic. To stay ahead of the virus evolution, further research and surveillance should be carried out to adjust public health measures accordingly.
Collapse
Affiliation(s)
- Raffaele Giancotti
- Department of Surgical and Medical Sciences, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy
| | - Ugo Lomoio
- Department of Surgical and Medical Sciences, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy
| | - Barbara Puccio
- Department of Surgical and Medical Sciences, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy
| | | | - Patrizia Vizza
- Department of Surgical and Medical Sciences, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy
| | - Carlo Torti
- Department of Surgical and Medical Sciences, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy
| | - Pierangelo Veltri
- Department of Computer Engineering, Modelling, Electronics and System, University of Calabria, 87036 Rende, Italy
| | - Pietro Hiram Guzzi
- Department of Surgical and Medical Sciences, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy
| |
Collapse
|
53
|
Satalkar V, Degaga GD, Li W, Pang YT, McShan AC, Gumbart JC, Mitchell JC, Torres MP. Generative β-hairpin design using a residue-based physicochemical property landscape. Biophys J 2024:S0006-3495(24)00070-5. [PMID: 38297834 DOI: 10.1016/j.bpj.2024.01.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/20/2023] [Accepted: 01/25/2024] [Indexed: 02/02/2024] Open
Abstract
De novo peptide design is a new frontier that has broad application potential in the biological and biomedical fields. Most existing models for de novo peptide design are largely based on sequence homology that can be restricted based on evolutionarily derived protein sequences and lack the physicochemical context essential in protein folding. Generative machine learning for de novo peptide design is a promising way to synthesize theoretical data that are based on, but unique from, the observable universe. In this study, we created and tested a custom peptide generative adversarial network intended to design peptide sequences that can fold into the β-hairpin secondary structure. This deep neural network model is designed to establish a preliminary foundation of the generative approach based on physicochemical and conformational properties of 20 canonical amino acids, for example, hydrophobicity and residue volume, using extant structure-specific sequence data from the PDB. The beta generative adversarial network model robustly distinguishes secondary structures of β hairpin from α helix and intrinsically disordered peptides with an accuracy of up to 96% and generates artificial β-hairpin peptide sequences with minimum sequence identities around 31% and 50% when compared against the current NCBI PDB and nonredundant databases, respectively. These results highlight the potential of generative models specifically anchored by physicochemical and conformational property features of amino acids to expand the sequence-to-structure landscape of proteins beyond evolutionary limits.
Collapse
Affiliation(s)
- Vardhan Satalkar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Gemechis D Degaga
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee
| | - Wei Li
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia
| | - Yui Tik Pang
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia
| | - James C Gumbart
- School of Physics, Georgia Institute of Technology, Atlanta, Georgia; School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia
| | - Julie C Mitchell
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee.
| | - Matthew P Torres
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia; School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia.
| |
Collapse
|
54
|
Liu Y, Liu H. Protein sequence design on given backbones with deep learning. Protein Eng Des Sel 2024; 37:gzad024. [PMID: 38157313 DOI: 10.1093/protein/gzad024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 12/08/2023] [Accepted: 12/18/2023] [Indexed: 01/03/2024] Open
Abstract
Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.
Collapse
Affiliation(s)
- Yufeng Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
- School of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu 215004, China
| |
Collapse
|
55
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. Nat Commun 2024; 15:810. [PMID: 38280868 PMCID: PMC10821953 DOI: 10.1038/s41467-024-45028-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/09/2024] [Indexed: 01/29/2024] Open
Abstract
Recent studies reveal that de novo gene origination from previously non-genic sequences is a common mechanism for gene innovation. These young genes provide an opportunity to study the structural and functional origins of proteins. Here, we combine high-quality base-level whole-genome alignments and computational structural modeling to study the origination, evolution, and protein structures of lineage-specific de novo genes. We identify 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. Sequence composition, evolutionary rates, and expression patterns indicate possible gradual functional or adaptive shifts with their gene ages. Surprisingly, we find little overall protein structural changes in candidates from the Drosophilinae lineage. We identify several candidates with potentially well-folded protein structures. Ancestral sequence reconstruction analysis reveals that most potentially well-folded candidates are often born well-folded. Single-cell RNA-seq analysis in testis shows that although most de novo gene candidates are enriched in spermatocytes, several young candidates are biased towards the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and protein structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
56
|
Ko S, Kim J, Lim J, Lee SM, Park JY, Woo J, Scott-Nevros ZK, Kim JR, Yoon H, Kim D. Blanket antimicrobial resistance gene database with structural information, BOARDS, provides insights on historical landscape of resistance prevalence and effects of mutations in enzyme structure. mSystems 2024; 9:e0094323. [PMID: 38085058 PMCID: PMC10871167 DOI: 10.1128/msystems.00943-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 11/02/2023] [Indexed: 01/24/2024] Open
Abstract
Antimicrobial resistance (AMR) in pathogenic bacteria poses a significant threat to public health, yet there is still a need for development in the tools to deeply understand AMR genes based on genetic or structural information. In this study, we present an interactive web database named Blanket Overarching Antimicrobial-Resistance gene Database with Structural information (BOARDS, sbml.unist.ac.kr), a database that comprehensively includes 3,943 reported AMR gene information for 1,997 extended spectrum beta-lactamase (ESBL) and 1,946 other genes as well as a total of 27,395 predicted protein structures. These structures, which include both wild-type AMR genes and their mutants, were derived from 80,094 publicly available whole-genome sequences. In addition, we developed the rapid analysis and detection tool of antimicrobial-resistance (RADAR), a one-stop analysis pipeline to detect AMR genes across whole-genome sequencing (WGSs). By integrating BOARDS and RADAR, the AMR prevalence landscape for eight multi-drug resistant pathogens was reconstructed, leading to unexpected findings such as the pre-existence of the MCR genes before their official reports. Enzymatic structure prediction-based analysis revealed that the occurrence of mutations found in some ESBL genes was found to be closely related to the binding affinities with their antibiotic substrates. Overall, BOARDS can play a significant role in performing in-depth analysis on AMR.IMPORTANCEWhile the increasing antibiotic resistance (AMR) in pathogen has been a burden on public health, effective tools for deep understanding of AMR based on genetic or structural information remain limited. In this study, a blanket overarching antimicrobial-resistance gene database with structure information (BOARDS)-a web-based database that comprehensively collected AMR gene data with predictive protein structural information was constructed. Additionally, we report the development of a RADAR pipeline that can analyze whole-genome sequences as well. BOARDS, which includes sequence and structural information, has shown the historical landscape and prevalence of the AMR genes and can provide insight into single-nucleotide polymorphism effects on antibiotic degrading enzymes within protein structures.
Collapse
Affiliation(s)
- Seyoung Ko
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
- School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jaehyung Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jaewon Lim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Sang-Mok Lee
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Joon Young Park
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jihoon Woo
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Zoe K. Scott-Nevros
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| | - Jong R. Kim
- School of Engineering and Digital Sciences, Nazarbayev University, Astan, Kazakhstan
| | - Hyunjin Yoon
- Department of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Donghyuk Kim
- School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
- School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea
| |
Collapse
|
57
|
Desai A, Mahajan V, Ramabhadran RO, Mukherjee R. Binding order of substrate and cofactor in sulfonamide monooxygenase during sulfa drug degradation: in silico studies. J Biomol Struct Dyn 2024:1-15. [PMID: 38263732 DOI: 10.1080/07391102.2024.2306495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 01/10/2024] [Indexed: 01/25/2024]
Abstract
For decades, sulfonamide antibiotics have been used across industries such as agriculture and animal husbandry. However, the use and inadvertent misuse of these antibiotics have resulted in the advent of sulfonamide-drug-resistant strains due to antibiotic pollution. Enzymatic bioremediation of antibiotics remains a potential emerging solution to combat antibiotic pollution. Here, we propose an enzymatic model for the degradation of sulfonamides by Microbacterium sp. We have employed a multi-pronged computational strategy involving - protein structure modelling, ligand docking and molecular dynamics simulations to decipher a plausible binding order for the enzymatic degradation of sulfonamides by the bacterial sulfonamide monooxygenase, SulX. Our results enable us to predict that this degradation is achieved through the sequential binding of the antibiotic sulfonamide followed by the reduced flavin cofactor FMNH2, thereby laying the computational foundation for further advancements in enzyme-mediated degradation of the antibiotic. We also provide a list of experiments which may be performed to verify and follow-up on our in-silico studies.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Amogh Desai
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati, India
| | - Ved Mahajan
- Department of Chemistry, Indian Institute of Science Education and Research Tirupati, Tirupati, India
| | - Raghunath O Ramabhadran
- Department of Chemistry, Indian Institute of Science Education and Research Tirupati, Tirupati, India
| | - Raju Mukherjee
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati, India
| |
Collapse
|
58
|
Zhang Z, Cai Y, Zhang B, Zheng W, Freddolino L, Zhang G, Zhou X. DEMO-EM2: assembling protein complex structures from cryo-EM maps through intertwined chain and domain fitting. Brief Bioinform 2024; 25:bbae113. [PMID: 38517699 PMCID: PMC10959074 DOI: 10.1093/bib/bbae113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 02/10/2024] [Accepted: 02/25/2024] [Indexed: 03/24/2024] Open
Abstract
The breakthrough in cryo-electron microscopy (cryo-EM) technology has led to an increasing number of density maps of biological macromolecules. However, constructing accurate protein complex atomic structures from cryo-EM maps remains a challenge. In this study, we extend our previously developed DEMO-EM to present DEMO-EM2, an automated method for constructing protein complex models from cryo-EM maps through an iterative assembly procedure intertwining chain- and domain-level matching and fitting for predicted chain models. The method was carefully evaluated on 27 cryo-electron tomography (cryo-ET) maps and 16 single-particle EM maps, where DEMO-EM2 models achieved an average TM-score of 0.92, outperforming those of state-of-the-art methods. The results demonstrate an efficient method that enables the rapid and reliable solution of challenging cryo-EM structure modeling problems.
Collapse
Affiliation(s)
- Ziying Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yaxian Cai
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Biao Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lydia Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiaogen Zhou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
59
|
Bernard C, Postic G, Ghannay S, Tahi F. RNAdvisor: a comprehensive benchmarking tool for the measure and prediction of RNA structural model quality. Brief Bioinform 2024; 25:bbae064. [PMID: 38436560 PMCID: PMC10939302 DOI: 10.1093/bib/bbae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/30/2024] [Accepted: 02/02/2024] [Indexed: 03/05/2024] Open
Abstract
RNA is a complex macromolecule that plays central roles in the cell. While it is well known that its structure is directly related to its functions, understanding and predicting RNA structures is challenging. Assessing the real or predictive quality of a structure is also at stake with the complex 3D possible conformations of RNAs. Metrics have been developed to measure model quality while scoring functions aim at assigning quality to guide the discrimination of structures without a known and solved reference. Throughout the years, many metrics and scoring functions have been developed, and no unique assessment is used nowadays. Each developed assessment method has its specificity and might be complementary to understanding structure quality. Therefore, to evaluate RNA 3D structure predictions, it would be important to calculate different metrics and/or scoring functions. For this purpose, we developed RNAdvisor, a comprehensive automated software that integrates and enhances the accessibility of existing metrics and scoring functions. In this paper, we present our RNAdvisor tool, as well as state-of-the-art existing metrics, scoring functions and a set of benchmarks we conducted for evaluating them. Source code is freely available on the EvryRNA platform: https://evryrna.ibisc.univ-evry.fr.
Collapse
Affiliation(s)
- Clement Bernard
- Université Paris Saclay, Univ Evry, IBISC, 91020 Evry-Courcouronnes, France
| | - Guillaume Postic
- Université Paris Saclay, Univ Evry, IBISC, 91020 Evry-Courcouronnes, France
| | - Sahar Ghannay
- LISN - CNRS/Université Paris-Saclay, France, 91400 Orsay, France
| | - Fariza Tahi
- Université Paris Saclay, Univ Evry, IBISC, 91020 Evry-Courcouronnes, France
| |
Collapse
|
60
|
Thayyil Menambath D, Adiga U, Rai T, Adiga S, Shetty V. Identification of the SIRT1 gene's most harmful non-synonymous SNPs and their effects on functional and structural features-an in silico analysis. F1000Res 2024; 12:66. [PMID: 38283900 PMCID: PMC10822041 DOI: 10.12688/f1000research.128706.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/16/2024] [Indexed: 01/30/2024] Open
Abstract
Introduction The sirtuin (Silent mating type information regulation 2 homolog)1(SIRT1) protein plays a vital role in many disorders such as diabetes, cancer, obesity, inflammation, and neurodegenerative and cardiovascular diseases. The objective of this in silico analysis of SIRT1's functional single nucleotide polymorphisms (SNPs) was to gain valuable insight into the harmful effects of non-synonymous SNPs (nsSNPs) on the protein. The objective of the study was to use bioinformatics methods to investigate the genetic variations and modifications that may have an impact on the SIRT1 gene's expression and function. Methods nsSNPs of SIRT1 protein were collected from the dbSNP site, from its three (3) different protein accession IDs. These were then fed to various bioinformatic tools such as SIFT, Provean, and I- Mutant to find the most deleterious ones. Functional and structural effects were examined using the HOPE server and I-Tasser. Gene interactions were predicted by STRING software. The SIFT, Provean, and I-Mutant tools detected the most deleterious three nsSNPs (rs769519031, rs778184510, and rs199983221). Results Out of 252 nsSNPs, SIFT analysis showed that 94 were deleterious, Provean listed 67 dangerous, and I-Mutant found 58 nsSNPs resulting in lowered stability of proteins. HOPE modelling of rs199983221 and rs769519031 suggested reduced hydrophobicity due to Ile 4Thr and Ile223Ser resulting in decreased hydrophobic interactions. In contrast, on modelling rs778184510, the mutant protein had a higher hydrophobicity than the wild type. Conclusions Our study reports that three nsSNPs (D357A, I223S, I4T) are the most damaging mutations of the SIRT1 gene. Mutations may result in altered protein structure and functions. Such altered protein may be the basis for various disorders. Our findings may be a crucial guide in establishing the pathogenesis of various disorders.
Collapse
Affiliation(s)
| | - Usha Adiga
- Biochemistry, KS Hegde Medical Academy, NITTE (DU), Mangalore, Karnataka, 575018, India
| | - Tirthal Rai
- Biochemistry, KS Hegde Medical Academy, NITTE (DU), Mangalore, Karnataka, 575018, India
| | - Sachidananda Adiga
- Pharmacology, KS Hegde Medical Academy, NITTE(DU), Mangalore, Karnataka, 575018, India
| | - Vijith Shetty
- Oncology, KS Hegde Medical Academy, NITTE(DU), Mangalore, Karnataka, 575018, India
| |
Collapse
|
61
|
Li J, Wang L, Zhu Z, Song C. Exploring the Alternative Conformation of a Known Protein Structure Based on Contact Map Prediction. J Chem Inf Model 2024; 64:301-315. [PMID: 38117138 PMCID: PMC10777399 DOI: 10.1021/acs.jcim.3c01381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023]
Abstract
The rapid development of deep learning-based methods has considerably advanced the field of protein structure prediction. The accuracy of predicting the 3D structures of simple proteins is comparable to that of experimentally determined structures, providing broad possibilities for structure-based biological studies. Another critical question is whether and how multistate structures can be predicted from a given protein sequence. In this study, analysis of tens of two-state proteins demonstrated that deep learning-based contact map predictions contain structural information on both states, which suggests that it is probably appropriate to change the target of deep learning-based protein structure prediction from one specific structure to multiple likely structures. Furthermore, by combining deep learning- and physics-based computational methods, we developed a protocol for exploring alternative conformations from a known structure of a given protein, by which we successfully approached the holo-state conformations of multiple representative proteins from their apo-state structures.
Collapse
Affiliation(s)
- Jiaxuan Li
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Lei Wang
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Zefeng Zhu
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Chen Song
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| |
Collapse
|
62
|
Zhang C, Zhang X, Freddolino P, Zhang Y. BioLiP2: an updated structure database for biologically relevant ligand-protein interactions. Nucleic Acids Res 2024; 52:D404-D412. [PMID: 37522378 PMCID: PMC10767969 DOI: 10.1093/nar/gkad630] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 07/03/2023] [Accepted: 07/17/2023] [Indexed: 08/01/2023] Open
Abstract
With the progress of structural biology, the Protein Data Bank (PDB) has witnessed rapid accumulation of experimentally solved protein structures. Since many structures are determined with purification and crystallization additives that are unrelated to a protein's in vivo function, it is nontrivial to identify the subset of protein-ligand interactions that are biologically relevant. We developed the BioLiP2 database (https://zhanggroup.org/BioLiP) to extract biologically relevant protein-ligand interactions from the PDB database. BioLiP2 assesses the functional relevance of the ligands by geometric rules and experimental literature validations. The ligand binding information is further enriched with other function annotations, including Enzyme Commission numbers, Gene Ontology terms, catalytic sites, and binding affinities collected from other databases and a manual literature survey. Compared to its predecessor BioLiP, BioLiP2 offers significantly greater coverage of nucleic acid-protein interactions, and interactions involving large complexes that are unavailable in PDB format. BioLiP2 also integrates cutting-edge structural alignment algorithms with state-of-the-art structure prediction techniques, which for the first time enables composite protein structure and sequence-based searching and significantly enhances the usefulness of the database in structure-based function annotations. With these new developments, BioLiP2 will continue to be an important and comprehensive database for docking, virtual screening, and structure-based protein function analyses.
Collapse
Affiliation(s)
- Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xi Zhang
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| | - Peter L Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Computer Science, School of Computing, National University of Singapore, 117417, Singapore
- Cancer Science Institute of Singapore, National University of Singapore,117599, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 117596, Singapore
| |
Collapse
|
63
|
Ohno S, Manabe N, Yamaguchi Y. Prediction of protein structure and AI. J Hum Genet 2024:10.1038/s10038-023-01215-4. [PMID: 38177398 DOI: 10.1038/s10038-023-01215-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 12/10/2023] [Indexed: 01/06/2024]
Abstract
AlphaFold, an artificial intelligence (AI)-based tool for predicting the 3D structure of proteins, is now widely recognized for its high accuracy and versatility in the folding of human proteins. AlphaFold is useful for understanding structure-function relationships from protein 3D structure models and can serve as a template or a reference for experimental structural analysis including X-ray crystallography, NMR and cryo-EM analysis. Its use is expanding among researchers, not only in structural biology but also in other research fields. Researchers are currently exploring the full potential of AlphaFold-generated protein models. Predicting disease severity caused by missense mutations is one such application. This article provides an overview of the 3D structural modeling of AlphaFold based on deep learning techniques and highlights the challenges in predicting the pathogenicity of missense mutations.
Collapse
Affiliation(s)
- Shiho Ohno
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan
| | - Noriyoshi Manabe
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan
| | - Yoshiki Yamaguchi
- Division of Structural Glycobiology, Institute of Molecular Biomembrane and Glycobiology, Tohoku Medical and Pharmaceutical University, 4-4-1 Komatsushima, Aoba-ku, Sendai, Miyagi, 981-8558, Japan.
| |
Collapse
|
64
|
Roy BG, Choi J, Fuchs MF. Predictive Modeling of Proteins Encoded by a Plant Virus Sheds a New Light on Their Structure and Inherent Multifunctionality. Biomolecules 2024; 14:62. [PMID: 38254661 PMCID: PMC10813169 DOI: 10.3390/biom14010062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 12/29/2023] [Accepted: 12/30/2023] [Indexed: 01/24/2024] Open
Abstract
Plant virus genomes encode proteins that are involved in replication, encapsidation, cell-to-cell, and long-distance movement, avoidance of host detection, counter-defense, and transmission from host to host, among other functions. Even though the multifunctionality of plant viral proteins is well documented, contemporary functional repertoires of individual proteins are incomplete. However, these can be enhanced by modeling tools. Here, predictive modeling of proteins encoded by the two genomic RNAs, i.e., RNA1 and RNA2, of grapevine fanleaf virus (GFLV) and their satellite RNAs by a suite of protein prediction software confirmed not only previously validated functions (suppressor of RNA silencing [VSR], viral genome-linked protein [VPg], protease [Pro], symptom determinant [Sd], homing protein [HP], movement protein [MP], coat protein [CP], and transmission determinant [Td]) and previously identified putative functions (helicase [Hel] and RNA-dependent RNA polymerase [Pol]), but also predicted novel functions with varying levels of confidence. These include a T3/T7-like RNA polymerase domain for protein 1AVSR, a short-chain reductase for protein 1BHel/VSR, a parathyroid hormone family domain for protein 1EPol/Sd, overlapping domains of unknown function and an ABC transporter domain for protein 2BMP, and DNA topoisomerase domains, transcription factor FBXO25 domain, or DNA Pol subunit cdc27 domain for the satellite RNA protein. Structural predictions for proteins 2AHP/Sd, 2BMP, and 3A? had low confidence, while predictions for proteins 1AVSR, 1BHel*/VSR, 1CVPg, 1DPro, 1EPol*/Sd, and 2CCP/Td retained higher confidence in at least one prediction. This research provided new insights into the structure and functions of GFLV proteins and their satellite protein. Future work is needed to validate these findings.
Collapse
Affiliation(s)
- Brandon G. Roy
- Plant Pathology and Plant-Microbe Biology Section, School of Integrative Plant Science, Cornell University, 15 Castle Creek Drive, Geneva, NY 14456, USA; (J.C.); (M.F.F.)
| | | | | |
Collapse
|
65
|
Pantolini L, Studer G, Pereira J, Durairaj J, Tauriello G, Schwede T. Embedding-based alignment: combining protein language models with dynamic programming alignment to detect structural similarities in the twilight-zone. Bioinformatics 2024; 40:btad786. [PMID: 38175775 PMCID: PMC10792726 DOI: 10.1093/bioinformatics/btad786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 10/27/2023] [Accepted: 12/29/2023] [Indexed: 01/06/2024] Open
Abstract
MOTIVATION Language models are routinely used for text classification and generative tasks. Recently, the same architectures were applied to protein sequences, unlocking powerful new approaches in the bioinformatics field. Protein language models (pLMs) generate high-dimensional embeddings on a per-residue level and encode a "semantic meaning" of each individual amino acid in the context of the full protein sequence. These representations have been used as a starting point for downstream learning tasks and, more recently, for identifying distant homologous relationships between proteins. RESULTS In this work, we introduce a new method that generates embedding-based protein sequence alignments (EBA) and show how these capture structural similarities even in the twilight zone, outperforming both classical methods as well as other approaches based on pLMs. The method shows excellent accuracy despite the absence of training and parameter optimization. We demonstrate that the combination of pLMs with alignment methods is a valuable approach for the detection of relationships between proteins in the twilight-zone. AVAILABILITY AND IMPLEMENTATION The code to run EBA and reproduce the analysis described in this article is available at: https://git.scicore.unibas.ch/schwede/EBA and https://git.scicore.unibas.ch/schwede/eba_benchmark.
Collapse
Affiliation(s)
- Lorenzo Pantolini
- Biozentrum, University of Basel, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Gabriel Studer
- Biozentrum, University of Basel, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Joana Pereira
- Biozentrum, University of Basel, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Janani Durairaj
- Biozentrum, University of Basel, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel 4056, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel 4056, Switzerland
| |
Collapse
|
66
|
Sudarev VV, Gette MS, Bazhenov SV, Tilinova OM, Zinovev EV, Manukhov IV, Kuklin AI, Ryzhykau YL, Vlasov AV. Ferritin-based fusion protein shows octameric deadlock state of self-assembly. Biochem Biophys Res Commun 2024; 690:149276. [PMID: 38007906 DOI: 10.1016/j.bbrc.2023.149276] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 11/15/2023] [Indexed: 11/28/2023]
Abstract
Ferritin is a universal protein complex responsible for iron perception in almost all living organisms and has applications from fundamental biophysics to drug delivery and structure-based immunogen design. Different platforms based on ferritin share similar technological challenges limiting their development - control of self-assembling processes of ferritin itself as well as ferritin-based chimeric recombinant protein complexes. In our research, we studied self-assembly processes of ferritin-based protein complexes under different expression conditions. We fused a ferritin subunit with a SMT3 protein tag, a homolog of human Small Ubiquitin-like Modifier (SUMO-tag), which was taken to destabilize ferritin 3-fold channel contacts and increase ferritin-SUMO subunits solubility. We first obtained the octameric protein complex of ferritin-SUMO (8xFer-SUMO) and studied its structural organization by small-angle X-ray scattering (SAXS). Obtained SAXS data correspond well with the high-resolution models predicted by AlphaFold and CORAL software of an octameric assembly around the 4-fold channel of ferritin without formation of 3-fold channels. Interestingly, three copies of 8xFer-SUMO do not assemble into 24-meric globules. Thus, we first obtained and structurally characterized ferritin-based self-assembling oligomers in a deadlock state. Deadlock oligomeric states of ferritin extend the known scheme of its self-assembly process, being new potential tools for a number of applications. Finally, our results might open new directions for various biotechnological platforms utilizing ferritin-based tools.
Collapse
Affiliation(s)
- V V Sudarev
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - M S Gette
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - S V Bazhenov
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - O M Tilinova
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - E V Zinovev
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - I V Manukhov
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation
| | - A I Kuklin
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation; Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, Dubna, 141980, Russian Federation
| | - Yu L Ryzhykau
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation; Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, Dubna, 141980, Russian Federation.
| | - A V Vlasov
- Research Center for Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, 141700, Russian Federation; Frank Laboratory of Neutron Physics, Joint Institute for Nuclear Research, Dubna, 141980, Russian Federation.
| |
Collapse
|
67
|
Saeed A, Alharazi T, Alshaghdali K, Rezgui R, Elnaem I, Alreshidi BAT, Tasleem M, Saeed M. Targeting GluR3 in Depression and Alzheimer's Disease: Novel Compounds and Therapeutic Prospects. J Alzheimers Dis 2024; 97:1299-1312. [PMID: 38277291 DOI: 10.3233/jad-230821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2024]
Abstract
BACKGROUND The present study investigates the interrelated pathophysiology of depression and Alzheimer's disease (AD), with the objective of elucidating common underlying mechanisms. OBJECTIVE Our objective is to identify previously undiscovered biogenic compounds from the NuBBE database that specifically interact with GluR3. This study examines the bidirectional association between depression and AD, specifically focusing on the role of depression as a risk factor in the onset and progression of the disease. METHODS In this study, we utilize pharmacokinetics, homology modeling, and molecular docking-based virtual screening techniques to examine the GluR3 AMPA receptor subunit. RESULTS The compounds, namely ZINC000002558953, ZINC000001228056, ZINC000000187911, ZINC000003954487, and ZINC000002040988, exhibited favorable pharmacokinetic profiles and drug-like characteristics, displaying high binding affinities to the GluR3 binding pocket. CONCLUSIONS These findings suggest that targeting GluR3 could hold promise for the development of therapies for depression and AD. Further validation through in vitro, in vivo, and clinical studies is necessary to explore the potential of these compounds as lead candidates for potent and selective GluR3 inhibitors. The shared molecular mechanisms between depression and AD provide an opportunity for novel treatment approaches that address both conditions simultaneously.
Collapse
Affiliation(s)
- Amir Saeed
- Department of Medical Laboratory Sciences, College of Applied Medical Sciences, University of Hail, Hail, Saudi Arabia
- Department of Medical Microbiology, Faculty of Medical Laboratory Sciences, University of Medical Sciences & Technology, Khartoum, Sudan
| | - Talal Alharazi
- Department of Medical Laboratory Sciences, College of Applied Medical Sciences, University of Hail, Hail, Saudi Arabia
| | - Khalid Alshaghdali
- Department of Medical Laboratory Sciences, College of Applied Medical Sciences, University of Hail, Hail, Saudi Arabia
| | - Raja Rezgui
- Department of Medical Laboratory Sciences, College of Applied Medical Sciences, University of Hail, Hail, Saudi Arabia
| | - Ibtihag Elnaem
- Department of oral and maxillofacial surgery and diagnostic science College of Dentistry, University of Hail, Hail, Saudi Arabia
| | | | - Munazzah Tasleem
- School of Electronic Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Mohd Saeed
- Department of Biology, College of Science, University of Hail, Hail, Saudi Arabia
| |
Collapse
|
68
|
Zhou H, Skolnick J. FRAGSITE2: A structure and fragment-based approach for virtual ligand screening. Protein Sci 2024; 33:e4869. [PMID: 38100293 PMCID: PMC10751727 DOI: 10.1002/pro.4869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 12/06/2023] [Accepted: 12/09/2023] [Indexed: 12/17/2023]
Abstract
Protein function annotation and drug discovery often involve finding small molecule binders. In the early stages of drug discovery, virtual ligand screening (VLS) is frequently applied to identify possible hits before experimental testing. While our recent ligand homology modeling (LHM)-machine learning VLS method FRAGSITE outperformed approaches that combined traditional docking to generate protein-ligand poses and deep learning scoring functions to rank ligands, a more robust approach that could identify a more diverse set of binding ligands is needed. Here, we describe FRAGSITE2 that shows significant improvement on protein targets lacking known small molecule binders and no confident LHM identified template ligands when benchmarked on two commonly used VLS datasets: For both the DUD-E set and DEKOIS2.0 set and ligands having a Tanimoto coefficient (TC) < 0.7 to the template ligands, the 1% enrichment factor (EF1% ) of FRAGSITE2 is significantly better than those for FINDSITEcomb2.0 , an earlier LHM algorithm. For the DUD-E set, FRAGSITE2 also shows better ROC enrichment factor and AUPR (area under the precision-recall curve) than the deep learning DenseFS scoring function. Comparison with the RF-score-VS on the 76 target subset of DEKOIS2.0 and a TC < 0.99 to training DUD-E ligands, FRAGSITE2 has double the EF1% . Its boosted tree regression method provides for more robust performance than a deep learning multiple layer perceptron method. When compared with the pretrained language model for protein target features, FRAGSITE2 also shows much better performance. Thus, FRAGSITE2 is a promising approach that can discover novel hits for protein targets. FRAGSITE2's web service is freely available to academic users at http://sites.gatech.edu/cssb/FRAGSITE2.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of TechnologyAtlantaGeorgiaUSA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of TechnologyAtlantaGeorgiaUSA
| |
Collapse
|
69
|
Liuu S, Nepelska M, Pfister H, Gamelas Magalhaes J, Chevalier G, Strozzi F, Billerey C, Maresca M, Nicoletti C, Di Pasquale E, Pechard C, Bardouillet L, Girardin SE, Boneca IG, Doré J, Blottière HM, Bonny C, Chene L, Cultrone A. Identification of a muropeptide precursor transporter from gut microbiota and its role in preventing intestinal inflammation. Proc Natl Acad Sci U S A 2023; 120:e2306863120. [PMID: 38127978 PMCID: PMC10756304 DOI: 10.1073/pnas.2306863120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 10/31/2023] [Indexed: 12/23/2023] Open
Abstract
The gut microbiota is a considerable source of biologically active compounds that can promote intestinal homeostasis and improve immune responses. Here, we used large expression libraries of cloned metagenomic DNA to identify compounds able to sustain an anti-inflammatory reaction on host cells. Starting with a screen for NF-κB activation, we have identified overlapping clones harbouring a heterodimeric ATP-binding cassette (ABC)-transporter from a Firmicutes. Extensive purification of the clone's supernatant demonstrates that the ABC-transporter allows for the efficient extracellular accumulation of three muropeptide precursor, with anti-inflammatory properties. They induce IL-10 secretion from human monocyte-derived dendritic cells and proved effective in reducing AIEC LF82 epithelial damage and IL-8 secretion in human intestinal resections. In addition, treatment with supernatants containing the muropeptide precursor reduces body weight loss and improves histological parameters in Dextran Sulfate Sodium (DSS)-treated mice. Until now, the source of peptidoglycan fragments was shown to come from the natural turnover of the peptidoglycan layer by endogenous peptidoglycan hydrolases. This is a report showing an ABC-transporter as a natural source of secreted muropeptide precursor and as an indirect player in epithelial barrier strengthening. The mechanism described here might represent an important component of the host immune homeostasis.
Collapse
Affiliation(s)
| | - Malgorzata Nepelska
- Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), AgroParisTech, Food Microbial Ecology lab (Micalis), Université Paris-Saclay, Jouy-en-Josas78350, France
| | | | | | | | | | | | - Marc Maresca
- CNRS, Centrale Marseille, Institut des Sciences Moléculaires (iSm2) UMR7313, Aix Marseille Université, Marseille13013, France
| | - Cendrine Nicoletti
- CNRS, Centrale Marseille, Institut des Sciences Moléculaires (iSm2) UMR7313, Aix Marseille Université, Marseille13013, France
| | - Eric Di Pasquale
- Institut de NeuroPhysioPathologie (INP), Aix Marseille Université, UMR 7051, Marseille13005, France
| | | | | | - Stephen E. Girardin
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Ivo Gomperts Boneca
- Institut Pasteur, Université Paris Cité, CNRS Unité Mixe de Recherche 6047, INSERM U1306, Unité de Biologie et génétique de la paroi bactérienne, Paris75015, France
| | - Joel Doré
- Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), AgroParisTech, Food Microbial Ecology lab (Micalis), Université Paris-Saclay, Jouy-en-Josas78350, France
- Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), MetaGenoPolis, Université Paris-Saclay, Jouy-en-Josas78350, France
| | - Hervé M. Blottière
- Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), AgroParisTech, Food Microbial Ecology lab (Micalis), Université Paris-Saclay, Jouy-en-Josas78350, France
- Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), MetaGenoPolis, Université Paris-Saclay, Jouy-en-Josas78350, France
| | | | | | | |
Collapse
|
70
|
Park J, Champion JA. Development of Self-Assembled Protein Nanocage Spatially Functionalized with HA Stalk as a Broadly Cross-Reactive Influenza Vaccine Platform. ACS NANO 2023; 17:25045-25060. [PMID: 38084728 PMCID: PMC10753887 DOI: 10.1021/acsnano.3c07669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 11/29/2023] [Accepted: 12/01/2023] [Indexed: 12/27/2023]
Abstract
There remains a need for the development of a universal influenza vaccine, as current seasonal influenza vaccines exhibit limited protection against mismatched, mutated, or pandemic influenza viruses. A desirable approach to developing an effective universal influenza vaccine is the incorporation of highly conserved antigens in a multivalent scaffold that enhances their immunogenicity. Here, we develop a broadly cross-reactive influenza vaccine by functionalizing self-assembled protein nanocages (SAPNs) with multiple copies of the hemagglutinin stalk on the outer surface and matrix protein 2 ectodomain on the inner surface. SAPNs were generated by engineering short coiled coils, and the design was simulated by MD GROMACS. Due to the short sequences, off-target immune responses against empty SAPN scaffolds were not seen in immunized mice. Vaccination with the multivalent SAPNs induces high levels of broadly cross-reactive antibodies of only external antigens, demonstrating tight spatial control over the designed antigen placement. This work demonstrates the use of SAPNs as a potential influenza vaccine.
Collapse
Affiliation(s)
- Jaeyoung Park
- School of Chemical and Biomolecular
Engineering, Georgia Institute of Technology, 950 Atlantic Dr. NW, Atlanta, Georgia 30332-2000, United States
| | - Julie A. Champion
- School of Chemical and Biomolecular
Engineering, Georgia Institute of Technology, 950 Atlantic Dr. NW, Atlanta, Georgia 30332-2000, United States
| |
Collapse
|
71
|
Hakkennes MA, Buda F, Bonnet S. MetalDock: An Open Access Docking Tool for Easy and Reproducible Docking of Metal Complexes. J Chem Inf Model 2023; 63:7816-7825. [PMID: 38048559 PMCID: PMC10751784 DOI: 10.1021/acs.jcim.3c01582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/13/2023] [Accepted: 11/14/2023] [Indexed: 12/06/2023]
Abstract
Despite the proven potential of metal complexes as therapeutics, the lack of computational tools available for the high-throughput screening of their interactions with proteins is a limiting factor toward clinical developments. To address this challenge, we introduce MetalDock, an easy-to-use, open access docking software for docking metal complexes to proteins. Our tool integrates the AutoDock docking engine with three well-known quantum software packages to automate the docking of metal-organic complexes to proteins. We used a Monte Carlo sampling scheme to obtain the missing Lennard-Jones parameters for 12 metal atom types and demonstrated that these parameters generalize exceptionally well. Our results show that the poses obtained by MetalDock are highly accurate, as they predict the binding geometries experimentally determined by crystal structures with high spatial reproducibility. Three different case studies are presented that demonstrate the versatility of MetalDock for the docking of diverse metal-organic compounds to different biomacromolecules, including nucleic acids.
Collapse
Affiliation(s)
- Matthijs
L. A. Hakkennes
- Leiden
Institute of Chemistry, Leiden University, P.O. Box 9502, 2300 RA Leiden, The Netherlands
| | - Francesco Buda
- Leiden
Institute of Chemistry, Leiden University, P.O. Box 9502, 2300 RA Leiden, The Netherlands
| | - Sylvestre Bonnet
- Leiden
Institute of Chemistry, Leiden University, P.O. Box 9502, 2300 RA Leiden, The Netherlands
| |
Collapse
|
72
|
Ng TK, Ji J, Liu Q, Yao Y, Wang WY, Cao Y, Chen CB, Lin JW, Dong G, Cen LP, Huang C, Zhang M. Evaluation of Myocilin Variant Protein Structures Modeled by AlphaFold2. Biomolecules 2023; 14:14. [PMID: 38275755 PMCID: PMC10813463 DOI: 10.3390/biom14010014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 12/12/2023] [Accepted: 12/15/2023] [Indexed: 01/27/2024] Open
Abstract
Deep neural network-based programs can be applied to protein structure modeling by inputting amino acid sequences. Here, we aimed to evaluate the AlphaFold2-modeled myocilin wild-type and variant protein structures and compare to the experimentally determined protein structures. Molecular dynamic and ligand binding properties of the experimentally determined and AlphaFold2-modeled protein structures were also analyzed. AlphaFold2-modeled myocilin variant protein structures showed high similarities in overall structure to the experimentally determined mutant protein structures, but the orientations and geometries of amino acid side chains were slightly different. The olfactomedin-like domain of the modeled missense variant protein structures showed fewer folding changes than the nonsense variant when compared to the predicted wild-type protein structure. Differences were also observed in molecular dynamics and ligand binding sites between the AlphaFold2-modeled and experimentally determined structures as well as between the wild-type and variant structures. In summary, the folding of the AlphaFold2-modeled MYOC variant protein structures could be similar to that determined by the experiments but with differences in amino acid side chain orientations and geometries. Careful comparisons with experimentally determined structures are needed before the applications of the in silico modeled variant protein structures.
Collapse
Affiliation(s)
- Tsz Kin Ng
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Jie Ji
- Network & Information Centre, Shantou University, Shantou 515041, China
| | - Qingping Liu
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
- Key Laboratory of Carbohydrate and Lipid Metabolism Research, College of Life Science and Technology, Dalian University, Dalian 116622, China
| | - Yao Yao
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
- Shantou University Medical College, Shantou 515041, China
| | - Wen-Ying Wang
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
- Shantou University Medical College, Shantou 515041, China
| | - Yingjie Cao
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Chong-Bo Chen
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Jian-Wei Lin
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Geng Dong
- Shantou University Medical College, Shantou 515041, China
| | - Ling-Ping Cen
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Chukai Huang
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| | - Mingzhi Zhang
- Joint Shantou International Eye Center of Shantou University and The Chinese University of Hong Kong, Shantou 515041, China; (T.K.N.)
| |
Collapse
|
73
|
Savinov A, Swanson S, Keating AE, Li GW. High-throughput computational discovery of inhibitory protein fragments with AlphaFold. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.19.572389. [PMID: 38187731 PMCID: PMC10769210 DOI: 10.1101/2023.12.19.572389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Peptides can bind to specific sites on larger proteins and thereby function as inhibitors and regulatory elements. Peptide fragments of larger proteins are particularly attractive for achieving these functions due to their inherent potential to form native-like binding interactions. Recently developed experimental approaches allow for high-throughput measurement of protein fragment inhibitory activity in living cells. However, it has thus far not been possible to predict de novo which of the many possible protein fragments bind their protein targets, let alone act as inhibitors. We have developed a computational method, FragFold, that employs AlphaFold to predict protein fragment binding to full-length protein targets in a high-throughput manner. Applying FragFold to thousands of fragments tiling across diverse proteins revealed peaks of predicted binding along each protein sequence. These predictions were compared with experimentally measured peaks of inhibitory activity in E. coli. We establish that our approach is a sensitive predictor of protein fragment function: Evaluating inhibitory fragments derived from known protein-protein interaction interfaces, we found 87% were predicted by FragFold to bind in a native-like mode. Across full protein sequences, 68% of FragFold-predicted binding peaks match experimentally measured inhibitory peaks. This is true even when the underlying inhibitory mechanism is unclear from existing structural data, and we find FragFold is able to predict novel binding modes for inhibitory fragments of unknown structure, explaining previous genetic and biochemical data for these fragments. The success rate of FragFold demonstrates that this computational approach should be broadly applicable for discovering inhibitory protein fragments across proteomes.
Collapse
Affiliation(s)
- Andrew Savinov
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sebastian Swanson
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Amy E. Keating
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Koch Center for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Gene-Wei Li
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
74
|
Danneskiold-Samsøe NB, Kavi D, Jude KM, Nissen SB, Wat LW, Coassolo L, Zhao M, Santana-Oikawa GA, Broido BB, Garcia KC, Svensson KJ. AlphaFold2 enables accurate deorphanization of ligands to single-pass receptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.16.531341. [PMID: 36993313 PMCID: PMC10055078 DOI: 10.1101/2023.03.16.531341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Secreted proteins play crucial roles in paracrine and endocrine signaling; however, identifying novel ligand-receptor interactions remains challenging. Here, we benchmarked AlphaFold as a screening approach to identify extracellular ligand-binding pairs using a structural library of single-pass transmembrane receptors. Key to the approach is the optimization of AlphaFold input and output for screening ligands against receptors to predict the most probable ligand-receptor interactions. Importantly, the predictions were performed on ligand-receptor pairs not used for AlphaFold training. We demonstrate high discriminatory power and a success rate of close to 90 % for known ligand-receptor pairs and 50 % for a diverse set of experimentally validated interactions. These results demonstrate proof-of-concept of a rapid and accurate screening platform to predict high-confidence cell-surface receptors for a diverse set of ligands by structural binding prediction, with potentially wide applicability for the understanding of cell-cell communication.
Collapse
Affiliation(s)
- Niels Banhos Danneskiold-Samsøe
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Biology, University of Copenhagen, Denmark
| | - Deniz Kavi
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Kevin M. Jude
- Department of Molecular and Cellular Physiology, Department of Structural Biology, and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Silas Boye Nissen
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- The Novo Nordisk Foundation Center for Stem Cell Medicine (reNEW), University of Copenhagen, Blegdamsvej 3B, DK-2200 Copenhagen N, Denmark
| | - Lianna W. Wat
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA
| | - Laetitia Coassolo
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA
| | - Meng Zhao
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA
| | | | | | - K. Christopher Garcia
- Department of Molecular and Cellular Physiology, Department of Structural Biology, and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Katrin J. Svensson
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Diabetes Research Center, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Cardiovascular Institute, Stanford University School of Medicine, CA, USA
| |
Collapse
|
75
|
Jeppesen M, André I. Accurate prediction of protein assembly structure by combining AlphaFold and symmetrical docking. Nat Commun 2023; 14:8283. [PMID: 38092742 PMCID: PMC10719378 DOI: 10.1038/s41467-023-43681-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 11/16/2023] [Indexed: 12/17/2023] Open
Abstract
AlphaFold can predict the structures of monomeric and multimeric proteins with high accuracy but has a limit on the number of chains and residues it can fold. Here we show that a combination of AlphaFold and all-atom symmetric docking simulations enables highly accurate prediction of the structure of complex symmetrical assemblies. We present a method to predict the structure of complexes with cubic - tetrahedral, octahedral and icosahedral - symmetry from sequence. Focusing on proteins where AlphaFold can make confident predictions on the subunit structure, 27 cubic systems were assembled with a median TM-score of 0.99 and a DockQ score of 0.72. 21 had TM-scores of above 0.9 and were categorized as acceptable- to high-quality according to DockQ. The resulting models are energetically optimized and can be used for detailed studies of intermolecular interactions in higher-order symmetrical assemblies. The results demonstrate how explicit treatment of structural symmetry can significantly expand the size and complexity of AlphaFold predictions.
Collapse
Affiliation(s)
- Mads Jeppesen
- Department of Biochemistry and Structural Biology, Lund University, Lund, Sweden
| | - Ingemar André
- Department of Biochemistry and Structural Biology, Lund University, Lund, Sweden.
| |
Collapse
|
76
|
Tsuchiya Y, Yonezawa T, Yamamori Y, Inoura H, Osawa M, Ikeda K, Tomii K. PoSSuM v.3: A Major Expansion of the PoSSuM Database for Finding Similar Binding Sites of Proteins. J Chem Inf Model 2023; 63:7578-7587. [PMID: 38016694 PMCID: PMC10716853 DOI: 10.1021/acs.jcim.3c01405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/28/2023] [Accepted: 11/01/2023] [Indexed: 11/30/2023]
Abstract
Information on structures of protein-ligand complexes, including comparisons of known and putative protein-ligand-binding pockets, is valuable for protein annotation and drug discovery and development. To facilitate biomedical and pharmaceutical research, we developed PoSSuM (https://possum.cbrc.pj.aist.go.jp/PoSSuM/), a database for identifying similar binding pockets in proteins. The current PoSSuM database includes 191 million similar pairs among almost 10 million identified pockets. PoSSuM drug search (PoSSuMds) is a resource for investigating ligand and receptor diversity among a set of pockets that can bind to an approved drug compound. The enhanced PoSSuMds covers pockets associated with both approved drugs and drug candidates in clinical trials from the latest release of ChEMBL. Additionally, we developed two new databases: PoSSuMAg for investigating antibody-antigen interactions and PoSSuMAF to simplify exploring putative pockets in AlphaFold human protein models.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Tomoki Yonezawa
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
| | - Yu Yamamori
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Hiroko Inoura
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Masanori Osawa
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
| | - Kazuyoshi Ikeda
- Division
of Physics for Life Functions, Keio University
Faculty of Pharmacy, 1-5-30 Shibakoen, Minato-ku, Tokyo 105-8512, Japan
- Medicinal
Chemistry Applied AI Unit, HPC- and AI-driven Drug Development Platform
Division, RIKEN Center for Computational
Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Kentaro Tomii
- Artificial
Intelligence Research Center, National Institute
of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
77
|
Peng Z, Wang W, Wei H, Li X, Yang J. Improved protein structure prediction with trRosettaX2, AlphaFold2, and optimized MSAs in CASP15. Proteins 2023; 91:1704-1711. [PMID: 37565699 DOI: 10.1002/prot.26570] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 07/17/2023] [Accepted: 07/31/2023] [Indexed: 08/12/2023]
Abstract
We present the monomer and multimer structure prediction results of our methods in CASP15. We first designed an elaborate pipeline that leverages complementary sequence databases and advanced database searching algorithms to generate high-quality multiple sequence alignments (MSAs). Top MSAs were then selected for the subsequent step of structure prediction. We utilized trRosettaX2 and AlphaFold2 for monomer structure prediction (group name Yang-Server), and AlphaFold-Multimer for multimer structure prediction (group name Yang-Multimer). Yang-Server and Yang-Multimer are ranked at the top and the fourth, respectively, for monomer and multimer structure prediction. For 94 monomers, the average TM-score of the predicted structure models by Yang-Server is 0.876, compared to 0.798 by the default AlphaFold2 (i.e., the group NBIS-AF2-standard). For 42 multimers, the average DockQ score of the predicted structure models by Yang-Multimer is 0.464, compared to 0.389 by the default AlphaFold-Multimer (i.e., the group NBIS-AF2-multimer). Detailed analysis of the results shows that several factors contribute to the improvement, including improved MSAs, iterated modeling for large targets, interplay between monomer and multimer structure prediction for intertwined structures, etc. However, the structure predictions for orphan proteins and multimers remain challenging, and breakthroughs in this area are anticipated in the future.
Collapse
Affiliation(s)
- Zhenling Peng
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Wenkai Wang
- School of Mathematical Sciences, Nankai University, Tianjin, China
| | - Hong Wei
- School of Mathematical Sciences, Nankai University, Tianjin, China
| | - Xiaoge Li
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| | - Jianyi Yang
- MOE Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China
| |
Collapse
|
78
|
Wallner B. Improved multimer prediction using massive sampling with AlphaFold in CASP15. Proteins 2023; 91:1734-1746. [PMID: 37548092 DOI: 10.1002/prot.26562] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/16/2023] [Accepted: 07/17/2023] [Indexed: 08/08/2023]
Abstract
AlphaFold2 has revolutionized structure prediction by achieving high accuracy comparable to experimentally determined structures. However, there is still room for improvement, especially for challenging cases like multimers. A key to the success of AlphaFold is its ability to assess and rank its own predictions. Our basic idea for the Wallner group in CASP15 was to exploit this excellent scoring function in AlphaFold by massive sampling. To achieve this goal, we conducted AlphaFold runs using six different settings, using templates, without templates, and with an increased number of recycles for both multimer v1 and v2 weights. In all instances, we enabled dropout layers during inference, allowing for sampling of uncertainty and enhancing the diversity of the generated models. In total, 274 289 models were generated for the 38 targets in CASP15, with a median of 4810 models per target. Of these 38 targets, 10 were high quality, 11 were medium quality, 11 were acceptable, and only 6 were incorrect. The improvement over the baseline method, NBIS-AF2-multimer, is substantial, with the mean DockQ increasing from 0.43 to 0.56, with several targets showing a DockQ score increase of +0.6 units. Remarkable, considering Wallner and NBIS-AF2-multimer were using identical input data. The success can be attributed to the diversified sampling using dropout with different settings and, in particular, the use of multimer v1, which is much more susceptible to sampling compared with v2. The method is available here: http://wallnerlab.org/AFsample/.
Collapse
Affiliation(s)
- Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| |
Collapse
|
79
|
Li J, Zhang S, Chen SJ. Advancing RNA 3D structure prediction: Exploring hierarchical and hybrid approaches in CASP15. Proteins 2023; 91:1779-1789. [PMID: 37615235 PMCID: PMC10841231 DOI: 10.1002/prot.26583] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 06/19/2023] [Accepted: 08/08/2023] [Indexed: 08/25/2023]
Abstract
In CASP15, we used an integrated hierarchical and hybrid approach to predict RNA structures. The approach involves three steps. First, with the use of physics-based methods, Vfold2D-MC and VfoldMCPX, we predict the 2D structures from the sequence. Second, we employ template-based methods, Vfold3D and VfoldLA, to build 3D scaffolds for the predicted 2D structures. Third, using the 3D scaffolds as initial structures and the predicted 2D structures as constraints, we predict the 3D structure from coarse-grained molecular dynamics simulations, IsRNA and RNAJP. Our approach was evaluated on 12 RNA targets in CASP15 and ranked second among all the 34 participating teams. The result demonstrated the reliability of our method in predicting RNA 2D structures with high accuracy and RNA 3D structures with moderate accuracy. Further improvements in RNA structure prediction for the next round of CASP may come from the incorporation of the physics-based method with machine learning techniques.
Collapse
Affiliation(s)
- Jun Li
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Sicheng Zhang
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and Institute for Data Science and Informatics, University of Missouri, Columbia, Missouri 65211, United States
| |
Collapse
|
80
|
Studer G, Tauriello G, Schwede T. Assessment of the assessment-All about complexes. Proteins 2023; 91:1850-1860. [PMID: 37858934 DOI: 10.1002/prot.26612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/26/2023] [Accepted: 09/29/2023] [Indexed: 10/21/2023]
Abstract
Predicting model quality is a fundamental component of any modeling procedure, and blind assessment of these methods constitutes a crucial aspect of the Critical Assessment of Protein Structure Prediction (CASP) experiment. Historically, the main focus was on assessing methods that predict global and per-residue accuracies in tertiary structure models. This focus shifted with the community's increased efforts in modeling complexes and assemblies. We asked the community to process the models from the CASP15 assembly category and provide estimates of the accuracy of the predicted quaternary structure, both globally and at the local interface level. Besides identifying remarkable accuracy of modeling groups in assessing their own predictions, we set up a benchmarking pipeline to highlight different aspects of quaternary structure models and introduced a simple consensus EMA method as baseline. While participating methods showed commendable performance, the baseline was difficult to surpass. It is important to point out that prediction performance varies for the individual CASP targets, highlighting potential areas of improvement and challenges ahead.
Collapse
Affiliation(s)
- Gabriel Studer
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
81
|
Roy RS, Liu J, Giri N, Guo Z, Cheng J. Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15. Proteins 2023; 91:1889-1902. [PMID: 37357816 PMCID: PMC10749984 DOI: 10.1002/prot.26542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 06/07/2023] [Accepted: 06/08/2023] [Indexed: 06/27/2023]
Abstract
Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of protein tertiary structural models, but it has been rarely applied to predicting the quality of quaternary structural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter-chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and performed very well in estimating the global structure accuracy of assembly models. The average per-target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per-target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analyzed. The results demonstrate that combining the multi-model method (PSS) with the complementary single-model method (ICPS) is a promising approach to EMA.
Collapse
Affiliation(s)
- Raj S. Roy
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
82
|
Oda T. Improving protein structure prediction with extended sequence similarity searches and deep-learning-based refinement in CASP15. Proteins 2023; 91:1712-1723. [PMID: 37485822 DOI: 10.1002/prot.26551] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/23/2023] [Accepted: 06/28/2023] [Indexed: 07/25/2023]
Abstract
The human predictor team PEZYFoldings got first place with the assessor's formulae (3rd place with Global Distance Test Total Score [GDT-TS]) in the single-domain category and 10th place in the multimer category in Critical Assessment of Structure Prediction 15. In this paper, I describe the exact method used by PEZYFoldings in the competition. As AlphaFold2 and AlphaFold-Multimer, developed by DeepMind, were state-of-the-art structure prediction tools, it was assumed that enhancing the input and output of the tools was an effective strategy to obtain the highest accuracy for structure prediction. Therefore, I used additional tools and databases to collect evolutionarily related sequences and introduced a deep-learning-based model in the refinement step. In addition to these modifications, manual interventions were performed to address various tasks. Detailed analyses were performed after the competition to identify the main contributors to performance. Comparing the number of evolutionarily related sequences I used with those of the other teams that provided AlphaFold2's baseline predictions revealed that an extensive sequence similarity search was one of the main contributors. Nonetheless, there were specific targets for which I could not identify any evolutionarily related sequences, resulting in my inability to construct accurate structures for these targets. Notably, I noticed that I had gained large Z-scores with the subunits of H1137, for which I performed manual domain parsing considering the interfaces between the subunits. This finding implies that the manual intervention contributed to my performance. The influence of the refinement model on the accuracy of structure prediction was minimal. I could have predicted structures with a similar level of accuracy without employing the refinement model. However, from the perspective of accuracy self-estimate, many structures demonstrated improvement after refinement. This improvement likely had a substantial influence on improving my position in the assessor's formulae rankings. These results highlight the opportunities for improvement in (1) multimer prediction, (2) building of larger and more diverse databases, and (3) developing tools to predict structures from primary sequences alone. In addition, transferring the manual intervention process to automation is a future concern.
Collapse
|
83
|
Xia Y, Zhao K, Liu D, Zhou X, Zhang G. Multi-domain and complex protein structure prediction using inter-domain interactions from deep learning. Commun Biol 2023; 6:1221. [PMID: 38040847 PMCID: PMC10692239 DOI: 10.1038/s42003-023-05610-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 11/20/2023] [Indexed: 12/03/2023] Open
Abstract
Accurately capturing domain-domain interactions is key to understanding protein function and designing structure-based drugs. Although AlphaFold2 has made a breakthrough on single domain, it should be noted that the structure modeling for multi-domain protein and complex remains a challenge. In this study, we developed a multi-domain and complex structure assembly protocol, named DeepAssembly, based on domain segmentation and single domain modeling algorithms. Firstly, DeepAssembly uses a population-based evolutionary algorithm to assemble multi-domain proteins by inter-domain interactions inferred from a developed deep learning network. Secondly, protein complexes are assembled by means of domains rather than chains using DeepAssembly. Experimental results show that on 219 multi-domain proteins, the average inter-domain distance precision by DeepAssembly is 22.7% higher than that of AlphaFold2. Moreover, DeepAssembly improves accuracy by 13.1% for 164 multi-domain structures with low confidence deposited in AlphaFold database. We apply DeepAssembly for the prediction of 247 heterodimers. We find that DeepAssembly successfully predicts the interface (DockQ ≥ 0.23) for 32.4% of the dimers, suggesting a lighter way to assemble complex structures by treating domains as assembly units and using inter-domain interactions learned from monomer structures.
Collapse
Affiliation(s)
- Yuhao Xia
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Kailong Zhao
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Dong Liu
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Xiaogen Zhou
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China
| | - Guijun Zhang
- College of Information Engineering, Zhejiang University of Technology, HangZhou, 310023, China.
| |
Collapse
|
84
|
Zheng W, Wuyun Q, Freddolino PL, Zhang Y. Integrating deep learning, threading alignments, and a multi-MSA strategy for high-quality protein monomer and complex structure prediction in CASP15. Proteins 2023; 91:1684-1703. [PMID: 37650367 PMCID: PMC10840719 DOI: 10.1002/prot.26585] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 08/04/2023] [Accepted: 08/14/2023] [Indexed: 09/01/2023]
Abstract
We report the results of the "UM-TBM" and "Zheng" groups in CASP15 for protein monomer and complex structure prediction. These prediction sets were obtained using the D-I-TASSER and DMFold-Multimer algorithms, respectively. For monomer structure prediction, D-I-TASSER introduced four new features during CASP15: (i) a multiple sequence alignment (MSA) generation protocol that combines multi-source MSA searching and a structural modeling-based MSA ranker; (ii) attention-network based spatial restraints; (iii) a multi-domain module containing domain partition and arrangement for domain-level templates and spatial restraints; (iv) an optimized I-TASSER-based folding simulation system for full-length model creation guided by a combination of deep learning restraints, threading alignments, and knowledge-based potentials. For 47 free modeling targets in CASP15, the final models predicted by D-I-TASSER showed average TM-score 19% higher than the standard AlphaFold2 program. We thus showed that traditional Monte Carlo-based folding simulations, when appropriately coupled with deep learning algorithms, can generate models with improved accuracy over end-to-end deep learning methods alone. For protein complex structure prediction, DMFold-Multimer generated models by integrating a new MSA generation algorithm (DeepMSA2) with the end-to-end modeling module from AlphaFold2-Multimer. For the 38 complex targets, DMFold-Multimer generated models with an average TM-score of 0.83 and Interface Contact Score of 0.60, both significantly higher than those of competing complex prediction tools. Our analyses on complexes highlighted the critical role played by MSA generating, ranking, and pairing in protein complex structure prediction. We also discuss future room for improvement in the areas of viral protein modeling and complex model ranking.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Peter L Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Computer Science, School of Computing, National University of Singapore, 117417 Singapore
- Cancer Science Institute of Singapore, National University of Singapore, 117599, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 117596, Singapore
| |
Collapse
|
85
|
Boohar RT, Vandepas LE, Traylor-Knowles N, Browne WE. Phylogenetic and Protein Structure Analyses Provide Insight into the Evolution and Diversification of the CD36 Domain "Apex" among Scavenger Receptor Class B Proteins across Eukarya. Genome Biol Evol 2023; 15:evad218. [PMID: 38035778 PMCID: PMC10715195 DOI: 10.1093/gbe/evad218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 11/07/2023] [Accepted: 11/24/2023] [Indexed: 12/02/2023] Open
Abstract
The cluster of differentiation 36 (CD36) domain defines the characteristic ectodomain associated with class B scavenger receptor (SR-B) proteins. In bilaterians, SR-Bs play critical roles in diverse biological processes including innate immunity functions such as pathogen recognition and apoptotic cell clearance, as well as metabolic sensing associated with fatty acid uptake and cholesterol transport. Although previous studies suggest this protein family is ancient, SR-B diversity across Eukarya has not been robustly characterized. We analyzed SR-B homologs identified from the genomes and transcriptomes of 165 diverse eukaryotic species. The presence of highly conserved amino acid motifs across major eukaryotic supergroups supports the presence of a SR-B homolog in the last eukaryotic common ancestor. Our comparative analyses of SR-B protein structure identify the retention of a canonical asymmetric beta barrel tertiary structure within the CD36 ectodomain across Eukarya. We also identify multiple instances of independent lineage-specific sequence expansions in the apex region of the CD36 ectodomain-a region functionally associated with ligand-sensing. We hypothesize that a combination of both sequence expansion and structural variation in the CD36 apex region may reflect the evolution of SR-B ligand-sensing specificity between diverse eukaryotic clades.
Collapse
Affiliation(s)
- Reed T Boohar
- Department of Biology, University of Miami, Coral Gables, Florida, USA
| | - Lauren E Vandepas
- Department of Biology, University of Miami, Coral Gables, Florida, USA
| | - Nikki Traylor-Knowles
- Department of Marine Biology and Ecology, Rosenstiel School of Marine and Atmospheric Science, University of Miami, Miami, Florida, USA
| | - William E Browne
- Department of Biology, University of Miami, Coral Gables, Florida, USA
| |
Collapse
|
86
|
Zhang X, Yin H, Ling F, Zhan J, Zhou Y. SPIN-CGNN: Improved fixed backbone protein design with contact map-based graph construction and contact graph neural network. PLoS Comput Biol 2023; 19:e1011330. [PMID: 38060617 PMCID: PMC10729952 DOI: 10.1371/journal.pcbi.1011330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 12/19/2023] [Accepted: 11/27/2023] [Indexed: 12/20/2023] Open
Abstract
Recent advances in deep learning have significantly improved the ability to infer protein sequences directly from protein structures for the fix-backbone design. The methods have evolved from the early use of multi-layer perceptrons to convolutional neural networks, transformers, and graph neural networks (GNN). However, the conventional approach of constructing K-nearest-neighbors (KNN) graph for GNN has limited the utilization of edge information, which plays a critical role in network performance. Here we introduced SPIN-CGNN based on protein contact maps for nearest neighbors. Together with auxiliary edge updates and selective kernels, we found that SPIN-CGNN provided a comparable performance in refolding ability by AlphaFold2 to the current state-of-the-art techniques but a significant improvement over them in term of sequence recovery, perplexity, deviation from amino-acid compositions of native sequences, conservation of hydrophobic positions, and low complexity regions, according to the test by unseen structures, "hallucinated" structures and diffusion models. Results suggest that low complexity regions in the sequences designed by deep learning, for generated structures in particular, remain to be improved, when compared to the native sequences.
Collapse
Affiliation(s)
- Xing Zhang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, People’s Republic of China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Hongmei Yin
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Fei Ling
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, People’s Republic of China
| | - Jian Zhan
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, People’s Republic of China
| |
Collapse
|
87
|
Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XV. Proteins 2023; 91:1539-1549. [PMID: 37920879 PMCID: PMC10843301 DOI: 10.1002/prot.26617] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 10/06/2023] [Indexed: 11/04/2023]
Abstract
Computing protein structure from amino acid sequence information has been a long-standing grand challenge. Critical assessment of structure prediction (CASP) conducts community experiments aimed at advancing solutions to this and related problems. Experiments are conducted every 2 years. The 2020 experiment (CASP14) saw major progress, with the second generation of deep learning methods delivering accuracy comparable with experiment for many single proteins. There is an expectation that these methods will have much wider application in computational structural biology. Here we summarize results from the most recent experiment, CASP15, in 2022, with an emphasis on new deep learning-driven progress. Other papers in this special issue of proteins provide more detailed analysis. For single protein structures, the AlphaFold2 deep learning method is still superior to other approaches, but there are two points of note. First, although AlphaFold2 was the core of all the most successful methods, there was a wide variety of implementation and combination with other methods. Second, using the standard AlphaFold2 protocol and default parameters only produces the highest quality result for about two thirds of the targets, and more extensive sampling is required for the others. The major advance in this CASP is the enormous increase in the accuracy of computed protein complexes, achieved by the use of deep learning methods, although overall these do not fully match the performance for single proteins. Here too, AlphaFold2 based method perform best, and again more extensive sampling than the defaults is often required. Also of note are the encouraging early results on the use of deep learning to compute ensembles of macromolecular structures. Critically for the usability of computed structures, for both single proteins and protein complexes, deep learning derived estimates of both local and global accuracy are of high quality, however the estimates in interface regions are slightly less reliable. CASP15 also included computation of RNA structures for the first time. Here, the classical approaches produced better agreement with experiment than the new deep learning ones, and accuracy is limited. Also, for the first time, CASP included the computation of protein-ligand complexes, an area of special interest for drug design. Here too, classical methods were still superior to deep learning ones. Many new approaches were discussed at the CASP conference, and it is clear methods will continue to advance.
Collapse
Affiliation(s)
| | - Torsten Schwede
- University of Basel, Biozentrum & SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Maya Topf
- Centre for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universitätsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| | | | - John Moult
- Institute for Bioscience and Biotechnology Research, Rockville, MD, USA, and Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
| |
Collapse
|
88
|
De Salis SKF, Chen JZ, Skarratt KK, Fuller SJ, Balle T. Deep learning structural insights into heterotrimeric alternatively spliced P2X7 receptors. Purinergic Signal 2023:10.1007/s11302-023-09978-3. [PMID: 38032425 DOI: 10.1007/s11302-023-09978-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 10/31/2023] [Indexed: 12/01/2023] Open
Abstract
P2X7 receptors (P2X7Rs) are membrane-bound ATP-gated ion channels that are composed of three subunits. Different subunit structures may be expressed due to alternative splicing of the P2RX7 gene, altering the receptor's function when combined with the wild-type P2X7A subunits. In this study, the application of the deep-learning method, AlphaFold2-Multimer (AF2M), for the generation of trimeric P2X7Rs was validated by comparing an AF2M-generated rat wild-type P2X7A receptor with a structure determined by cryogenic electron microscopy (cryo-EM) (Protein Data Bank Identification: 6U9V). The results suggested AF2M could firstly, accurately predict the structures of P2X7Rs and secondly, accurately identify the highest quality model through the ranking system. Subsequently, AF2M was used to generate models of heterotrimeric alternatively spliced P2X7Rs consisting of one or two wild-type P2X7A subunits in combination with one or two P2X7B, P2X7E, P2X7J, and P2X7L splice variant subunits. The top-ranking models were deemed valid based on AF2M's confidence measures, stability in molecular dynamics simulations, and consistent flexibility of the conserved regions between the models. The structure of the heterotrimeric receptors, which were missing key residues in the ATP binding sites and carboxyl terminal domains (CTDs) compared to the wild-type receptor, help to explain their observed functions. Overall, the models produced in this study (available as supplementary material) unlock the possibility of structure-based studies into the heterotrimeric P2X7Rs.
Collapse
Affiliation(s)
- Sophie K F De Salis
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia
| | - Jake Zheng Chen
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia
| | - Kristen K Skarratt
- The University of Sydney, Nepean Clinical School, Kingswood, NSW, 2747, Australia
| | - Stephen J Fuller
- The University of Sydney, Nepean Clinical School, Kingswood, NSW, 2747, Australia
| | - Thomas Balle
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia.
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia.
| |
Collapse
|
89
|
Huang GJ, Parry TK, McLaughlin WA. Assessment of the Performances of the Protein Modeling Techniques Participating in CASP15 Using a Structure-Based Functional Site Prediction Approach: ResiRole. Bioengineering (Basel) 2023; 10:1377. [PMID: 38135968 PMCID: PMC10740689 DOI: 10.3390/bioengineering10121377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/27/2023] [Accepted: 11/28/2023] [Indexed: 12/24/2023] Open
Abstract
BACKGROUND Model quality assessments via computational methods which entail comparisons of the modeled structures to the experimentally determined structures are essential in the field of protein structure prediction. The assessments provide means to benchmark the accuracies of the modeling techniques and to aid with their development. We previously described the ResiRole method to gauge model quality principally based on the preservation of the structural characteristics described in SeqFEATURE functional site prediction models. METHODS We apply ResiRole to benchmark modeling group performances in the Critical Assessment of Structure Prediction experiment, round 15. To gauge model quality, a normalized Predicted Functional site Similarity Score (PFSS) was calculated as the average of one minus the absolute values of the differences of the functional site prediction probabilities, as found for the experimental structures versus those found at the corresponding sites in the structure models. RESULTS The average PFSS per modeling group (gPFSS) correlates with standard quality metrics, and can effectively be used to rank the accuracies of the groups. For the free modeling (FM) category, correlation coefficients of the Local Distance Difference Test (LDDT) and Global Distance Test-Total Score (GDT-TS) metrics with gPFSS were 0.98239 and 0.87691, respectively. An example finding for a specific group is that the gPFSS for EMBER3D was higher than expected based on the predictive relationship between gPFSS and LDDT. We infer the result is due to the use of constraints imprinted by function that are a part of the EMBER3D methodology. Also, we find functional site predictions that may guide further functional characterizations of the respective proteins. CONCLUSION The gPFSS metric provides an effective means to assess and rank the performances of the structure prediction techniques according to their abilities to accurately recount the structural features at predicted functional sites.
Collapse
Affiliation(s)
| | | | - William A. McLaughlin
- Department of Medical Education, Geisinger Commonwealth School of Medicine, 525 Pine Street, Scranton, PA 18509, USA (T.K.P.)
| |
Collapse
|
90
|
Harmalkar A, Lyskov S, Gray JJ. Reliable protein-protein docking with AlphaFold, Rosetta and replica-exchange. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.28.551063. [PMID: 37546760 PMCID: PMC10402144 DOI: 10.1101/2023.07.28.551063] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Despite the recent breakthrough of AlphaFold (AF) in the field of protein sequence-to-structure prediction, modeling protein interfaces and predicting protein complex structures remains challenging, especially when there is a significant conformational change in one or both binding partners. Prior studies have demonstrated that AF-multimer (AFm) can predict accurate protein complexes in only up to 43% of cases. In this work, we combine AlphaFold as a structural template generator with a physics-based replica exchange docking algorithm. Using a curated collection of 254 available protein targets with both unbound and bound structures, we first demonstrate that AlphaFold confidence measures (pLDDT) can be repurposed for estimating protein flexibility and docking accuracy for multimers. We incorporate these metrics within our ReplicaDock 2.0 protocol to complete a robust in-silico pipeline for accurate protein complex structure prediction. AlphaRED (AlphaFold-initiated Replica Exchange Docking) successfully docks failed AF predictions including 97 failure cases in Docking Benchmark Set 5.5. AlphaRED generates CAPRI acceptable-quality or better predictions for 66% of benchmark targets. Further, on a subset of antigen-antibody targets, which is challenging for AFm (19% success rate), AlphaRED demonstrates a success rate of 51%. This new strategy demonstrates the success possible by integrating deep-learning based architectures trained on evolutionary information with physics-based enhanced sampling. The pipeline is available at github.com/Graylab/AlphaRED.
Collapse
|
91
|
McBride JM, Polev K, Abdirasulov A, Reinharz V, Grzybowski BA, Tlusty T. AlphaFold2 Can Predict Single-Mutation Effects. PHYSICAL REVIEW LETTERS 2023; 131:218401. [PMID: 38072605 DOI: 10.1103/physrevlett.131.218401] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 09/26/2023] [Indexed: 12/18/2023]
Abstract
AlphaFold2 (AF) is a promising tool, but is it accurate enough to predict single mutation effects? Here, we report that the localized structural deformation between protein pairs differing by only 1-3 mutations-as measured by the effective strain-is correlated across 3901 experimental and AF-predicted structures. Furthermore, analysis of ∼11 000 proteins shows that the local structural change correlates with various phenotypic changes. These findings suggest that AF can predict the range and magnitude of single-mutation effects on average, and we propose a method to improve precision of AF predictions and to indicate when predictions are unreliable.
Collapse
Affiliation(s)
- John M McBride
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
| | - Konstantin Polev
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | - Amirbek Abdirasulov
- Department of Computer Science and Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | | | - Bartosz A Grzybowski
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| |
Collapse
|
92
|
Vassiliev P, Gusev E, Komelkova M, Kochetkov A, Dobrynina M, Sarapultsev A. Computational Analysis of CD46 Protein Interaction with SARS-CoV-2 Structural Proteins: Elucidating a Putative Viral Entry Mechanism into Human Cells. Viruses 2023; 15:2297. [PMID: 38140538 PMCID: PMC10747966 DOI: 10.3390/v15122297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 11/20/2023] [Accepted: 11/22/2023] [Indexed: 12/24/2023] Open
Abstract
This study examines an unexplored aspect of SARS-CoV-2 entry into host cells, which is widely understood to occur via the viral spike (S) protein's interaction with human ACE2-associated proteins. While vaccines and inhibitors targeting this mechanism are in use, they may not offer complete protection against reinfection. Hence, we investigate putative receptors and their cofactors. Specifically, we propose CD46, a human membrane cofactor protein, as a potential putative receptor and explore its role in cellular invasion, acting possibly as a cofactor with other viral structural proteins. Employing computational techniques, we created full-size 3D models of human CD46 and four key SARS-CoV-2 structural proteins-EP, MP, NP, and SP. We further developed 3D models of CD46 complexes interacting with these proteins. The primary aim is to pinpoint the likely interaction domains between CD46 and these structural proteins to facilitate the identification of molecules that can block these interactions, thus offering a foundation for novel pharmacological treatments for SARS-CoV-2 infection.
Collapse
Affiliation(s)
- Pavel Vassiliev
- Laboratory for Information Technology in Pharmacology and Computer Modeling of Drugs, Research Center for Innovative Medicines, Volgograd State Medical University, 39 Novorossiyskaya Street, Volgograd 400087, Russia;
| | - Evgenii Gusev
- Institute of Immunology and Physiology, Ural Branch of the Russian Academy of Science, 106 Pervomaiskaya Street, Yekaterinburg 620049, Russia; (E.G.); (M.D.)
- Russian-Chinese Education and Research Center of System Pathology, South Ural State University, 76 Lenin Prospekt, Chelyabinsk 454080, Russia;
| | - Maria Komelkova
- Russian-Chinese Education and Research Center of System Pathology, South Ural State University, 76 Lenin Prospekt, Chelyabinsk 454080, Russia;
| | - Andrey Kochetkov
- Laboratory for Information Technology in Pharmacology and Computer Modeling of Drugs, Research Center for Innovative Medicines, Volgograd State Medical University, 39 Novorossiyskaya Street, Volgograd 400087, Russia;
| | - Maria Dobrynina
- Institute of Immunology and Physiology, Ural Branch of the Russian Academy of Science, 106 Pervomaiskaya Street, Yekaterinburg 620049, Russia; (E.G.); (M.D.)
| | - Alexey Sarapultsev
- Institute of Immunology and Physiology, Ural Branch of the Russian Academy of Science, 106 Pervomaiskaya Street, Yekaterinburg 620049, Russia; (E.G.); (M.D.)
- Russian-Chinese Education and Research Center of System Pathology, South Ural State University, 76 Lenin Prospekt, Chelyabinsk 454080, Russia;
| |
Collapse
|
93
|
Peslalz P, Kraus F, Izzo F, Bleisch A, El Hamdaoui Y, Schulz I, Kany AM, Hirsch AKH, Friedland K, Plietker B. Selective Activation of a TRPC6 Ion Channel Over TRPC3 by Metalated Type-B Polycyclic Polyprenylated Acylphloroglucinols. J Med Chem 2023; 66:15061-15072. [PMID: 37922400 DOI: 10.1021/acs.jmedchem.3c01170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2023]
Abstract
Selective modulation of TRPC6 ion channels is a promising therapeutic approach for neurodegenerative diseases and depression. A significant advancement showcases the selective activation of TRPC6 through metalated type-B PPAP, termed PPAP53. This success stems from PPAP53's 1,3-diketone motif facilitating metal coordination. PPAP53 is water-soluble and as potent as hyperforin, the gold standard in this field. In contrast to type-A, type-B PPAPs offer advantages such as gram-scale synthesis, easy derivatization, and long-term stability. Our investigations reveal PPAP53 selectively binding to the C-terminus of TRPC6. Although cryoelectron microscopy has resolved the majority of the TRPC6 structure, the binding site in the C-terminus remained unresolved. To address this issue, we employed state-of-the-art artificial-intelligence-based protein structure prediction algorithms to predict the missing region. Our computational results, validated against experimental data, indicate that PPAP53 binds to the 777LLKL780-region of the C-terminus, thus providing critical insights into the binding mechanism of PPAP53.
Collapse
Affiliation(s)
- Philipp Peslalz
- Chair of Organic Chemistry, Faculty of Chemistry and Food Chemistry, Technical University Dresden, Bergstr. 66, Dresden 01069, Germany
| | - Frank Kraus
- Institut für Organische Chemie, Universität Stuttgart , Pfaffenwaldring 55, Stuttgart 70569, Germany
| | - Flavia Izzo
- Institut für Organische Chemie, Universität Stuttgart , Pfaffenwaldring 55, Stuttgart 70569, Germany
| | - Anton Bleisch
- Chair of Organic Chemistry, Faculty of Chemistry and Food Chemistry, Technical University Dresden, Bergstr. 66, Dresden 01069, Germany
| | - Yamina El Hamdaoui
- Institut für Biomedizinische und Pharmazeutische Wissenschaften Johannes Gutenberg-Universität Mainz, Mainz 55128, Germany
| | - Ina Schulz
- Institut für Biomedizinische und Pharmazeutische Wissenschaften Johannes Gutenberg-Universität Mainz, Mainz 55128, Germany
| | - Andreas M Kany
- Helmholtz Institute for Pharm. Research Saarland (HIPS)-Helmholtz Centre for Infection Research (HZI), Saarbrücken 66123, Germany
| | - Anna K H Hirsch
- Helmholtz Institute for Pharm. Research Saarland (HIPS)-Helmholtz Centre for Infection Research (HZI), Saarbrücken 66123, Germany
- Department of Pharmacy, Saarland University, Saarbrücken 66123, Germany
| | - Kristina Friedland
- Institut für Biomedizinische und Pharmazeutische Wissenschaften Johannes Gutenberg-Universität Mainz, Mainz 55128, Germany
| | - Bernd Plietker
- Chair of Organic Chemistry, Faculty of Chemistry and Food Chemistry, Technical University Dresden, Bergstr. 66, Dresden 01069, Germany
- Institut für Organische Chemie, Universität Stuttgart , Pfaffenwaldring 55, Stuttgart 70569, Germany
| |
Collapse
|
94
|
Zhou X, Chen G, Ye J, Wang E, Zhang J, Mao C, Li Z, Hao J, Huang X, Tang J, Heng PA. ProRefiner: an entropy-based refining strategy for inverse protein folding with global graph attention. Nat Commun 2023; 14:7434. [PMID: 37973874 PMCID: PMC10654420 DOI: 10.1038/s41467-023-43166-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 11/02/2023] [Indexed: 11/19/2023] Open
Abstract
Inverse Protein Folding (IPF) is an important task of protein design, which aims to design sequences compatible with a given backbone structure. Despite the prosperous development of algorithms for this task, existing methods tend to rely on noisy predicted residues located in the local neighborhood when generating sequences. To address this limitation, we propose an entropy-based residue selection method to remove noise in the input residue context. Additionally, we introduce ProRefiner, a memory-efficient global graph attention model to fully utilize the denoised context. Our proposed method achieves state-of-the-art performance on multiple sequence design benchmarks in different design settings. Furthermore, we demonstrate the applicability of ProRefiner in redesigning Transposon-associated transposase B, where six out of the 20 variants we propose exhibit improved gene editing activity.
Collapse
Affiliation(s)
- Xinyi Zhou
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Central Ave, Hong Kong, China
| | | | - Junjie Ye
- Noah's Ark Lab, Huawei, Shenzhen, China
| | - Ercheng Wang
- Zhejiang Lab, Kechuang Avenue, Hangzhou, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Jun Zhang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, China
| | - Cong Mao
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, China
| | - Zhanwei Li
- Zhejiang Lab, Kechuang Avenue, Hangzhou, China
| | | | | | - Jin Tang
- Zhejiang Lab, Kechuang Avenue, Hangzhou, China
| | - Pheng Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Central Ave, Hong Kong, China
- Zhejiang Lab, Kechuang Avenue, Hangzhou, China
| |
Collapse
|
95
|
Park J, Joung I, Joo K, Lee J. Application of conformational space annealing to the protein structure modeling using cryo-EM maps. J Comput Chem 2023; 44:2332-2346. [PMID: 37585026 DOI: 10.1002/jcc.27200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 04/26/2023] [Accepted: 07/16/2023] [Indexed: 08/17/2023]
Abstract
Conformational space annealing (CSA), a global optimization method, has been applied to various protein structure modeling tasks. In this paper, we applied CSA to the cryo-EM structure modeling task by combining the python subroutine of CSA (PyCSA) and the fast relax (FastRelax) protocol of PyRosetta. Refinement of initial structures generated from two methods, rigid fitting of predicted structures to the Cryo-EM map and de novo protein modeling by tracing the Cryo-EM map, was performed by CSA. In the refinement of the rigid-fitted structures, the final models showed that CSA can generate reliable atomic structures of proteins, even when large movements of protein domains were required. In the de novo modeling case, although the overall structural qualities of the final models were rather dependent on the initial models, the final models generated by CSA showed improved MolProbity scores and cross-correlation coefficients to the maps. These results suggest that CSA can accomplish flexible fitting and refinement together by sampling diverse conformations effectively and thus can be utilized for cryo-EM structure modeling.
Collapse
Affiliation(s)
| | | | - Keehyoung Joo
- Center for Advanced Computations, Korea Institute for Advanced Study, Seoul, South Korea
| | - Jooyoung Lee
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, South Korea
| |
Collapse
|
96
|
Hon-Nami K, Hijikata A, Yura K, Bessho Y. Whole genome analyses for c-type cytochromes associated with respiratory chains in the extreme thermophile, Thermus thermophilus. J GEN APPL MICROBIOL 2023; 69:68-78. [PMID: 37394433 DOI: 10.2323/jgam.2023.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
In thermophilic microorganisms, c-type cytochrome (cyt) proteins mainly function in the respiratory chain as electron carriers. Genome analyses at the beginning of this century revealed a variety of genes harboring the heme c motif. Here, we describe the results of surveying genes with the heme c motif, CxxCH, in a genome database comprising four strains of Thermus thermophilus, including strain HB8, and the confirmation of 19 c-type cytochromes among 27 selected genes. We analyzed the 19 genes, including the expression of four, by a bioinformatics approach to elucidate their individual attributes. One of the approaches included an analysis based on the secondary structure alignment pattern between the heme c motif and the 6th ligand. The predicted structures revealed many cyt c domains with fewer β-strands, such as mitochondrial cyt c, in addition to the β-strand unique to Thermus inserted in cyt c domains, as in T. thermophilus cyt c552 and caa3 cyt c oxidase subunit IIc. The surveyed thermophiles harbor potential proteins with a variety of cyt c folds. The gene analyses led to the development of an index for the classification of cyt c domains. Based on these results, we propose names for T. thermophilus genes harboring the cyt c fold.
Collapse
Affiliation(s)
| | - Atsushi Hijikata
- School of Life Sciences, Tokyo University of Pharmacy and Life Sciences
| | - Kei Yura
- Graduate School of Humanities and Sciences, Ochanomizu University
- Center for Interdisciplinary AI and Data Science, Ochanomizu University
- Graduate School of Advanced Science and Engineering, Waseda University
| | - Yoshitaka Bessho
- Center for Interdisciplinary AI and Data Science, Ochanomizu University
- RIKEN SPring-8 Center, Harima Institute
| |
Collapse
|
97
|
Falkenberg F, Kohn S, Bott M, Bongaerts J, Siegert P. Biochemical characterisation of a novel broad pH spectrum subtilisin from Fictibacillus arsenicus DSM 15822 T. FEBS Open Bio 2023; 13:2035-2046. [PMID: 37649135 PMCID: PMC10626276 DOI: 10.1002/2211-5463.13701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/23/2023] [Accepted: 08/29/2023] [Indexed: 09/01/2023] Open
Abstract
Subtilisins from microbial sources, especially from the Bacillaceae family, are of particular interest for biotechnological applications and serve the currently growing enzyme market as efficient and novel biocatalysts. Biotechnological applications include use in detergents, cosmetics, leather processing, wastewater treatment and pharmaceuticals. To identify a possible candidate for the enzyme market, here we cloned the gene of the subtilisin SPFA from Fictibacillus arsenicus DSM 15822T (obtained through a data mining-based search) and expressed it in Bacillus subtilis DB104. After production and purification, the protease showed a molecular mass of 27.57 kDa and a pI of 5.8. SPFA displayed hydrolytic activity at a temperature optimum of 80 °C and a very broad pH optimum between 8.5 and 11.5, with high activity up to pH 12.5. SPFA displayed no NaCl dependence but a high NaCl tolerance, with decreasing activity up to concentrations of 5 m NaCl. The stability enhanced with increasing NaCl concentration. Based on its substrate preference for 10 synthetic peptide 4-nitroanilide substrates with three or four amino acids and its phylogenetic classification, SPFA can be assigned to the subgroup of true subtilisins. Moreover, SPFA exhibited high tolerance to 5% (w/v) SDS and 5% H2 O2 (v/v). The biochemical properties of SPFA, especially its tolerance of remarkably high pH, SDS and H2 O2 , suggest it has potential for biotechnological applications.
Collapse
Affiliation(s)
- Fabian Falkenberg
- Institute of Nano‐ and BiotechnologiesAachen University of Applied SciencesJülichGermany
| | - Sophie Kohn
- Institute of Nano‐ and BiotechnologiesAachen University of Applied SciencesJülichGermany
| | - Michael Bott
- Institute of Bio‐ and Geosciences, IBG‐1: BiotechnologyForschungszentrum JülichGermany
| | - Johannes Bongaerts
- Institute of Nano‐ and BiotechnologiesAachen University of Applied SciencesJülichGermany
| | - Petra Siegert
- Institute of Nano‐ and BiotechnologiesAachen University of Applied SciencesJülichGermany
| |
Collapse
|
98
|
Bale A, Rambo R, Prior C. The SKMT Algorithm: A method for assessing and comparing underlying protein entanglement. PLoS Comput Biol 2023; 19:e1011248. [PMID: 38011290 PMCID: PMC10703313 DOI: 10.1371/journal.pcbi.1011248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 12/07/2023] [Accepted: 11/06/2023] [Indexed: 11/29/2023] Open
Abstract
We present fast and simple-to-implement measures of the entanglement of protein tertiary structures which are appropriate for highly flexible structure comparison. These are performed using the SKMT algorithm, a novel method of smoothing the Cα backbone to achieve a minimal complexity curve representation of the manner in which the protein's secondary structure elements fold to form its tertiary structure. Its subsequent complexity is characterised using measures based on the writhe and crossing number quantities heavily utilised in DNA topology studies, and which have shown promising results when applied to proteins recently. The SKMT smoothing is used to derive empirical bounds on a protein's entanglement relative to its number of secondary structure elements. We show that large scale helical geometries dominantly account for the maximum growth in entanglement of protein monomers, and further that this large scale helical geometry is present in a large array of proteins, consistent across a number of different protein structure types and sequences. We also show how these bounds can be used to constrain the search space of protein structure prediction from small angle x-ray scattering experiments, a method highly suited to determining the likely structure of proteins in solution where crystal structure or machine learning based predictions often fail to match experimental data. Finally we develop a structural comparison metric based on the SKMT smoothing which is used in one specific case to demonstrate significant structural similarity between Rossmann fold and TIM Barrel proteins, a link which is potentially significant as attempts to engineer the latter have in the past produced the former. We provide the SWRITHE interactive python notebook to calculate these metrics.
Collapse
Affiliation(s)
- Arron Bale
- Department of Mathematical Sciences, Durham University, Durham, United Kingdom
| | - Robert Rambo
- Diamond Light Source, Harwell Science and Innovation Campus, Didcot, United Kingdom
| | - Christopher Prior
- Department of Mathematical Sciences, Durham University, Durham, United Kingdom
| |
Collapse
|
99
|
Madaloz TZ, Dos Santos K, Zacchi FL, Bainy ACD, Razzera G. Nuclear receptor superfamily structural diversity in pacific oyster: In silico identification of estradiol binding candidates. CHEMOSPHERE 2023; 340:139877. [PMID: 37619748 DOI: 10.1016/j.chemosphere.2023.139877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 07/21/2023] [Accepted: 08/17/2023] [Indexed: 08/26/2023]
Abstract
The increasing presence of anthropogenic contaminants in aquatic environments poses challenges for species inhabiting contaminated sites. Due to their structural binding characteristics to ligands that inhibit or activate gene transcription, these xenobiotic compounds frequently target the nuclear receptor superfamily. The present work aims to understand the potential interaction between the hormone 17-β-estradiol, an environmental contaminant, and the nuclear receptors of Crassostrea gigas, the Pacific oyster. This filter-feeding, sessile oyster species is subject to environmental changes and exposure to contaminants. In the Pacific oyster, the estrogen-binding nuclear receptor is not able to bind this hormone as it does in vertebrates. However, another receptor may exhibit responsiveness to estrogen-like molecules and derivatives. We employed high-performance in silico methodologies, including three-dimensional modeling, molecular docking and atomistic molecular dynamics to identify likely binding candidates with the target moecule. Our approach revealed that among the C. gigas nuclear receptor superfamily, candidates with the most favorable interaction with the molecule of interest belonged to the NR1D, NR1H, NR1P, NR2E, NHR42, and NR0B groups. Interestingly, NR1H and NR0B were associated with planktonic/larval life cycle stages, while NR1P, NR2E, and NR0B were associated with sessile/adult life stages. The application of this computational methodological strategy demonstrated high performance in the virtual screening of candidates for binding with the target xenobiotic molecule and can be employed in other studies in the field of ecotoxicology in non-model organisms.
Collapse
Affiliation(s)
- Tâmela Zamboni Madaloz
- Programa de Pós-Graduação Em Bioquímica, Departamento de Bioquímica, Universidade Federal de Santa Catarina, Florianópolis, SC, 88040-900, Brazil; Laboratório de Biomarcadores de Contaminação Aquática e Imunoquímica, Universidade Federal de Santa Catarina, Florianópolis, SC, 88040-900, Brazil
| | - Karin Dos Santos
- Programa de Pós-Graduação Em Bioquímica, Departamento de Bioquímica, Universidade Federal de Santa Catarina, Florianópolis, SC, 88040-900, Brazil; Laboratório de Biomarcadores de Contaminação Aquática e Imunoquímica, Universidade Federal de Santa Catarina, Florianópolis, SC, 88040-900, Brazil
| | - Flávia Lucena Zacchi
- Laboratório de Moluscos Marinhos, Universidade Federal de Santa Catarina, Florianópolis, SC, 88061-600, Brazil
| | - Afonso Celso Dias Bainy
- Programa de Pós-Graduação Em Bioquímica, Departamento de Bioquímica, Universidade Federal de Santa Catarina, Florianópolis, SC, 88040-900, Brazil; Laboratório de Biomarcadores de Contaminação Aquática e Imunoquímica, Universidade Federal de Santa Catarina, Florianópolis, SC, 88040-900, Brazil
| | - Guilherme Razzera
- Programa de Pós-Graduação Em Bioquímica, Departamento de Bioquímica, Universidade Federal de Santa Catarina, Florianópolis, SC, 88040-900, Brazil; Laboratório de Biomarcadores de Contaminação Aquática e Imunoquímica, Universidade Federal de Santa Catarina, Florianópolis, SC, 88040-900, Brazil.
| |
Collapse
|
100
|
Malbranke C, Rostain W, Depardieu F, Cocco S, Monasson R, Bikard D. Computational design of novel Cas9 PAM-interacting domains using evolution-based modelling and structural quality assessment. PLoS Comput Biol 2023; 19:e1011621. [PMID: 37976326 PMCID: PMC10729993 DOI: 10.1371/journal.pcbi.1011621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 12/19/2023] [Accepted: 10/19/2023] [Indexed: 11/19/2023] Open
Abstract
We present here an approach to protein design that combines (i) scarce functional information such as experimental data (ii) evolutionary information learned from a natural sequence variants and (iii) physics-grounded modeling. Using a Restricted Boltzmann Machine (RBM), we learn a sequence model of a protein family. We use semi-supervision to leverage available functional information during the RBM training. We then propose a strategy to explore the protein representation space that can be informed by external models such as an empirical force-field method (FoldX). Our approach is applied to a domain of the Cas9 protein responsible for recognition of a short DNA motif. We experimentally assess the functionality of 71 variants generated to explore a range of RBM and FoldX energies. Sequences with as many as 50 differences (20% of the protein domain) to the wild-type retained functionality. Overall, 21/71 sequences designed with our method were functional. Interestingly, 6/71 sequences showed an improved activity in comparison with the original wild-type protein sequence. These results demonstrate the interest in further exploring the synergies between machine-learning of protein sequence representations and physics grounded modeling strategies informed by structural information.
Collapse
Affiliation(s)
- Cyril Malbranke
- Laboratory of Physics of the Ecole Normale Superieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Paris, France
- Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Synthetic Biology, Paris, France
| | - William Rostain
- Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Synthetic Biology, Paris, France
| | - Florence Depardieu
- Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Synthetic Biology, Paris, France
| | - Simona Cocco
- Laboratory of Physics of the Ecole Normale Superieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Paris, France
| | - Rémi Monasson
- Laboratory of Physics of the Ecole Normale Superieure, PSL Research, CNRS UMR 8023, Sorbonne Université, Paris, France
| | - David Bikard
- Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Synthetic Biology, Paris, France
| |
Collapse
|