1
|
Mirabello C, Wallner B, Nystedt B, Azinas S, Carroni M. Unmasking AlphaFold to integrate experiments and predictions in multimeric complexes. Nat Commun 2024; 15:8724. [PMID: 39379372 PMCID: PMC11461844 DOI: 10.1038/s41467-024-52951-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 09/26/2024] [Indexed: 10/10/2024] Open
Abstract
Since the release of AlphaFold, researchers have actively refined its predictions and attempted to integrate it into existing pipelines for determining protein structures. These efforts have introduced a number of functionalities and optimisations at the latest Critical Assessment of protein Structure Prediction edition (CASP15), resulting in a marked improvement in the prediction of multimeric protein structures. However, AlphaFold's capability of predicting large protein complexes is still limited and integrating experimental data in the prediction pipeline is not straightforward. In this study, we introduce AF_unmasked to overcome these limitations. Our results demonstrate that AF_unmasked can integrate experimental information to build larger or hard to predict protein assemblies with high confidence. The resulting predictions can help interpret and augment experimental data. This approach generates high quality (DockQ score > 0.8) structures even when little to no evolutionary information is available and imperfect experimental structures are used as a starting point. AF_unmasked is developed and optimised to fill incomplete experimental structures (structural inpainting), which may provide insights into protein dynamics. In summary, AF_unmasked provides an easy-to-use method that efficiently integrates experiments to predict large protein complexes more confidently.
Collapse
Affiliation(s)
- Claudio Mirabello
- Dept of Physics, Chemistry and Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Linköping University, 581 83, Linköping, Sweden.
| | - Björn Wallner
- Dept of Physics, Chemistry and Biology, Linköping University, 581 83, Linköping, Sweden
| | - Björn Nystedt
- Dept of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Husargatan 3, SE-752 37, Uppsala, Sweden
| | - Stavros Azinas
- Dept of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| | - Marta Carroni
- Dept of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University, Stockholm, Sweden
| |
Collapse
|
2
|
Benavides TL, Montelione GT. Integrative Modeling of Protein-Polypeptide Complexes by Bayesian Model Selection using AlphaFold and NMR Chemical Shift Perturbation Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.19.613999. [PMID: 39345459 PMCID: PMC11430059 DOI: 10.1101/2024.09.19.613999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Protein-polypeptide interactions, including those involving intrinsically-disordered peptides and intrinsically-disordered regions of protein binding partners, are crucial for many biological functions. However, experimental structure determination of protein-peptide complexes can be challenging. Computational methods, while promising, generally require experimental data for validation and refinement. Here we present CSP_Rank, an integrated modeling approach to determine the structures of protein-peptide complexes. This method combines AlphaFold2 (AF2) enhanced sampling methods with a Bayesian conformational selection process based on experimental Nuclear Magnetic Resonance (NMR) Chemical Shift Perturbation (CSP) data and AF2 confidence metrics. Using a curated dataset of 108 protein-peptide complexes from the Biological Magnetic Resonance Data Bank (BMRB), we observe that while AF2 typically yields models with excellent consistency with experimental CSP data, applying enhanced sampling followed by data-guided conformational selection routinely results in ensembles of structures with improved agreement with NMR observables. For two systems, we cross-validate the CSP-selected models using independently acquired nuclear Overhauser effect (NOE) NMR data and demonstrate how CSP and NMR can be combined using our Bayesian framework for model selection. CSP_Rank is a novel method for integrative modeling of protein-peptide complexes and has broad implications for studies of protein-peptide interactions and aiding in understanding their biological functions.
Collapse
Affiliation(s)
- Tiburon L. Benavides
- Department of Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| |
Collapse
|
3
|
Kovalevskiy O, Mateos-Garcia J, Tunyasuvunakool K. AlphaFold two years on: Validation and impact. Proc Natl Acad Sci U S A 2024; 121:e2315002121. [PMID: 39133843 PMCID: PMC11348012 DOI: 10.1073/pnas.2315002121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Two years on from the initial release of AlphaFold, we have seen its widespread adoption as a structure prediction tool. Here, we discuss some of the latest work based on AlphaFold, with a particular focus on its use within the structural biology community. This encompasses use cases like speeding up structure determination itself, enabling new computational studies, and building new tools and workflows. We also look at the ongoing validation of AlphaFold, as its predictions continue to be compared against large numbers of experimental structures to further delineate the model's capabilities and limitations.
Collapse
|
4
|
Shor B, Schneidman-Duhovny D. Integrative modeling meets deep learning: Recent advances in modeling protein assemblies. Curr Opin Struct Biol 2024; 87:102841. [PMID: 38795564 DOI: 10.1016/j.sbi.2024.102841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/24/2024] [Accepted: 04/27/2024] [Indexed: 05/28/2024]
Abstract
Recent progress in protein structure prediction based on deep learning revolutionized the field of Structural Biology. Beyond single proteins, it also enabled high-throughput prediction of structures of protein-protein interactions. Despite the success in predicting complex structures, large macromolecular assemblies still require specialized approaches. Here we describe recent advances in modeling macromolecular assemblies using integrative and hierarchical approaches. We highlight applications that predict protein-protein interactions and challenges in modeling complexes based on the interaction networks, including the prediction of complex stoichiometry and heterogeneity.
Collapse
Affiliation(s)
- Ben Shor
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel. https://twitter.com/ben_shor
| | - Dina Schneidman-Duhovny
- The Rachel and Selim Benin School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
| |
Collapse
|
5
|
Swapna GVT, Dube N, Roth MJ, Montelione GT. Modeling Alternative Conformational States of Pseudo-Symmetric Solute Carrier Transporters using Methods from Machine Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.15.603529. [PMID: 39071413 PMCID: PMC11275918 DOI: 10.1101/2024.07.15.603529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
The Solute Carrier (SLC) superfamily of integral membrane proteins function to transport a wide array of solutes across the plasma and organelle membranes. SLC proteins also function as important drug transporters and as viral receptors. Despite being classified as a single superfamily, SLC proteins do not share a single common fold classification; however, most belong to multi-pass transmembrane helical protein fold families. SLC proteins populate different conformational states during the solute transport process, including outward open, intermediate (occluded), and inward open conformational states. For some SLC fold families this structural "flipping" corresponds to swapping between conformations of their N-terminal and C-terminal symmetry-related sub-structures. Conventional AlphaFold2 or Evolutionary Scale Modeling methods typically generate models for only one of these multiple conformational states of SLC proteins. Here we describe a fast and simple approach for modeling multiple conformational states of SLC proteins using a combined ESM - AF2 process. The resulting multi-state models are validated by comparison with sequence-based evolutionary co-variance data (ECs) that encode information about contacts present in the various conformational states adopted by the protein. We also explored the impact of mutations on conformational distributions of SLC proteins modeled by AlphaFold2 using both conventional and enhanced sampling methods. This approach for modeling conformational landscapes of pseudo-symmetric SLC proteins is demonstrated for several integral membrane protein transporters, including SLC35F2 the receptor of a feline leukemia virus envelope protein required for viral entry into eukaryotic cells.
Collapse
Affiliation(s)
- G V T Swapna
- Dept. of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
- Department of Pharmacology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway NJ 08854 USA
| | - Namita Dube
- Dept. of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Monica J Roth
- Department of Pharmacology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway NJ 08854 USA
| | - Gaetano T Montelione
- Dept. of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| |
Collapse
|
6
|
Wallerstein J, Han X, Levkovets M, Lesovoy D, Malmodin D, Mirabello C, Wallner B, Sun R, Sandalova T, Agback P, Karlsson G, Achour A, Agback T, Orekhov V. Insights into mechanisms of MALT1 allostery from NMR and AlphaFold dynamic analyses. Commun Biol 2024; 7:868. [PMID: 39014105 PMCID: PMC11252132 DOI: 10.1038/s42003-024-06558-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/05/2024] [Indexed: 07/18/2024] Open
Abstract
Mucosa-associated lymphoid tissue lymphoma-translocation protein 1 (MALT1) is an attractive target for the development of modulatory compounds in the treatment of lymphoma and other cancers. While the three-dimensional structure of MALT1 has been previously determined through X-ray analysis, its dynamic behaviour in solution has remained unexplored. We present here dynamic analyses of the apo MALT1 form along with the E549A mutation. This investigation used NMR 15N relaxation and NOE measurements between side-chain methyl groups. Our findings confirm that MALT1 exists as a monomer in solution, and demonstrate that the domains display semi-independent movements in relation to each other. Our dynamic study, covering multiple time scales, along with the assessment of conformational populations by Molecular Dynamic simulations, Alpha Fold modelling and PCA analysis, put the side chain of residue W580 in an inward position, shedding light at potential mechanisms underlying the allosteric regulation of this enzyme.
Collapse
Affiliation(s)
- Johan Wallerstein
- Department of Chemistry and Molecular Biology, University of Gothenburg, Box 465, SE-40530, Gothenburg, Sweden
| | - Xiao Han
- Science for Life Laboratory, Department of Medicine, Solna, Karolinska Institute, SE-17165, Solna, Sweden
- Division of Infectious Diseases, Karolinska University Hospital, SE‑171 76, Stockholm, Sweden
| | - Maria Levkovets
- Swedish NMR Centre, University of Gothenburg, Box 465, SE-40530, Gothenburg, Sweden
| | - Dmitry Lesovoy
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, 117997, Moscow, Russia
| | - Daniel Malmodin
- Swedish NMR Centre, University of Gothenburg, Box 465, SE-40530, Gothenburg, Sweden
| | - Claudio Mirabello
- Dept of Physics, Chemistry and Biology, Linköping University, 581 83, Linköping, Sweden
- National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Solna, Sweden
| | - Björn Wallner
- National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Solna, Sweden
| | - Renhua Sun
- Science for Life Laboratory, Department of Medicine, Solna, Karolinska Institute, SE-17165, Solna, Sweden
- Division of Infectious Diseases, Karolinska University Hospital, SE‑171 76, Stockholm, Sweden
| | - Tatyana Sandalova
- Science for Life Laboratory, Department of Medicine, Solna, Karolinska Institute, SE-17165, Solna, Sweden
- Division of Infectious Diseases, Karolinska University Hospital, SE‑171 76, Stockholm, Sweden
| | - Peter Agback
- Department of Molecular Sciences, Swedish University of Agricultural Sciences, PO Box 7015, SE-750 07, Uppsala, Sweden
| | - Göran Karlsson
- Department of Chemistry and Molecular Biology, University of Gothenburg, Box 465, SE-40530, Gothenburg, Sweden
- Swedish NMR Centre, University of Gothenburg, Box 465, SE-40530, Gothenburg, Sweden
| | - Adnane Achour
- Science for Life Laboratory, Department of Medicine, Solna, Karolinska Institute, SE-17165, Solna, Sweden
- Division of Infectious Diseases, Karolinska University Hospital, SE‑171 76, Stockholm, Sweden
| | - Tatiana Agback
- Department of Molecular Sciences, Swedish University of Agricultural Sciences, PO Box 7015, SE-750 07, Uppsala, Sweden.
| | - Vladislav Orekhov
- Department of Chemistry and Molecular Biology, University of Gothenburg, Box 465, SE-40530, Gothenburg, Sweden.
- Swedish NMR Centre, University of Gothenburg, Box 465, SE-40530, Gothenburg, Sweden.
| |
Collapse
|
7
|
Lupo U, Sgarbossa D, Bitbol AF. Pairing interacting protein sequences using masked language modeling. Proc Natl Acad Sci U S A 2024; 121:e2311887121. [PMID: 38913900 PMCID: PMC11228504 DOI: 10.1073/pnas.2311887121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 12/18/2023] [Indexed: 06/26/2024] Open
Abstract
Predicting which proteins interact together from amino acid sequences is an important task. We develop a method to pair interacting protein sequences which leverages the power of protein language models trained on multiple sequence alignments (MSAs), such as MSA Transformer and the EvoFormer module of AlphaFold. We formulate the problem of pairing interacting partners among the paralogs of two protein families in a differentiable way. We introduce a method called Differentiable Pairing using Alignment-based Language Models (DiffPALM) that solves it by exploiting the ability of MSA Transformer to fill in masked amino acids in multiple sequence alignments using the surrounding context. MSA Transformer encodes coevolution between functionally or structurally coupled amino acids within protein chains. It also captures inter-chain coevolution, despite being trained on single-chain data. Relying on MSA Transformer without fine-tuning, DiffPALM outperforms existing coevolution-based pairing methods on difficult benchmarks of shallow multiple sequence alignments extracted from ubiquitous prokaryotic protein datasets. It also outperforms an alternative method based on a state-of-the-art protein language model trained on single sequences. Paired alignments of interacting protein sequences are a crucial ingredient of supervised deep learning methods to predict the three-dimensional structure of protein complexes. Starting from sequences paired by DiffPALM substantially improves the structure prediction of some eukaryotic protein complexes by AlphaFold-Multimer. It also achieves competitive performance with using orthology-based pairing.
Collapse
Affiliation(s)
- Umberto Lupo
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne CH-1015, Switzerland
| | - Damiano Sgarbossa
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne CH-1015, Switzerland
| | - Anne-Florence Bitbol
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne CH-1015, Switzerland
| |
Collapse
|
8
|
McLean TC. LazyAF, a pipeline for accessible medium-scale in silico prediction of protein-protein interactions. MICROBIOLOGY (READING, ENGLAND) 2024; 170:001473. [PMID: 38967642 PMCID: PMC11316561 DOI: 10.1099/mic.0.001473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 06/14/2024] [Indexed: 07/06/2024]
Abstract
Artificial intelligence has revolutionized the field of protein structure prediction. However, with more powerful and complex software being developed, it is accessibility and ease of use rather than capability that is quickly becoming a limiting factor to end users. LazyAF is a Google Colaboratory-based pipeline which integrates the existing ColabFold BATCH software to streamline the process of medium-scale protein-protein interaction prediction. LazyAF was used to predict the interactome of the 76 proteins encoded on the broad-host-range multi-drug resistance plasmid RK2, demonstrating the ease and accessibility the pipeline provides.
Collapse
Affiliation(s)
- Thomas C. McLean
- Department of Molecular Microbiology, John Innes Centre, Norwich, UK
| |
Collapse
|
9
|
Huang YJ, Montelione GT. Hidden Structural States of Proteins Revealed by Conformer Selection with AlphaFold-NMR. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.26.600902. [PMID: 38979209 PMCID: PMC11230435 DOI: 10.1101/2024.06.26.600902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Recent advances in molecular modeling using deep learning can revolutionize our understanding of dynamic protein structures. NMR is particularly well-suited for determining dynamic features of biomolecular structures. The conventional process for determining biomolecular structures from experimental NMR data involves its representation as conformation-dependent restraints, followed by generation of structural models guided by these spatial restraints. Here we describe an alternative approach: generating a distribution of realistic protein conformational models using artificial intelligence-(AI-) based methods and then selecting the sets of conformers that best explain the experimental data. We applied this conformational selection approach to redetermine the solution NMR structure of the enzyme Gaussia luciferase. First, we generated a diverse set of conformer models using AlphaFold2 (AF2) with an enhanced sampling protocol. The models that best-fit NOESY and chemical shift data were then selected with a Bayesian scoring metric. The resulting models include features of both the published NMR structure and the standard AF2 model generated without enhanced sampling. This "AlphaFold-NMR" protocol also generated an alternative "open" conformational state that fits nearly as well to the overall NMR data but accounts for some NOESY data that is not consistent with first "closed" conformational state; while other NOESY data consistent with this second state are not consistent with the first conformational state. The structure of this "open" structural state differs from that of the "closed" state primarily by the position of a thumb-shaped loop between α-helices H5 and H6, revealing a cryptic surface pocket. These alternative conformational states of Gluc are supported by "double recall" analysis of NOESY data and AF2 models. Additional structural states are also indicated by backbone chemical shift data indicating partially-disordered conformations for the C-terminal segment. Considered as a multistate ensemble, these multiple states of Gluc together fit the NOESY and chemical shift data better than the "restraint-based" NMR structure and provide novel insights into its structure-dynamic-function relationships. This study demonstrates the potential of AI-based modeling with enhanced sampling to generate conformational ensembles followed by conformer selection with experimental data as an alternative to conventional restraint satisfaction protocols for protein NMR structure determination.
Collapse
Affiliation(s)
- Yuanpeng J. Huang
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Gaetano T. Montelione
- Dept of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| |
Collapse
|
10
|
El Salamouni NS, Cater JH, Spenkelink LM, Yu H. Nanobody engineering: computational modelling and design for biomedical and therapeutic applications. FEBS Open Bio 2024. [PMID: 38898362 DOI: 10.1002/2211-5463.13850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 05/25/2024] [Accepted: 06/10/2024] [Indexed: 06/21/2024] Open
Abstract
Nanobodies, the smallest functional antibody fragment derived from camelid heavy-chain-only antibodies, have emerged as powerful tools for diverse biomedical applications. In this comprehensive review, we discuss the structural characteristics, functional properties, and computational approaches driving the design and optimisation of synthetic nanobodies. We explore their unique antigen-binding domains, highlighting the critical role of complementarity-determining regions in target recognition and specificity. This review further underscores the advantages of nanobodies over conventional antibodies from a biosynthesis perspective, including their small size, stability, and solubility, which make them ideal candidates for economical antigen capture in diagnostics, therapeutics, and biosensing. We discuss the recent advancements in computational methods for nanobody modelling, epitope prediction, and affinity maturation, shedding light on their intricate antigen-binding mechanisms and conformational dynamics. Finally, we examine a direct example of how computational design strategies were implemented for improving a nanobody-based immunosensor, known as a Quenchbody. Through combining experimental findings and computational insights, this review elucidates the transformative impact of nanobodies in biotechnology and biomedical research, offering a roadmap for future advancements and applications in healthcare and diagnostics.
Collapse
Affiliation(s)
- Nehad S El Salamouni
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Australia
| | - Jordan H Cater
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Australia
| | - Lisanne M Spenkelink
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Australia
| | - Haibo Yu
- Molecular Horizons and School of Chemistry and Molecular Bioscience, University of Wollongong, Australia
- ARC Centre of Excellence in Quantum Biotechnology, University of Wollongong, Australia
| |
Collapse
|
11
|
Urvas L, Chiesa L, Bret G, Jacquemard C, Kellenberger E. Benchmarking AlphaFold-Generated Structures of Chemokine-Chemokine Receptor Complexes. J Chem Inf Model 2024; 64:4587-4600. [PMID: 38809680 DOI: 10.1021/acs.jcim.3c01835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
AlphaFold and AlphaFold-Multimer have become two essential tools for the modeling of unknown structures of proteins and protein complexes. In this work, we extensively benchmarked the quality of chemokine-chemokine receptor structures generated by AlphaFold-Multimer against experimentally determined structures. Our analysis considered both the global quality of the model, as well as key structural features for chemokine recognition. To study the effects of template and multiple sequence alignment parameters on the results, a new prediction pipeline called LIT-AlphaFold (https://github.com/LIT-CCM-lab/LIT-AlphaFold) was developed, allowing extensive input customization. AlphaFold-Multimer correctly predicted differences in chemokine binding orientation and accurately reproduced the unique binding orientation of the CXCL12-ACKR3 complex. Further, the predictions of the full receptor N-terminus provided insights into a putative chemokine recognition site 0.5. The accuracy of chemokine N-terminus binding mode prediction varied between complexes, but the confidence score permitted the distinguishing of residues that were very likely well positioned. Finally, we generated a high-confidence model of the unsolved CXCL12-CXCR4 complex, which agreed with experimental mutagenesis and cross-linking data.
Collapse
Affiliation(s)
- Lauri Urvas
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Luca Chiesa
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Guillaume Bret
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Célien Jacquemard
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| | - Esther Kellenberger
- Laboratoire d'Innovation Thérapeutique, UMR 7200 CNRS, Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|
12
|
Hu C, Li X, Zhang M, Jing C, Hai M, Shen J, Xu Q, Dang X, Shi Y, Liu E, Jiang J. Identifying the Quantitative Trait Locus and Candidate Genes of Traits Related to Milling Quality in Rice via a Genome-Wide Association Study. PLANTS (BASEL, SWITZERLAND) 2024; 13:1324. [PMID: 38794395 PMCID: PMC11124788 DOI: 10.3390/plants13101324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 05/08/2024] [Accepted: 05/09/2024] [Indexed: 05/26/2024]
Abstract
Milling quality directly affects production efficiency in rice, which is closely related to the brown rice recovery (BRR), the milled rice recovery (MRR) and the head milled rice recovery (HMRR). The present study investigated these three traits in 173 germplasms in two environments, finding abundant phenotypic variation. Three QTLs for BRR, two for MRR, and three for HMRR were identified in a genome-wide association study, five of these were identified in previously reported QTLs and three were newly identified. By combining the linkage disequilibrium (LD) analyses, the candidate gene LOC_Os05g08350 was identified. It had two haplotypes with significant differences and Hap 2 increased the BRR by 4.40%. The results of the qRT-PCR showed that the expression of LOC_Os05g08350 in small-BRR accessions was significantly higher than that in large-BRR accessions at Stages 4-5 of young panicle development, reaching the maximum value at Stage 5. The increase in thickness of the spikelet hulls of the accession carrying LOC_Os05g08350TT occurred due to an increase in the cell width and the cell numbers in cross-sections of spikelet hulls. These results help to further clarify the molecular genetic mechanism of milling-quality-related traits and provide genetic germplasm materials for high-quality breeding in rice.
Collapse
Affiliation(s)
- Changmin Hu
- College of Agronomy, Anhui Agricultural University, Hefei 230036, China
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Xinru Li
- College of Agronomy, Anhui Agricultural University, Hefei 230036, China
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Mengyuan Zhang
- College of Agronomy, Anhui Agricultural University, Hefei 230036, China
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Chunyu Jing
- College of Agronomy, Anhui Agricultural University, Hefei 230036, China
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Mei Hai
- College of Agronomy, Anhui Agricultural University, Hefei 230036, China
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Jiaming Shen
- College of Agronomy, Anhui Agricultural University, Hefei 230036, China
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Qing Xu
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Xiaojing Dang
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| | - Yingyao Shi
- College of Agronomy, Anhui Agricultural University, Hefei 230036, China
| | - Erbao Liu
- College of Agronomy, Anhui Agricultural University, Hefei 230036, China
| | - Jianhua Jiang
- Institute of Rice Research, Anhui Academy of Agricultural Sciences, Hefei 230031, China
| |
Collapse
|
13
|
Sonmez C, Toia B, Eickhoff P, Matei AM, El Beyrouthy M, Wallner B, Douglas ME, de Lange T, Lottersberger F. DNA-PK controls Apollo's access to leading-end telomeres. Nucleic Acids Res 2024; 52:4313-4327. [PMID: 38407308 PMCID: PMC11077071 DOI: 10.1093/nar/gkae105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 01/23/2024] [Accepted: 02/01/2024] [Indexed: 02/27/2024] Open
Abstract
The complex formed by Ku70/80 and DNA-PKcs (DNA-PK) promotes the synapsis and the joining of double strand breaks (DSBs) during canonical non-homologous end joining (c-NHEJ). In c-NHEJ during V(D)J recombination, DNA-PK promotes the processing of the ends and the opening of the DNA hairpins by recruiting and/or activating the nuclease Artemis/DCLRE1C/SNM1C. Paradoxically, DNA-PK is also required to prevent the fusions of newly replicated leading-end telomeres. Here, we describe the role for DNA-PK in controlling Apollo/DCLRE1B/SNM1B, the nuclease that resects leading-end telomeres. We show that the telomeric function of Apollo requires DNA-PKcs's kinase activity and the binding of Apollo to DNA-PK. Furthermore, AlphaFold-Multimer predicts that Apollo's nuclease domain has extensive additional interactions with DNA-PKcs, and comparison to the cryo-EM structure of Artemis bound to DNA-PK phosphorylated on the ABCDE/Thr2609 cluster suggests that DNA-PK can similarly grant Apollo access to the DNA end. In agreement, the telomeric function of DNA-PK requires the ABCDE/Thr2609 cluster. These data reveal that resection of leading-end telomeres is regulated by DNA-PK through its binding to Apollo and its (auto)phosphorylation-dependent positioning of Apollo at the DNA end, analogous but not identical to DNA-PK dependent regulation of Artemis at hairpins.
Collapse
Affiliation(s)
- Ceylan Sonmez
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| | - Beatrice Toia
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| | - Patrik Eickhoff
- Chester Beatty Laboratories, The Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK
| | - Andreea Medeea Matei
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| | - Michael El Beyrouthy
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| | - Björn Wallner
- Department of Physics, Chemistry and Biology, Linköping University, Linköping 58 183, Sweden
| | - Max E Douglas
- Chester Beatty Laboratories, The Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK
| | - Titia de Lange
- Laboratory for Cell Biology and Genetics, The Rockefeller University, 1230 York Avenue, NY, NY 10021, USA
| | - Francisca Lottersberger
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping 58 183, Sweden
| |
Collapse
|
14
|
Affiliation(s)
- Sriram Subramaniam
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia, Canada.
- Gandeeva Therapeutics Inc., Burnaby, British Columbia, Canada.
| |
Collapse
|
15
|
Yin R, Pierce BG. Evaluation of AlphaFold antibody-antigen modeling with implications for improving predictive accuracy. Protein Sci 2024; 33:e4865. [PMID: 38073135 PMCID: PMC10751731 DOI: 10.1002/pro.4865] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 12/01/2023] [Accepted: 12/07/2023] [Indexed: 12/26/2023]
Abstract
High resolution antibody-antigen structures provide critical insights into immune recognition and can inform therapeutic design. The challenges of experimental structural determination and the diversity of the immune repertoire underscore the necessity of accurate computational tools for modeling antibody-antigen complexes. Initial benchmarking showed that despite overall success in modeling protein-protein complexes, AlphaFold and AlphaFold-Multimer have limited success in modeling antibody-antigen interactions. In this study, we performed a thorough analysis of AlphaFold's antibody-antigen modeling performance on 427 nonredundant antibody-antigen complex structures, identifying useful confidence metrics for predicting model quality, and features of complexes associated with improved modeling success. Notably, we found that the latest version of AlphaFold improves near-native modeling success to over 30%, versus approximately 20% for a previous version, while increased AlphaFold sampling gives approximately 50% success. With this improved success, AlphaFold can generate accurate antibody-antigen models in many cases, while additional training or other optimization may further improve performance.
Collapse
Affiliation(s)
- Rui Yin
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| | - Brian G. Pierce
- University of Maryland Institute for Bioscience and Biotechnology ResearchRockvilleMarylandUSA
- Department of Cell Biology and Molecular GeneticsUniversity of MarylandCollege ParkMarylandUSA
| |
Collapse
|
16
|
Jeppesen M, André I. Accurate prediction of protein assembly structure by combining AlphaFold and symmetrical docking. Nat Commun 2023; 14:8283. [PMID: 38092742 PMCID: PMC10719378 DOI: 10.1038/s41467-023-43681-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 11/16/2023] [Indexed: 12/17/2023] Open
Abstract
AlphaFold can predict the structures of monomeric and multimeric proteins with high accuracy but has a limit on the number of chains and residues it can fold. Here we show that a combination of AlphaFold and all-atom symmetric docking simulations enables highly accurate prediction of the structure of complex symmetrical assemblies. We present a method to predict the structure of complexes with cubic - tetrahedral, octahedral and icosahedral - symmetry from sequence. Focusing on proteins where AlphaFold can make confident predictions on the subunit structure, 27 cubic systems were assembled with a median TM-score of 0.99 and a DockQ score of 0.72. 21 had TM-scores of above 0.9 and were categorized as acceptable- to high-quality according to DockQ. The resulting models are energetically optimized and can be used for detailed studies of intermolecular interactions in higher-order symmetrical assemblies. The results demonstrate how explicit treatment of structural symmetry can significantly expand the size and complexity of AlphaFold predictions.
Collapse
Affiliation(s)
- Mads Jeppesen
- Department of Biochemistry and Structural Biology, Lund University, Lund, Sweden
| | - Ingemar André
- Department of Biochemistry and Structural Biology, Lund University, Lund, Sweden.
| |
Collapse
|
17
|
Olechnovič K, Valančauskas L, Dapkūnas J, Venclovas Č. Prediction of protein assemblies by structure sampling followed by interface-focused scoring. Proteins 2023; 91:1724-1733. [PMID: 37578163 DOI: 10.1002/prot.26569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 07/12/2023] [Accepted: 07/31/2023] [Indexed: 08/15/2023]
Abstract
Proteins often function as part of permanent or transient multimeric complexes, and understanding function of these assemblies requires knowledge of their three-dimensional structures. While the ability of AlphaFold to predict structures of individual proteins with unprecedented accuracy has revolutionized structural biology, modeling structures of protein assemblies remains challenging. To address this challenge, we developed a protocol for predicting structures of protein complexes involving model sampling followed by scoring focused on the subunit-subunit interaction interface. In this protocol, we diversified AlphaFold models by varying construction and pairing of multiple sequence alignments as well as increasing the number of recycles. In cases when AlphaFold failed to assemble a full protein complex or produced unreliable results, additional diverse models were constructed by docking of monomers or subcomplexes. All the models were then scored using a newly developed method, VoroIF-jury, which relies only on structural information. Notably, VoroIF-jury is independent of AlphaFold self-assessment scores and therefore can be used to rank models originating from different structure prediction methods. We tested our protocol in CASP15 and obtained top results, significantly outperforming the standard AlphaFold-Multimer pipeline. Analysis of our results showed that the accuracy of our assembly models was capped mainly by structure sampling rather than model scoring. This observation suggests that better sampling, especially for the antibody-antigen complexes, may lead to further improvement. Our protocol is expected to be useful for modeling and/or scoring protein assemblies.
Collapse
Affiliation(s)
- Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Lukas Valančauskas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
18
|
Leemann M, Sagasta A, Eberhardt J, Schwede T, Robin X, Durairaj J. Automated benchmarking of combined protein structure and ligand conformation prediction. Proteins 2023; 91:1912-1924. [PMID: 37885318 DOI: 10.1002/prot.26605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 09/15/2023] [Accepted: 09/21/2023] [Indexed: 10/28/2023]
Abstract
The prediction of protein-ligand complexes (PLC), using both experimental and predicted structures, is an active and important area of research, underscored by the inclusion of the Protein-Ligand Interaction category in the latest round of the Critical Assessment of Protein Structure Prediction experiment CASP15. The prediction task in CASP15 consisted of predicting both the three-dimensional structure of the receptor protein as well as the position and conformation of the ligand. This paper addresses the challenges and proposed solutions for devising automated benchmarking techniques for PLC prediction. The reliability of experimentally solved PLC as ground truth reference structures is assessed using various validation criteria. Similarity of PLC to previously released complexes are employed to judge PLC diversity and the difficulty of a PLC as a prediction target. We show that the commonly used PDBBind time-split test-set is inappropriate for comprehensive PLC evaluation, with state-of-the-art tools showing conflicting results on a more representative and high quality dataset constructed for benchmarking purposes. We also show that redocking on crystal structures is a much simpler task than docking into predicted protein models, demonstrated by the two PLC-prediction-specific scoring metrics created. Finally, we introduce a fully automated pipeline that predicts PLC and evaluates the accuracy of the protein structure, ligand pose, and protein-ligand interactions.
Collapse
Affiliation(s)
- Michèle Leemann
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Ander Sagasta
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jerome Eberhardt
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Xavier Robin
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
19
|
Wallner B. AFsample: improving multimer prediction with AlphaFold using massive sampling. Bioinformatics 2023; 39:btad573. [PMID: 37713472 PMCID: PMC10534052 DOI: 10.1093/bioinformatics/btad573] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/29/2023] [Accepted: 09/14/2023] [Indexed: 09/17/2023] Open
Abstract
SUMMARY The AlphaFold2 neural network model has revolutionized structural biology with unprecedented performance. We demonstrate that by stochastically perturbing the neural network by enabling dropout at inference combined with massive sampling, it is possible to improve the quality of the generated models. We generated ∼6000 models per target compared with 25 default for AlphaFold-Multimer, with v1 and v2 multimer network models, with and without templates, and increased the number of recycles within the network. The method was benchmarked in CASP15, and compared with AlphaFold-Multimer v2 it improved the average DockQ from 0.41 to 0.55 using identical input and was ranked at the very top in the protein assembly category when compared with all other groups participating in CASP15. The simplicity of the method should facilitate the adaptation by the field, and the method should be useful for anyone interested in modeling multimeric structures, alternate conformations, or flexible structures. AVAILABILITY AND IMPLEMENTATION AFsample is available online at http://wallnerlab.org/AFsample.
Collapse
Affiliation(s)
- Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, SE-581 83 Linköping, Sweden
| |
Collapse
|