1
|
Zhao S, Cui Z, Zhang G, Gong Y, Su L. MGPPI: multiscale graph neural networks for explainable protein-protein interaction prediction. Front Genet 2024; 15:1440448. [PMID: 39076171 PMCID: PMC11284081 DOI: 10.3389/fgene.2024.1440448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Accepted: 06/24/2024] [Indexed: 07/31/2024] Open
Abstract
Protein-Protein Interactions (PPIs) involves in various biological processes, which are of significant importance in cancer diagnosis and drug development. Computational based PPI prediction methods are more preferred due to their low cost and high accuracy. However, existing protein structure based methods are insufficient in the extraction of protein structural information. Furthermore, most methods are less interpretable, which hinder their practical application in the biomedical field. In this paper, we propose MGPPI, which is a Multiscale graph convolutional neural network model for PPI prediction. By incorporating multiscale module into the Graph Neural Network (GNN) and constructing multi convolutional layers, MGPPI can effectively capture both local and global protein structure information. For model interpretability, we introduce a novel visual explanation method named Gradient Weighted interaction Activation Mapping (Grad-WAM), which can highlight key binding residue sites. We evaluate the performance of MGPPI by comparing with state-of-the-arts methods on various datasets. Results shows that MGPPI outperforms other methods significantly and exhibits strong generalization capabilities on the multi-species dataset. As a practical case study, we predicted the binding affinity between the spike (S) protein of SARS-COV-2 and the human ACE2 receptor protein, and successfully identified key binding sites with known binding functions. Key binding sites mutation in PPIs can affect cancer patient survival statues. Therefore, we further verified Grad-WAM highlighted residue sites in separating patients survival groups in several different cancer type datasets. According to our results, some of the highlighted residues can be used as biomarkers in predicting patients survival probability. All these results together demonstrate the high accuracy and practical application value of MGPPI. Our method not only addresses the limitations of existing approaches but also can assists researchers in identifying crucial drug targets and help guide personalized cancer treatment.
Collapse
Affiliation(s)
| | | | | | | | - Lingtao Su
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China
| |
Collapse
|
2
|
Xiao F, Luo L, Liu X, Ljubetič A, Jin N, Jerala R, Hu G. Comparative Simulative Analysis and Design of Single-Chain Self-Assembled Protein Cages. J Phys Chem B 2024; 128:6272-6282. [PMID: 38904939 DOI: 10.1021/acs.jpcb.4c01957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/22/2024]
Abstract
Coiled-coil protein origami (CCPO) is a modular strategy for the de novo design of polypeptide nanostructures. It represents a type of modular design based on pairwise-interacting coiled-coil (CC) units with a single-chain protein programmed to fold into a polyhedral cage. However, the mechanisms underlying the self-assembly of the protein tetrahedron are still not fully understood. In the present study, 18 CCPO cages with three different topologies were modeled in silico. Then, molecular dynamics simulations and CC parameters were calculated to characterize the dynamic properties of protein tetrahedral cages at both the local and global levels. Furthermore, a deformed CC unit was redesigned, and the stability of the new cage was significantly improved.
Collapse
Affiliation(s)
- Fei Xiao
- MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Key Laboratory of Pathogen Bioscience and Anti-infective Medicine, Department of Bioinformatics, Center for Systems Biology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215213, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Soochow University, Suzhou 215123, China
| | - Longfei Luo
- MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Key Laboratory of Pathogen Bioscience and Anti-infective Medicine, Department of Bioinformatics, Center for Systems Biology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215213, China
| | - Xin Liu
- Institute of Blood and Marrow Transplantation, Medical College of Soochow University, Jiangsu Institute of Hematology, The First Affiliated Hospital of Soochow University, Collaborative Innovation Center of Hematology, National Clinical Research Center for Hematologic Diseases, Soochow University, Suzhou 215123, China
| | - Ajasja Ljubetič
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, SI-1000 Ljubljana, Slovenia
- EN-FIST Centre of Excellence, SI-1000 Ljubljana, Slovenia
| | - Nengzhi Jin
- Key Laboratory of Advanced Computing of Gansu Province, Gansu Computing Center, Lanzhou 730030, China
| | - Roman Jerala
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, SI-1000 Ljubljana, Slovenia
| | - Guang Hu
- MOE Key Laboratory of Geriatric Diseases and Immunology, Suzhou Key Laboratory of Pathogen Bioscience and Anti-infective Medicine, Department of Bioinformatics, Center for Systems Biology, School of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow University, Suzhou 215213, China
- Jiangsu Province Engineering Research Center of Precision Diagnostics and Therapeutics Development, Soochow University, Suzhou 215123, China
| |
Collapse
|
3
|
Plaper T, Rihtar E, Železnik Ramuta T, Forstnerič V, Jazbec V, Ivanovski F, Benčina M, Jerala R. The art of designed coiled-coils for the regulation of mammalian cells. Cell Chem Biol 2024:S2451-9456(24)00220-4. [PMID: 38971158 DOI: 10.1016/j.chembiol.2024.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 05/04/2024] [Accepted: 06/11/2024] [Indexed: 07/08/2024]
Abstract
Synthetic biology aims to engineer complex biological systems using modular elements, with coiled-coil (CC) dimer-forming modules are emerging as highly useful building blocks in the regulation of protein assemblies and biological processes. Those small modules facilitate highly specific and orthogonal protein-protein interactions, offering versatility for the regulation of diverse biological functions. Additionally, their design rules enable precise control and tunability over these interactions, which are crucial for specific applications. Recent advancements showcase their potential for use in innovative therapeutic interventions and biomedical applications. In this review, we discuss the potential of CCs, exploring their diverse applications in mammalian cells, such as synthetic biological circuit design, transcriptional and allosteric regulation, cellular assemblies, chimeric antigen receptor (CAR) T cell regulation, and genome editing and their role in advancing the understanding and regulation of cellular processes.
Collapse
Affiliation(s)
- Tjaša Plaper
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Erik Rihtar
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Taja Železnik Ramuta
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Vida Forstnerič
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Vid Jazbec
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Filip Ivanovski
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Mojca Benčina
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia; Centre for Technologies of Gene and Cell Therapy, Hajdrihova 19, 1000 Ljubljana, Slovenia
| | - Roman Jerala
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Hajdrihova 19, 1000 Ljubljana, Slovenia; Centre for Technologies of Gene and Cell Therapy, Hajdrihova 19, 1000 Ljubljana, Slovenia.
| |
Collapse
|
4
|
Frenkel M, Raman S. Discovering mechanisms of human genetic variation and controlling cell states at scale. Trends Genet 2024; 40:587-600. [PMID: 38658256 DOI: 10.1016/j.tig.2024.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 03/29/2024] [Accepted: 03/29/2024] [Indexed: 04/26/2024]
Abstract
Population-scale sequencing efforts have catalogued substantial genetic variation in humans such that variant discovery dramatically outpaces interpretation. We discuss how single-cell sequencing is poised to reveal genetic mechanisms at a rate that may soon approach that of variant discovery. The functional genomics toolkit is sufficiently modular to systematically profile almost any type of variation within increasingly diverse contexts and with molecularly comprehensive and unbiased readouts. As a result, we can construct deep phenotypic atlases of variant effects that span the entire regulatory cascade. The same conceptual approach to interpreting genetic variation should be applied to engineering therapeutic cell states. In this way, variant mechanism discovery and cell state engineering will become reciprocating and iterative processes towards genomic medicine.
Collapse
Affiliation(s)
- Max Frenkel
- Cellular and Molecular Biology Graduate Program, University of Wisconsin, Madison, WI, USA; Medical Scientist Training Program, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA; Department of Biochemistry, University of Wisconsin, Madison, WI, USA.
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin, Madison, WI, USA; Department of Bacteriology, University of Wisconsin, Madison, WI, USA; Department of Chemical and Biological Engineering, University of Wisconsin, Madison, WI, USA.
| |
Collapse
|
5
|
Snoj J, Lapenta F, Jerala R. Preorganized cyclic modules facilitate the self-assembly of protein nanostructures. Chem Sci 2024; 15:3673-3686. [PMID: 38455016 PMCID: PMC10915844 DOI: 10.1039/d3sc06658d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 01/15/2024] [Indexed: 03/09/2024] Open
Abstract
The rational design of supramolecular assemblies aims to generate complex systems based on the simple information encoded in the chemical structure. Programmable molecules such as nucleic acids and polypeptides are particularly suitable for designing diverse assemblies and shapes not found in nature. Here, we describe a strategy for assembling modular architectures based on structurally and covalently preorganized subunits. Cyclization through spontaneous self-splicing of split intein and coiled-coil dimer-based interactions of polypeptide chains provide structural constraints, facilitating the desired assembly. We demonstrate the implementation of a strategy based on the preorganization of the subunits by designing a two-chain coiled-coil protein origami (CCPO) assembly that adopts a tetrahedral topology only when one or both subunit chains are covalently cyclized. Employing this strategy, we further design a 109 kDa trimeric CCPO assembly comprising 24 CC-forming segments. In this case, intein cyclization was crucial for the assembly of a concave octahedral scaffold, a newly designed protein fold. The study highlights the importance of preorganization of building modules to facilitate the self-assembly of higher-order supramolecular structures.
Collapse
Affiliation(s)
- Jaka Snoj
- Department of Synthetic Biology and Immunology, National Institute of Chemistry Hajdrihova 19 SI-1000 Ljubljana Slovenia
- Interdisciplinary Doctoral Program in Biomedicine, University of Ljubljana Kongresni trg 12 SI-1000 Ljubljana Slovenia
| | - Fabio Lapenta
- Department of Synthetic Biology and Immunology, National Institute of Chemistry Hajdrihova 19 SI-1000 Ljubljana Slovenia
| | - Roman Jerala
- Department of Synthetic Biology and Immunology, National Institute of Chemistry Hajdrihova 19 SI-1000 Ljubljana Slovenia
- EN-FIST Centre of Excellence Trg OF 13 SI-1000 Ljubljana Slovenia
| |
Collapse
|
6
|
Mock M, Langmead CJ, Grandsard P, Edavettal S, Russell A. Recent advances in generative biology for biotherapeutic discovery. Trends Pharmacol Sci 2024; 45:255-267. [PMID: 38378385 DOI: 10.1016/j.tips.2024.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 12/22/2023] [Accepted: 01/05/2024] [Indexed: 02/22/2024]
Abstract
Generative biology combines artificial intelligence (AI), advanced life sciences technologies, and automation to revolutionize the process of designing novel biomolecules with prescribed properties, giving drug discoverers the ability to escape the limitations of biology during the design of next-generation protein therapeutics. Significant hurdles remain, namely: (i) the inherently complex nature of drug discovery, (ii) the bewildering number of promising computational and experimental techniques that have emerged in the past several years, and (iii) the limited availability of relevant protein sequence-function data for drug-like molecules. There is a need to focus on computational methods that will be most practically effective for protein drug discovery and on building experimental platforms to generate the data most appropriate for these methods. Here, we discuss recent advances in computational and experimental life sciences that are most crucial for impacting the pace and success of protein drug discovery.
Collapse
Affiliation(s)
- Marissa Mock
- Amgen Research, Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA
| | | | - Peter Grandsard
- Amgen Research, Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA
| | - Suzanne Edavettal
- Amgen Research, Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA
| | - Alan Russell
- Amgen Research, Amgen Inc., One Amgen Center Drive, Thousand Oaks, CA 91320, USA.
| |
Collapse
|
7
|
Versini R, Sritharan S, Aykac Fas B, Tubiana T, Aimeur SZ, Henri J, Erard M, Nüsse O, Andreani J, Baaden M, Fuchs P, Galochkina T, Chatzigoulas A, Cournia Z, Santuz H, Sacquin-Mora S, Taly A. A Perspective on the Prospective Use of AI in Protein Structure Prediction. J Chem Inf Model 2024; 64:26-41. [PMID: 38124369 DOI: 10.1021/acs.jcim.3c01361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
AlphaFold2 (AF2) and RoseTTaFold (RF) have revolutionized structural biology, serving as highly reliable and effective methods for predicting protein structures. This article explores their impact and limitations, focusing on their integration into experimental pipelines and their application in diverse protein classes, including membrane proteins, intrinsically disordered proteins (IDPs), and oligomers. In experimental pipelines, AF2 models help X-ray crystallography in resolving the phase problem, while complementarity with mass spectrometry and NMR data enhances structure determination and protein flexibility prediction. Predicting the structure of membrane proteins remains challenging for both AF2 and RF due to difficulties in capturing conformational ensembles and interactions with the membrane. Improvements in incorporating membrane-specific features and predicting the structural effect of mutations are crucial. For intrinsically disordered proteins, AF2's confidence score (pLDDT) serves as a competitive disorder predictor, but integrative approaches including molecular dynamics (MD) simulations or hydrophobic cluster analyses are advocated for accurate dynamics representation. AF2 and RF show promising results for oligomeric models, outperforming traditional docking methods, with AlphaFold-Multimer showing improved performance. However, some caveats remain in particular for membrane proteins. Real-life examples demonstrate AF2's predictive capabilities in unknown protein structures, but models should be evaluated for their agreement with experimental data. Furthermore, AF2 models can be used complementarily with MD simulations. In this Perspective, we propose a "wish list" for improving deep-learning-based protein folding prediction models, including using experimental data as constraints and modifying models with binding partners or post-translational modifications. Additionally, a meta-tool for ranking and suggesting composite models is suggested, driving future advancements in this rapidly evolving field.
Collapse
Affiliation(s)
- Raphaelle Versini
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Sujith Sritharan
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Burcu Aykac Fas
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Thibault Tubiana
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Sana Zineb Aimeur
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Julien Henri
- Sorbonne Université, CNRS, Laboratoire de Biologie, Computationnelle et Quantitative UMR 7238, Institut de Biologie Paris-Seine, 4 Place Jussieu, F-75005 Paris, France
| | - Marie Erard
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Oliver Nüsse
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Marc Baaden
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Patrick Fuchs
- Sorbonne Université, École Normale Supérieure, PSL University, CNRS, Laboratoire des Biomolécules, LBM, 75005 Paris, France
- Université de Paris, UFR Sciences du Vivant, 75013 Paris, France
| | - Tatiana Galochkina
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75014 Paris, France
| | - Alexios Chatzigoulas
- Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, 15784 Athens, Greece
| | - Zoe Cournia
- Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, 15784 Athens, Greece
| | - Hubert Santuz
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Antoine Taly
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| |
Collapse
|
8
|
Parres-Gold J, Levine M, Emert B, Stuart A, Elowitz MB. Principles of Computation by Competitive Protein Dimerization Networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.30.564854. [PMID: 37961250 PMCID: PMC10634983 DOI: 10.1101/2023.10.30.564854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Many biological signaling pathways employ proteins that competitively dimerize in diverse combinations. These dimerization networks can perform biochemical computations, in which the concentrations of monomers (inputs) determine the concentrations of dimers (outputs). Despite their prevalence, little is known about the range of input-output computations that dimerization networks can perform (their "expressivity") and how it depends on network size and connectivity. Using a systematic computational approach, we demonstrate that even small dimerization networks (3-6 monomers) are expressive, performing diverse multi-input computations. Further, dimerization networks are versatile, performing different computations when their protein components are expressed at different levels, such as in different cell types. Remarkably, individual networks with random interaction affinities, when large enough (≥8 proteins), can perform nearly all (~90%) potential one-input network computations merely by tuning their monomer expression levels. Thus, even the simple process of competitive dimerization provides a powerful architecture for multi-input, cell-type-specific signal processing.
Collapse
Affiliation(s)
- Jacob Parres-Gold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Matthew Levine
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Benjamin Emert
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Andrew Stuart
- Division of Engineering and Applied Sciences, California Institute of Technology, Pasadena, CA 91125, USA
| | - Michael B. Elowitz
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| |
Collapse
|