1
|
Kinshuk S, Li L, Meckes B, Chan CTY. Sequence-Based Protein Design: A Review of Using Statistical Models to Characterize Coevolutionary Traits for Developing Hybrid Proteins as Genetic Sensors. Int J Mol Sci 2024; 25:8320. [PMID: 39125888 PMCID: PMC11312098 DOI: 10.3390/ijms25158320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 07/23/2024] [Accepted: 07/26/2024] [Indexed: 08/12/2024] Open
Abstract
Statistical analyses of homologous protein sequences can identify amino acid residue positions that co-evolve to generate family members with different properties. Based on the hypothesis that the coevolution of residue positions is necessary for maintaining protein structure, coevolutionary traits revealed by statistical models provide insight into residue-residue interactions that are important for understanding protein mechanisms at the molecular level. With the rapid expansion of genome sequencing databases that facilitate statistical analyses, this sequence-based approach has been used to study a broad range of protein families. An emerging application of this approach is to design hybrid transcriptional regulators as modular genetic sensors for novel wiring between input signals and genetic elements to control outputs. Among many allosterically regulated regulator families, the members contain structurally conserved and functionally independent protein domains, including a DNA-binding module (DBM) for interacting with a specific genetic element and a ligand-binding module (LBM) for sensing an input signal. By hybridizing a DBM and an LBM from two different family members, a hybrid regulator can be created with a new combination of signal-detection and DNA-recognition properties not present in natural systems. In this review, we present recent advances in the development of hybrid regulators and their applications in cellular engineering, especially focusing on the use of statistical analyses for characterizing DBM-LBM interactions and hybrid regulator design. Based on these studies, we then discuss the current limitations and potential directions for enhancing the impact of this sequence-based design approach.
Collapse
Affiliation(s)
- Sahaj Kinshuk
- Department of Biomedical Engineering, College of Engineering, University of North Texas, 3940 N Elm Street, Denton, TX 76207, USA; (S.K.); (L.L.); (B.M.)
| | - Lin Li
- Department of Biomedical Engineering, College of Engineering, University of North Texas, 3940 N Elm Street, Denton, TX 76207, USA; (S.K.); (L.L.); (B.M.)
| | - Brian Meckes
- Department of Biomedical Engineering, College of Engineering, University of North Texas, 3940 N Elm Street, Denton, TX 76207, USA; (S.K.); (L.L.); (B.M.)
- BioDiscovery Institute, University of North Texas, 1155 Union Circle #305220, Denton, TX 76203, USA
| | - Clement T. Y. Chan
- Department of Biomedical Engineering, College of Engineering, University of North Texas, 3940 N Elm Street, Denton, TX 76207, USA; (S.K.); (L.L.); (B.M.)
- BioDiscovery Institute, University of North Texas, 1155 Union Circle #305220, Denton, TX 76203, USA
| |
Collapse
|
2
|
Rocha REO, Mariano DCB, Almeida TS, CorrêaCosta LS, Fischer PHC, Santos LH, Caffarena ER, da Silveira CH, Lamp LM, Fernandez-Quintero ML, Liedl KR, de Melo-Minardi RC, de Lima LHF. Thermostabilizing mechanisms of canonical single amino acid substitutions at a GH1 β-glucosidase probed by multiple MD and computational approaches. Proteins 2023; 91:218-236. [PMID: 36114781 DOI: 10.1002/prot.26424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 09/01/2022] [Accepted: 09/06/2022] [Indexed: 01/07/2023]
Abstract
β-glucosidases play a pivotal role in second-generation biofuel (2G-biofuel) production. For this application, thermostable enzymes are essential due to the denaturing conditions on the bioreactors. Random amino acid substitutions have originated new thermostable β-glucosidases, but without a clear understanding of their molecular mechanisms. Here, we probe by different molecular dynamics simulation approaches with distinct force fields and submitting the results to various computational analyses, the molecular bases of the thermostabilization of the Paenibacillus polymyxa GH1 β-glucosidase by two-point mutations E96K (TR1) and M416I (TR2). Equilibrium molecular dynamic simulations (eMD) at different temperatures, principal component analysis (PCA), virtual docking, metadynamics (MetaDy), accelerated molecular dynamics (aMD), Poisson-Boltzmann surface analysis, grid inhomogeneous solvation theory and colony method estimation of conformational entropy allow to converge to the idea that the stabilization carried by both substitutions depend on different contributions of three classic mechanisms: (i) electrostatic surface stabilization; (ii) efficient isolation of the hydrophobic core from the solvent, with energetic advantages at the solvation cap; (iii) higher distribution of the protein dynamics at the mobile active site loops than at the protein core, with functional and entropic advantages. Mechanisms i and ii predominate for TR1, while in TR2, mechanism iii is dominant. Loop A integrity and loops A, C, D, and E dynamics play critical roles in such mechanisms. Comparison of the dynamic and topological changes observed between the thermostable mutants and the wildtype protein with amino acid co-evolutive networks and thermostabilizing hotspots from the literature allow inferring that the mechanisms here recovered can be related to the thermostability obtained by different substitutions along the whole family GH1. We hope the results and insights discussed here can be helpful for future rational approaches to the engineering of optimized β-glucosidases for 2G-biofuel production for industry, biotechnology, and science.
Collapse
Affiliation(s)
- Rafael Eduardo Oliveira Rocha
- Laboratory of Molecular Modelling and Bioinformatics (LAMMB), Department of Physical and Biological Sciences, Campus Sete Lagoas, Universidade Federal de São João Del Rei, Sete Lagoas, Brazil.,Laboratory of Bioinformatics and Systems (LBS), Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.,Laboratory of Molecular Modeling and Drug Design, Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Diego César Batista Mariano
- Laboratory of Bioinformatics and Systems (LBS), Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Tiago Silva Almeida
- Laboratory of Molecular Modelling and Bioinformatics (LAMMB), Department of Physical and Biological Sciences, Campus Sete Lagoas, Universidade Federal de São João Del Rei, Sete Lagoas, Brazil
| | - Leon Sulfierry CorrêaCosta
- Laboratory of Molecular Modelling and Bioinformatics (LAMMB), Department of Physical and Biological Sciences, Campus Sete Lagoas, Universidade Federal de São João Del Rei, Sete Lagoas, Brazil.,Computational Modeling Coordination (COMOD), Laboratório Nacional de Computação Científica (LNCC), Petrópolis, Brazil
| | - Pedro Henrique Camargo Fischer
- Laboratory of Molecular Modelling and Bioinformatics (LAMMB), Department of Physical and Biological Sciences, Campus Sete Lagoas, Universidade Federal de São João Del Rei, Sete Lagoas, Brazil
| | - Lucianna Helene Santos
- Laboratory of Bioinformatics and Systems (LBS), Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil.,Laboratory of Molecular Modeling and Drug Design, Department of Biochemistry and Immunology, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | | | | | - Leonida M Lamp
- Institute of General, Inorganic and Theoretical Chemistry, and Center for Chemistry and Biomedicine Innsbruck (CCB), University of Innsbruck, Innsbruck, Austria
| | - Monica Lisa Fernandez-Quintero
- Institute of General, Inorganic and Theoretical Chemistry, and Center for Chemistry and Biomedicine Innsbruck (CCB), University of Innsbruck, Innsbruck, Austria
| | - Klaus Roman Liedl
- Institute of General, Inorganic and Theoretical Chemistry, and Center for Chemistry and Biomedicine Innsbruck (CCB), University of Innsbruck, Innsbruck, Austria
| | - Raquel Cardoso de Melo-Minardi
- Laboratory of Bioinformatics and Systems (LBS), Department of Computer Science, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Leonardo Henrique França de Lima
- Laboratory of Molecular Modelling and Bioinformatics (LAMMB), Department of Physical and Biological Sciences, Campus Sete Lagoas, Universidade Federal de São João Del Rei, Sete Lagoas, Brazil.,Institute of General, Inorganic and Theoretical Chemistry, and Center for Chemistry and Biomedicine Innsbruck (CCB), University of Innsbruck, Innsbruck, Austria
| |
Collapse
|
3
|
Ravishankar K, Jiang X, Leddin EM, Morcos F, Cisneros GA. Computational compensatory mutation discovery approach: Predicting a PARP1 variant rescue mutation. Biophys J 2022; 121:3663-3673. [PMID: 35642254 PMCID: PMC9617126 DOI: 10.1016/j.bpj.2022.05.036] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Revised: 05/20/2022] [Accepted: 05/23/2022] [Indexed: 11/02/2022] Open
Abstract
The prediction of protein mutations that affect function may be exploited for multiple uses. In the context of disease variants, the prediction of compensatory mutations that reestablish functional phenotypes could aid in the development of genetic therapies. In this work, we present an integrated approach that combines coevolutionary analysis and molecular dynamics (MD) simulations to discover functional compensatory mutations. This approach is employed to investigate possible rescue mutations of a poly(ADP-ribose) polymerase 1 (PARP1) variant, PARP1 V762A, associated with lung cancer and follicular lymphoma. MD simulations show PARP1 V762A exhibits noticeable changes in structural and dynamical behavior compared with wild-type (WT) PARP1. Our integrated approach predicts A755E as a possible compensatory mutation based on coevolutionary information, and molecular simulations indicate that the PARP1 A755E/V762A double mutant exhibits similar structural and dynamical behavior to WT PARP1. Our methodology can be broadly applied to a large number of systems where single-nucleotide polymorphisms have been identified as connected to disease and can shed light on the biophysical effects of such changes as well as provide a way to discover potential mutants that could restore WT-like functionality. This can, in turn, be further utilized in the design of molecular therapeutics that aim to mimic such compensatory effect.
Collapse
Affiliation(s)
| | - Xianli Jiang
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas; Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Emmett M Leddin
- Department of Chemistry, University of North Texas, Denton, Texas
| | - Faruck Morcos
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas; Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas; Center for Systems Biology, The University of Texas at Dallas, Richardson, Texas.
| | - G Andrés Cisneros
- Department of Chemistry, University of North Texas, Denton, Texas; Department of Physics, The University of Texas at Dallas, Richardson, Texas; Department of Chemistry, The University of Texas at Dallas, Richardson, Texas.
| |
Collapse
|
4
|
Petrotchenko EV, Borchers CH. Protein Chemistry Combined with Mass Spectrometry for Protein Structure Determination. Chem Rev 2021; 122:7488-7499. [PMID: 34968047 DOI: 10.1021/acs.chemrev.1c00302] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The advent of soft-ionization mass spectrometry for biomolecules has opened up new possibilities for the structural analysis of proteins. Combining protein chemistry methods with modern mass spectrometry has led to the emergence of the distinct field of structural proteomics. Multiple protein chemistry approaches, such as surface modification, limited proteolysis, hydrogen-deuterium exchange, and cross-linking, provide diverse and often orthogonal structural information on the protein systems studied. Combining experimental data from these various structural proteomics techniques provides a more comprehensive examination of the protein structure and increases confidence in the ultimate findings. Here, we review various types of experimental data from structural proteomics approaches with an emphasis on the use of multiple complementary mass spectrometric approaches to provide experimental constraints for the solving of protein structures.
Collapse
Affiliation(s)
- Evgeniy V Petrotchenko
- Segal Cancer Proteomics Centre, Lady Davis Institute, Jewish General Hospital, McGill University, Montreal, Quebec H3T 1E2, Canada.,Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Moscow 121205, Russia
| | - Christoph H Borchers
- Segal Cancer Proteomics Centre, Lady Davis Institute, Jewish General Hospital, McGill University, Montreal, Quebec H3T 1E2, Canada.,Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Moscow 121205, Russia.,Gerald Bronfman Department of Oncology, Jewish General Hospital, McGill University, Montreal, Quebec H3T 1E2, Canada
| |
Collapse
|
5
|
Abstract
Knowledge of protein structure is crucial to our understanding of biological function and is routinely used in drug discovery. High-resolution techniques to determine the three-dimensional atomic coordinates of proteins are available. However, such methods are frequently limited by experimental challenges such as sample quantity, target size, and efficiency. Structural mass spectrometry (MS) is a technique in which structural features of proteins are elucidated quickly and relatively easily. Computational techniques that convert sparse MS data into protein models that demonstrate agreement with the data are needed. This review features cutting-edge computational methods that predict protein structure from MS data such as chemical cross-linking, hydrogen-deuterium exchange, hydroxyl radical protein footprinting, limited proteolysis, ion mobility, and surface-induced dissociation. Additionally, we address future directions for protein structure prediction with sparse MS data. Expected final online publication date for the Annual Review of Physical Chemistry, Volume 73 is April 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Sarah E Biehn
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, USA;
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, USA;
| |
Collapse
|
6
|
Bottino GF, Ferrari AJR, Gozzo FC, Martínez L. Structural discrimination analysis for constraint selection in protein modeling. Bioinformatics 2021; 37:3766-3773. [PMID: 34086840 DOI: 10.1093/bioinformatics/btab425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/07/2021] [Accepted: 06/03/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Protein structure modeling can be improved by the use of distance constraints between amino acid residues, provided such data reflects-at least partially-the native tertiary structure of the target system. In fact, only a small subset of the native contact map is necessary to successfully drive the model conformational search, so one important goal is to obtain the set of constraints with the highest true-positive rate, lowest redundancy, and greatest amount of information. In this work, we introduce a constraint evaluation and selection method based on the point-biserial correlation coefficient, which utilizes structural information from an ensemble of models to indirectly measure the power of each constraint in biasing the conformational search towards consensus structures. RESULTS Residue contact maps obtained by direct coupling analysis are systematically improved by means of discriminant analysis, reaching in some cases accuracies often seen only in modern deep-learning based approaches. When combined with an iterative modeling workflow, the proposed constraint classification optimizes the selection of the constraint set and maximizes the probability of obtaining successful models. The use of discriminant analysis for the valorization of the information of constraint data sets is a general concept with possible applications to other constraint types and modeling problems. AVAILABILITY AND IMPLEMENTATION scripts and procedures to implement the methodology presented herein are available at https://github.com/m3g/2021_Bottino_Biserial. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guilherme F Bottino
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil.,Center for Computational Engineering & Science, University of Campinas, Campinas, SP, Brazil
| | - Allan J R Ferrari
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil.,Center for Computational Engineering & Science, University of Campinas, Campinas, SP, Brazil
| | - Fabio C Gozzo
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil
| | - Leandro Martínez
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil.,Center for Computational Engineering & Science, University of Campinas, Campinas, SP, Brazil
| |
Collapse
|
7
|
Dokholyan NV. Experimentally-driven protein structure modeling. J Proteomics 2020; 220:103777. [PMID: 32268219 PMCID: PMC7214187 DOI: 10.1016/j.jprot.2020.103777] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 03/17/2020] [Accepted: 04/02/2020] [Indexed: 11/25/2022]
Abstract
Revolutions in natural and exact sciences started at the dawn of last century have led to the explosion of theoretical, experimental, and computational approaches to determine structures of molecules, complexes, as well as their rich conformational dynamics. Since different experimental methods produce information that is attributed to specific time and length scales, corresponding computational methods have to be tailored to these scales and experiments. These methods can be then combined and integrated in scales, hence producing a fuller picture of molecular structure and motion from the "puzzle pieces" offered by various experiments. Here, we describe a number of computational approaches to utilize experimental data to glance into structure of proteins and understand their dynamics. We will also discuss the limitations and the resolution of the constraints-based modeling approaches. SIGNIFICANCE: Experimentally-driven computational structure modeling and determination is a rapidly evolving alternative to traditional approaches for molecular structure determination. These new hybrid experimental-computational approaches are proving to be a powerful microscope to glance into the structural features of intrinsically or partially disordered proteins, dynamics of molecules and complexes. In this review, we describe various approaches in the field of experimentally-driven computational structure modeling.
Collapse
Affiliation(s)
- Nikolay V Dokholyan
- Department of Pharmacology, Penn State University College of Medicine, Hershey, PA 17033, USA; Department of Biochemistry & Molecular Biology, Penn State College of Medicine, Hershey, PA 17033, USA.; Department of Chemistry, Pennsylvania State University, University Park, PA 16802, USA.; Department of Biomedical Engineering, Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|
8
|
Dimas RP, Jiang XL, Alberto de la Paz J, Morcos F, Chan CTY. Engineering repressors with coevolutionary cues facilitates toggle switches with a master reset. Nucleic Acids Res 2019; 47:5449-5463. [PMID: 31162606 PMCID: PMC6547410 DOI: 10.1093/nar/gkz280] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 04/08/2019] [Indexed: 12/17/2022] Open
Abstract
Engineering allosteric transcriptional repressors containing an environmental sensing module (ESM) and a DNA recognition module (DRM) has the potential to unlock a combinatorial set of rationally designed biological responses. We demonstrated that constructing hybrid repressors by fusing distinct ESMs and DRMs provides a means to flexibly rewire genetic networks for complex signal processing. We have used coevolutionary traits among LacI homologs to develop a model for predicting compatibility between ESMs and DRMs. Our predictions accurately agree with the performance of 40 engineered repressors. We have harnessed this framework to develop a system of multiple toggle switches with a master OFF signal that produces a unique behavior: each engineered biological activity is switched to a stable ON state by different chemicals and returned to OFF in response to a common signal. One promising application of this design is to develop living diagnostics for monitoring multiple parameters in complex physiological environments and it represents one of many circuit topologies that can be explored with modular repressors designed with coevolutionary information.
Collapse
Affiliation(s)
- Rey P Dimas
- Department of Biology, The University of Texas at Tyler, Tyler, TX 75799, USA
| | - Xian-Li Jiang
- Department of Biological Sciences, The University of Texas at Dallas, Dallas, TX 75080, USA
| | - Jose Alberto de la Paz
- Department of Biological Sciences, The University of Texas at Dallas, Dallas, TX 75080, USA
| | - Faruck Morcos
- Department of Biological Sciences, The University of Texas at Dallas, Dallas, TX 75080, USA.,Department of Bioengineering, The University of Texas at Dallas, Dallas, TX 75080, USA.,Center for Systems Biology, The University of Texas at Dallas, Dallas, TX 75080, USA
| | - Clement T Y Chan
- Department of Biology, The University of Texas at Tyler, Tyler, TX 75799, USA.,Department of Chemistry and Biochemistry, The University of Texas at Tyler, Tyler, TX 75799, USA
| |
Collapse
|
9
|
Dos Santos RN, Bottino GF, Gozzo FC, Morcos F, Martínez L. Structural complementarity of distance constraints obtained from chemical cross-linking and amino acid coevolution. Proteins 2019; 88:625-632. [PMID: 31693206 DOI: 10.1002/prot.25843] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 10/07/2019] [Accepted: 11/03/2019] [Indexed: 12/11/2022]
Abstract
The analysis of amino acid coevolution has emerged as a practical method for protein structural modeling by providing structural contact information from alignments of amino acid sequences. In parallel, chemical cross-linking/mass spectrometry (XLMS) has gained attention as a universally applicable method for obtaining low-resolution distance constraints to model the quaternary arrangements of proteins, and more recently even protein tertiary structures. Here, we show that the structural information obtained by XLMS and coevolutionary analysis are effectively complementary: the distance constraints obtained by each method are almost exclusively associated with non-coincident pairs of residues, and modeling results obtained by the combination of both sets are improved relative to considering the same total number of constraints of a single type. The structural rationale behind the complementarity of the distance constraints is discussed and illustrated for a representative set of proteins with different sizes and folds.
Collapse
Affiliation(s)
- Ricardo N Dos Santos
- Institute of Chemistry, University of Campinas, Campinas, São Paulo, Brazil.,Center for Computing in Engineering & Sciences, University of Campinas, Campinas, São Paulo, Brazil
| | - Guilherme F Bottino
- Institute of Chemistry, University of Campinas, Campinas, São Paulo, Brazil.,Center for Computing in Engineering & Sciences, University of Campinas, Campinas, São Paulo, Brazil
| | - Fábio C Gozzo
- Institute of Chemistry, University of Campinas, Campinas, São Paulo, Brazil
| | - Faruck Morcos
- Department of Biological Sciences, University of Texas at Dallas, Richardson, Texas.,Department of Bioengineering, University of Texas at Dallas, Richardson, Texas
| | - Leandro Martínez
- Institute of Chemistry, University of Campinas, Campinas, São Paulo, Brazil.,Center for Computing in Engineering & Sciences, University of Campinas, Campinas, São Paulo, Brazil
| |
Collapse
|
10
|
Li Y, De la Paz JA, Jiang X, Liu R, Pokkulandra AP, Bleris L, Morcos F. Coevolutionary Couplings Unravel PAM-Proximal Constraints of CRISPR-SpCas9. Biophys J 2019; 117:1684-1691. [PMID: 31648792 DOI: 10.1016/j.bpj.2019.09.040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 09/25/2019] [Accepted: 09/30/2019] [Indexed: 01/07/2023] Open
Abstract
The clustered regularly interspaced short palindromic repeats (CRISPR) system, an immune system analog found in prokaryotes, allows a single-guide RNA to direct a CRISPR-associated protein (Cas) with combined helicase and nuclease activity to DNA. The presence of a specific protospacer adjacent motif (PAM) next to the DNA target site plays a crucial role in determining both efficacy and specificity of gene editing. Herein, we introduce a coevolutionary framework to computationally unveil nonobvious molecular interactions in CRISPR systems and experimentally probe their functional role. Specifically, we use direct coupling analysis, a statistical inference framework used to infer direct coevolutionary couplings, in the context of protein/nucleic acid interactions. Applied to Streptococcus pyogenes Cas9, a Hamiltonian metric obtained from coevolutionary relationships reveals, to our knowledge, novel PAM-proximal nucleotide preferences at the seventh position of S. pyogenes Cas9 PAM (5'-NGRNNNT-3'), which was experimentally confirmed by in vitro and functional assays in human cells. We show that coevolved and conserved interactions point to specific clues toward rationally engineering new generations of Cas9 systems and may eventually help decipher the diversity of this family of proteins.
Collapse
Affiliation(s)
- Yi Li
- Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas; Center for Systems Biology, The University of Texas at Dallas, Richardson, Texas
| | - José A De la Paz
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas
| | - Xianli Jiang
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas
| | - Richard Liu
- Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas
| | - Adarsha P Pokkulandra
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas
| | - Leonidas Bleris
- Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas; Center for Systems Biology, The University of Texas at Dallas, Richardson, Texas; Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas.
| | - Faruck Morcos
- Department of Bioengineering, The University of Texas at Dallas, Richardson, Texas; Center for Systems Biology, The University of Texas at Dallas, Richardson, Texas; Department of Biological Sciences, The University of Texas at Dallas, Richardson, Texas.
| |
Collapse
|
11
|
Ferrari AJR, Clasen MA, Kurt L, Carvalho PC, Gozzo FC, Martínez L. TopoLink: evaluation of structural models using chemical crosslinking distance constraints. Bioinformatics 2019; 35:3169-3170. [DOI: 10.1093/bioinformatics/btz014] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 12/05/2018] [Accepted: 01/04/2019] [Indexed: 01/28/2023] Open
Abstract
Abstract
Summary
A software was developed to evaluate structural models using chemical crosslinking experiments. The user provides the types of linkers used and their reactivity, and the observed crosslinks and dead-ends. The software computes the minimum length of a physically inspired linker that connects the reactive atoms of interest, and reports the consistency of each distance with the experimental observation. Statistics on model consistency with the links are provided. Tools to evaluate the correlation of crosslinks in ensembles of models were developed. TopoLink was used to evaluate the potential crosslinks of all structures of the CATH database. The number of crosslinks expected as a function of protein size and linker length can be used as guide for experimental design.
Availability and implementation
TopoLink is available as free software at http://m3g.iqm.unicamp.br/topolink, and distributed as source code with a user-friendly graphical interface for Windows. A web server is also provided.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Allan J R Ferrari
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil
| | | | | | | | - Fabio C Gozzo
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil
| | - Leandro Martínez
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil
- Center for Computing in Engineering & Sciences, University of Campinas, Campinas, SP, Brazil
| |
Collapse
|
12
|
Ferrari AJR, Gozzo FC, Martínez L. Statistical force-field for structural modeling using chemical cross-linking/mass spectrometry distance constraints. Bioinformatics 2019; 35:3005-3012. [DOI: 10.1093/bioinformatics/btz013] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 12/03/2018] [Accepted: 01/04/2019] [Indexed: 12/22/2022] Open
Abstract
Abstract
Motivation
Chemical cross-linking/mass spectrometry (XLMS) is an experimental method to obtain distance constraints between amino acid residues which can be applied to structural modeling of tertiary and quaternary biomolecular structures. These constraints provide, in principle, only upper limits to the distance between amino acid residues along the surface of the biomolecule. In practice, attempts to use of XLMS constraints for tertiary protein structure determination have not been widely successful. This indicates the need of specifically designed strategies for the representation of these constraints within modeling algorithms.
Results
A force-field designed to represent XLMS-derived constraints is proposed. The potential energy functions are obtained by computing, in the database of known protein structures, the probability of satisfaction of a topological cross-linking distance as a function of the Euclidean distance between amino acid residues. First, the strategy suggests that XL constraints should be set to shorter distances than usually assumed. Second, the complete statistical force-field improves the models obtained and can be easily incorporated into current modeling methods and software. The force-field was implemented and is distributed to be used within the Rosetta ab initio relax protocol.
Availability and implementation
Force-field parameters and usage instructions are freely available online (http://m3g.iqm.unicamp.br/topolink/xlff).
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Allan J R Ferrari
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil
| | - Fabio C Gozzo
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil
| | - Leandro Martínez
- Institute of Chemistry, University of Campinas, Campinas, SP, Brazil
- Center for Computing in Engineering & Sciences, University of Campinas, Campinas, SP, Brazil
| |
Collapse
|
13
|
Cross-linking mass spectrometry: methods and applications in structural, molecular and systems biology. Nat Struct Mol Biol 2018; 25:1000-1008. [PMID: 30374081 DOI: 10.1038/s41594-018-0147-0] [Citation(s) in RCA: 201] [Impact Index Per Article: 33.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Accepted: 09/19/2018] [Indexed: 01/11/2023]
Abstract
Over the past decade, cross-linking mass spectrometry (CLMS) has developed into a robust and flexible tool that provides medium-resolution structural information. CLMS data provide a measure of the proximity of amino acid residues and thus offer information on the folds of proteins and the topology of their complexes. Here, we highlight notable successes of this technique as well as common pipelines. Novel CLMS applications, such as in-cell cross-linking, probing conformational changes and tertiary-structure determination, are now beginning to make contributions to molecular biology and the emerging fields of structural systems biology and interactomics.
Collapse
|