1
|
Mitra R, Li J, Sagendorf JM, Jiang Y, Chiu TP, Rohs R. DeepPBS: Geometric deep learning for interpretable prediction of protein-DNA binding specificity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.15.571942. [PMID: 38293168 PMCID: PMC10827229 DOI: 10.1101/2023.12.15.571942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Predicting specificity in protein-DNA interactions is a challenging yet essential task for understanding gene regulation. Here, we present Deep Predictor of Binding Specificity (DeepPBS), a geometric deep-learning model designed to predict binding specificity across protein families based on protein-DNA structures. The DeepPBS architecture allows investigation of different family-specific recognition patterns. DeepPBS can be applied to predicted structures, and can aid in the modeling of protein-DNA complexes. DeepPBS is interpretable and can be used to calculate protein heavy atom-level importance scores, demonstrated as a case-study on p53-DNA interface. When aggregated at the protein residue level, these scores conform well with alanine scanning mutagenesis experimental data. The inference time for DeepPBS is sufficiently fast for analyzing simulation trajectories, as demonstrated on a molecular-dynamics simulation of a Drosophila Hox-DNA tertiary complex with its cofactor. DeepPBS and its corresponding data resources offer a foundation for machine-aided protein-DNA interaction studies, guiding experimental choices and complex design, as well as advancing our understanding of molecular interactions.
Collapse
|
2
|
Glasscock CJ, Pecoraro R, McHugh R, Doyle LA, Chen W, Boivin O, Lonnquist B, Na E, Politanska Y, Haddox HK, Cox D, Norn C, Coventry B, Goreshnik I, Vafeados D, Lee GR, Gordan R, Stoddard BL, DiMaio F, Baker D. Computational design of sequence-specific DNA-binding proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.20.558720. [PMID: 37790440 PMCID: PMC10542524 DOI: 10.1101/2023.09.20.558720] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Sequence-specific DNA-binding proteins (DBPs) play critical roles in biology and biotechnology, and there has been considerable interest in the engineering of DBPs with new or altered specificities for genome editing and other applications. While there has been some success in reprogramming naturally occurring DBPs using selection methods, the computational design of new DBPs that recognize arbitrary target sites remains an outstanding challenge. We describe a computational method for the design of small DBPs that recognize specific target sequences through interactions with bases in the major groove, and employ this method in conjunction with experimental screening to generate binders for 5 distinct DNA targets. These binders exhibit specificity closely matching the computational models for the target DNA sequences at as many as 6 base positions and affinities as low as 30-100 nM. The crystal structure of a designed DBP-target site complex is in close agreement with the design model, highlighting the accuracy of the design method. The designed DBPs function in both Escherichia coli and mammalian cells to repress and activate transcription of neighboring genes. Our method is a substantial step towards a general route to small and hence readily deliverable sequence-specific DBPs for gene regulation and editing.
Collapse
Affiliation(s)
- Cameron J. Glasscock
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Robert Pecoraro
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Physics, University of Washington, Seattle, WA, USA
| | - Ryan McHugh
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Lindsey A. Doyle
- Division of Basic Sciences, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Wei Chen
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Olivier Boivin
- Program in Genetics and Genomic, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Beau Lonnquist
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Emily Na
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Yuliya Politanska
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Hugh K. Haddox
- Division of Basic Sciences, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - David Cox
- Department of Biochemistry, Stanford University School of Medicine, Palo Alto, CA USA
- Department of Medicine, Division of Hematology, Stanford University, Stanford, CA, USA
| | - Christoffer Norn
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- BioInnovation Institute, DK2200 Copenhagen N, Denmark
| | - Brian Coventry
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Inna Goreshnik
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Dionne Vafeados
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Gyu Rie Lee
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA USA
| | - Raluca Gordan
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Department of Computer Science, Department of Molecular Genetics and Microbiology, Duke University, Durham, NC, USA
| | - Barry L. Stoddard
- Division of Basic Sciences, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- BioInnovation Institute, DK2200 Copenhagen N, Denmark
| |
Collapse
|
3
|
Pavlovicz RE, Park H, DiMaio F. Efficient consideration of coordinated water molecules improves computational protein-protein and protein-ligand docking discrimination. PLoS Comput Biol 2020; 16:e1008103. [PMID: 32956350 PMCID: PMC7529342 DOI: 10.1371/journal.pcbi.1008103] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 10/01/2020] [Accepted: 06/29/2020] [Indexed: 12/25/2022] Open
Abstract
Highly coordinated water molecules are frequently an integral part of protein-protein and protein-ligand interfaces. We introduce an updated energy model that efficiently captures the energetic effects of these ordered water molecules on the surfaces of proteins. A two-stage method is developed in which polar groups arranged in geometries suitable for water placement are first identified, then a modified Monte Carlo simulation allows highly coordinated waters to be placed on the surface of a protein while simultaneously sampling amino acid side chain orientations. This “semi-explicit” water model is implemented in Rosetta and is suitable for both structure prediction and protein design. We show that our new approach and energy model yield significant improvements in native structure recovery of protein-protein and protein-ligand docking discrimination tests. Well-coordinated water molecules—those forming multiple hydrogen bonds with nearby polar groups—play an important role in the structure of biomolecular systems, yet the effect of these waters is often not considered in molecular energy computations. In this paper, we describe a method to efficiently consider these water molecules both implicitly and explicitly at the interfaces formed by two polar molecules. In computations related to determining how a protein interacts with binding partners, we show that the use of this new method significantly improves results. Future application of this approach may improve the design of new protein and small molecule drugs.
Collapse
Affiliation(s)
- Ryan E. Pavlovicz
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
- Institute for Protein Design, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|
4
|
Leman JK, Weitzner BD, Lewis SM, Adolf-Bryfogle J, Alam N, Alford RF, Aprahamian M, Baker D, Barlow KA, Barth P, Basanta B, Bender BJ, Blacklock K, Bonet J, Boyken SE, Bradley P, Bystroff C, Conway P, Cooper S, Correia BE, Coventry B, Das R, De Jong RM, DiMaio F, Dsilva L, Dunbrack R, Ford AS, Frenz B, Fu DY, Geniesse C, Goldschmidt L, Gowthaman R, Gray JJ, Gront D, Guffy S, Horowitz S, Huang PS, Huber T, Jacobs TM, Jeliazkov JR, Johnson DK, Kappel K, Karanicolas J, Khakzad H, Khar KR, Khare SD, Khatib F, Khramushin A, King IC, Kleffner R, Koepnick B, Kortemme T, Kuenze G, Kuhlman B, Kuroda D, Labonte JW, Lai JK, Lapidoth G, Leaver-Fay A, Lindert S, Linsky T, London N, Lubin JH, Lyskov S, Maguire J, Malmström L, Marcos E, Marcu O, Marze NA, Meiler J, Moretti R, Mulligan VK, Nerli S, Norn C, Ó'Conchúir S, Ollikainen N, Ovchinnikov S, Pacella MS, Pan X, Park H, Pavlovicz RE, Pethe M, Pierce BG, Pilla KB, Raveh B, Renfrew PD, Burman SSR, Rubenstein A, Sauer MF, Scheck A, Schief W, Schueler-Furman O, Sedan Y, Sevy AM, Sgourakis NG, Shi L, Siegel JB, Silva DA, Smith S, Song Y, Stein A, Szegedy M, Teets FD, Thyme SB, Wang RYR, Watkins A, Zimmerman L, Bonneau R. Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nat Methods 2020; 17:665-680. [PMID: 32483333 PMCID: PMC7603796 DOI: 10.1038/s41592-020-0848-2] [Citation(s) in RCA: 400] [Impact Index Per Article: 100.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 04/22/2020] [Indexed: 12/12/2022]
Abstract
The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at http://www.rosettacommons.org.
Collapse
Affiliation(s)
- Julia Koehler Leman
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA.
- Department of Biology, New York University, New York, New York, USA.
| | - Brian D Weitzner
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Lyell Immunopharma Inc., Seattle, WA, USA
| | - Steven M Lewis
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Biochemistry, Duke University, Durham, NC, USA
- Cyrus Biotechnology, Seattle, WA, USA
| | - Jared Adolf-Bryfogle
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Nawsad Alam
- Department of Microbiology and Molecular Genetics, IMRIC, Ein Kerem Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Rebecca F Alford
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Melanie Aprahamian
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Kyle A Barlow
- Graduate Program in Bioinformatics, University of California San Francisco, San Francisco, CA, USA
| | - Patrick Barth
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Baylor College of Medicine, Department of Pharmacology, Houston, TX, USA
| | - Benjamin Basanta
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Biological Physics Structure and Design PhD Program, University of Washington, Seattle, WA, USA
| | - Brian J Bender
- Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
| | - Kristin Blacklock
- Institute of Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Jaume Bonet
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Scott E Boyken
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Lyell Immunopharma Inc., Seattle, WA, USA
| | - Phil Bradley
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Chris Bystroff
- Department of Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Patrick Conway
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Seth Cooper
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Bruno E Correia
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Brian Coventry
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Rhiju Das
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Lorna Dsilva
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Roland Dunbrack
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Alexander S Ford
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Brandon Frenz
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Cyrus Biotechnology, Seattle, WA, USA
| | - Darwin Y Fu
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
| | - Caleb Geniesse
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Ragul Gowthaman
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD, USA
| | - Dominik Gront
- Faculty of Chemistry, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
| | - Sharon Guffy
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Scott Horowitz
- Department of Chemistry & Biochemistry, University of Denver, Denver, CO, USA
- The Knoebel Institute for Healthy Aging, University of Denver, Denver, CO, USA
| | - Po-Ssu Huang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Thomas Huber
- Research School of Chemistry, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Tim M Jacobs
- Program in Bioinformatics and Computational Biology, Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - David K Johnson
- Center for Computational Biology, University of Kansas, Lawrence, KS, USA
| | - Kalli Kappel
- Biophysics Program, Stanford University, Stanford, CA, USA
| | - John Karanicolas
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, PA, USA
| | - Hamed Khakzad
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute for Computational Science, University of Zurich, Zurich, Switzerland
- S3IT, University of Zurich, Zurich, Switzerland
| | - Karen R Khar
- Cyrus Biotechnology, Seattle, WA, USA
- Center for Computational Biology, University of Kansas, Lawrence, KS, USA
| | - Sagar D Khare
- Institute of Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Chemistry and Chemical Biology, The State University of New Jersey, Piscataway, NJ, USA
- Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Computational Biology and Molecular Biophysics Program, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Firas Khatib
- Department of Computer and Information Science, University of Massachusetts Dartmouth, Dartmouth, MA, USA
| | - Alisa Khramushin
- Department of Microbiology and Molecular Genetics, IMRIC, Ein Kerem Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Indigo C King
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Cyrus Biotechnology, Seattle, WA, USA
| | - Robert Kleffner
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Brian Koepnick
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Georg Kuenze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Daisuke Kuroda
- Medical Device Development and Regulation Research Center, School of Engineering, University of Tokyo, Tokyo, Japan
- Department of Bioengineering, School of Engineering, University of Tokyo, Tokyo, Japan
| | - Jason W Labonte
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Chemistry, Franklin & Marshall College, Lancaster, PA, USA
| | - Jason K Lai
- Baylor College of Medicine, Department of Pharmacology, Houston, TX, USA
| | - Gideon Lapidoth
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Andrew Leaver-Fay
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, USA
| | - Thomas Linsky
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Nir London
- Department of Microbiology and Molecular Genetics, IMRIC, Ein Kerem Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Joseph H Lubin
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sergey Lyskov
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jack Maguire
- Program in Bioinformatics and Computational Biology, Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Lars Malmström
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute for Computational Science, University of Zurich, Zurich, Switzerland
- S3IT, University of Zurich, Zurich, Switzerland
- Division of Infection Medicine, Department of Clinical Sciences Lund, Faculty of Medicine, Lund University, Lund, Sweden
| | - Enrique Marcos
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Research in Biomedicine Barcelona, The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Orly Marcu
- Department of Microbiology and Molecular Genetics, IMRIC, Ein Kerem Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Nicholas A Marze
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jens Meiler
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Departments of Chemistry, Pharmacology and Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
- Institute for Chemical Biology, Vanderbilt University, Nashville, TN, USA
| | - Rocco Moretti
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
| | - Vikram Khipple Mulligan
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Santrupti Nerli
- Department of Computer Science, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Christoffer Norn
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Shane Ó'Conchúir
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Noah Ollikainen
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Sergey Ovchinnikov
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
| | - Michael S Pacella
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Xingjie Pan
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Ryan E Pavlovicz
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Cyrus Biotechnology, Seattle, WA, USA
| | - Manasi Pethe
- Department of Chemistry and Chemical Biology, The State University of New Jersey, Piscataway, NJ, USA
- Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Brian G Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
| | - Kala Bharath Pilla
- Research School of Chemistry, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Barak Raveh
- Department of Microbiology and Molecular Genetics, IMRIC, Ein Kerem Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - P Douglas Renfrew
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Shourya S Roy Burman
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Aliza Rubenstein
- Institute of Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Computational Biology and Molecular Biophysics Program, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Marion F Sauer
- Chemical and Physical Biology Program, Vanderbilt Vaccine Center, Vanderbilt University, Nashville, TN, USA
| | - Andreas Scheck
- Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - William Schief
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA, USA
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, IMRIC, Ein Kerem Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Yuval Sedan
- Department of Microbiology and Molecular Genetics, IMRIC, Ein Kerem Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Alexander M Sevy
- Chemical and Physical Biology Program, Vanderbilt Vaccine Center, Vanderbilt University, Nashville, TN, USA
| | - Nikolaos G Sgourakis
- Department of Chemistry and Biochemistry, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Lei Shi
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Justin B Siegel
- Department of Chemistry, University of California, Davis, Davis, CA, USA
- Department of Biochemistry and Molecular Medicine, University of California, Davis, Davis, California, USA
- Genome Center, University of California, Davis, Davis, CA, USA
| | | | - Shannon Smith
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
| | - Yifan Song
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Cyrus Biotechnology, Seattle, WA, USA
| | - Amelie Stein
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
| | - Maria Szegedy
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Frank D Teets
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Summer B Thyme
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Ray Yu-Ruei Wang
- Department of Biochemistry, University of Washington, Seattle, WA, USA
| | - Andrew Watkins
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA
| | - Lior Zimmerman
- Department of Microbiology and Molecular Genetics, IMRIC, Ein Kerem Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Richard Bonneau
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA.
- Department of Biology, New York University, New York, New York, USA.
- Department of Computer Science, New York University, New York, NY, USA.
- Center for Data Science, New York University, New York, NY, USA.
| |
Collapse
|
5
|
Gapsys V, de Groot BL. Alchemical Free Energy Calculations for Nucleotide Mutations in Protein–DNA Complexes. J Chem Theory Comput 2017; 13:6275-6289. [DOI: 10.1021/acs.jctc.7b00849] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Vytautas Gapsys
- Computational Biomolecular
Dynamics Group, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Bert L. de Groot
- Computational Biomolecular
Dynamics Group, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| |
Collapse
|
6
|
Alford RF, Leaver-Fay A, Jeliazkov JR, O’Meara MJ, DiMaio FP, Park H, Shapovalov MV, Renfrew PD, Mulligan VK, Kappel K, Labonte JW, Pacella MS, Bonneau R, Bradley P, Dunbrack RL, Das R, Baker D, Kuhlman B, Kortemme T, Gray JJ. The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design. J Chem Theory Comput 2017; 13:3031-3048. [PMID: 28430426 PMCID: PMC5717763 DOI: 10.1021/acs.jctc.7b00125] [Citation(s) in RCA: 768] [Impact Index Per Article: 109.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Over the past decade, the Rosetta biomolecular modeling suite has informed diverse biological questions and engineering challenges ranging from interpretation of low-resolution structural data to design of nanomaterials, protein therapeutics, and vaccines. Central to Rosetta's success is the energy function: a model parametrized from small-molecule and X-ray crystal structure data used to approximate the energy associated with each biomolecule conformation. This paper describes the mathematical models and physical concepts that underlie the latest Rosetta energy function, called the Rosetta Energy Function 2015 (REF15). Applying these concepts, we explain how to use Rosetta energies to identify and analyze the features of biomolecular models. Finally, we discuss the latest advances in the energy function that extend its capabilities from soluble proteins to also include membrane proteins, peptides containing noncanonical amino acids, small molecules, carbohydrates, nucleic acids, and other macromolecules.
Collapse
Affiliation(s)
- Rebecca F. Alford
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218, United States
| | - Andrew Leaver-Fay
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, 120 Mason Farm Road, Chapel Hill, North Carolina 27599, United States
| | - Jeliazko R. Jeliazkov
- Program in Molecular Biophysics, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218, United States
| | - Matthew J. O’Meara
- Department of Pharmaceutical Chemistry, University of California at San Francisco, 1700 Fourth Street, San Francisco, California 94158, United States
| | - Frank P. DiMaio
- Department of Biochemistry, University of Washington, J-Wing Health Sciences Building, Box 357350, Seattle, Washington 98195, United States
| | - Hahnbeom Park
- Department of Biochemistry, University of Washington, Molecular Engineering and Sciences, Box 357350, 4000 15 Ave NE, Seattle, Washington 98195, United States
| | - Maxim V. Shapovalov
- Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111, United States
| | - P. Douglas Renfrew
- Department of Biology, Center for Genomics and Systems Biology, New York University, 100 Washington Square East, New York, New York 10003
- Center for Computational Biology, Flatiron Institute, Simons Foundation, 162 5 Avenue, New York, New York 10010, United States
| | - Vikram K. Mulligan
- Department of Biochemistry, University of Washington, Molecular Engineering and Sciences, Box 357350, 4000 15 Ave NE, Seattle, Washington 98195, United States
| | - Kalli Kappel
- Biophysics Program, Stanford University, 450 Serra Mall, Stanford, California 94305, United States
| | - Jason W. Labonte
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218, United States
| | - Michael S. Pacella
- Department of Biomedical Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218, United States
| | - Richard Bonneau
- Department of Biology, Center for Genomics and Systems Biology, New York University, 100 Washington Square East, New York, New York 10003
- Center for Computational Biology, Flatiron Institute, Simons Foundation, 162 5 Avenue, New York, New York 10010, United States
| | - Philip Bradley
- Computational Biology Program, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, Washington 98109, United States
| | - Roland L. Dunbrack
- Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111, United States
| | - Rhiju Das
- Biophysics Program, Stanford University, 450 Serra Mall, Stanford, California 94305, United States
| | - David Baker
- Department of Biochemistry, University of Washington, Molecular Engineering and Sciences, Box 357350, 4000 15 Ave NE, Seattle, Washington 98195, United States
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, 120 Mason Farm Road, Chapel Hill, North Carolina 27599, United States
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California at San Francisco, San Francisco, California 94158, United States
| | - Jeffrey J. Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 North Charles Street, Baltimore, Maryland 21218, United States
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, 120 Mason Farm Road, Chapel Hill, North Carolina 27599, United States
| |
Collapse
|
7
|
Park H, Bradley P, Greisen P, Liu Y, Mulligan VK, Kim DE, Baker D, DiMaio F. Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules. J Chem Theory Comput 2016; 12:6201-6212. [PMID: 27766851 PMCID: PMC5515585 DOI: 10.1021/acs.jctc.6b00819] [Citation(s) in RCA: 275] [Impact Index Per Article: 34.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Most biomolecular modeling energy functions for structure prediction, sequence design, and molecular docking have been parametrized using existing macromolecular structural data; this contrasts molecular mechanics force fields which are largely optimized using small-molecule data. In this study, we describe an integrated method that enables optimization of a biomolecular modeling energy function simultaneously against small-molecule thermodynamic data and high-resolution macromolecular structural data. We use this approach to develop a next-generation Rosetta energy function that utilizes a new anisotropic implicit solvation model, and an improved electrostatics and Lennard-Jones model, illustrating how energy functions can be considerably improved in their ability to describe large-scale energy landscapes by incorporating both small-molecule and macromolecule data. The energy function improves performance in a wide range of protein structure prediction challenges, including monomeric structure prediction, protein-protein and protein-ligand docking, protein sequence design, and prediction of the free energy changes by mutation, while reasonably recapitulating small-molecule thermodynamic properties.
Collapse
Affiliation(s)
- Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| | - Philip Bradley
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue N., Seattle, Washington 98019, USA
| | - Per Greisen
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| | - Yuan Liu
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| | - Vikram Khipple Mulligan
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| | - David E. Kim
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, Washington 98195, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, Washington 98195, USA
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
8
|
Topham CM, Barbe S, André I. An Atomistic Statistically Effective Energy Function for Computational Protein Design. J Chem Theory Comput 2016; 12:4146-68. [PMID: 27341125 DOI: 10.1021/acs.jctc.6b00090] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Shortcomings in the definition of effective free-energy surfaces of proteins are recognized to be a major contributory factor responsible for the low success rates of existing automated methods for computational protein design (CPD). The formulation of an atomistic statistically effective energy function (SEEF) suitable for a wide range of CPD applications and its derivation from structural data extracted from protein domains and protein-ligand complexes are described here. The proposed energy function comprises nonlocal atom-based and local residue-based SEEFs, which are coupled using a novel atom connectivity number factor to scale short-range, pairwise, nonbonded atomic interaction energies and a surface-area-dependent cavity energy term. This energy function was used to derive additional SEEFs describing the unfolded-state ensemble of any given residue sequence based on computed average energies for partially or fully solvent-exposed fragments in regions of irregular structure in native proteins. Relative thermal stabilities of 97 T4 bacteriophage lysozyme mutants were predicted from calculated energy differences for folded and unfolded states with an average unsigned error (AUE) of 0.84 kcal mol(-1) when compared to experiment. To demonstrate the utility of the energy function for CPD, further validation was carried out in tests of its capacity to recover cognate protein sequences and to discriminate native and near-native protein folds, loop conformers, and small-molecule ligand binding poses from non-native benchmark decoys. Experimental ligand binding free energies for a diverse set of 80 protein complexes could be predicted with an AUE of 2.4 kcal mol(-1) using an additional energy term to account for the loss in ligand configurational entropy upon binding. The atomistic SEEF is expected to improve the accuracy of residue-based coarse-grained SEEFs currently used in CPD and to extend the range of applications of extant atom-based protein statistical potentials.
Collapse
Affiliation(s)
- Christopher M Topham
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| | - Sophie Barbe
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| | - Isabelle André
- Université de Toulouse; INSA, UPS, INP; LISBP , 135 Avenue de Rangueil, F-31077 Toulouse, France.,CNRS, UMR5504 , F-31400 Toulouse, France.,INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés , F-31400 Toulouse, France
| |
Collapse
|
9
|
Recognition Code of ZNF191(243-368) and Its Interaction with DNA. Bioinorg Chem Appl 2015; 2015:416751. [PMID: 26457075 PMCID: PMC4592708 DOI: 10.1155/2015/416751] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 09/02/2015] [Indexed: 02/05/2023] Open
Abstract
ZNF191(243-368) is the C-terminal region of ZNF191 which contains a putative DNA-binding domain of four Cys2His2 zinc finger motifs. In this study, an expression vector of a fusion protein of ZNF191(243-368) with glutathione-S-transferase (GST) was constructed and transformed into Escherichia coli BL21. The fusion protein GST-ZNF191(243-368) was expressed using this vector to investigate the protein-DNA binding reaction through an affinity selection strategy on the basis of the binding quality of the zinc finger domain. Results showed that ZNF191(243-368) can selectively bind with sequences and react with genes which contain an AGGG core. However, the recognition mechanism of Cys2His2 zinc finger proteins to DNA warrants further investigation.
Collapse
|
10
|
Garton M, Najafabadi HS, Schmitges FW, Radovani E, Hughes TR, Kim PM. A structural approach reveals how neighbouring C2H2 zinc fingers influence DNA binding specificity. Nucleic Acids Res 2015; 43:9147-57. [PMID: 26384429 PMCID: PMC4627083 DOI: 10.1093/nar/gkv919] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Accepted: 09/05/2015] [Indexed: 12/28/2022] Open
Abstract
Development of an accurate protein–DNA recognition code that can predict DNA specificity from protein sequence is a central problem in biology. C2H2 zinc fingers constitute by far the largest family of DNA binding domains and their binding specificity has been studied intensively. However, despite decades of research, accurate prediction of DNA specificity remains elusive. A major obstacle is thought to be the inability of current methods to account for the influence of neighbouring domains. Here we show that this problem can be addressed using a structural approach: we build structural models for all C2H2-ZF–DNA complexes with known binding motifs and find six distinct binding modes. Each mode changes the orientation of specificity residues with respect to the DNA, thereby modulating base preference. Most importantly, the structural analysis shows that residues at the domain interface strongly and predictably influence the binding mode, and hence specificity. Accounting for predicted binding mode significantly improves prediction accuracy of predicted motifs. This new insight into the fundamental behaviour of C2H2-ZFs has implications for both improving the prediction of natural zinc finger-binding sites, and for prioritizing further experiments to complete the code. It also provides a new design feature for zinc finger engineering.
Collapse
Affiliation(s)
- Michael Garton
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada
| | - Hamed S Najafabadi
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada
| | - Frank W Schmitges
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada
| | - Ernest Radovani
- Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada
| | - Timothy R Hughes
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada
| | - Philip M Kim
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto M5S 3E1, Canada Department of Molecular Genetics, University of Toronto, Toronto M5S 1A8, Canada Department of Computer Science, University of Toronto, Toronto M5S 2E4, Canada
| |
Collapse
|
11
|
Persikov AV, Wetzel JL, Rowland EF, Oakes BL, Xu DJ, Singh M, Noyes MB. A systematic survey of the Cys2His2 zinc finger DNA-binding landscape. Nucleic Acids Res 2015; 43:1965-84. [PMID: 25593323 PMCID: PMC4330361 DOI: 10.1093/nar/gku1395] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Cys2His2 zinc fingers (C2H2-ZFs) comprise the largest class of metazoan DNA-binding domains. Despite this domain's well-defined DNA-recognition interface, and its successful use in the design of chimeric proteins capable of targeting genomic regions of interest, much remains unknown about its DNA-binding landscape. To help bridge this gap in fundamental knowledge and to provide a resource for design-oriented applications, we screened large synthetic protein libraries to select binding C2H2-ZF domains for each possible three base pair target. The resulting data consist of >160 000 unique domain-DNA interactions and comprise the most comprehensive investigation of C2H2-ZF DNA-binding interactions to date. An integrated analysis of these independent screens yielded DNA-binding profiles for tens of thousands of domains and led to the successful design and prediction of C2H2-ZF DNA-binding specificities. Computational analyses uncovered important aspects of C2H2-ZF domain-DNA interactions, including the roles of within-finger context and domain position on base recognition. We observed the existence of numerous distinct binding strategies for each possible three base pair target and an apparent balance between affinity and specificity of binding. In sum, our comprehensive data help elucidate the complex binding landscape of C2H2-ZF domains and provide a foundation for efforts to determine, predict and engineer their DNA-binding specificities.
Collapse
Affiliation(s)
- Anton V Persikov
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Joshua L Wetzel
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | - Elizabeth F Rowland
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Benjamin L Oakes
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Denise J Xu
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Mona Singh
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | - Marcus B Noyes
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
12
|
Joyce AP, Zhang C, Bradley P, Havranek JJ. Structure-based modeling of protein: DNA specificity. Brief Funct Genomics 2014; 14:39-49. [PMID: 25414269 DOI: 10.1093/bfgp/elu044] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Protein:DNA interactions are essential to a range of processes that maintain and express the information encoded in the genome. Structural modeling is an approach that aims to understand these interactions at the physicochemical level. It has been proposed that structural modeling can lead to deeper understanding of the mechanisms of protein:DNA interactions, and that progress in this field can not only help to rationalize the observed specificities of DNA-binding proteins but also to allow researchers to engineer novel DNA site specificities. In this review we discuss recent developments in the structural description of protein:DNA interactions and specificity, as well as the challenges facing the field in the future.
Collapse
|
13
|
Ashworth J, Plaisier CL, Lo FY, Reiss DJ, Baliga NS. Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction. PLoS One 2014; 9:e107863. [PMID: 25255272 PMCID: PMC4177876 DOI: 10.1371/journal.pone.0107863] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 08/16/2014] [Indexed: 11/18/2022] Open
Abstract
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.
Collapse
Affiliation(s)
- Justin Ashworth
- Institute for Systems Biology, Seattle, Washington, United States of America
- * E-mail: (JA); (NB)
| | | | - Fang Yin Lo
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - David J. Reiss
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Nitin S. Baliga
- Institute for Systems Biology, Seattle, Washington, United States of America
- Department of Microbiology, University of Washington, Seattle, Washington, United States of America
- * E-mail: (JA); (NB)
| |
Collapse
|
14
|
Abstract
The rapid development of programmable site-specific endonucleases has led to a dramatic increase in genome engineering activities for research and therapeutic purposes. Specific loci of interest in the genomes of a wide range of organisms including mammals can now be modified using zinc-finger nucleases, transcription activator-like effectornucleases, and CRISPR-associated Cas9 endonucleases in a site-specific manner, in some cases requiring relatively modest effort for endonuclease design, construction, and application. While these technologies have made genome engineering widely accessible, the ability of programmable nucleases to cleave off-target sequences can limit their applicability and raise concerns about therapeutic safety. In this chapter, we review methods to evaluate and improve the DNA cleavage activity of programmable site-specific endonucleases and describe a procedure for a comprehensive off-target profiling method based on the in vitro selection of very large (~10(12)-membered) libraries of potential nuclease substrates.
Collapse
|
15
|
Abstract
Building protein tools that can selectively bind or cleave specific DNA sequences requires efficient technologies for modifying protein-DNA interactions. Computational design is one method for accomplishing this goal. In this chapter, we present the current state of protein-DNA interface design with the Rosetta macromolecular modeling program. The LAGLIDADG endonuclease family of DNA-cleaving enzymes, under study as potential gene therapy reagents, has been the main testing ground for these in silico protocols. At this time, the computational methods are most useful for designing endonuclease variants that can accommodate small numbers of target site substitutions. Attempts to engineer for more extensive interface changes will likely benefit from an approach that uses the computational design results in conjunction with a high-throughput directed evolution or screening procedure. The family of enzymes presents an engineering challenge because their interfaces are highly integrated and there is significant coordination between the binding and catalysis events. Future developments in the computational algorithms depend on experimental feedback to improve understanding and modeling of these complex enzymatic features. This chapter presents both the basic method of design that has been successfully used to modulate specificity and more advanced procedures that incorporate DNA flexibility and other properties that are likely necessary for reliable modeling of more extensive target site changes.
Collapse
Affiliation(s)
- Summer Thyme
- Department of Biological Sciences, University of Washington, Seattle, WA, USA
| | | |
Collapse
|
16
|
Li HL, Nakano T, Hotta A. Genetic correction using engineered nucleases for gene therapy applications. Dev Growth Differ 2013; 56:63-77. [PMID: 24329887 DOI: 10.1111/dgd.12107] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2013] [Revised: 10/20/2013] [Accepted: 10/20/2013] [Indexed: 12/24/2022]
Abstract
Genetic mutations in humans are associated with congenital disorders and phenotypic traits. Gene therapy holds the promise to cure such genetic disorders, although it has suffered from several technical limitations for decades. Recent progress in gene editing technology using tailor-made nucleases, such as meganucleases (MNs), zinc finger nucleases (ZFNs), TAL effector nucleases (TALENs) and, more recently, CRISPR/Cas9, has significantly broadened our ability to precisely modify target sites in the human genome. In this review, we summarize recent progress in gene correction approaches of the human genome, with a particular emphasis on the clinical applications of gene therapy.
Collapse
Affiliation(s)
- Hongmei Lisa Li
- Department of Reprogramming Science, Center for iPS cell Research and Applications (CiRA), Kyoto University, Kyoto, Japan; Japan Society for the Promotion of Science (JSPS), Tokyo, Japan
| | | | | |
Collapse
|
17
|
Thyme SB, Boissel SJS, Arshiya Quadri S, Nolan T, Baker DA, Park RU, Kusak L, Ashworth J, Baker D. Reprogramming homing endonuclease specificity through computational design and directed evolution. Nucleic Acids Res 2013; 42:2564-76. [PMID: 24270794 PMCID: PMC3936771 DOI: 10.1093/nar/gkt1212] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Homing endonucleases (HEs) can be used to induce targeted genome modification to reduce the fitness of pathogen vectors such as the malaria-transmitting Anopheles gambiae and to correct deleterious mutations in genetic diseases. We describe the creation of an extensive set of HE variants with novel DNA cleavage specificities using an integrated experimental and computational approach. Using computational modeling and an improved selection strategy, which optimizes specificity in addition to activity, we engineered an endonuclease to cleave in a gene associated with Anopheles sterility and another to cleave near a mutation that causes pyruvate kinase deficiency. In the course of this work we observed unanticipated context-dependence between bases which will need to be mechanistically understood for reprogramming of specificity to succeed more generally.
Collapse
Affiliation(s)
- Summer B Thyme
- Department of Biochemistry, University of Washington, UW Box 357350, 1705 NE Pacific St., Seattle, WA 98195, USA, Graduate Program in Biomolecular Structure and Design, University of Washington, UW Box 357350, 1705 NE Pacific St., Seattle, WA 98195, USA, Graduate Program in Molecular and Cellular Biology, University of Washington, UW Box 357275, 1959 NE Pacific St., Seattle, WA 98195, USA, Department of Life Sciences, Sir Alexander Fleming Building, Imperial College London, Imperial College Road, London SW7 2AZ, UK, Department of Genetics, University of Cambridge, Downing Street, Cambridge CB1 3QA, UK, Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA 98109, USA and Howard Hughes Medical Institute, University of Washington, UW Box 357350, 1705 NE Pacific St., Seattle, WA 98195, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Siggers T, Gordân R. Protein-DNA binding: complexities and multi-protein codes. Nucleic Acids Res 2013; 42:2099-111. [PMID: 24243859 PMCID: PMC3936734 DOI: 10.1093/nar/gkt1112] [Citation(s) in RCA: 153] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Binding of proteins to particular DNA sites across the genome is a primary determinant of specificity in genome maintenance and gene regulation. DNA-binding specificity is encoded at multiple levels, from the detailed biophysical interactions between proteins and DNA, to the assembly of multi-protein complexes. At each level, variation in the mechanisms used to achieve specificity has led to difficulties in constructing and applying simple models of DNA binding. We review the complexities in protein–DNA binding found at multiple levels and discuss how they confound the idea of simple recognition codes. We discuss the impact of new high-throughput technologies for the characterization of protein–DNA binding, and how these technologies are uncovering new complexities in protein–DNA recognition. Finally, we review the concept of multi-protein recognition codes in which new DNA-binding specificities are achieved by the assembly of multi-protein complexes.
Collapse
Affiliation(s)
- Trevor Siggers
- Department of Biology, Boston University, Boston, MA 02215, USA, Departments of Biostatistics and Bioinformatics, Computer Science, and Molecular Genetics and Microbiology, Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA
| | | |
Collapse
|
19
|
Esperón P, Scazzocchio C, Paulino M. In vitroandin silicoanalysis of theAspergillus nidulansDNA–CreA repressor interactions. J Biomol Struct Dyn 2013; 32:2033-41. [DOI: 10.1080/07391102.2013.843474] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
20
|
Persikov AV, Singh M. De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins. Nucleic Acids Res 2013; 42:97-108. [PMID: 24097433 PMCID: PMC3874201 DOI: 10.1093/nar/gkt890] [Citation(s) in RCA: 134] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Proteins with sequence-specific DNA binding function are important for a wide range of biological activities. De novo prediction of their DNA-binding specificities from sequence alone would be a great aid in inferring cellular networks. Here we introduce a method for predicting DNA-binding specificities for Cys2His2 zinc fingers (C2H2-ZFs), the largest family of DNA-binding proteins in metazoans. We develop a general approach, based on empirical calculations of pairwise amino acid–nucleotide interaction energies, for predicting position weight matrices (PWMs) representing DNA-binding specificities for C2H2-ZF proteins. We predict DNA-binding specificities on a per-finger basis and merge predictions for C2H2-ZF domains that are arrayed within sequences. We test our approach on a diverse set of natural C2H2-ZF proteins with known binding specificities and demonstrate that for >85% of the proteins, their predicted PWMs are accurate in 50% of their nucleotide positions. For proteins with several zinc finger isoforms, we show via case studies that this level of accuracy enables us to match isoforms with their known DNA-binding specificities. A web server for predicting a PWM given a protein containing C2H2-ZF domains is available online at http://zf.princeton.edu and can be used to aid in protein engineering applications and in genome-wide searches for transcription factor targets.
Collapse
Affiliation(s)
- Anton V Persikov
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ 08544, USA and Department of Computer Science, Princeton University, Princeton NJ 08544, USA
| | | |
Collapse
|
21
|
Yusa K. Seamless genome editing in human pluripotent stem cells using custom endonuclease-based gene targeting and the piggyBac transposon. Nat Protoc 2013. [PMID: 24071911 DOI: 10.1038/nprot.2013.126.] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
I report here a detailed protocol for seamless genome editing using the piggyBac transposon in human pluripotent stem cells (hPSCs). Recent advances in custom endonucleases have enabled us to routinely perform genome editing in hPSCs. Conventional approaches use the Cre/loxP system that leaves behind residual sequences in the targeted genome. I used the piggyBac transposon to seamlessly remove a drug selection cassette and demonstrated safe genetic correction of a mutation causing α-1 antitrypsin deficiency in patient-derived hPSCs. An alternative approach to using the piggyBac transposon to correct mutations involves using single-stranded oligonucleotides, which is a faster process to complete. However, this experimental procedure is rather complicated and it may be hard to achieve homozygous modifications. In contrast, using the piggyBac transposon with drug selection-based enrichment of genetic modifications, as described here, is simple and can yield multiple correctly targeted clones, including homozygotes. Although two rounds of genetic manipulation are required to achieve homozygote modifications, the entire process takes ∼3 months to complete.
Collapse
Affiliation(s)
- Kosuke Yusa
- Wellcome Trust Sanger Institute, Cambridge, UK.
| |
Collapse
|
22
|
Yusa K. Seamless genome editing in human pluripotent stem cells using custom endonuclease-based gene targeting and the piggyBac transposon. Nat Protoc 2013; 8:2061-78. [PMID: 24071911 DOI: 10.1038/nprot.2013.126] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
I report here a detailed protocol for seamless genome editing using the piggyBac transposon in human pluripotent stem cells (hPSCs). Recent advances in custom endonucleases have enabled us to routinely perform genome editing in hPSCs. Conventional approaches use the Cre/loxP system that leaves behind residual sequences in the targeted genome. I used the piggyBac transposon to seamlessly remove a drug selection cassette and demonstrated safe genetic correction of a mutation causing α-1 antitrypsin deficiency in patient-derived hPSCs. An alternative approach to using the piggyBac transposon to correct mutations involves using single-stranded oligonucleotides, which is a faster process to complete. However, this experimental procedure is rather complicated and it may be hard to achieve homozygous modifications. In contrast, using the piggyBac transposon with drug selection-based enrichment of genetic modifications, as described here, is simple and can yield multiple correctly targeted clones, including homozygotes. Although two rounds of genetic manipulation are required to achieve homozygote modifications, the entire process takes ∼3 months to complete.
Collapse
Affiliation(s)
- Kosuke Yusa
- Wellcome Trust Sanger Institute, Cambridge, UK.
| |
Collapse
|
23
|
Li S, Bradley P. Probing the role of interfacial waters in protein-DNA recognition using a hybrid implicit/explicit solvation model. Proteins 2013; 81:1318-29. [PMID: 23444044 DOI: 10.1002/prot.24272] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 02/06/2013] [Indexed: 01/30/2023]
Abstract
When proteins bind to their DNA target sites, ordered water molecules are often present at the protein-DNA interface bridging protein and DNA through hydrogen bonds. What is the role of these ordered interfacial waters? Are they important determinants of the specificity of DNA sequence recognition, or do they act in binding in a primarily nonspecific manner, by improving packing of the interface, shielding unfavorable electrostatic interactions, and solvating unsatisfied polar groups that are inaccessible to bulk solvent? When modeling details of structure and binding preferences, can fully implicit solvent models be fruitfully applied to protein-DNA interfaces, or must the individualistic properties of these interfacial waters be accounted for? To address these questions, we have developed a hybrid implicit/explicit solvation model that specifically accounts for the locations and orientations of small numbers of DNA-bound water molecules, while treating the majority of the solvent implicitly. Comparing the performance of this model with that of its fully implicit counterpart, we find that explicit treatment of interfacial waters results in a modest but significant improvement in protein side-chain placement and DNA sequence recovery. Base-by-base comparison of the performance of the two models highlights DNA sequence positions whose recognition may be dependent on interfacial water. Our study offers large-scale statistical evidence for the role of ordered water for protein-DNA recognition, together with detailed examination of several well-characterized systems. In addition, our approach provides a template for modeling explicit water molecules at interfaces that should be extensible to other systems.
Collapse
Affiliation(s)
- Shen Li
- Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA
| | | |
Collapse
|
24
|
Engineered Zinc Finger Nucleases for Targeted Genome Editing. SITE-DIRECTED INSERTION OF TRANSGENES 2013. [DOI: 10.1007/978-94-007-4531-5_5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
25
|
Leaver-Fay A, O'Meara MJ, Tyka M, Jacak R, Song Y, Kellogg EH, Thompson J, Davis IW, Pache RA, Lyskov S, Gray JJ, Kortemme T, Richardson JS, Havranek JJ, Snoeyink J, Baker D, Kuhlman B. Scientific benchmarks for guiding macromolecular energy function improvement. Methods Enzymol 2013; 523:109-43. [PMID: 23422428 DOI: 10.1016/b978-0-12-394292-0.00006-0] [Citation(s) in RCA: 159] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Accurate energy functions are critical to macromolecular modeling and design. We describe new tools for identifying inaccuracies in energy functions and guiding their improvement, and illustrate the application of these tools to the improvement of the Rosetta energy function. The feature analysis tool identifies discrepancies between structures deposited in the PDB and low-energy structures generated by Rosetta; these likely arise from inaccuracies in the energy function. The optE tool optimizes the weights on the different components of the energy function by maximizing the recapitulation of a wide range of experimental observations. We use the tools to examine three proposed modifications to the Rosetta energy function: improving the unfolded state energy model (reference energies), using bicubic spline interpolation to generate knowledge-based torisonal potentials, and incorporating the recently developed Dunbrack 2010 rotamer library (Shapovalov & Dunbrack, 2011).
Collapse
Affiliation(s)
- Andrew Leaver-Fay
- Department of Biochemistry, University of North Carolina, Chapel Hill, North Carolina, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Liu LA, Bradley P. Atomistic modeling of protein-DNA interaction specificity: progress and applications. Curr Opin Struct Biol 2012; 22:397-405. [PMID: 22796087 DOI: 10.1016/j.sbi.2012.06.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Accepted: 06/20/2012] [Indexed: 12/22/2022]
Abstract
An accurate, predictive understanding of protein-DNA binding specificity is crucial for the successful design and engineering of novel protein-DNA binding complexes. In this review, we summarize recent studies that use atomistic representations of interfaces to predict protein-DNA binding specificity computationally. Although methods with limited structural flexibility have proven successful at recapitulating consensus binding sequences from wild-type complex structures, conformational flexibility is likely important for design and template-based modeling, where non-native conformations need to be sampled and accurately scored. A successful application of such computational modeling techniques in the construction of the TAL-DNA complex structure is discussed. With continued improvements in energy functions, solvation models, and conformational sampling, we are optimistic that reliable and large-scale protein-DNA binding prediction and engineering is a goal within reach.
Collapse
|
27
|
London N, Gullá S, Keating AE, Schueler-Furman O. In silico and in vitro elucidation of BH3 binding specificity toward Bcl-2. Biochemistry 2012; 51:5841-50. [PMID: 22702834 DOI: 10.1021/bi3003567] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Interactions between Bcl-2-like proteins and BH3 domains play a key role in the regulation of apoptosis. Despite the overall structural similarity of their interaction with helical BH3 domains, Bcl-2-like proteins exhibit an intricate spectrum of binding specificities whose underlying basis is not well understood. Here, we characterize these interactions using Rosetta FlexPepBind, a protocol for the prediction of peptide binding specificity that evaluates the binding potential of different peptides based on structural models of the corresponding peptide-receptor complexes. For two prominent players, Bcl-xL and Mcl-1, we obtain good agreement with a large set of experimental SPOT array measurements and recapitulate the binding specificity of peptides derived by yeast display in a previous study. We extend our approach to a third member of this family, Bcl-2: we test our blind prediction of the binding of 180 BIM-derived peptides with a corresponding experimental SPOT array. Both prediction and experiment reveal a Bcl-2 binding specificity pattern that resembles that of Bcl-xL. Finally, we extend this application to accurately predict the specificity pattern of additional human BH3-only derived peptides. This study characterizes the distinct patterns of binding specificity of BH3-only derived peptides for the Bcl-2 like proteins Bcl-xL, Mcl-1, and Bcl-2 and provides insight into the structural basis of determinants of specificity.
Collapse
Affiliation(s)
- Nir London
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Hadassah Medical School, The Hebrew University, POB 12272, Jerusalem 91120, Israel
| | | | | | | |
Collapse
|
28
|
Chu SW, Noyes MB, Christensen RG, Pierce BG, Zhu LJ, Weng Z, Stormo GD, Wolfe SA. Exploring the DNA-recognition potential of homeodomains. Genome Res 2012; 22:1889-98. [PMID: 22539651 PMCID: PMC3460184 DOI: 10.1101/gr.139014.112] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The recognition potential of most families of DNA-binding domains (DBDs) remains relatively unexplored. Homeodomains (HDs), like many other families of DBDs, display limited diversity in their preferred recognition sequences. To explore the recognition potential of HDs, we utilized a bacterial selection system to isolate HD variants, from a randomized library, that are compatible with each of the 64 possible 3' triplet sites (i.e., TAANNN). The majority of these selections yielded sets of HDs with overrepresented residues at specific recognition positions, implying the selection of specific binders. The DNA-binding specificity of 151 representative HD variants was subsequently characterized, identifying HDs that preferentially recognize 44 of these target sites. Many of these variants contain novel combinations of specificity determinants that are uncommon or absent in extant HDs. These novel determinants, when grafted into different HD backbones, produce a corresponding alteration in specificity. This information was used to create more explicit HD recognition models, which can inform the prediction of transcriptional regulatory networks for extant HDs or the engineering of HDs with novel DNA-recognition potential. The diversity of recovered HD recognition sequences raises important questions about the fitness barrier that restricts the evolution of alternate recognition modalities in natural systems.
Collapse
Affiliation(s)
- Stephanie W Chu
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA
| | | | | | | | | | | | | | | |
Collapse
|
29
|
Kiełbowicz-Matuk A. Involvement of plant C(2)H(2)-type zinc finger transcription factors in stress responses. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2012; 185-186:78-85. [PMID: 22325868 DOI: 10.1016/j.plantsci.2011.11.015] [Citation(s) in RCA: 128] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2011] [Revised: 11/20/2011] [Accepted: 11/22/2011] [Indexed: 05/18/2023]
Abstract
Abiotic and biotic stresses frequently impose constraints on plant distribution and affect agricultural productivity. Various aspects of the multiplicity and the complexity of stress responsive gene networks have been previously studied. Many of individual transcription factors in plants and their family classes that regulate the expression of several genes in responses to environmental stresses have been identified. One such class of transcription regulators is the C(2)H(2) class of zinc finger proteins. Numerous members of the C(2)H(2)-type zinc finger family have been shown to play diverse roles in the plant stress response and the hormone signal transduction. Transcription profiling analyses have demonstrated that the transcript level of many C(2)H(2)-type zinc finger proteins is elevated under different abiotic stress conditions such as low temperature, salt, drought, osmotic stress and oxidative stress. Some C(2)H(2)-type proteins are additionally involved in the biotic stress signaling pathway. Moreover, it has been reported that overexpression of some C(2)H(2)-type zinc finger protein genes resulted in both the activation of some stress-related genes and enhanced tolerance to various stresses. Current genetic studies have focused on possible interactions between different zinc finger transcription factors during stresses to regulate transcription. This review highlights the role of the C(2)H(2) class of the zinc finger proteins in regulating abiotic and biotic stress tolerance in the plants.
Collapse
|
30
|
Thyme SB, Baker D, Bradley P. Improved modeling of side-chain--base interactions and plasticity in protein--DNA interface design. J Mol Biol 2012; 419:255-74. [PMID: 22426128 DOI: 10.1016/j.jmb.2012.03.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Revised: 02/09/2012] [Accepted: 03/09/2012] [Indexed: 12/30/2022]
Abstract
Combinatorial sequence optimization for protein design requires libraries of discrete side-chain conformations. The discreteness of these libraries is problematic, particularly for long, polar side chains, since favorable interactions can be missed. Previously, an approach to loop remodeling where protein backbone movement is directed by side-chain rotamers predicted to form interactions previously observed in native complexes (termed "motifs") was described. Here, we show how such motif libraries can be incorporated into combinatorial sequence optimization protocols and improve native complex recapitulation. Guided by the motif rotamer searches, we made improvements to the underlying energy function, increasing recapitulation of native interactions. To further test the methods, we carried out a comprehensive experimental scan of amino acid preferences in the I-AniI protein-DNA interface and found that many positions tolerated multiple amino acids. This sequence plasticity is not observed in the computational results because of the fixed-backbone approximation of the model. We improved modeling of this diversity by introducing DNA flexibility and reducing the convergence of the simulated annealing algorithm that drives the design process. In addition to serving as a benchmark, this extensive experimental data set provides insight into the types of interactions essential to maintain the function of this potential gene therapy reagent.
Collapse
Affiliation(s)
- Summer B Thyme
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA.
| | | | | |
Collapse
|
31
|
Bradley P. Structural modeling of TAL effector-DNA interactions. Protein Sci 2012; 21:471-4. [PMID: 22334576 DOI: 10.1002/pro.2034] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2011] [Revised: 01/23/2012] [Accepted: 01/23/2012] [Indexed: 11/07/2022]
Abstract
TAL (transcriptional activator-like) effectors are DNA-binding repeat proteins recently shown to recognize their target sites by an unprecedented, 1:1 mapping between repeat residues and DNA bases. The structural basis for this recognition is not known; in particular, it is not clear whether such 1:1 recognition can be accommodated by standard major-groove readout of B-form DNA. Here we describe a structure prediction protocol tailored to the TAL-DNA system, and report simulation results that shed light on observed repeat-base associations and overall TAL structure. Our models demonstrate that TAL-DNA interactions can be explained by a model in which the TAL repeat domain forms a superhelical repeat structure that wraps around undistorted B-form DNA, paralleling the geometry of the major groove, with contacts between position 13 of each repeat and its associated base pair on the sense strand determining the specificity of DNA recognition.
Collapse
Affiliation(s)
- Philip Bradley
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, M1-B514, Seattle, Washington 98109, USA.
| |
Collapse
|
32
|
Ashworth J, Wurtmann EJ, Baliga NS. Reverse engineering systems models of regulation: discovery, prediction and mechanisms. Curr Opin Biotechnol 2011; 23:598-603. [PMID: 22209016 DOI: 10.1016/j.copbio.2011.12.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 12/08/2011] [Indexed: 10/14/2022]
Abstract
Biological systems can now be understood in comprehensive and quantitative detail using systems biology approaches. Putative genome-scale models can be built rapidly based upon biological inventories and strategic system-wide molecular measurements. Current models combine statistical associations, causative abstractions, and known molecular mechanisms to explain and predict quantitative and complex phenotypes. This top-down 'reverse engineering' approach generates useful organism-scale models despite noise and incompleteness in data and knowledge. Here we review and discuss the reverse engineering of biological systems using top-down data-driven approaches, in order to improve discovery, hypothesis generation, and the inference of biological properties.
Collapse
|
33
|
Pattanayak V, Ramirez CL, Joung JK, Liu DR. Revealing off-target cleavage specificities of zinc-finger nucleases by in vitro selection. Nat Methods 2011; 8:765-70. [PMID: 21822273 PMCID: PMC3164905 DOI: 10.1038/nmeth.1670] [Citation(s) in RCA: 406] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Accepted: 07/20/2011] [Indexed: 12/21/2022]
Abstract
Engineered zinc-finger nucleases (ZFNs) are promising tools for genome manipulation, and determining off-target cleavage sites of these enzymes is of great interest. We developed an in vitro selection method that interrogates 10(11) DNA sequences for cleavage by active, dimeric ZFNs. The method revealed hundreds of thousands of DNA sequences, some present in the human genome, that can be cleaved in vitro by two ZFNs: CCR5-224 and VF2468, which target the endogenous human CCR5 and VEGFA genes, respectively. Analysis of identified sites in one cultured human cell line revealed CCR5-224-induced changes at nine off-target loci, though this remains to be tested in other relevant cell types. Similarly, we observed 31 off-target sites cleaved by VF2468 in cultured human cells. Our findings establish an energy compensation model of ZFN specificity in which excess binding energy contributes to off-target ZFN cleavage and suggest strategies for the improvement of future ZFN design.
Collapse
Affiliation(s)
- Vikram Pattanayak
- Department of Chemistry & Chemical Biology and Howard Hughes Medical Institute Harvard University, 12 Oxford St, Cambridge, MA 02138 USA
| | - Cherie L. Ramirez
- Molecular Pathology Unit, Center for Cancer Research, and Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA 02129 USA
- Department of Pathology & Biological and Biomedical Sciences Program, Harvard Medical School, Boston, MA 02115 USA
| | - J. Keith Joung
- Molecular Pathology Unit, Center for Cancer Research, and Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, MA 02129 USA
- Department of Pathology & Biological and Biomedical Sciences Program, Harvard Medical School, Boston, MA 02115 USA
| | - David R. Liu
- Department of Chemistry & Chemical Biology and Howard Hughes Medical Institute Harvard University, 12 Oxford St, Cambridge, MA 02138 USA
| |
Collapse
|
34
|
Seeliger D, Buelens FP, Goette M, de Groot BL, Grubmüller H. Towards computational specificity screening of DNA-binding proteins. Nucleic Acids Res 2011; 39:8281-90. [PMID: 21737424 PMCID: PMC3201868 DOI: 10.1093/nar/gkr531] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
DNA-binding proteins are key players in the regulation of gene expression and, hence, are essential for cell function. Chimeric proteins composed of DNA-binding domains and DNA modifying domains allow for precise genome manipulation. A key prerequisite is the specific recognition of a particular nucleotide sequence. Here, we quantitatively assess the binding affinity of DNA-binding proteins by molecular dynamics-based alchemical free energy simulations. A computational framework was developed to automatically set up in silico screening assays and estimate free energy differences using two independent procedures, based on equilibrium and non-equlibrium transformation pathways. The influence of simulation times on the accuracy of both procedures is presented. The binding specificity of a zinc-finger transcription factor to several sequences is calculated, and agreement with experimental data is shown. Finally we propose an in silico screening strategy aiming at the derivation of full specificity profiles for DNA-binding proteins.
Collapse
Affiliation(s)
- Daniel Seeliger
- Computational Biomolecular Dynamics Group, Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| | | | | | | | | |
Collapse
|