1
|
Kumar R, Sinha NR, Mohan RR. Corneal gene therapy: Structural and mechanistic understanding. Ocul Surf 2023; 29:279-297. [PMID: 37244594 DOI: 10.1016/j.jtos.2023.05.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 05/18/2023] [Accepted: 05/22/2023] [Indexed: 05/29/2023]
Abstract
Cornea, a dome-shaped and transparent front part of the eye, affords 2/3rd refraction and barrier functions. Globally, corneal diseases are the leading cause of vision impairment. Loss of corneal function including opacification involve the complex crosstalk and perturbation between a variety of cytokines, chemokines and growth factors generated by corneal keratocytes, epithelial cells, lacrimal tissues, nerves, and immune cells. Conventional small-molecule drugs can treat mild-to-moderate traumatic corneal pathology but requires frequent application and often fails to treat severe pathologies. The corneal transplant surgery is a standard of care to restore vision in patients. However, declining availability and rising demand of donor corneas are major concerns to maintain ophthalmic care. Thus, the development of efficient and safe nonsurgical methods to cure corneal disorders and restore vision in vivo is highly desired. Gene-based therapy has huge potential to cure corneal blindness. To achieve a nonimmunogenic, safe and sustained therapeutic response, the selection of a relevant genes, gene editing methods and suitable delivery vectors are vital. This article describes corneal structural and functional features, mechanistic understanding of gene therapy vectors, gene editing methods, gene delivery tools, and status of gene therapy for treating corneal disorders, diseases, and genetic dystrophies.
Collapse
Affiliation(s)
- Rajnish Kumar
- Harry S. Truman Memorial Veterans' Hospital, Columbia, MO, 65201, USA; One-health One-medicine Vision Research Program, Departments of Veterinary Medicine and Surgery & Biomedical Sciences, College of Veterinary Medicine, University of Missouri, Columbia, MO, 65211, USA; Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow campus, UP, 226028, India
| | - Nishant R Sinha
- Harry S. Truman Memorial Veterans' Hospital, Columbia, MO, 65201, USA; One-health One-medicine Vision Research Program, Departments of Veterinary Medicine and Surgery & Biomedical Sciences, College of Veterinary Medicine, University of Missouri, Columbia, MO, 65211, USA
| | - Rajiv R Mohan
- Harry S. Truman Memorial Veterans' Hospital, Columbia, MO, 65201, USA; One-health One-medicine Vision Research Program, Departments of Veterinary Medicine and Surgery & Biomedical Sciences, College of Veterinary Medicine, University of Missouri, Columbia, MO, 65211, USA; Mason Eye Institute, School of Medicine, University of Missouri, Columbia, MO, 65212, USA.
| |
Collapse
|
2
|
Meseguer A, Årman F, Fornes O, Molina-Fernández R, Bonet J, Fernandez-Fuentes N, Oliva B. On the prediction of DNA-binding preferences of C2H2-ZF domains using structural models: application on human CTCF. NAR Genom Bioinform 2021; 2:lqaa046. [PMID: 33575598 PMCID: PMC7671317 DOI: 10.1093/nargab/lqaa046] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 05/07/2020] [Accepted: 06/10/2020] [Indexed: 12/25/2022] Open
Abstract
Cis2-His2 zinc finger (C2H2-ZF) proteins are the largest family of transcription factors in human and higher metazoans. To date, the DNA-binding preferences of many members of this family remain unknown. We have developed a computational method to predict their DNA-binding preferences. We have computed theoretical position weight matrices (PWMs) of proteins composed by C2H2-ZF domains, with the only requirement of an input structure. We have predicted more than two-third of a single zinc-finger domain binding site for about 70% variants of Zif268, a classical member of this family. We have successfully matched between 60 and 90% of the binding-site motif of examples of proteins composed by three C2H2-ZF domains in JASPAR, a standard database of PWMs. The tests are used as a proof of the capacity to scan a DNA fragment and find the potential binding sites of transcription-factors formed by C2H2-ZF domains. As an example, we have tested the approach to predict the DNA-binding preferences of the human chromatin binding factor CTCF. We offer a server to model the structure of a zinc-finger protein and predict its PWM.
Collapse
Affiliation(s)
- Alberto Meseguer
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona, Catalonia 08005, Spain
| | - Filip Årman
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona, Catalonia 08005, Spain
| | - Oriol Fornes
- Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, BC V5Z 4H4, Canada
| | - Ruben Molina-Fernández
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona, Catalonia 08005, Spain
| | - Jaume Bonet
- Laboratory of Protein Design & Immunoengineering, School of Engineering, Ecole Polytechnique Federale de Lausanne, Lausanne 1015, Vaud, Switzerland
| | - Narcis Fernandez-Fuentes
- Department of Biosciences, U Science Tech, Universitat de Vic-Universitat Central de Catalunya, Vic, Catalonia 08500, Spain
| | - Baldo Oliva
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona, Catalonia 08005, Spain
| |
Collapse
|
3
|
Nasal and otic placode specific regulation of Sox2 involves both activation by Sox-Sall4 synergism and multiple repression mechanisms. Dev Biol 2018; 433:61-74. [DOI: 10.1016/j.ydbio.2017.11.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Revised: 11/02/2017] [Accepted: 11/10/2017] [Indexed: 01/21/2023]
|
4
|
Persikov AV, Wetzel JL, Rowland EF, Oakes BL, Xu DJ, Singh M, Noyes MB. A systematic survey of the Cys2His2 zinc finger DNA-binding landscape. Nucleic Acids Res 2015; 43:1965-84. [PMID: 25593323 PMCID: PMC4330361 DOI: 10.1093/nar/gku1395] [Citation(s) in RCA: 78] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Cys2His2 zinc fingers (C2H2-ZFs) comprise the largest class of metazoan DNA-binding domains. Despite this domain's well-defined DNA-recognition interface, and its successful use in the design of chimeric proteins capable of targeting genomic regions of interest, much remains unknown about its DNA-binding landscape. To help bridge this gap in fundamental knowledge and to provide a resource for design-oriented applications, we screened large synthetic protein libraries to select binding C2H2-ZF domains for each possible three base pair target. The resulting data consist of >160 000 unique domain-DNA interactions and comprise the most comprehensive investigation of C2H2-ZF DNA-binding interactions to date. An integrated analysis of these independent screens yielded DNA-binding profiles for tens of thousands of domains and led to the successful design and prediction of C2H2-ZF DNA-binding specificities. Computational analyses uncovered important aspects of C2H2-ZF domain-DNA interactions, including the roles of within-finger context and domain position on base recognition. We observed the existence of numerous distinct binding strategies for each possible three base pair target and an apparent balance between affinity and specificity of binding. In sum, our comprehensive data help elucidate the complex binding landscape of C2H2-ZF domains and provide a foundation for efforts to determine, predict and engineer their DNA-binding specificities.
Collapse
Affiliation(s)
- Anton V Persikov
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Joshua L Wetzel
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | - Elizabeth F Rowland
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Benjamin L Oakes
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Denise J Xu
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Mona Singh
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA Department of Computer Science, Princeton University, Princeton, NJ 08544, USA
| | - Marcus B Noyes
- The Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
5
|
Dutta S, Sundar D. Designing Zinc Finger Proteins for Applications in Synthetic Biology. SYSTEMS AND SYNTHETIC BIOLOGY 2015. [DOI: 10.1007/978-94-017-9514-2_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
6
|
Gupta A, Christensen RG, Bell HA, Goodwin M, Patel RY, Pandey M, Enuameh MS, Rayla AL, Zhu C, Thibodeau-Beganny S, Brodsky MH, Joung JK, Wolfe SA, Stormo GD. An improved predictive recognition model for Cys(2)-His(2) zinc finger proteins. Nucleic Acids Res 2014; 42:4800-12. [PMID: 24523353 PMCID: PMC4005693 DOI: 10.1093/nar/gku132] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Revised: 01/21/2014] [Accepted: 01/22/2014] [Indexed: 11/17/2022] Open
Abstract
Cys(2)-His(2) zinc finger proteins (ZFPs) are the largest family of transcription factors in higher metazoans. They also represent the most diverse family with regards to the composition of their recognition sequences. Although there are a number of ZFPs with characterized DNA-binding preferences, the specificity of the vast majority of ZFPs is unknown and cannot be directly inferred by homology due to the diversity of recognition residues present within individual fingers. Given the large number of unique zinc fingers and assemblies present across eukaryotes, a comprehensive predictive recognition model that could accurately estimate the DNA-binding specificity of any ZFP based on its amino acid sequence would have great utility. Toward this goal, we have used the DNA-binding specificities of 678 two-finger modules from both natural and artificial sources to construct a random forest-based predictive model for ZFP recognition. We find that our recognition model outperforms previously described determinant-based recognition models for ZFPs, and can successfully estimate the specificity of naturally occurring ZFPs with previously defined specificities.
Collapse
Affiliation(s)
- Ankit Gupta
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Ryan G. Christensen
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Heather A. Bell
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Mathew Goodwin
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Ronak Y. Patel
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Manishi Pandey
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Metewo Selase Enuameh
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Amy L. Rayla
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Cong Zhu
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Stacey Thibodeau-Beganny
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Michael H. Brodsky
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - J. Keith Joung
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Scot A. Wolfe
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| | - Gary D. Stormo
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA, Department of Genetics, Washington University School of Medicine, St Louis, MO 63108, USA, Department of Biochemistry and Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA 01609, USA, Molecular Pathology Unit, Center for Computational and Integrative Biology, and Center for Cancer Research, Massachusetts General Hospital, Charlestown, MA 02129, USA, Department of Molecular Medicine, University of Massachusetts Medical School, Worcester, MA 01605, USA and Department of Pathology, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
7
|
Yang P, Wu M, Guo J, Kwoh CK, Przytycka TM, Zheng J. LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms. BMC Bioinformatics 2014; 15:48. [PMID: 24533858 PMCID: PMC3936957 DOI: 10.1186/1471-2105-15-48] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2012] [Accepted: 01/27/2014] [Indexed: 11/10/2022] Open
Abstract
Background As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Results Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. Conclusions LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots.
Collapse
Affiliation(s)
| | | | | | | | | | - Jie Zheng
- Bioinformatics Research Centre (BIRC), School of Computer Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore.
| |
Collapse
|
8
|
Persikov AV, Singh M. De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins. Nucleic Acids Res 2013; 42:97-108. [PMID: 24097433 PMCID: PMC3874201 DOI: 10.1093/nar/gkt890] [Citation(s) in RCA: 139] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Proteins with sequence-specific DNA binding function are important for a wide range of biological activities. De novo prediction of their DNA-binding specificities from sequence alone would be a great aid in inferring cellular networks. Here we introduce a method for predicting DNA-binding specificities for Cys2His2 zinc fingers (C2H2-ZFs), the largest family of DNA-binding proteins in metazoans. We develop a general approach, based on empirical calculations of pairwise amino acid–nucleotide interaction energies, for predicting position weight matrices (PWMs) representing DNA-binding specificities for C2H2-ZF proteins. We predict DNA-binding specificities on a per-finger basis and merge predictions for C2H2-ZF domains that are arrayed within sequences. We test our approach on a diverse set of natural C2H2-ZF proteins with known binding specificities and demonstrate that for >85% of the proteins, their predicted PWMs are accurate in 50% of their nucleotide positions. For proteins with several zinc finger isoforms, we show via case studies that this level of accuracy enables us to match isoforms with their known DNA-binding specificities. A web server for predicting a PWM given a protein containing C2H2-ZF domains is available online at http://zf.princeton.edu and can be used to aid in protein engineering applications and in genome-wide searches for transcription factor targets.
Collapse
Affiliation(s)
- Anton V Persikov
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ 08544, USA and Department of Computer Science, Princeton University, Princeton NJ 08544, USA
| | | |
Collapse
|
9
|
Sarkar A, Kumar S, Punetha A, Grover A, Sundar D. Analysis and Prediction of DNA-Recognition by Zinc Finger Proteins. Bioinformatics 2013. [DOI: 10.4018/978-1-4666-3604-0.ch018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Zinc fingers are the most abundant class of DNA-binding proteins encoded in the eukaryotic genomes. Custom-designed zinc finger proteins attached to various DNA-modifying domains can be used to achieve highly specific genome modification, which has tremendous applications in molecular therapeutics. Analysis of sequence and structure of the zinc finger proteins provides clues for understanding protein-DNA interactions and aid in custom-design of zinc finger proteins with tailor-made specificity. Computational methods for prediction of recognition helices for C2H2 zinc fingers that bind to specific target DNA sites could provide valuable insights for researchers interested in designing specific zinc finger proteins for biological and biomedical applications. In this chapter, we describe the zinc finger protein-DNA interaction patterns, challenges in engineering the recognition-specificity of zinc finger proteins, the computational methods of prediction of proteins that recognize specific target DNA sequence and their applications in molecular therapeutics.
Collapse
|
10
|
Functional site plasticity in domain superfamilies. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1834:874-89. [PMID: 23499848 PMCID: PMC3787744 DOI: 10.1016/j.bbapap.2013.02.042] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2012] [Revised: 02/20/2013] [Accepted: 02/28/2013] [Indexed: 11/21/2022]
Abstract
We present, to our knowledge, the first quantitative analysis of functional site diversity in homologous domain superfamilies. Different types of functional sites are considered separately. Our results show that most diverse superfamilies are very plastic in terms of the spatial location of their functional sites. This is especially true for protein–protein interfaces. In contrast, we confirm that catalytic sites typically occupy only a very small number of topological locations. Small-ligand binding sites are more diverse than expected, although in a more limited manner than protein–protein interfaces. In spite of the observed diversity, our results also confirm the previously reported preferential location of functional sites. We identify a subset of homologous domain superfamilies where diversity is particularly extreme, and discuss possible reasons for such plasticity, i.e. structural diversity. Our results do not contradict previous reports of preferential co-location of sites among homologues, but rather point at the importance of not ignoring other sites, especially in large and diverse superfamilies. Data on sites exploited by different relatives, within each well annotated domain superfamily, has been made accessible from the CATH website in order to highlight versatile superfamilies or superfamilies with highly preferential sites. This information is valuable for system biology and knowledge of any constraints on protein interactions could help in understanding the dynamic control of networks in which these proteins participate. The novelty of our work lies in the comprehensive nature of the analysis – we have used a significantly larger dataset than previous studies – and the fact that in many superfamilies we show that different parts of the domain surface are exploited by different relatives for ligand/protein interactions, particularly in superfamilies which are diverse in sequence and structure, an observation not previously reported on such a large scale. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly. Most diverse domain superfamilies have very diverse functional site locations. Catalytic sites are found in a small, restricted number of topological positions. Location of small-ligand binding sites is more diverse than expected. Protein–protein interfaces display the most flexibility in functional site locations.
Collapse
|
11
|
Bitar M, Drummond MG, Costa MGS, Lobo FP, Calzavara-Silva CE, Bisch PM, Machado CR, Macedo AM, Pierce RJ, Franco GR. Modeling the zing finger protein SmZF1 from Schistosoma mansoni: Insights into DNA binding and gene regulation. J Mol Graph Model 2012; 39:29-38. [PMID: 23220279 DOI: 10.1016/j.jmgm.2012.10.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2012] [Revised: 10/09/2012] [Accepted: 10/13/2012] [Indexed: 10/27/2022]
Abstract
Zinc finger proteins are widely found in eukaryotes, representing an important class of DNA-binding proteins frequently involved in transcriptional regulation. Zinc finger motifs are composed by two antiparallel β-strands and one α-helix, stabilized by a zinc ion coordinated by conserved histidine and cysteine residues. In Schistosoma mansoni, these regulatory proteins are known to modulate morphological and physiological changes, having crucial roles in parasite development. A previously described C(2)H(2) zinc finger protein, SmZF1, was shown to be present in cell nuclei of different life stages of S. mansoni and to activate gene transcription in a heterologous system. A high-quality SmZF1 tridimensional structure was generated using comparative modeling. Molecular dynamics simulations of the obtained structure revealed stability of the zinc fingers motifs and high flexibility on the terminals, comparable to the profile observed on the template X-ray structure based on thermal b-factors. Based on the protein tridimensional features and amino acid composition, we were able to characterize four C(2)H(2) zinc finger motifs, the first involved in protein-protein interactions while the three others involved in DNA binding. We defined a consensus DNA binding sequence using three distinct algorithms and further carried out docking calculations, which revealed the interaction of fingers 2-4 with the predicted DNA. A search for S. mansoni genes presenting putative SmZF1 binding sites revealed 415 genes hypothetically under SmZF1 control. Using an automatic annotation and GO assignment approach, we found that the majority of those genes code for proteins involved in developmental processes. Taken together, these results present a consistent base to the structural and functional characterization of SmZF1.
Collapse
Affiliation(s)
- Mainá Bitar
- Laboratório de Física Biológica, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Roy S, Dutta S, Khanna K, Singla S, Sundar D. Prediction of DNA-binding specificity in zinc finger proteins. J Biosci 2012; 37:483-91. [DOI: 10.1007/s12038-012-9213-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
13
|
Dey B, Thukral S, Krishnan S, Chakrobarty M, Gupta S, Manghani C, Rani V. DNA-protein interactions: methods for detection and analysis. Mol Cell Biochem 2012; 365:279-99. [PMID: 22399265 DOI: 10.1007/s11010-012-1269-z] [Citation(s) in RCA: 96] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2011] [Accepted: 02/16/2012] [Indexed: 12/18/2022]
Abstract
DNA-binding proteins control various cellular processes such as recombination, replication and transcription. This review is aimed to summarize some of the most commonly used techniques to determine DNA-protein interactions. In vitro techniques such as footprinting assays, electrophoretic mobility shift assay, southwestern blotting, yeast one-hybrid assay, phage display and proximity ligation assay have been discussed. The highly versatile in vivo techniques such as chromatin immunoprecipitation and its variants, DNA adenine methyl transferase identification as well as 3C and chip-loop assay have also been summarized. In addition, some in silico tools have been reviewed to provide computational basis for determining DNA-protein interactions. Biophysical techniques like fluorescence resonance energy transfer (FRET) techniques, FRET-FLIM, circular dichroism, atomic force microscopy, nuclear magnetic resonance, surface plasmon resonance, etc. have also been highlighted.
Collapse
Affiliation(s)
- Bipasha Dey
- Department of Biotechnology, Jaypee Institute of Information Technology, A-10 Sector-62, Noida 201307, Uttar Pradesh, India
| | | | | | | | | | | | | |
Collapse
|
14
|
Zhang W, Wan H, Jiang H, Zhao Y, Zhang X, Hu S, Wang Q. A transcriptome analysis of mitten crab testes (Eriocheir sinensis). Genet Mol Biol 2011; 34:136-41. [PMID: 21637557 PMCID: PMC3085360 DOI: 10.1590/s1415-47572010005000099] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2010] [Accepted: 06/15/2010] [Indexed: 12/05/2022] Open
Abstract
The identification of expressed genes involved in sexual precocity of the mitten crab (Eriocheir sinensis) is critical for a better understanding of its reproductive development. To this end, we constructed a cDNA library from the rapid developmental stage of testis of E. sinensis and sequenced 3,388 randomly picked clones. After processing, 2,990 high-quality expressed sequence tags (ESTs) were clustered into 2,415 unigenes including 307 contigs and 2,108 singlets, which were then compared to the NCBI non-redundant (nr) protein and nucleotide (nt) database for annotation with Blastx and Blastn, respectively. After further analysis, 922 unigenes were obtained with concrete annotations and 30 unigenes were found to have functions possibly related to the process of reproduction in male crabs – six transcripts relevant to spermatogenesis (especially Cyclin K and RecA homolog DMC1), two transcripts involved in nuclear protein transformation, two heat-shock protein genes, eleven transcription factor genes (a series of zinc-finger proteins), and nine cytoskeleton protein-related genes. Our results, besides providing valuable information related to crustacean reproduction, can also serve as a base for future studies of reproductive and developmental biology.
Collapse
Affiliation(s)
- Wei Zhang
- School of Life Science, East China Normal University, Shanghai, China
| | | | | | | | | | | | | |
Collapse
|
15
|
Yanover C, Bradley P. Extensive protein and DNA backbone sampling improves structure-based specificity prediction for C2H2 zinc fingers. Nucleic Acids Res 2011; 39:4564-76. [PMID: 21343182 PMCID: PMC3113574 DOI: 10.1093/nar/gkr048] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Sequence-specific DNA recognition by gene regulatory proteins is critical for proper cellular functioning. The ability to predict the DNA binding preferences of these regulatory proteins from their amino acid sequence would greatly aid in reconstruction of their regulatory interactions. Structural modeling provides one route to such predictions: by building accurate molecular models of regulatory proteins in complex with candidate binding sites, and estimating their relative binding affinities for these sites using a suitable potential function, it should be possible to construct DNA binding profiles. Here, we present a novel molecular modeling protocol for protein-DNA interfaces that borrows conformational sampling techniques from de novo protein structure prediction to generate a diverse ensemble of structural models from small fragments of related and unrelated protein-DNA complexes. The extensive conformational sampling is coupled with sequence space exploration so that binding preferences for the target protein can be inferred from the resulting optimized DNA sequences. We apply the algorithm to predict binding profiles for a benchmark set of eleven C2H2 zinc finger transcription factors, five of known and six of unknown structure. The predicted profiles are in good agreement with experimental binding data; furthermore, examination of the modeled structures gives insight into observed binding preferences.
Collapse
Affiliation(s)
- Chen Yanover
- Program in Computational Biology, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, USA
| | | |
Collapse
|
16
|
Re-programming DNA-binding specificity in zinc finger proteins for targeting unique address in a genome. SYSTEMS AND SYNTHETIC BIOLOGY 2011; 4:323-9. [PMID: 22132059 DOI: 10.1007/s11693-011-9077-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2010] [Accepted: 02/03/2011] [Indexed: 12/26/2022]
Abstract
Recent studies provide a glimpse of future potential therapeutic applications of custom-designed zinc finger proteins in achieving highly specific genomic manipulation. Custom-design of zinc finger proteins with tailor-made specificity is currently limited by the availability of information on recognition helices for all possible DNA targets. However, recent advances suggest that a combination of design and selection method is best suited to identify custom zinc finger DNA-binding proteins for known genome target sites. Design of functionally self-contained zinc finger proteins can be achieved by (a) modular protein engineering and (b) computational prediction. Here, we explore the novel functionality obtained by engineered zinc finger proteins and the computational approaches for prediction of recognition helices of zinc finger proteins that can raise our ability to re-program zinc finger proteins with desired novel DNA-binding specificities.
Collapse
|
17
|
Reyon D, Kirkpatrick JR, Sander JD, Zhang F, Voytas DF, Joung JK, Dobbs D, Coffman CR. ZFNGenome: a comprehensive resource for locating zinc finger nuclease target sites in model organisms. BMC Genomics 2011; 12:83. [PMID: 21276248 PMCID: PMC3042413 DOI: 10.1186/1471-2164-12-83] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2010] [Accepted: 01/28/2011] [Indexed: 02/04/2023] Open
Abstract
Background Zinc Finger Nucleases (ZFNs) have tremendous potential as tools to facilitate genomic modifications, such as precise gene knockouts or gene replacements by homologous recombination. ZFNs can be used to advance both basic research and clinical applications, including gene therapy. Recently, the ability to engineer ZFNs that target any desired genomic DNA sequence with high fidelity has improved significantly with the introduction of rapid, robust, and publicly available techniques for ZFN design such as the Oligomerized Pool ENgineering (OPEN) method. The motivation for this study is to make resources for genome modifications using OPEN-generated ZFNs more accessible to researchers by creating a user-friendly interface that identifies and provides quality scores for all potential ZFN target sites in the complete genomes of several model organisms. Description ZFNGenome is a GBrowse-based tool for identifying and visualizing potential target sites for OPEN-generated ZFNs. ZFNGenome currently includes a total of more than 11.6 million potential ZFN target sites, mapped within the fully sequenced genomes of seven model organisms; S. cerevisiae, C. reinhardtii, A. thaliana, D. melanogaster, D. rerio, C. elegans, and H. sapiens and can be visualized within the flexible GBrowse environment. Additional model organisms will be included in future updates. ZFNGenome provides information about each potential ZFN target site, including its chromosomal location and position relative to transcription initiation site(s). Users can query ZFNGenome using several different criteria (e.g., gene ID, transcript ID, target site sequence). Tracks in ZFNGenome also provide "uniqueness" and ZiFOpT (Zinc Finger OPEN Targeter) "confidence" scores that estimate the likelihood that a chosen ZFN target site will function in vivo. ZFNGenome is dynamically linked to ZiFDB, allowing users access to all available information about zinc finger reagents, such as the effectiveness of a given ZFN in creating double-stranded breaks. Conclusions ZFNGenome provides a user-friendly interface that allows researchers to access resources and information regarding genomic target sites for engineered ZFNs in seven model organisms. This genome-wide database of potential ZFN target sites should greatly facilitate the utilization of ZFNs in both basic and clinical research. ZFNGenome is freely available at: http://bindr.gdcb.iastate.edu/ZFNGenome or at the Zinc Finger Consortium website: http://www.zincfingers.org/.
Collapse
Affiliation(s)
- Deepak Reyon
- Department of Genetics, Iowa State University, Ames, IA 50011, USA.
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Levy R, Edelman M, Sobolev V. Prediction of 3D metal binding sites from translated gene sequences based on remote-homology templates. Proteins 2010; 76:365-74. [PMID: 19173310 DOI: 10.1002/prot.22352] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Database-scale analysis was performed to determine whether structural models, based on remote homologues, are effective in predicting 3D transition metal binding sites in proteins directly from translated gene sequences. The extent by which side chain modeling alone reduces sensitivity and selectivity is shown to be <10%. Surprisingly, selectivity was not dependent on the level of sequence homology between template and target, or on the presence of a metal ion in the structural template. Applying a modification of the CHED algorithm (Babor et al., Proteins 2008;70:208-217) and machine learning filters, a selectivity of approximately 90% was achieved for protein sequences using unrelated structural templates over a sequence identity range of 18-100%. Below approximately 18% identity, the number of analyzable target-template pairs and predictability of metal binding sites falls off sharply. A full third of structural templates were found to have target partners only in the remote homology range of 18-30%. In this range, nonmetal-binding templates are calculated to be the majority and serve to predict with 50% sensitivity at the geometric level. Overall, sensitivity at the geometric level for targets having templates in the 18-30% sequence identity range is 73%, with an average of one false positive site per true site. Protein sequences described as "unknown" in the UniProt database and composed largely of unidentified genome project sequences were studied and metal binding sites predicted. A web server for prediction of metal binding sites from protein sequence is provided.
Collapse
Affiliation(s)
- Ronen Levy
- Department of Plant Sciences, Weizmann Institute of Science, Rehovot, Israel
| | | | | |
Collapse
|
19
|
Zykovich A, Korf I, Segal DJ. Bind-n-Seq: high-throughput analysis of in vitro protein-DNA interactions using massively parallel sequencing. Nucleic Acids Res 2010; 37:e151. [PMID: 19843614 PMCID: PMC2794170 DOI: 10.1093/nar/gkp802] [Citation(s) in RCA: 115] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Transcription factor–DNA interactions are some of the most important processes in biology because they directly control hereditary information. The targets of most transcription factor are unknown. In this report, we introduce Bind-n-Seq, a new high-throughput method for analyzing protein–DNA interactions in vitro, with several advantages over current methods. The procedure has three steps (i) binding proteins to randomized oligonucleotide DNA targets, (ii) sequencing the bound oligonucleotide with massively parallel technology and (iii) finding motifs among the sequences. De novo binding motifs determined by this method for the DNA-binding domains of two well-characterized zinc-finger proteins were similar to those described previously. Furthermore, calculations of the relative affinity of the proteins for specific DNA sequences correlated significantly with previous studies (R2 = 0.9). These results present Bind-n-Seq as a highly rapid and parallel method for determining in vitro binding sites and relative affinities.
Collapse
Affiliation(s)
- Artem Zykovich
- Genome Center, University of California, Davis, CA 95616, USA
| | | | | |
Collapse
|
20
|
Abstract
Half of all human transcription factors are zinc finger proteins and yet very little is known concerning the biological role of the majority of these factors. In particular, very few genome-wide studies of the in vivo binding of zinc finger factors have been performed. Based on in vitro studies and other methods that allow selection of high affinity-binding sites in artificial conditions, a zinc finger code has been developed that can be used to compose a putative recognition motif for a particular zinc finger factor (ZNF). Theoretically, a simple bioinformatics analysis could then predict the genomic locations of all the binding sites for that ZNF. However, it is unlikely that all of the sequences in the human genome having a good match to a predicted motif are in fact occupied in vivo (due to negative influences from repressive chromatin, nucleosomal positioning, overlap of binding sites with other factors, etc). A powerful method to identify in vivo binding sites for transcription factors on a genome-wide scale is the chromatin immunoprecipitation (ChIP) assay, followed by hybridization of the precipitated DNA to microarrays (ChIP-chip) or by high throughput DNA sequencing of the sample (ChIP-seq). Such comprehensive in vivo binding studies would not only identify target genes of a particular zinc finger factor, but also provide binding motif data that could be used to test the validity of the zinc finger code. This chapter describes in detail the steps needed to prepare ChIP samples and libraries for high throughput sequencing using the Illumina GA2 platform and includes descriptions of quality control steps necessary to ensure a successful ChIP-seq experiment.
Collapse
|
21
|
Frietze S, Lan X, Jin VX, Farnham PJ. Genomic targets of the KRAB and SCAN domain-containing zinc finger protein 263. J Biol Chem 2009; 285:1393-403. [PMID: 19887448 DOI: 10.1074/jbc.m109.063032] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Half of all human transcription factors use C2H2 zinc finger domains to specify site-specific DNA binding and yet very little is known about their role in gene regulation. Based on in vitro studies, a zinc finger code has been developed that predicts a binding motif for a particular zinc finger factor (ZNF). However, very few studies have performed genome-wide analyses of ZNF binding patterns, and thus, it is not clear if the binding code developed in vitro will be useful for identifying target genes of a particular ZNF. We performed genome-wide ChIP-seq for ZNF263, a C2H2 ZNF that contains 9 finger domains, a KRAB repression domain, and a SCAN domain and identified more than 5000 binding sites in K562 cells. Our results suggest that ZNF263 binds to a 24-nt site that differs from the motif predicted by the zinc finger code in several positions. Interestingly, many of the ZNF263 binding sites are located within the transcribed region of the target gene. Although ZNFs containing a KRAB domain are thought to function mainly as transcriptional repressors, many of the ZNF263 target genes are expressed at high levels. To address the biological role of ZNF263, we identified genes whose expression was altered by treatment of cells with ZNF263-specific small interfering RNAs. Our results suggest that ZNF263 can have both positive and negative effects on transcriptional regulation of its target genes.
Collapse
Affiliation(s)
- Seth Frietze
- Department of Pharmacology and the Genome Center, University of California, Davis, California 95616, USA
| | | | | | | |
Collapse
|
22
|
Jayakanthan M, Muthukumaran J, Chandrasekar S, Chawla K, Punetha A, Sundar D. ZifBASE: a database of zinc finger proteins and associated resources. BMC Genomics 2009; 10:421. [PMID: 19737425 PMCID: PMC2746237 DOI: 10.1186/1471-2164-10-421] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2009] [Accepted: 09/09/2009] [Indexed: 11/24/2022] Open
Abstract
Background Information on the occurrence of zinc finger protein motifs in genomes is crucial to the developing field of molecular genome engineering. The knowledge of their target DNA-binding sequences is vital to develop chimeric proteins for targeted genome engineering and site-specific gene correction. There is a need to develop a computational resource of zinc finger proteins (ZFP) to identify the potential binding sites and its location, which reduce the time of in vivo task, and overcome the difficulties in selecting the specific type of zinc finger protein and the target site in the DNA sequence. Description ZifBASE provides an extensive collection of various natural and engineered ZFP. It uses standard names and a genetic and structural classification scheme to present data retrieved from UniProtKB, GenBank, Protein Data Bank, ModBase, Protein Model Portal and the literature. It also incorporates specialized features of ZFP including finger sequences and positions, number of fingers, physiochemical properties, classes, framework, PubMed citations with links to experimental structures (PDB, if available) and modeled structures of natural zinc finger proteins. ZifBASE provides information on zinc finger proteins (both natural and engineered ones), the number of finger units in each of the zinc finger proteins (with multiple fingers), the synergy between the adjacent fingers and their positions. Additionally, it gives the individual finger sequence and their target DNA site to which it binds for better and clear understanding on the interactions of adjacent fingers. The current version of ZifBASE contains 139 entries of which 89 are engineered ZFPs, containing 3-7F totaling to 296 fingers. There are 50 natural zinc finger protein entries ranging from 2-13F, totaling to 307 fingers. It has sequences and structures from literature, Protein Data Bank, ModBase and Protein Model Portal. The interface is cross linked to other public databases like UniprotKB, PDB, ModBase and Protein Model Portal and PubMed for making it more informative. Conclusion A database is established to maintain the information of the sequence features, including the class, framework, number of fingers, residues, position, recognition site and physio-chemical properties (molecular weight, isoelectric point) of both natural and engineered zinc finger proteins and dissociation constant of few. ZifBASE can provide more effective and efficient way of accessing the zinc finger protein sequences and their target binding sites with the links to their three-dimensional structures. All the data and functions are available at the advanced web-based search interface .
Collapse
Affiliation(s)
- Mannu Jayakanthan
- Centre of Excellence in Bioinformatics, School of Life Sciences, Pondicherry University, Pondicherry 605014, India.
| | | | | | | | | | | |
Collapse
|
23
|
Jiang H, Yin Y, Zhang X, Hu S, Wang Q. Chasing relationships between nutrition and reproduction: A comparative transcriptome analysis of hepatopancreas and testis from Eriocheir sinensis. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY D-GENOMICS & PROTEOMICS 2009; 4:227-34. [PMID: 20403758 DOI: 10.1016/j.cbd.2009.05.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2009] [Revised: 05/19/2009] [Accepted: 05/19/2009] [Indexed: 10/20/2022]
Abstract
There is a delicate relationship between nutrition and reproduction of mitten crab (Eriocheir sinensis). The crabs store significant amounts of energy in hepatopancreas, which is prepared for significant energy output and expenditure during reproduction, but the internal molecular mechanism has never been known. Here we present the first relationship between hepatopancreas and testis of E. sinensis. We acquired 6287 high quality expressed sequence tags (EST), representing 3829 unigenes totally, from healthy male mitten crabs of first grade. We investigated the Gene Ontology and the main metabolism processes of hepatopancreas and testis from E. sinensis. Genes most likely expressed more frequently and localized in hepatopancreas, and abundant genes from testis for multiple functions. Many genes important for the nutrition regulation are in the EST resource, including arginine kinase, leptin receptor-like protein, seminal plasma glycoprotein 120, and many kinds of zinc finger proteins. The EST data also revealed genes such as heat shock protein 70, testis enhanced gene transcript (TEGT), Cyclin K, etc. predicted to play important roles in regulation of reproduction mechanisms. Among these genes, alignment of leptin receptor-like protein and vasa-like protein from E. sinensis and other species showed even more genomic information on E. sinensis. We identified seventeen genes relevant to control of nutrition mechanisms and eleven genes involved in regulation of reproduction. And this study provides insights into the genetic and molecular mechanisms of nutrition and reproduction in the crab. Such information would facilitate the optimization of breeding in the aquaculture of mitten crabs.
Collapse
Affiliation(s)
- Hui Jiang
- Department of Biology, East China Normal University, Shanghai, China
| | | | | | | | | |
Collapse
|
24
|
Temiz NA, Camacho CJ. Experimentally based contact energies decode interactions responsible for protein-DNA affinity and the role of molecular waters at the binding interface. Nucleic Acids Res 2009; 37:4076-88. [PMID: 19429892 PMCID: PMC2709573 DOI: 10.1093/nar/gkp289] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
A major obstacle towards understanding the molecular basis of transcriptional regulation is the lack of a recognition code for protein–DNA interactions. Using high-quality crystal structures and binding data on the promiscuous family of C2H2 zinc fingers (ZF), we decode 10 fundamental specific interactions responsible for protein–DNA recognition. The interactions include five hydrogen bond types, three atomic desolvation penalties, a favorable non-polar energy, and a novel water accessibility factor. We apply this code to three large datasets containing a total of 89 C2H2 transcription factor (TF) mutants on the three ZFs of EGR. Guided by molecular dynamics simulations of individual ZFs, we map the interactions into homology models that embody all feasible intra- and intermolecular bonds, selecting for each sequence the structure with the lowest free energy. These interactions reproduce the change in affinity of 35 mutants of finger I (R2 = 0.998), 23 mutants of finger II (R2 = 0.96) and 31 finger III human domains (R2 = 0.94). Our findings reveal recognition rules that depend on DNA sequence/structure, molecular water at the interface and induced fit of the C2H2 TFs. Collectively, our method provides the first robust framework to decode the molecular basis of TFs binding to DNA.
Collapse
Affiliation(s)
- N Alpay Temiz
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | | |
Collapse
|
25
|
Zheng G, Qian Z, Yang Q, Wei C, Xie L, Zhu Y, Li Y. The combination approach of SVM and ECOC for powerful identification and classification of transcription factor. BMC Bioinformatics 2008; 9:282. [PMID: 18554421 PMCID: PMC2440765 DOI: 10.1186/1471-2105-9-282] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2008] [Accepted: 06/16/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcription factors (TFs) are core functional proteins which play important roles in gene expression control, and they are key factors for gene regulation network construction. Traditionally, they were identified and classified through experimental approaches. In order to save time and reduce costs, many computational methods have been developed to identify TFs from new proteins and to classify the resulted TFs. Though these methods have facilitated screening of TFs to some extent, low accuracy is still a common problem. With the fast growing number of new proteins, more precise algorithms for identifying TFs from new proteins and classifying the consequent TFs are in a high demand. RESULTS The support vector machine (SVM) algorithm was utilized to construct an automatic detector for TF identification, where protein domains and functional sites were employed as feature vectors. Error-correcting output coding (ECOC) algorithm, which was originated from information and communication engineering fields, was introduced to combine with support vector machine (SVM) methodology for TF classification. The overall success rates of identification and classification achieved 88.22% and 97.83% respectively. Finally, a web site was constructed to let users access our tools (see Availability and requirements section for URL). CONCLUSION The SVM method was a valid and stable means for TFs identification with protein domains and functional sites as feature vectors. Error-correcting output coding (ECOC) algorithm is a powerful method for multi-class classification problem. When combined with SVM method, it can remarkably increase the accuracy of TF classification using protein domains and functional sites as feature vectors. In addition, our work implied that ECOC algorithm may succeed in a broad range of applications in biological data mining.
Collapse
Affiliation(s)
- Guangyong Zheng
- School of Life Sciences, Fudan University, 220 Handan Road, Shanghai 200433, PR China
- Department of Computing and Information Technology, Fudan University, 220 Handan Road, Shanghai 200433, PR China
- Bioinformatics Center, Key Lab of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, PR China
| | - Ziliang Qian
- Bioinformatics Center, Key Lab of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, PR China
- Graduate School of the Chinese Academy of Sciences, 19 Yuquan Road, Beijing 100039, PR China
| | - Qing Yang
- Department of Computing and Information Technology, Fudan University, 220 Handan Road, Shanghai 200433, PR China
| | - Chaochun Wei
- College of Life Sciences and Technology, Shanghai Jiaotong University, 800 Dongchuan Road, Shanghai 200240, PR China
- Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai 200235, PR China
| | - Lu Xie
- Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai 200235, PR China
| | - Yangyong Zhu
- Department of Computing and Information Technology, Fudan University, 220 Handan Road, Shanghai 200433, PR China
- Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai 200235, PR China
| | - Yixue Li
- College of Life Sciences and Technology, Shanghai Jiaotong University, 800 Dongchuan Road, Shanghai 200240, PR China
- Shanghai Center for Bioinformation Technology, 100 Qinzhou Road, Shanghai 200235, PR China
| |
Collapse
|