1
|
Iwamuro T, Itohara K, Furukawa Y. Stability of N-type inactivation and the coupling between N-type and C-type inactivation in the Aplysia Kv1 channel. Pflugers Arch 2024; 476:1493-1516. [PMID: 39008084 DOI: 10.1007/s00424-024-02982-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 05/28/2024] [Accepted: 06/19/2024] [Indexed: 07/16/2024]
Abstract
The voltage-dependent potassium channels (Kv channels) show several different types of inactivation. N-type inactivation is a fast inactivating mechanism, which is essentially an open pore blockade by the amino-terminal structure of the channel itself or the auxiliary subunit. There are several functionally discriminatable slow inactivation (C-type, P-type, U-type), the mechanism of which is supposed to include rearrangement of the pore region. In some Kv1 channels, the actual inactivation is brought about by coupling of N-type and C-type inactivation (N-C coupling). In the present study, we focused on the N-C coupling of the Aplysia Kv1 channel (AKv1). AKv1 shows a robust N-type inactivation, but its recovery is almost thoroughly from C-type inactivated state owing to the efficient N-C coupling. In the I8Q mutant of AKv1, we found that the inactivation as well as its recovery showed two kinetic components apparently correspond to N-type and C-type inactivation. Also, the cumulative inactivation which depends on N-type mechanism in AKv1 was hindered in I8Q, suggesting that N-type inactivation of I8Q is less stable. We also found that Zn2 + specifically accelerates C-type inactivation of AKv1 and that H382 in the pore turret is involved in the Zn2 + binding. Because the region around Ile8 (I8) in AKv1 has been suggested to be involved in the pre-block binding of the amino-terminal structure, our results strengthen a hypothesis that the stability of the pre-block state is important for stable N-type inactivation as well as the N-C coupling in the Kv1 channel inactivation.
Collapse
Affiliation(s)
- Tokunari Iwamuro
- Laboratory of Neurobiology, Graduate School of Integrated Sciences of Life, Hiroshima University, Kagamiyama 1-7-1, 739-8521, Higashi-Hiroshima, Japan
| | - Kazuki Itohara
- Laboratory of Neurobiology, Graduate School of Integrated Sciences of Life, Hiroshima University, Kagamiyama 1-7-1, 739-8521, Higashi-Hiroshima, Japan
| | - Yasuo Furukawa
- Laboratory of Neurobiology, Graduate School of Integrated Sciences of Life, Hiroshima University, Kagamiyama 1-7-1, 739-8521, Higashi-Hiroshima, Japan.
| |
Collapse
|
2
|
Cankara F, Senyuz S, Sayin AZ, Gursoy A, Keskin O. DiPPI: A Curated Data Set for Drug-like Molecules in Protein-Protein Interfaces. J Chem Inf Model 2024; 64:5041-5051. [PMID: 38907989 DOI: 10.1021/acs.jcim.3c01905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2024]
Abstract
Proteins interact through their interfaces, and dysfunction of protein-protein interactions (PPIs) has been associated with various diseases. Therefore, investigating the properties of the drug-modulated PPIs and interface-targeting drugs is critical. Here, we present a curated large data set for drug-like molecules in protein interfaces. We further introduce DiPPI (Drugs in Protein-Protein Interfaces), a two-module web site to facilitate the search for such molecules and their properties by exploiting our data set in drug repurposing studies. In the interface module of the web site, we present several properties, of interfaces, such as amino acid properties, hotspots, evolutionary conservation of drug-binding amino acids, and post-translational modifications of these residues. On the drug-like molecule side, we list drug-like small molecules and FDA-approved drugs from various databases and highlight those that bind to the interfaces. We further clustered the drugs based on their molecular fingerprints to confine the search for an alternative drug to a smaller space. Drug properties, including Lipinski's rules and various molecular descriptors, are also calculated and made available on the web site to guide the selection of drug molecules. Our data set contains 534,203 interfaces for 98,632 protein structures, of which 55,135 are detected to bind to a drug-like molecule. 2214 drug-like molecules are deposited on our web site, among which 335 are FDA-approved. DiPPI provides users with an easy-to-follow scheme for drug repurposing studies through its well-curated and clustered interface and drug data and is freely available at http://interactome.ku.edu.tr:8501.
Collapse
Affiliation(s)
- Fatma Cankara
- Graduate School of Sciences and Engineering, Koç University, İstanbul 34450, Turkey
| | - Simge Senyuz
- Graduate School of Sciences and Engineering, Koç University, İstanbul 34450, Turkey
| | - Ahenk Zeynep Sayin
- Department of Chemical and Biological Engineering, Koç University, İstanbul 34450, Turkey
| | - Attila Gursoy
- Department of Computer Engineering, Koç University, İstanbul 34450, Turkey
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, Koç University, İstanbul 34450, Turkey
| |
Collapse
|
3
|
Salcedo MV, Gravel N, Keshavarzi A, Huang LC, Kochut KJ, Kannan N. Predicting protein and pathway associations for understudied dark kinases using pattern-constrained knowledge graph embedding. PeerJ 2023; 11:e15815. [PMID: 37868056 PMCID: PMC10590106 DOI: 10.7717/peerj.15815] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 07/10/2023] [Indexed: 10/24/2023] Open
Abstract
The 534 protein kinases encoded in the human genome constitute a large druggable class of proteins that include both well-studied and understudied "dark" members. Accurate prediction of dark kinase functions is a major bioinformatics challenge. Here, we employ a graph mining approach that uses the evolutionary and functional context encoded in knowledge graphs (KGs) to predict protein and pathway associations for understudied kinases. We propose a new scalable graph embedding approach, RegPattern2Vec, which employs regular pattern constrained random walks to sample diverse aspects of node context within a KG flexibly. RegPattern2Vec learns functional representations of kinases, interacting partners, post-translational modifications, pathways, cellular localization, and chemical interactions from a kinase-centric KG that integrates and conceptualizes data from curated heterogeneous data resources. By contextualizing information relevant to prediction, RegPattern2Vec improves accuracy and efficiency in comparison to other random walk-based graph embedding approaches. We show that the predictions produced by our model overlap with pathway enrichment data produced using experimentally validated Protein-Protein Interaction (PPI) data from both publicly available databases and experimental datasets not used in training. Our model also has the advantage of using the collected random walks as biological context to interpret the predicted protein-pathway associations. We provide high-confidence pathway predictions for 34 dark kinases and present three case studies in which analysis of meta-paths associated with the prediction enables biological interpretation. Overall, RegPattern2Vec efficiently samples multiple node types for link prediction on biological knowledge graphs and the predicted associations between understudied kinases, pseudokinases, and known pathways serve as a conceptual starting point for hypothesis generation and testing.
Collapse
Affiliation(s)
- Mariah V. Salcedo
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, United States of America
| | - Nathan Gravel
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States of America
| | - Abbas Keshavarzi
- School of Computing, University of Georgia, Athens, GA, United States of America
| | - Liang-Chin Huang
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States of America
| | - Krzysztof J. Kochut
- School of Computing, University of Georgia, Athens, GA, United States of America
| | - Natarajan Kannan
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA, United States of America
- Institute of Bioinformatics, University of Georgia, Athens, GA, United States of America
| |
Collapse
|
4
|
Mohanty M, Mohanty PS. Molecular docking in organic, inorganic, and hybrid systems: a tutorial review. MONATSHEFTE FUR CHEMIE 2023; 154:1-25. [PMID: 37361694 PMCID: PMC10243279 DOI: 10.1007/s00706-023-03076-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 05/08/2023] [Indexed: 06/28/2023]
Abstract
Molecular docking simulation is a very popular and well-established computational approach and has been extensively used to understand molecular interactions between a natural organic molecule (ideally taken as a receptor) such as an enzyme, protein, DNA, RNA and a natural or synthetic organic/inorganic molecule (considered as a ligand). But the implementation of docking ideas to synthetic organic, inorganic, or hybrid systems is very limited with respect to their use as a receptor despite their huge popularity in different experimental systems. In this context, molecular docking can be an efficient computational tool for understanding the role of intermolecular interactions in hybrid systems that can help in designing materials on mesoscale for different applications. The current review focuses on the implementation of the docking method in organic, inorganic, and hybrid systems along with examples from different case studies. We describe different resources, including databases and tools required in the docking study and applications. The concept of docking techniques, types of docking models, and the role of different intermolecular interactions involved in the docking process to understand the binding mechanisms are explained. Finally, the challenges and limitations of dockings are also discussed in this review. Graphical abstract
Collapse
Affiliation(s)
- Madhuchhanda Mohanty
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
| | - Priti S. Mohanty
- School of Biotechnology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
- School of Chemical Technology, Kalinga Institute of Industrial Technology (KIIT), Deemed to be University, Bhubaneswar, 751024 India
| |
Collapse
|
5
|
In Silico Binding of 2-Aminocyclobutanones to SARS-CoV-2 Nsp13 Helicase and Demonstration of Antiviral Activity. Int J Mol Sci 2023; 24:ijms24065120. [PMID: 36982188 PMCID: PMC10049026 DOI: 10.3390/ijms24065120] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 02/22/2023] [Accepted: 03/02/2023] [Indexed: 03/10/2023] Open
Abstract
The landscape of viral strains and lineages of SARS-CoV-2 keeps changing and is currently dominated by Delta and Omicron variants. Members of the latest Omicron variants, including BA.1, are showing a high level of immune evasion, and Omicron has become a prominent variant circulating globally. In our search for versatile medicinal chemistry scaffolds, we prepared a library of substituted ɑ-aminocyclobutanones from an ɑ-aminocyclobutanone synthon (11). We performed an in silico screen of this actual chemical library as well as other virtual 2-aminocyclobutanone analogs against seven SARS-CoV-2 nonstructural proteins to identify potential drug leads against SARS-CoV-2, and more broadly against coronavirus antiviral targets. Several of these analogs were initially identified as in silico hits against SARS-CoV-2 nonstructural protein 13 (Nsp13) helicase through molecular docking and dynamics simulations. Antiviral activity of the original hits as well as ɑ-aminocyclobutanone analogs that were predicted to bind more tightly to SARS-CoV-2 Nsp13 helicase are reported. We now report cyclobutanone derivatives that exhibit anti-SARS-CoV-2 activity. Furthermore, the Nsp13 helicase enzyme has been the target of relatively few target-based drug discovery efforts, in part due to a very late release of a high-resolution structure accompanied by a limited understanding of its protein biochemistry. In general, antiviral agents initially efficacious against wild-type SARS-CoV-2 strains have lower activities against variants due to heavy viral loads and greater turnover rates, but the inhibitors we are reporting have higher activities against the later variants than the wild-type (10–20X). We speculate this could be due to Nsp13 helicase being a critical bottleneck in faster replication rates of the new variants, so targeting this enzyme affects these variants to an even greater extent. This work calls attention to cyclobutanones as a useful medicinal chemistry scaffold, and the need for additional focus on the discovery of Nsp13 helicase inhibitors to combat the aggressive and immune-evading variants of concern (VOCs).
Collapse
|
6
|
Hasenahuer MA, Sanchis-Juan A, Laskowski RA, Baker JA, Stephenson JD, Orengo CA, Raymond FL, Thornton JM. Mapping the Constrained Coding Regions in the Human Genome to Their Corresponding Proteins. J Mol Biol 2023; 435:167892. [PMID: 36410474 PMCID: PMC9875310 DOI: 10.1016/j.jmb.2022.167892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 11/08/2022] [Accepted: 11/14/2022] [Indexed: 11/23/2022]
Abstract
Constrained Coding Regions (CCRs) in the human genome have been derived from DNA sequencing data of large cohorts of healthy control populations, available in the Genome Aggregation Database (gnomAD) [1]. They identify regions depleted of protein-changing variants and thus identify segments of the genome that have been constrained during human evolution. By mapping these DNA-defined regions from genomic coordinates onto the corresponding protein positions and combining this information with protein annotations, we have explored the distribution of CCRs and compared their co-occurrence with different protein functional features, previously annotated at the amino acid level in public databases. As expected, our results reveal that functional amino acids involved in interactions with DNA/RNA, protein-protein contacts and catalytic sites are the protein features most likely to be highly constrained for variation in the control population. More surprisingly, we also found that linear motifs, linear interacting peptides (LIPs), disorder-order transitions upon binding with other protein partners and liquid-liquid phase separating (LLPS) regions are also strongly associated with high constraint for variability. We also compared intra-species constraints in the human CCRs with inter-species conservation and functional residues to explore how such CCRs may contribute to the analysis of protein variants. As has been previously observed, CCRs are only weakly correlated with conservation, suggesting that intraspecies constraints complement interspecies conservation and can provide more information to interpret variant effects.
Collapse
Affiliation(s)
- Marcia A. Hasenahuer
- European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK,Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, UK,Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK,Corresponding author at: European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK. @MarHasenahuer
| | - Alba Sanchis-Juan
- Department of Haematology, NHS Blood and Transplant Centre, University of Cambridge, Cambridge CB2 0XY, UK,NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Roman A. Laskowski
- European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - James A. Baker
- European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - James D. Stephenson
- European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Christine A. Orengo
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - F. Lucy Raymond
- Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 0XY, UK,NIHR BioResource, Cambridge University Hospitals NHS Foundation Trust, Cambridge Biomedical Campus, Cambridge CB2 0QQ, UK
| | - Janet M. Thornton
- European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| |
Collapse
|
7
|
Choudhari JK, Eberhardt M, Chatterjee T, Hohberger B, Vera J. Glaucoma-TrEl: A web-based interactive database to build evidence-based hypotheses on the role of trace elements in glaucoma. BMC Res Notes 2022; 15:348. [PMID: 36401306 PMCID: PMC9673420 DOI: 10.1186/s13104-022-06210-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 09/12/2022] [Accepted: 09/19/2022] [Indexed: 11/19/2022] Open
Abstract
Objective Glaucoma is a chronic neurological disease that is associated with high intraocular pressure (IOP), causes gradual damage to retinal ganglion cells, and often culminates in vision loss. Recent research suggests that glaucoma is a complex multifactorial disease in which multiple interlinked genes and pathways play a role during onset and development. Also, differential availability of trace elements seems to play a role in glaucoma pathophysiology, although their mechanism of action is unknown. The aim of this work is to disseminate a web-based repository on interactions between trace elements and protein-coding genes linked to glaucoma pathophysiology. Results In this study, we present Glaucoma-TrEl, a web database containing information about interactions between trace elements and protein-coding genes that are linked to glaucoma. In the database, we include interactions between 437 unique genes and eight trace elements. Our analysis found a large number of interactions between trace elements and protein-coding genes mutated or linked to the pathophysiology of glaucoma. We associated genes interacting with multiple trace elements to pathways known to play a role in glaucoma. The web-based platform provides an easy-to-use and interactive tool, which serves as an information hub facilitating future research work on trace elements in glaucoma.
Collapse
|
8
|
Evolutionary inference across eukaryotes identifies universal features shaping organelle gene retention. Cell Syst 2022; 13:874-884.e5. [PMID: 36115336 DOI: 10.1016/j.cels.2022.08.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 06/24/2022] [Accepted: 08/22/2022] [Indexed: 01/26/2023]
Abstract
Mitochondria and plastids power complex life. Why some genes and not others are retained in their organelle DNA (oDNA) genomes remains a debated question. Here, we attempt to identify the properties of genes and associated underlying mechanisms that determine oDNA retention. We harness over 15k oDNA sequences and over 300 whole genome sequences across eukaryotes with tools from structural biology, bioinformatics, machine learning, and Bayesian model selection. Previously hypothesized features, including the hydrophobicity of a protein product, and less well-known features, including binding energy centrality within a protein complex, predict oDNA retention across eukaryotes, with additional influences of nucleic acid and amino acid biochemistry. Notably, the same features predict retention in both organelles, and retention models learned from one organelle type quantitatively predict retention in the other, supporting the universality of these features-which also distinguish gene profiles in more recent, independent endosymbiotic relationships. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
|
9
|
Bharadwaj A, Jakobi AJ. Electron scattering properties of biological macromolecules and their use for cryo-EM map sharpening. Faraday Discuss 2022; 240:168-183. [PMID: 35938593 PMCID: PMC9642005 DOI: 10.1039/d2fd00078d] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Resolution-dependent loss of contrast in cryo-EM maps may obscure features at high resolution that are critical for map interpretation. Post-processing of cryo-EM maps can improve the interpretability by adjusting the resolution-dependence of structure factor amplitudes through map sharpening. Traditionally this has been done by rescaling the relative contribution of low and high-resolution frequencies globally. More recently, the realisation that molecular motion and heterogeneity cause non-uniformity of resolution throughout the map has inspired the development of techniques that optimise sharpening locally. We previously developed LocScale, a method that utilises the radial structure factor from a refined atomic model as a restraint for local map sharpening. While this method has proved beneficial for the interpretation of cryo-EM maps, the dependence on the availability of (partial) model information limits its general applicability. Here, we review the basic assumptions of resolution-dependent contrast loss in cryo-EM maps and propose a route towards a robust alternative for local map sharpening that utilises information on expected scattering properties of biological macromolecules, but requires no detailed knowledge of the underlying molecular structure. We examine remaining challenges for implementation and discuss possible applications.
Collapse
Affiliation(s)
- Alok Bharadwaj
- Department of Bionanoscience, Kavli Institute of Nanoscience, Delft University of TechnologyThe Netherlands
| | - Arjen J. Jakobi
- Department of Bionanoscience, Kavli Institute of Nanoscience, Delft University of TechnologyThe Netherlands
| |
Collapse
|
10
|
Anand V, Prabhakaran HS, Gogoi P, Kanaujia SP, Kumar M. Structural and functional characterization of Cas2 of CRISPR-Cas subtype I-C lacking the CRISPR component. Front Mol Biosci 2022; 9:988569. [PMID: 36172044 PMCID: PMC9510766 DOI: 10.3389/fmolb.2022.988569] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 08/08/2022] [Indexed: 11/29/2022] Open
Abstract
The genome of pathogenic Leptospira interrogans serovars (Copenhageni and Lai) are predicted to have CRISPR-Cas of subtypes I-B and I-C. Cas2, one of the core Cas proteins, has a crucial role in adaptive defense against foreign nucleic acids. However, subtype I-C lacks the CRISPR element at its loci essential for RNA-mediated adaptive immunity against foreign nucleic acids. The reason for sustaining the expense of cas genes are unknown in the absence of a CRISPR array. Thus, Cas2C was chosen as a representative Cas protein from two well-studied serovars of Leptospira to address whether it is functional. In this study, the recombinant Cas2C of Leptospira serovars Copenhageni (rLinCas2C, 12 kDa) and Lai (rLinCas2C_Lai, 8.6 kDa) were overexpressed and purified. Due to natural frameshift mutation in the cas2c gene of serovar Lai, rLinCas2C_Lai was overexpressed and purified as a partially translated protein. Nevertheless, the recombinant Cas2C from each serovar exhibited metal-dependent DNase and metal-independent RNase activities. The crystal structure of rLinCas2C obtained at the resolution of 2.60 Å revealed the protein is in apostate conformation and contains N- (1–71 amino acids) and C-terminal (72–90 amino acids) regions, with the former possessing a ferredoxin fold. Substitution of the conserved residues (Tyr7, Asp8, Arg33, and Phe39) with alanine and deletion of Loop L2 resulted in compromised DNase activity. On the other hand, a moderate reduction in RNase activity was evident only in selective rLinCas2C mutants. Overall, in the absence of an array, the observed catalytic activity of Cas2C may be required for biological processes distinct from the CRISPR-Cas-associated function.
Collapse
Affiliation(s)
| | | | | | | | - Manish Kumar
- *Correspondence: Shankar Prasad Kanaujia, ; Manish Kumar,
| |
Collapse
|
11
|
Gong Y, Behera G, Erber L, Luo A, Chen Y. HypDB: A functionally annotated web-based database of the proline hydroxylation proteome. PLoS Biol 2022; 20:e3001757. [PMID: 36026437 PMCID: PMC9455854 DOI: 10.1371/journal.pbio.3001757] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 09/08/2022] [Accepted: 07/13/2022] [Indexed: 01/16/2023] Open
Abstract
Proline hydroxylation (Hyp) regulates protein structure, stability, and protein-protein interaction. It is widely involved in diverse metabolic and physiological pathways in cells and diseases. To reveal functional features of the Hyp proteome, we integrated various data sources for deep proteome profiling of the Hyp proteome in humans and developed HypDB (https://www.HypDB.site), an annotated database and web server for Hyp proteome. HypDB provides site-specific evidence of modification based on extensive LC-MS analysis and literature mining with 14,413 nonredundant Hyp sites on 5,165 human proteins including 3,383 Class I and 4,335 Class II sites. Annotation analysis revealed significant enrichment of Hyp on key functional domains and tissue-specific distribution of Hyp abundance across 26 types of human organs and fluids and 6 cell lines. The network connectivity analysis further revealed a critical role of Hyp in mediating protein-protein interactions. Moreover, the spectral library generated by HypDB enabled data-independent analysis (DIA) of clinical tissues and the identification of novel Hyp biomarkers in lung cancer and kidney cancer. Taken together, our integrated analysis of human proteome with publicly accessible HypDB revealed functional diversity of Hyp substrates and provides a quantitative data source to characterize Hyp in pathways and diseases.
Collapse
Affiliation(s)
- Yao Gong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota at Twin Cities, Minneapolis, Minnesota, United States of America
- Bioinformatics and Computational Biology Program, University of Minnesota at Twin Cities, Minneapolis, Minnesota, United States of America
| | - Gaurav Behera
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota at Twin Cities, Minneapolis, Minnesota, United States of America
| | - Luke Erber
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota at Twin Cities, Minneapolis, Minnesota, United States of America
| | - Ang Luo
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota at Twin Cities, Minneapolis, Minnesota, United States of America
| | - Yue Chen
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota at Twin Cities, Minneapolis, Minnesota, United States of America
- Bioinformatics and Computational Biology Program, University of Minnesota at Twin Cities, Minneapolis, Minnesota, United States of America
- * E-mail:
| |
Collapse
|
12
|
Riziotis IG, Thornton JM. Capturing the geometry, function, and evolution of enzymes with 3D templates. Protein Sci 2022; 31:e4363. [PMID: 35762726 PMCID: PMC9207746 DOI: 10.1002/pro.4363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 05/06/2022] [Accepted: 05/14/2022] [Indexed: 11/05/2022]
Abstract
Structural templates are 3D signatures representing protein functional sites, such as ligand binding cavities, metal coordination motifs, or catalytic sites. Here we explore methods to generate template libraries and algorithms to query structures for conserved 3D motifs. Applications of templates are discussed, as well as some exemplar cases for examining evolutionary links in enzymes. We also introduce the concept of using more than one template per structure to represent flexible sites, as an approach to better understand catalysis through snapshots captured in enzyme structures. Functional annotation from structure is an important topic that has recently resurfaced due to the new more accurate methods of protein structure prediction. Therefore, we anticipate that template-based functional site detection will be a powerful tool in the task of characterizing a vast number of new protein models.
Collapse
|
13
|
Abstract
Acetylcholine is a central biological signal molecule present in all kingdoms of life. In humans, acetylcholine is the primary neurotransmitter of the peripheral nervous system; it mediates signal transmission at neuromuscular junctions. Here, we show that the opportunistic human pathogen Pseudomonas aeruginosa exhibits chemoattraction toward acetylcholine over a concentration range of 1 μM to 100 mM. The maximal magnitude of the response was superior to that of many other P. aeruginosa chemoeffectors. We demonstrate that this chemoattraction is mediated by the PctD (PA4633) chemoreceptor. Using microcalorimetry, we show that the PctD ligand-binding domain (LBD) binds acetylcholine with a equilibrium dissociation constant (KD) of 23 μM. It also binds choline and with lower affinity betaine. Highly sensitive responses to acetylcholine and choline, and less sensitive responses to betaine and l-carnitine, were observed in Escherichia coli expressing a chimeric receptor comprising the PctD-LBD fused to the Tar chemoreceptor signaling domain. We also identified the PacA (ECA_RS10935) chemoreceptor of the phytopathogen Pectobacterium atrosepticum, which binds choline and betaine but fails to recognize acetylcholine. To identify the molecular determinants for acetylcholine recognition, we report high-resolution structures of PctD-LBD (with bound acetylcholine and choline) and PacA-LBD (with bound betaine). We identified an amino acid motif in PctD-LBD that interacts with the acetylcholine tail. This motif is absent in PacA-LBD. Significant acetylcholine chemotaxis was also detected in the plant pathogens Agrobacterium tumefaciens and Dickeya solani. To the best of our knowledge, this is the first report of acetylcholine chemotaxis and extends the range of host signals perceived by bacterial chemoreceptors.
Collapse
|
14
|
Martin W, Sheynkman G, Lightstone FC, Nussinov R, Cheng F. Interpretable artificial intelligence and exascale molecular dynamics simulations to reveal kinetics: Applications to Alzheimer's disease. Curr Opin Struct Biol 2022; 72:103-113. [PMID: 34628220 PMCID: PMC8860862 DOI: 10.1016/j.sbi.2021.09.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 08/30/2021] [Accepted: 09/01/2021] [Indexed: 02/03/2023]
Abstract
The rapid increase in computing power, especially with the integration of graphics processing units, has dramatically increased the capabilities of molecular dynamics simulations. To date, these capabilities extend from running very long simulations (tens to hundreds of microseconds) to thousands of short simulations. However, the expansive data generated in these simulations must be made interpretable not only by the investigator who performs them but also by others as well. Here, we demonstrate how integrating learning techniques, such as artificial intelligence, machine learning, and neural networks, into analysis pipelines can reveal the kinetics of Alzheimer's disease (AD) protein aggregation. We review select AD targets, describe current simulation methods, and introduce learning concepts and their application in AD, highlighting limitations and potential solutions.
Collapse
Affiliation(s)
- William Martin
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
| | - Gloria Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, 22903, USA
| | - Felice C Lightstone
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Lab, Livermore, CA, 94550, USA
| | - Ruth Nussinov
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research in the Laboratory of Cancer Immunometabolism, National Cancer Institute, Frederick, MD, 21702, USA; Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA; Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH, 44195, USA; Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA.
| |
Collapse
|
15
|
' All That Glitters Is Not Gold': High-Resolution Crystal Structures of Ligand-Protein Complexes Need Not Always Represent Confident Binding Poses. Int J Mol Sci 2021; 22:ijms22136830. [PMID: 34202053 PMCID: PMC8268033 DOI: 10.3390/ijms22136830] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 05/24/2021] [Accepted: 05/24/2021] [Indexed: 01/09/2023] Open
Abstract
Our understanding of the structure–function relationships of biomolecules and thereby applying it to drug discovery programs are substantially dependent on the availability of the structural information of ligand–protein complexes. However, the correct interpretation of the electron density of a small molecule bound to a crystal structure of a macromolecule is not trivial. Our analysis involving quality assessment of ~0.28 million small molecule–protein binding site pairs derived from crystal structures corresponding to ~66,000 PDB entries indicates that the majority (65%) of the pairs might need little (54%) or no (11%) attention. Out of the remaining 35% of pairs that need attention, 11% of the pairs (including structures with high/moderate resolution) pose serious concerns. Unfortunately, most users of crystal structures lack the training to evaluate the quality of a crystal structure against its experimental data and, in general, rely on the resolution as a ‘gold standard’ quality metric. Our work aims to sensitize the non-crystallographers that resolution, which is a global quality metric, need not be an accurate indicator of local structural quality. In this article, we demonstrate the use of several freely available tools that quantify local structural quality and are easy to use from a non-crystallographer’s perspective. We further propose a few solutions for consideration by the scientific community to promote quality research in structural biology and applied areas.
Collapse
|
16
|
Srivastava A, Giangiobbe S, Skopelitou D, Miao B, Paramasivam N, Diquigiovanni C, Bonora E, Hemminki K, Försti A, Bandapalli OR. Whole Genome Sequencing Prioritizes CHEK2, EWSR1, and TIAM1 as Possible Predisposition Genes for Familial Non-Medullary Thyroid Cancer. Front Endocrinol (Lausanne) 2021; 12:600682. [PMID: 33692755 PMCID: PMC7937922 DOI: 10.3389/fendo.2021.600682] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Accepted: 01/04/2021] [Indexed: 01/08/2023] Open
Abstract
Familial inheritance in non-medullary thyroid cancer (NMTC) is an area that has yet to be adequately explored. Despite evidence suggesting strong familial clustering of non-syndromic NMTC, known variants still account for a very small percentage of the genetic burden. In a recent whole genome sequencing (WGS) study of five families with several NMTCs, we shortlisted promising variants with the help of our in-house developed Familial Cancer Variant Prioritization Pipeline (FCVPPv2). Here, we report potentially disease-causing variants in checkpoint kinase 2 (CHEK2), Ewing sarcoma breakpoint region 1 (EWSR1) and T-lymphoma invasion and metastasis-inducing protein 1 (TIAM1) in one family. Performing WGS on three cases, one probable case and one healthy individual in a family with familial NMTC left us with 112254 variants with a minor allele frequency of less than 0.1%, which was reduced by pedigree-based filtering to 6368. Application of the pipeline led to the prioritization of seven coding and nine non-coding variants from this family. The variant identified in CHEK2, a known tumor suppressor gene involved in DNA damage-induced DNA repair, cell cycle arrest, and apoptosis, has been previously identified as a germline variant in breast and prostate cancer and has been functionally validated by Roeb et al. in a yeast-based assay to have an intermediate effect on protein function. We thus hypothesized that this family may harbor additional disease-causing variants in other functionally related genes. We evaluated two further variants in EWSR1 and TIAM1 with promising in silico results and reported interaction in the DNA-damage repair pathway. Hence, we propose a polygenic mode of inheritance in this family. As familial NMTC is considered to be more aggressive than its sporadic counterpart, it is important to identify such susceptibility genes and their associated pathways. In this way, the advancement of personalized medicine in NMTC patients can be fostered. We also wish to reopen the discussion on monogenic vs polygenic inheritance in NMTC and instigate further development in this area of research.
Collapse
Affiliation(s)
- Aayushi Srivastava
- Division of Molecular Genetic Epidemiology, German Cancer Research Center, Heidelberg, Germany
- Preclinical Pediatric Oncology, Hopp Children’s Cancer Center (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Heidelberg, Germany
- Medical Faculty, Heidelberg University, Heidelberg, Germany
| | - Sara Giangiobbe
- Division of Molecular Genetic Epidemiology, German Cancer Research Center, Heidelberg, Germany
- Medical Faculty, Heidelberg University, Heidelberg, Germany
| | - Diamanto Skopelitou
- Division of Molecular Genetic Epidemiology, German Cancer Research Center, Heidelberg, Germany
- Preclinical Pediatric Oncology, Hopp Children’s Cancer Center (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Heidelberg, Germany
- Medical Faculty, Heidelberg University, Heidelberg, Germany
| | - Beiping Miao
- Preclinical Pediatric Oncology, Hopp Children’s Cancer Center (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Nagarajan Paramasivam
- Computational Oncology, National Center for Tumor Diseases (NCT), Molecular Diagnostics Program, Heidelberg, Germany
| | - Chiara Diquigiovanni
- Unit of Medical Genetics, Department of Medical and Surgical Sciences, S. Orsola-Malphigi Hospital, University of Bologna, Bologna, Italy
| | - Elena Bonora
- Unit of Medical Genetics, Department of Medical and Surgical Sciences, S. Orsola-Malphigi Hospital, University of Bologna, Bologna, Italy
| | - Kari Hemminki
- Division of Molecular Genetic Epidemiology, German Cancer Research Center, Heidelberg, Germany
- Faculty of Medicine and Biomedical Center in Pilsen, Charles University in Prague, Pilsen, Czechia
| | - Asta Försti
- Division of Molecular Genetic Epidemiology, German Cancer Research Center, Heidelberg, Germany
- Preclinical Pediatric Oncology, Hopp Children’s Cancer Center (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Obul Reddy Bandapalli
- Division of Molecular Genetic Epidemiology, German Cancer Research Center, Heidelberg, Germany
- Preclinical Pediatric Oncology, Hopp Children’s Cancer Center (KiTZ), Heidelberg, Germany
- Division of Pediatric Neurooncology, German Cancer Research Center (DKFZ), German Cancer Consortium (DKTK), Heidelberg, Germany
- Medical Faculty, Heidelberg University, Heidelberg, Germany
- *Correspondence: Obul Reddy Bandapalli,
| |
Collapse
|
17
|
Abstract
The geminivirus capsid architecture is unique and built from twinned pseudo T=1 icosahedrons with 110 copies of the coat protein (CP). The CP is multifunctional. It performs various functions during the infection of a wide range of agriculturally important plant hosts. The CP multimerizes via pentameric intermediates during assembly and encapsulates the ssDNA genome to generate the unique capsid morphology. The virus capsid protects and transports the genome in the insect vector and plant host enroute to the plant nucleus for replication and the production of progeny. This review further explores CP:CP and CP:DNA interactions, and the environmental conditions that govern the assembly of the geminivirus capsid. This analysis was facilitated by new data available for the family, including three-dimensional structures and molecular biology data for several members. In addition, current and promising new control strategies of plant crop infection, which can lead to starvation for subsistence farmers, are discussed.
Collapse
Affiliation(s)
- Antonette Bennett
- Department of Biochemistry and Molecular Biology, College of Medicine, Center for Structural Biology, McKnight Brain Institute, University of Florida, Gainesville, FL, United States
| | - Mavis Agbandje-McKenna
- Department of Biochemistry and Molecular Biology, College of Medicine, Center for Structural Biology, McKnight Brain Institute, University of Florida, Gainesville, FL, United States.
| |
Collapse
|
18
|
Newport TD, Sansom MS, Stansfeld PJ. The MemProtMD database: a resource for membrane-embedded protein structures and their lipid interactions. Nucleic Acids Res 2019; 47:D390-D397. [PMID: 30418645 PMCID: PMC6324062 DOI: 10.1093/nar/gky1047] [Citation(s) in RCA: 119] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2018] [Revised: 10/05/2018] [Accepted: 10/16/2018] [Indexed: 12/19/2022] Open
Abstract
Integral membrane proteins fulfil important roles in many crucial biological processes, including cell signalling, molecular transport and bioenergetic processes. Advancements in experimental techniques are revealing high resolution structures for an increasing number of membrane proteins. Yet, these structures are rarely resolved in complex with membrane lipids. In 2015, the MemProtMD pipeline was developed to allow the automated lipid bilayer assembly around new membrane protein structures, released from the Protein Data Bank (PDB). To make these data available to the scientific community, a web database (http://memprotmd.bioch.ox.ac.uk) has been developed. Simulations and the results of subsequent analysis can be viewed using a web browser, including interactive 3D visualizations of the assembled bilayer and 2D visualizations of lipid contact data and membrane protein topology. In addition, ensemble analyses are performed to detail conserved lipid interaction information across proteins, families and for the entire database of 3506 PDB entries. Proteins may be searched using keywords, PDB or Uniprot identifier, or browsed using classification systems, such as Pfam, Gene Ontology annotation, mpstruc or the Transporter Classification Database. All files required to run further molecular simulations of proteins in the database are provided.
Collapse
Affiliation(s)
- Thomas D Newport
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK
| | - Mark S P Sansom
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK
| | - Phillip J Stansfeld
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK
| |
Collapse
|
19
|
Romão CV, Matias PM, Sousa CM, Pinho FG, Pinto AF, Teixeira M, Bandeiras TM. Insights into the Structures of Superoxide Reductases from the Symbionts Ignicoccus hospitalis and Nanoarchaeum equitans. Biochemistry 2018; 57:5271-5281. [PMID: 29939726 DOI: 10.1021/acs.biochem.8b00334] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Superoxide reductases (SORs) are enzymes that detoxify the superoxide anion through its reduction to hydrogen peroxide and exist in both prokaryotes and eukaryotes. The substrate is transformed at an iron catalytic center, pentacoordinated in the ferrous state by four histidines and one cysteine. SORs have a highly conserved motif, (E)(K)HxP-, in which the glutamate is associated with a redox-driven structural change, completing the octahedral coordination of the iron in the ferric state, whereas the lysine may be responsible for stabilization and donation of a proton to catalytic intermediates. We aimed to understand at the structural level the role of these two residues, by determining the X-ray structures of the SORs from the hyperthermophilic archaea Ignicoccus hospitalis and Nanoarchaeum equitans that lack the quasi-conserved lysine and glutamate, respectively, but have catalytic rate constants similar to those of the canonical enzymes, as we previously demonstrated. Furthermore, we have determined the crystal structure of the E23A mutant of I. hospitalis SOR, which mimics several enzymes that lack both residues. The structures revealed distinct structural arrangements of the catalytic center that simulate several catalytic cycle intermediates, namely, the reduced and the oxidized forms, and the glutamate-free and deprotonated ferric forms. Moreover, the structure of the I. hospitalis SOR provides evidence for the presence of an alternative lysine close to the iron center in the reduced state that may be a functional substitute for the "canonical" lysine.
Collapse
Affiliation(s)
- Célia V Romão
- ITQB NOVA, Instituto de Tecnologia Química e Biológica António Xavier , Universidade Nova de Lisboa , Av. da República , 2780-157 Oeiras , Portugal
| | - Pedro M Matias
- ITQB NOVA, Instituto de Tecnologia Química e Biológica António Xavier , Universidade Nova de Lisboa , Av. da República , 2780-157 Oeiras , Portugal.,iBET , Instituto de Biologia Experimental e Tecnológica , Apartado 12 , 2781-901 Oeiras , Portugal
| | - Cristiana M Sousa
- iBET , Instituto de Biologia Experimental e Tecnológica , Apartado 12 , 2781-901 Oeiras , Portugal
| | - Filipa G Pinho
- iBET , Instituto de Biologia Experimental e Tecnológica , Apartado 12 , 2781-901 Oeiras , Portugal
| | - Ana F Pinto
- ITQB NOVA, Instituto de Tecnologia Química e Biológica António Xavier , Universidade Nova de Lisboa , Av. da República , 2780-157 Oeiras , Portugal
| | - Miguel Teixeira
- ITQB NOVA, Instituto de Tecnologia Química e Biológica António Xavier , Universidade Nova de Lisboa , Av. da República , 2780-157 Oeiras , Portugal
| | - Tiago M Bandeiras
- ITQB NOVA, Instituto de Tecnologia Química e Biológica António Xavier , Universidade Nova de Lisboa , Av. da República , 2780-157 Oeiras , Portugal.,iBET , Instituto de Biologia Experimental e Tecnológica , Apartado 12 , 2781-901 Oeiras , Portugal
| |
Collapse
|
20
|
Grötzinger SW, Karan R, Strillinger E, Bader S, Frank A, Al Rowaihi IS, Akal A, Wackerow W, Archer JA, Rueping M, Weuster-Botz D, Groll M, Eppinger J, Arold ST. Identification and Experimental Characterization of an Extremophilic Brine Pool Alcohol Dehydrogenase from Single Amplified Genomes. ACS Chem Biol 2018; 13:161-170. [PMID: 29188989 DOI: 10.1021/acschembio.7b00792] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Because only 0.01% of prokaryotic genospecies can be cultured and in situ observations are often impracticable, culture-independent methods are required to understand microbial life and harness potential applications of microbes. Here, we report a methodology for the production of proteins with desired functions based on single amplified genomes (SAGs) from unculturable species. We use this method to resurrect an alcohol dehydrogenase (ADH/D1) from an uncharacterized halo-thermophilic archaeon collected from a brine pool at the bottom of the Red Sea. Our crystal structure of 5,6-dihydroxy NADPH-bound ADH/D1 combined with biochemical analyses reveal the molecular features of its halo-thermophily, its unique habitat adaptations, and its possible reaction mechanism for atypical oxygen activation. Our strategy offers a general guide for using SAGs as a source for scientific and industrial investigations of "microbial dark matter."
Collapse
Affiliation(s)
- Stefan W. Grötzinger
- King Abdullah University of Science and Technology, Biological and Environmental Science and Engineering
Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
- Technical University of Munich, Department of Mechanical
Engineering, Institute of Biochemical Engineering, Garching, Germany
| | - Ram Karan
- King Abdullah University of Science and Technology, Physical Science and Engineering Division, KAUST Catalysis Center, Thuwal, Kingdom of Saudi Arabia
| | - Eva Strillinger
- Technical University of Munich, Department of Mechanical
Engineering, Institute of Biochemical Engineering, Garching, Germany
- King Abdullah University of Science and Technology, Physical Science and Engineering Division, KAUST Catalysis Center, Thuwal, Kingdom of Saudi Arabia
| | - Stefan Bader
- King Abdullah University of Science and Technology, Physical Science and Engineering Division, KAUST Catalysis Center, Thuwal, Kingdom of Saudi Arabia
| | - Annika Frank
- Technical University of Munich, Center for Integrated
Protein Science Munich in the Department Chemistry, Garching, Germany
| | - Israa S. Al Rowaihi
- King Abdullah University of Science and Technology, Physical Science and Engineering Division, KAUST Catalysis Center, Thuwal, Kingdom of Saudi Arabia
| | - Anastassja Akal
- King Abdullah University of Science and Technology, Physical Science and Engineering Division, KAUST Catalysis Center, Thuwal, Kingdom of Saudi Arabia
- Technical University of Munich, Center for Integrated
Protein Science Munich in the Department Chemistry, Garching, Germany
| | - Wiebke Wackerow
- King Abdullah University of Science and Technology, Physical Science and Engineering Division, KAUST Catalysis Center, Thuwal, Kingdom of Saudi Arabia
| | - John A. Archer
- King Abdullah University of Science and Technology, Biological and Environmental Science and Engineering
Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
| | - Magnus Rueping
- King Abdullah University of Science and Technology, Physical Science and Engineering Division, KAUST Catalysis Center, Thuwal, Kingdom of Saudi Arabia
| | - Dirk Weuster-Botz
- Technical University of Munich, Department of Mechanical
Engineering, Institute of Biochemical Engineering, Garching, Germany
| | - Michael Groll
- Technical University of Munich, Center for Integrated
Protein Science Munich in the Department Chemistry, Garching, Germany
| | - Jörg Eppinger
- King Abdullah University of Science and Technology, Physical Science and Engineering Division, KAUST Catalysis Center, Thuwal, Kingdom of Saudi Arabia
| | - Stefan T. Arold
- King Abdullah University of Science and Technology, Biological and Environmental Science and Engineering
Division, Computational Bioscience Research Center, Thuwal, Kingdom of Saudi Arabia
| |
Collapse
|
21
|
Burschowsky D, Thorbjørnsrud HV, Heim JB, Fahrig-Kamarauskaitė JR, Würth-Roderer K, Kast P, Krengel U. Inter-Enzyme Allosteric Regulation of Chorismate Mutase in Corynebacterium glutamicum: Structural Basis of Feedback Activation by Trp. Biochemistry 2017; 57:557-573. [PMID: 29178787 DOI: 10.1021/acs.biochem.7b01018] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Corynebacterium glutamicum is widely used for the industrial production of amino acids, nucleotides, and vitamins. The shikimate pathway enzymes DAHP synthase (CgDS, Cg2391) and chorismate mutase (CgCM, Cgl0853) play a key role in the biosynthesis of aromatic compounds. Here we show that CgCM requires the formation of a complex with CgDS to achieve full activity, and that both CgCM and CgDS are feedback regulated by aromatic amino acids binding to CgDS. Kinetic analysis showed that Phe and Tyr inhibit CgCM activity by inter-enzyme allostery, whereas binding of Trp to CgDS strongly activates CgCM. Mechanistic insights were gained from crystal structures of the CgCM homodimer, tetrameric CgDS, and the heterooctameric CgCM-CgDS complex, refined to 1.1, 2.5, and 2.2 Å resolution, respectively. Structural details from the allosteric binding sites reveal that DAHP synthase is recruited as the dominant regulatory platform to control the shikimate pathway, similar to the corresponding enzyme complex from Mycobacterium tuberculosis.
Collapse
Affiliation(s)
| | | | - Joel B Heim
- Department of Chemistry, University of Oslo , NO-0315 Oslo, Norway
| | | | | | - Peter Kast
- Laboratory of Organic Chemistry, ETH Zurich , CH-8093 Zurich, Switzerland
| | - Ute Krengel
- Department of Chemistry, University of Oslo , NO-0315 Oslo, Norway
| |
Collapse
|
22
|
Britto-Borges T, Barton GJ. A study of the structural properties of sites modified by the O-linked 6-N-acetylglucosamine transferase. PLoS One 2017; 12:e0184405. [PMID: 28886091 PMCID: PMC5590929 DOI: 10.1371/journal.pone.0184405] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 08/23/2017] [Indexed: 01/17/2023] Open
Abstract
Protein O-GlcNAcylation (O-GlcNAc) is an essential post-translational modification (PTM) in higher eukaryotes. The O-linked β-N-acetylglucosamine transferase (OGT), targets specific Serines and Threonines (S/T) in intracellular proteins. However, unlike phosphorylation, fewer than 25% of known O-GlcNAc sites match a clear sequence pattern. Accordingly, the three-dimensional structures of O-GlcNAc sites were characterised to investigate the role of structure in molecular recognition. From 1,584 O-GlcNAc sites in 620 proteins, 143 were mapped to protein structures determined by X-ray crystallography. The modified S/T were 1.7 times more likely to be annotated in the REM465 field which defines missing residues in a protein structure, while 7 O-GlcNAc sites were solvent inaccessible and unlikely to be targeted by OGT. 132 sites with complete backbone atoms clustered into 10 groups, but these were indistinguishable from clusters from unmodified S/T. This suggests there is no prevalent three-dimensional motif for OGT recognition. Predicted features from the 620 proteins were compared to unmodified S/T in O-GlcNAcylated proteins and globular proteins. The Jpred4 predicted secondary structure shows that modified S/T were more likely to be coils. 5/6 methods to predict intrinsic disorder indicated O-GlcNAcylated S/T to be significantly more disordered than unmodified S/T. Although the analysis did not find a pattern in the site three-dimensional structure, it revealed the residues around the modification site are likely to be disordered and suggests a potential role of secondary structure elements in OGT site recognition.
Collapse
Affiliation(s)
- Thiago Britto-Borges
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, United Kingdom
| | - Geoffrey J. Barton
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, United Kingdom
| |
Collapse
|
23
|
Dimitriou PS, Denesyuk A, Takahashi S, Yamashita S, Johnson MS, Nakayama T, Denessiouk K. Alpha/beta-hydrolases: A unique structural motif coordinates catalytic acid residue in 40 protein fold families. Proteins 2017. [DOI: 10.1002/prot.25338] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Polytimi S. Dimitriou
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering; Åbo Akademi University; Turku 20520 Finland
| | - Alexander Denesyuk
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering; Åbo Akademi University; Turku 20520 Finland
- Institute for Biological Instrumentation of the Russian Academy of Sciences; Pushchino 142290 Russia
| | - Seiji Takahashi
- Department of Biomolecular Engineering, Graduate School of Engineering; Tohoku University; Sendai Miyagi 980-8579 Japan
| | - Satoshi Yamashita
- Division of Material Chemistry, Graduate School of Natural Science and Technology; Kanazawa University; Kanazawa Ishikawa 920-1192 Japan
| | - Mark S. Johnson
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering; Åbo Akademi University; Turku 20520 Finland
| | - Toru Nakayama
- Department of Biomolecular Engineering, Graduate School of Engineering; Tohoku University; Sendai Miyagi 980-8579 Japan
| | - Konstantin Denessiouk
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering; Åbo Akademi University; Turku 20520 Finland
| |
Collapse
|
24
|
Lobo SAL, Videira MAM, Pacheco I, Wass MN, Warren MJ, Teixeira M, Matias PM, Romão CV, Saraiva LM. Desulfovibrio vulgaris CbiK P cobaltochelatase: evolution of a haem binding protein orchestrated by the incorporation of two histidine residues. Environ Microbiol 2016; 19:106-118. [PMID: 27486032 DOI: 10.1111/1462-2920.13479] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Accepted: 07/27/2016] [Indexed: 11/26/2022]
Abstract
The sulfate-reducing bacteria of the Desulfovibrio genus make three distinct modified tetrapyrroles, haem, sirohaem and adenosylcobamide, where sirohydrochlorin acts as the last common biosynthetic intermediate along the branched tetrapyrrole pathway. Intriguingly, D. vulgaris encodes two sirohydrochlorin chelatases, CbiKP and CbiKC , that insert cobalt/iron into the tetrapyrrole macrocycle but are thought to be distinctly located in the periplasm and cytoplasm respectively. Fusing GFP onto the C-terminus of CbiKP confirmed that the protein is transported to the periplasm. The structure-function relationship of CbiKP was studied by constructing eleven site-directed mutants and determining their chelatase activities, oligomeric status and haem binding abilities. Residues His154 and His216 were identified as essential for metal-chelation of sirohydrochlorin. The tetrameric form of the protein is stabilized by Arg54 and Glu76, which form hydrogen bonds between two subunits. His96 is responsible for the binding of two haem groups within the main central cavity of the tetramer. Unexpectedly, CbiKP is shown to bind two additional haem groups through interaction with His103. Thus, although still retaining cobaltochelatase activity, the presence of His96 and His103 in CbiKP , which are absent from all other known bacterial cobaltochelatases, has evolved CbiKP a new function as a haem binding protein permitting it to act as a potential haem chaperone or transporter.
Collapse
Affiliation(s)
- Susana A L Lobo
- Instituto de Tecnologia Química e Biológica NOVA, Avenida da República (EAN), Oeiras, 2780-157, Portugal
| | - Marco A M Videira
- Instituto de Tecnologia Química e Biológica NOVA, Avenida da República (EAN), Oeiras, 2780-157, Portugal
| | - Isabel Pacheco
- Instituto de Tecnologia Química e Biológica NOVA, Avenida da República (EAN), Oeiras, 2780-157, Portugal
| | - Mark N Wass
- School of Biosciences, University of Kent, Giles Lane, Canterbury, Kent, CT2 7NJ, UK
| | - Martin J Warren
- School of Biosciences, University of Kent, Giles Lane, Canterbury, Kent, CT2 7NJ, UK
| | - Miguel Teixeira
- Instituto de Tecnologia Química e Biológica NOVA, Avenida da República (EAN), Oeiras, 2780-157, Portugal
| | - Pedro M Matias
- Instituto de Tecnologia Química e Biológica NOVA, Avenida da República (EAN), Oeiras, 2780-157, Portugal.,iBET, Instituto de Biologia Experimental e Tecnológica, Apartado 12, Oeiras, 2781-901, Portugal
| | - Célia V Romão
- Instituto de Tecnologia Química e Biológica NOVA, Avenida da República (EAN), Oeiras, 2780-157, Portugal
| | - Lígia M Saraiva
- Instituto de Tecnologia Química e Biológica NOVA, Avenida da República (EAN), Oeiras, 2780-157, Portugal
| |
Collapse
|
25
|
Céol A, Verhoef LGGC, Wade M, Muller H. Genome and network visualization facilitates the analyses of the effects of drugs and mutations on protein-protein and drug-protein networks. BMC Bioinformatics 2016; 17 Suppl 4:54. [PMID: 26961139 PMCID: PMC4896239 DOI: 10.1186/s12859-016-0908-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND Biologists generally interrogate genomics data using web-based genome browsers that have limited analytical potential. New generation genome browsers such as the Integrated Genome Browser (IGB) have largely overcome this limitation and permit customized analyses to be implemented using plugins. We illustrate the use of a plugin for IGB that exploits advanced visualization techniques to integrate the analysis of genomics data with network and structural approaches. RESULTS We show how visualization technologies that combine both genomics and network biology can facilitate the selection of the key amino acid contacts from protein-protein and protein-drug interactions. Starting from the MDM2-P53 interaction, which is a high-value target for cancer therapy, and Nutlin, the parent small molecule of an MDM2 antagonist that is currently in clinical trials, we show that this method can be generalized to analyze how drugs and mutations can interfere with both protein-protein and drug-protein networks. We illustrate this point by two additional use-cases exploring the molecular basis of tamoxifen side effects and of drug resistance in chronic myeloid leukemia patients. CONCLUSIONS Combined network and structure biology approaches provide key insights into both the genetic and the edgetic roles of variants in diseases. 3D interactomes facilitate the identification of disease-relevant interactions that can then be specifically targeted by drugs. Recent advances in molecular interaction and structure visualization tools have greatly simplified the mapping of mutated residues to molecular interaction interfaces. Such approaches can now also be integrated with genome visualization tools to enable comparative analyses of interaction contacts.
Collapse
Affiliation(s)
- Arnaud Céol
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia (IIT), Via Adamello 16, Milan, I-20139, Italy.
| | - Lisette G G C Verhoef
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia (IIT), Via Adamello 16, Milan, I-20139, Italy.
| | - Mark Wade
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia (IIT), Via Adamello 16, Milan, I-20139, Italy.
| | - Heiko Muller
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia (IIT), Via Adamello 16, Milan, I-20139, Italy.
| |
Collapse
|
26
|
Abstract
Summary: Prioritization of candidate genes emanating from large-scale screens requires integrated analyses at the genomics, molecular, network and structural biology levels. We have extended the Integrated Genome Browser (IGB) to facilitate these tasks. The graphical user interface greatly simplifies building disease networks and zooming in at atomic resolution to identify variations in molecular complexes that may affect molecular interactions in the context of genomic data. All results are summarized in genome tracks and can be visualized and analyzed at the transcript level. Availability and implementation: The MI Bundle is a plugin for the IGB. The plugin, help, video and tutorial are available at http://cru.genomics.iit.it/igbmibundle/ and https://github.com/CRUiit/igb-mi-bundle/wiki. The source code is released under the Apache License, Version 2. Contact:arnaud.ceol@iit.it Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Arnaud Céol
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia (IIT), 20139 Milan, Italy
| | - Heiko Müller
- Center for Genomic Science of IIT@SEMM, Fondazione Istituto Italiano di Tecnologia (IIT), 20139 Milan, Italy
| |
Collapse
|
27
|
Taberman H, Andberg M, Parkkinen T, Jänis J, Penttilä M, Hakulinen N, Koivula A, Rouvinen J. Structure and function of a decarboxylating Agrobacterium tumefaciens keto-deoxy-d-galactarate dehydratase. Biochemistry 2014; 53:8052-60. [PMID: 25454257 DOI: 10.1021/bi501290k] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Agrobacterium tumefaciens (At) strain C58 contains an oxidative enzyme pathway that can function on both d-glucuronic and d-galacturonic acid. The corresponding gene coding for At keto-deoxy-d-galactarate (KDG) dehydratase is located in the same gene cluster as those coding for uronate dehydrogenase (At Udh) and galactarolactone cycloisomerase (At Gci) which we have previously characterized. Here, we present the kinetic characterization and crystal structure of At KDG dehydratase, which catalyzes the next step, the decarboxylating hydrolyase reaction of KDG to produce α-ketoglutaric semialdehyde (α-KGSA) and carbon dioxide. The crystal structures of At KDG dehydratase and its complexes with pyruvate and 2-oxoadipic acid, two substrate analogues, were determined to 1.7 Å, 1.5 Å, and 2.1 Å resolution, respectively. Furthermore, mass spectrometry was used to confirm reaction end-products. The results lead us to propose a structure-based mechanism for At KDG dehydratase, suggesting that while the enzyme belongs to the Class I aldolase protein family, it does not follow a typical retro-aldol condensation mechanism.
Collapse
Affiliation(s)
- Helena Taberman
- Department of Chemistry, University of Eastern Finland , FI-80101 Joensuu, Finland
| | | | | | | | | | | | | | | |
Collapse
|
28
|
Valentini E, Kikhney AG, Previtali G, Jeffries CM, Svergun DI. SASBDB, a repository for biological small-angle scattering data. Nucleic Acids Res 2014; 43:D357-63. [PMID: 25352555 PMCID: PMC4383894 DOI: 10.1093/nar/gku1047] [Citation(s) in RCA: 236] [Impact Index Per Article: 23.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Small-angle X-ray and neutron scattering (SAXS and SANS) are fundamental tools used to study the global shapes of proteins, nucleic acids, macromolecular complexes and assemblies in solution. Due to recent advances in instrumentation and computational methods, the quantity of experimental scattering data and subsequent publications is increasing dramatically. The need for a global repository allowing investigators to locate and access experimental scattering data and associated models was recently emphasized by the wwPDB small-angle scattering task force (SAStf). The small-angle scattering biological data bank (SASBDB) www.sasbdb.org has been designed in accordance with the plans of the SAStf as part of a future federated system of databases for biological SAXS and SANS. SASBDB is a comprehensive repository of freely accessible and fully searchable SAS experimental data and models that are deposited together with the relevant experimental conditions, sample details and instrument characteristics. At present the quality of deposited experimental data and the accuracy of models are manually curated, with future plans to integrate automated systems as the database expands.
Collapse
Affiliation(s)
- Erica Valentini
- European Molecular Biology Laboratory, Hamburg Outstation, Notkestr. 85, Geb. 25a, 22603 Hamburg, Germany
| | - Alexey G Kikhney
- European Molecular Biology Laboratory, Hamburg Outstation, Notkestr. 85, Geb. 25a, 22603 Hamburg, Germany
| | - Gianpietro Previtali
- European Molecular Biology Laboratory, Hamburg Outstation, Notkestr. 85, Geb. 25a, 22603 Hamburg, Germany
| | - Cy M Jeffries
- European Molecular Biology Laboratory, Hamburg Outstation, Notkestr. 85, Geb. 25a, 22603 Hamburg, Germany
| | - Dmitri I Svergun
- European Molecular Biology Laboratory, Hamburg Outstation, Notkestr. 85, Geb. 25a, 22603 Hamburg, Germany
| |
Collapse
|
29
|
Borges PT, Frazão C, Miranda CS, Carrondo MA, Romão CV. Structure of the monofunctional heme catalase DR1998 from Deinococcus radiodurans. FEBS J 2014; 281:4138-50. [PMID: 24975828 DOI: 10.1111/febs.12895] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Revised: 06/12/2014] [Accepted: 06/24/2014] [Indexed: 11/30/2022]
Abstract
UNLABELLED Deinococcus radiodurans is an aerobic organism with the ability to survive under conditions of high radiation doses or desiccation. As part of its protection system against oxidative stress, this bacterium encodes three monofunctional catalases. The DR1998 catalase belongs to clade 1, and is present at high levels under normal growth conditions. The crystals of DR1998 diffracted very weakly, and the merged diffraction data showed an R sym of 0.308. Its crystal structure was determined and refined to 2.6 Å. The four molecules present in the asymmetric unit form, by crystallographic symmetry, two homotetramers with 222 point-group symmetry. The overall structure of DR1998 is similar to that of other monofunctional catalases, showing higher structural homology with the catalase structures of clade 1. Each monomer shows the typical catalase fold, and contains one heme b in the active site. The heme is coordinated by the proximal ligand Tyr369, and on the heme distal side the essential His81 and Asn159 are hydrogen-bonded to a water molecule. A 25-Å-long channel is the main channel connecting the active site to the external surface. This channel starts with a hydrophobic region from the catalytic heme site, which is followed by a hydrophilic region that begins on Asp139 and expands up to the protein surface. Apart from this channel, an alternative channel, also near the heme active site, is presented and discussed. DATABASE Coordinates and structure factors have been deposited in the Protein Data Bank in Europe under accession code 4CAB.
Collapse
Affiliation(s)
- Patrícia T Borges
- Instituto de Tecnologia Química e Biológica, Universidade Nova de Lisboa, Oeiras, Portugal
| | | | | | | | | |
Collapse
|
30
|
Jung S, Main D. Genomics and bioinformatics resources for translational science in Rosaceae. PLANT BIOTECHNOLOGY REPORTS 2014; 8:49-64. [PMID: 24634697 PMCID: PMC3951882 DOI: 10.1007/s11816-013-0282-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 04/22/2013] [Indexed: 05/22/2023]
Abstract
Recent technological advances in biology promise unprecedented opportunities for rapid and sustainable advancement of crop quality. Following this trend, the Rosaceae research community continues to generate large amounts of genomic, genetic and breeding data. These include annotated whole genome sequences, transcriptome and expression data, proteomic and metabolomic data, genotypic and phenotypic data, and genetic and physical maps. Analysis, storage, integration and dissemination of these data using bioinformatics tools and databases are essential to provide utility of the data for basic, translational and applied research. This review discusses the currently available genomics and bioinformatics resources for the Rosaceae family.
Collapse
Affiliation(s)
- Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA 99164 USA
| |
Collapse
|
31
|
Gutmanas A, Alhroub Y, Battle GM, Berrisford JM, Bochet E, Conroy MJ, Dana JM, Fernandez Montecelo MA, van Ginkel G, Gore SP, Haslam P, Hatherley R, Hendrickx PMS, Hirshberg M, Lagerstedt I, Mir S, Mukhopadhyay A, Oldfield TJ, Patwardhan A, Rinaldi L, Sahni G, Sanz-García E, Sen S, Slowley RA, Velankar S, Wainwright ME, Kleywegt GJ. PDBe: Protein Data Bank in Europe. Nucleic Acids Res 2013; 42:D285-91. [PMID: 24288376 PMCID: PMC3965016 DOI: 10.1093/nar/gkt1180] [Citation(s) in RCA: 105] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The Protein Data Bank in Europe (pdbe.org) is a founding member of the Worldwide PDB consortium (wwPDB; wwpdb.org) and as such is actively engaged in the deposition, annotation, remediation and dissemination of macromolecular structure data through the single global archive for such data, the PDB. Similarly, PDBe is a member of the EMDataBank organisation (emdatabank.org), which manages the EMDB archive for electron microscopy data. PDBe also develops tools that help the biomedical science community to make effective use of the data in the PDB and EMDB for their research. Here we describe new or improved services, including updated SIFTS mappings to other bioinformatics resources, a new browser for the PDB archive based on Gene Ontology (GO) annotation, updates to the analysis of Nuclear Magnetic Resonance-derived structures, redesigned search and browse interfaces, and new or updated visualisation and validation tools for EMDB entries.
Collapse
Affiliation(s)
- Aleksandras Gutmanas
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Analyzing methods for path mining with applications in metabolomics. Gene 2013; 534:125-38. [PMID: 24230973 DOI: 10.1016/j.gene.2013.10.056] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Revised: 10/23/2013] [Accepted: 10/25/2013] [Indexed: 11/22/2022]
Abstract
Metabolomics is one of the key approaches of systems biology that consists of studying biochemical networks having a set of metabolites, enzymes, reactions and their interactions. As biological networks are very complex in nature, proper techniques and models need to be chosen for their better understanding and interpretation. One of the useful strategies in this regard is using path mining strategies and graph-theoretical approaches that help in building hypothetical models and perform quantitative analysis. Furthermore, they also contribute to analyzing topological parameters in metabolome networks. Path mining techniques can be based on grammars, keys, patterns and indexing. Moreover, they can also be used for modeling metabolome networks, finding structural similarities between metabolites, in-silico metabolic engineering, shortest path estimation and for various graph-based analysis. In this manuscript, we have highlighted some core and applied areas of path-mining for modeling and analysis of metabolic networks.
Collapse
|
33
|
Lagerstedt I, Moore WJ, Patwardhan A, Sanz-García E, Best C, Swedlow JR, Kleywegt GJ. Web-based visualisation and analysis of 3D electron-microscopy data from EMDB and PDB. J Struct Biol 2013; 184:173-81. [PMID: 24113529 PMCID: PMC3898923 DOI: 10.1016/j.jsb.2013.09.021] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Revised: 09/24/2013] [Accepted: 09/25/2013] [Indexed: 11/25/2022]
Abstract
The Protein Data Bank in Europe (PDBe) has developed web-based tools for the visualisation and analysis of 3D electron microscopy (3DEM) structures in the Electron Microscopy Data Bank (EMDB) and Protein Data Bank (PDB). The tools include: (1) a volume viewer for 3D visualisation of maps, tomograms and models, (2) a slice viewer for inspecting 2D slices of tomographic reconstructions, and (3) visual analysis pages to facilitate analysis and validation of maps, tomograms and models. These tools were designed to help non-experts and experts alike to get some insight into the content and assess the quality of 3DEM structures in EMDB and PDB without the need to install specialised software or to download large amounts of data from these archives. The technical challenges encountered in developing these tools, as well as the more general considerations when making archived data available to the user community through a web interface, are discussed.
Collapse
Affiliation(s)
- Ingvar Lagerstedt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - William J. Moore
- Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dow Street, Dundee DD1 5EH, United Kingdom
| | - Ardan Patwardhan
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Eduardo Sanz-García
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Christoph Best
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Jason R. Swedlow
- Centre for Gene Regulation and Expression, College of Life Sciences, University of Dundee, Dow Street, Dundee DD1 5EH, United Kingdom
| | - Gerard J. Kleywegt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SD, United Kingdom
| |
Collapse
|
34
|
Gutmanas A, Oldfield TJ, Patwardhan A, Sen S, Velankar S, Kleywegt GJ. The role of structural bioinformatics resources in the era of integrative structural biology. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2013; 69:710-21. [PMID: 23633580 PMCID: PMC3640467 DOI: 10.1107/s0907444913001157] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 01/11/2013] [Indexed: 11/10/2022]
Abstract
The history and the current state of the PDB and EMDB archives is briefly described, as well as some of the challenges that they face. It seems natural that the role of structural biology archives will change from being a pure repository of historic data into becoming an indispensable resource for the wider biomedical community. As part of this transformation, it will be necessary to validate the biomacromolecular structure data and ensure the highest possible quality for the archive holdings, to combine structural data from different spatial scales into a unified resource and to integrate structural data with functional, genetic and taxonomic data as well as other information available in bioinformatics resources. Some recent developments and plans to address these challenges at PDBe are presented.
Collapse
Affiliation(s)
- Aleksandras Gutmanas
- Protein Data Bank in Europe, EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Thomas J. Oldfield
- Protein Data Bank in Europe, EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Ardan Patwardhan
- Protein Data Bank in Europe, EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Sanchayita Sen
- Protein Data Bank in Europe, EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Sameer Velankar
- Protein Data Bank in Europe, EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Gerard J. Kleywegt
- Protein Data Bank in Europe, EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| |
Collapse
|
35
|
Thirup S, Gupta V, Quistgaard EM. Up, down, and around: identifying recurrent interactions within and between super-secondary structures in β-propellers. Methods Mol Biol 2013; 932:35-50. [PMID: 22987345 DOI: 10.1007/978-1-62703-065-6_3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
Abstract
The examination and analysis of super-secondary structures or other specific structural patterns associated with particular functions, sequence motifs, or structural contexts require that the set of structures examined shares the same feature. This seems obvious but in practice this may often present problems such as identifying complete sets of data avoiding false positives. In the case of the β-propeller structures the symmetry of the propeller creates problems for many structure similarity search programs. Here we present a procedure that will identify propeller structures in PDB and assign them to the different N-propeller types. In addition we outline methods to examine similarities and differences within and between the four stranded up-and-down blades of the β-propeller.
Collapse
Affiliation(s)
- Søren Thirup
- Department of Molecular Biology, MIND Centre, CARB Centre, Aarhus University, Aarhus, Denmark.
| | | | | |
Collapse
|
36
|
Velankar S, Dana JM, Jacobsen J, van Ginkel G, Gane PJ, Luo J, Oldfield TJ, O'Donovan C, Martin MJ, Kleywegt GJ. SIFTS: Structure Integration with Function, Taxonomy and Sequences resource. Nucleic Acids Res 2012. [PMID: 23203869 PMCID: PMC3531078 DOI: 10.1093/nar/gks1258] [Citation(s) in RCA: 174] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
The Structure Integration with Function, Taxonomy and Sequences resource (SIFTS; http://pdbe.org/sifts) is a close collaboration between the Protein Data Bank in Europe (PDBe) and UniProt. The two teams have developed a semi-automated process for maintaining up-to-date cross-reference information to UniProt entries, for all protein chains in the PDB entries present in the UniProt database. This process is carried out for every weekly PDB release and the information is stored in the SIFTS database. The SIFTS process includes cross-references to other biological resources such as Pfam, SCOP, CATH, GO, InterPro and the NCBI taxonomy database. The information is exported in XML format, one file for each PDB entry, and is made available by FTP. Many bioinformatics resources use SIFTS data to obtain cross-references between the PDB and other biological databases so as to provide their users with up-to-date information.
Collapse
Affiliation(s)
- Sameer Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Fang H, Gough J. DcGO: database of domain-centric ontologies on functions, phenotypes, diseases and more. Nucleic Acids Res 2012; 41:D536-44. [PMID: 23161684 PMCID: PMC3531119 DOI: 10.1093/nar/gks1080] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
We present ‘dcGO’ (http://supfam.org/SUPERFAMILY/dcGO), a comprehensive ontology database for protein domains. Domains are often the functional units of proteins, thus instead of associating ontological terms only with full-length proteins, it sometimes makes more sense to associate terms with individual domains. Domain-centric GO, ‘dcGO’, provides associations between ontological terms and protein domains at the superfamily and family levels. Some functional units consist of more than one domain acting together or acting at an interface between domains; therefore, ontological terms associated with pairs of domains, triplets and longer supra-domains are also provided. At the time of writing the ontologies in dcGO include the Gene Ontology (GO); Enzyme Commission (EC) numbers; pathways from UniPathway; human phenotype ontology and phenotype ontologies from five model organisms, including plants; anatomy ontologies from three organisms; human disease ontology and drugs from DrugBank. All ontological terms have probabilistic scores for their associations. In addition to associations to domains and supra-domains, the ontological terms have been transferred to proteins, through homology, providing annotations of >80 million sequences covering 2414 complete genomes, hundreds of meta-genomes, thousands of viruses and so forth. The dcGO database is updated fortnightly, and its website provides downloads, search, browse, phylogenetic context and other data-mining facilities.
Collapse
Affiliation(s)
- Hai Fang
- Department of Computer Science, University of Bristol, The Merchant Venturers Building, Bristol BS8 1UB, UK.
| | | |
Collapse
|
38
|
Sousa da Silva AW, Vranken WF. ACPYPE - AnteChamber PYthon Parser interfacE. BMC Res Notes 2012; 5:367. [PMID: 22824207 PMCID: PMC3461484 DOI: 10.1186/1756-0500-5-367] [Citation(s) in RCA: 1690] [Impact Index Per Article: 140.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 06/26/2012] [Indexed: 11/25/2022] Open
Abstract
Background ACPYPE (or AnteChamber PYthon Parser interfacE) is a wrapper script around the ANTECHAMBER software that simplifies the generation of small molecule topologies and parameters for a variety of molecular dynamics programmes like GROMACS, CHARMM and CNS. It is written in the Python programming language and was developed as a tool for interfacing with other Python based applications such as the CCPN software suite (for NMR data analysis) and ARIA (for structure calculations from NMR data). ACPYPE is open source code, under GNU GPL v3, and is available as a stand-alone application at http://www.ccpn.ac.uk/acpype and as a web portal application at http://webapps.ccpn.ac.uk/acpype. Findings We verified the topologies generated by ACPYPE in three ways: by comparing with default AMBER topologies for standard amino acids; by generating and verifying topologies for a large set of ligands from the PDB; and by recalculating the structures for 5 protein–ligand complexes from the PDB. Conclusions ACPYPE is a tool that simplifies the automatic generation of topology and parameters in different formats for different molecular mechanics programmes, including calculation of partial charges, while being object oriented for integration with other applications.
Collapse
Affiliation(s)
- Alan W Sousa da Silva
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK.
| | | |
Collapse
|
39
|
Julfayev ES, McLaughlin RJ, Tao YP, McLaughlin WA. KB-Rank: efficient protein structure and functional annotation identification via text query. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2012; 13:101-10. [PMID: 22270457 PMCID: PMC3375009 DOI: 10.1007/s10969-012-9125-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Accepted: 01/07/2012] [Indexed: 12/12/2022]
Abstract
The KB-Rank tool was developed to help determine the functions of proteins. A user provides text query and protein structures are retrieved together with their functional annotation categories. Structures and annotation categories are ranked according to their estimated relevance to the queried text. The algorithm for ranking first retrieves matches between the query text and the text fields associated with the structures. The structures are next ordered by their relative content of annotations that are found to be prevalent across all the structures retrieved. An interactive web interface was implemented to navigate and interpret the relevance of the structures and annotation categories retrieved by a given search. The aim of the KB-Rank tool is to provide a means to quickly identify protein structures of interest and the annotations most relevant to the queries posed by a user. Informational and navigational searches regarding disease topics are described to illustrate the tool's utilities. The tool is available at the URL http://protein.tcmedc.org/KB-Rank.
Collapse
Affiliation(s)
- Elchin S. Julfayev
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| | - Ryan J. McLaughlin
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| | - Yi-Ping Tao
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Road, Piscataway, NJ 08854-8087 USA
| | - William A. McLaughlin
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| |
Collapse
|
40
|
Gore S, Velankar S, Kleywegt GJ. Implementing an X-ray validation pipeline for the Protein Data Bank. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2012; 68:478-83. [PMID: 22505268 PMCID: PMC3322607 DOI: 10.1107/s0907444911050359] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Accepted: 11/23/2011] [Indexed: 11/10/2022]
Abstract
There is an increasing realisation that the quality of the biomacromolecular structures deposited in the Protein Data Bank (PDB) archive needs to be assessed critically using established and powerful validation methods. The Worldwide Protein Data Bank (wwPDB) organization has convened several Validation Task Forces (VTFs) to advise on the methods and standards that should be used to validate all of the entries already in the PDB as well as all structures that will be deposited in the future. The recommendations of the X-ray VTF are currently being implemented in a software pipeline. Here, ongoing work on this pipeline is briefly described as well as ways in which validation-related information could be presented to users of structural data.
Collapse
Affiliation(s)
- Swanand Gore
- Protein Data Bank in Europe (PDBe), EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Sameer Velankar
- Protein Data Bank in Europe (PDBe), EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| | - Gerard J. Kleywegt
- Protein Data Bank in Europe (PDBe), EMBL–EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, England
| |
Collapse
|
41
|
Koestler T, von Haeseler A, Ebersberger I. REvolver: modeling sequence evolution under domain constraints. Mol Biol Evol 2012; 29:2133-45. [PMID: 22383532 DOI: 10.1093/molbev/mss078] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Simulating the change of protein sequences over time in a biologically realistic way is fundamental for a broad range of studies with a focus on evolution. It is, thus, problematic that typically simulators evolve individual sites of a sequence identically and independently. More realistic simulations are possible; however, they are often prohibited by limited knowledge concerning site-specific evolutionary constraints or functional dependencies between amino acids. As a consequence, a protein's functional and structural characteristics are rapidly lost in the course of simulated evolution. Here, we present REvolver (www.cibiv.at/software/revolver), a program that simulates protein sequence alteration such that evolutionarily stable sequence characteristics, like functional domains, are maintained. For this purpose, REvolver recruits profile hidden Markov models (pHMMs) for parameterizing site-specific models of sequence evolution in an automated fashion. pHMMs derived from alignments of homologous proteins or protein domains capture information regarding which sequence sites remained conserved over time and where in a sequence insertions or deletions are more likely to occur. Thus, they describe constraints on the evolutionary process acting on these sequences. To demonstrate the performance of REvolver as well as its applicability in large-scale simulation studies, we evolved the entire human proteome up to 1.5 expected substitutions per site. Simultaneously, we analyzed the preservation of Pfam and SMART domains in the simulated sequences over time. REvolver preserved 92% of the Pfam domains originally present in the human sequences. This value drops to 15% when traditional models of amino acid sequence evolution are used. Thus, REvolver represents a significant advance toward a realistic simulation of protein sequence evolution on a proteome-wide scale. Further, REvolver facilitates the simulation of a protein family with a user-defined domain architecture at the root.
Collapse
|
42
|
Doreleijers JF, Vranken WF, Schulte C, Markley JL, Ulrich EL, Vriend G, Vuister GW. NRG-CING: integrated validation reports of remediated experimental biomolecular NMR data and coordinates in wwPDB. Nucleic Acids Res 2011; 40:D519-24. [PMID: 22139937 PMCID: PMC3245154 DOI: 10.1093/nar/gkr1134] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
For many macromolecular NMR ensembles from the Protein Data Bank (PDB) the experiment-based restraint lists are available, while other experimental data, mainly chemical shift values, are often available from the BioMagResBank. The accuracy and precision of the coordinates in these macromolecular NMR ensembles can be improved by recalculation using the available experimental data and present-day software. Such efforts, however, generally fail on half of all NMR ensembles due to the syntactic and semantic heterogeneity of the underlying data and the wide variety of formats used for their deposition. We have combined the remediated restraint information from our NMR Restraints Grid (NRG) database with available chemical shifts from the BioMagResBank and the Common Interface for NMR structure Generation (CING) structure validation reports into the weekly updated NRG-CING database (http://nmr.cmbi.ru.nl/NRG-CING). Eleven programs have been included in the NRG-CING production pipeline to arrive at validation reports that list for each entry the potential inconsistencies between the coordinates and the available experimental NMR data. The longitudinal validation of these data in a publicly available relational database yields a set of indicators that can be used to judge the quality of every macromolecular structure solved with NMR. The remediated NMR experimental data sets and validation reports are freely available online.
Collapse
Affiliation(s)
- Jurgen F Doreleijers
- IMM, Radboud University Nijmegen, Geert Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
43
|
Velankar S, Alhroub Y, Best C, Caboche S, Conroy MJ, Dana JM, Fernandez Montecelo MA, van Ginkel G, Golovin A, Gore SP, Gutmanas A, Haslam P, Hendrickx PMS, Heuson E, Hirshberg M, John M, Lagerstedt I, Mir S, Newman LE, Oldfield TJ, Patwardhan A, Rinaldi L, Sahni G, Sanz-García E, Sen S, Slowley R, Suarez-Uruena A, Swaminathan GJ, Symmons MF, Vranken WF, Wainwright M, Kleywegt GJ. PDBe: Protein Data Bank in Europe. Nucleic Acids Res 2011; 40:D445-52. [PMID: 22110033 PMCID: PMC3245096 DOI: 10.1093/nar/gkr998] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The Protein Data Bank in Europe (PDBe; pdbe.org) is a partner in the Worldwide PDB organization (wwPDB; wwpdb.org) and as such actively involved in managing the single global archive of biomacromolecular structure data, the PDB. In addition, PDBe develops tools, services and resources to make structure-related data more accessible to the biomedical community. Here we describe recently developed, extended or improved services, including an animated structure-presentation widget (PDBportfolio), a widget to graphically display the coverage of any UniProt sequence in the PDB (UniPDB), chemistry- and taxonomy-based PDB-archive browsers (PDBeXplore), and a tool for interactive visualization of NMR structures, corresponding experimental data as well as validation and analysis results (Vivaldi).
Collapse
Affiliation(s)
- S Velankar
- Protein Data Bank in Europe, EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Halling-Brown MD, Bulusu KC, Patel M, Tym JE, Al-Lazikani B. canSAR: an integrated cancer public translational research and drug discovery resource. Nucleic Acids Res 2011; 40:D947-56. [PMID: 22013161 PMCID: PMC3245005 DOI: 10.1093/nar/gkr881] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
canSAR is a fully integrated cancer research and drug discovery resource developed to utilize the growing publicly available biological annotation, chemical screening, RNA interference screening, expression, amplification and 3D structural data. Scientists can, in a single place, rapidly identify biological annotation of a target, its structural characterization, expression levels and protein interaction data, as well as suitable cell lines for experiments, potential tool compounds and similarity to known drug targets. canSAR has, from the outset, been completely use-case driven which has dramatically influenced the design of the back-end and the functionality provided through the interfaces. The Web interface at http://cansar.icr.ac.uk provides flexible, multipoint entry into canSAR. This allows easy access to the multidisciplinary data within, including target and compound synopses, bioactivity views and expert tools for chemogenomic, expression and protein interaction network data.
Collapse
Affiliation(s)
- Mark D Halling-Brown
- Cancer Research UK Cancer Therapeutics Unit, Institute of Cancer Research, Haddow Laboratories, Belmont, Surrey SM2 5NG, UK
| | | | | | | | | |
Collapse
|
45
|
Favia AD, Bottegoni G, Nobeli I, Bisignano P, Cavalli A. SERAPhiC: A Benchmark for in Silico Fragment-Based Drug Design. J Chem Inf Model 2011; 51:2882-96. [DOI: 10.1021/ci2003363] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Angelo D. Favia
- Department of Drug Discovery and Development, Istituto Italiano di Tecnologia, via Morego 30, 16163 Genova, Italy
| | - Giovanni Bottegoni
- Department of Drug Discovery and Development, Istituto Italiano di Tecnologia, via Morego 30, 16163 Genova, Italy
| | - Irene Nobeli
- Department of Biological Sciences, Institute of Structural and Molecular Biology, Birkbeck, University of London, Malet Street, WC1E 7HX London, United Kingdom
| | - Paola Bisignano
- Department of Drug Discovery and Development, Istituto Italiano di Tecnologia, via Morego 30, 16163 Genova, Italy
| | - Andrea Cavalli
- Department of Drug Discovery and Development, Istituto Italiano di Tecnologia, via Morego 30, 16163 Genova, Italy
- Dipartimento di Scienze Farmaceutiche, Università di Bologna, via Belmeloro 6, 40126 Bologna, Italy
| |
Collapse
|
46
|
Kinjo AR, Suzuki H, Yamashita R, Ikegawa Y, Kudou T, Igarashi R, Kengaku Y, Cho H, Standley DM, Nakagawa A, Nakamura H. Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format. Nucleic Acids Res 2011; 40:D453-60. [PMID: 21976737 PMCID: PMC3245181 DOI: 10.1093/nar/gkr811] [Citation(s) in RCA: 97] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
The Protein Data Bank Japan (PDBj, http://pdbj.org) is a member of the worldwide Protein Data Bank (wwPDB) and accepts and processes the deposited data of experimentally determined macromolecular structures. While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins, which are summarized in this article. To enhance the interoperability of the PDB data, we have recently developed PDB/RDF, PDB data in the Resource Description Framework (RDF) format, along with its ontology in the Web Ontology Language (OWL) based on the PDB mmCIF Exchange Dictionary. Being in the standard format for the Semantic Web, the PDB/RDF data provide a means to integrate the PDB with other biological information resources.
Collapse
Affiliation(s)
- Akira R Kinjo
- Institute for Protein Research and Immunology Frontier Research Center, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
|
48
|
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 2011; 40:D1100-7. [PMID: 21948594 PMCID: PMC3245175 DOI: 10.1093/nar/gkr777] [Citation(s) in RCA: 2450] [Impact Index Per Article: 188.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
ChEMBL is an Open Data database containing binding, functional and ADMET information for a large number of drug-like bioactive compounds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chemical biology and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compounds and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.
Collapse
Affiliation(s)
- Anna Gaulton
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Verschueren E, Vanhee P, van der Sloot AM, Serrano L, Rousseau F, Schymkowitz J. Protein design with fragment databases. Curr Opin Struct Biol 2011; 21:452-9. [PMID: 21684149 DOI: 10.1016/j.sbi.2011.05.002] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2011] [Accepted: 05/25/2011] [Indexed: 11/25/2022]
Abstract
Structure-based computational methods are popular tools for designing proteins and interactions between proteins because they provide the necessary insight and details required for rational engineering. Here, we first argue that large-scale databases of fragments contain a discrete but complete set of building blocks that can be used to design structures. We show that these structural alphabets can be saturated to provide conformational ensembles that sample the native structure space around energetic minima. Second, we show that catalogs of interaction patterns hold the key to overcome the lack of scaffolds when computationally designing protein interactions. Finally, we illustrate the power of database-driven computational protein design methods by recent successful applications and discuss what challenges remain to push this field forward.
Collapse
Affiliation(s)
- Erik Verschueren
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Spain
| | | | | | | | | | | |
Collapse
|
50
|
Sowmya G, Anita S, Kangueane P. Insights from the structural analysis of protein heterodimer interfaces. Bioinformation 2011; 6:137-43. [PMID: 21572879 PMCID: PMC3092946 DOI: 10.6026/97320630006137] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2011] [Accepted: 05/07/2011] [Indexed: 11/23/2022] Open
Abstract
Protein heterodimer complexes are often involved in catalysis, regulation, assembly, immunity and inhibition. This involves the formation of stable interfaces between the interacting partners. Hence, it is of interest to describe heterodimer interfaces using known structural complexes. We use a non-redundant dataset of 192 heterodimer complex structures from the protein databank (PDB) to identify interface residues and describe their interfaces using amino-acids residue property preference. Analysis of the dataset shows that the heterodimer interfaces are often abundant in polar residues. The analysis also shows the presence of two classes of interfaces in heterodimer complexes. The first class of interfaces (class A) with more polar residues than core but less than surface is known. These interfaces are more hydrophobic than surfaces, where protein-protein binding is largely hydrophobic. The second class of interfaces (class B) with more polar residues than core and surface is shown. These interfaces are more polar than surfaces, where binding is mainly polar. Thus, these findings provide insights to the understanding of protein-protein interactions.
Collapse
Affiliation(s)
- Gopichandran Sowmya
- Department of Biotechnology, Faculty of applied Science, AIMST University, 08100 Semeling, Malaysia
- Biomedical Informatics, Pondicherry 607402, India
| | | | - Pandjassarame Kangueane
- Department of Biotechnology, Faculty of applied Science, AIMST University, 08100 Semeling, Malaysia
- Biomedical Informatics, Pondicherry 607402, India
| |
Collapse
|