1
|
Zheng Z, Goncearenco A, Berezovsky IN. Back in time to the Gly-rich prototype of the phosphate binding elementary function. Curr Res Struct Biol 2024; 7:100142. [PMID: 38655428 PMCID: PMC11035071 DOI: 10.1016/j.crstbi.2024.100142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 03/31/2024] [Accepted: 04/03/2024] [Indexed: 04/26/2024] Open
Abstract
Binding of nucleotides and their derivatives is one of the most ancient elementary functions dating back to the Origin of Life. We review here the works considering one of the key elements in binding of (di)nucleotide-containing ligands - phosphate binding. We start from a brief discussion of major participants, conditions, and events in prebiotic evolution that resulted in the Origin of Life. Tracing back to the basic functions, including metal and phosphate binding, and, potentially, formation of primitive protein-protein interactions, we focus here on the phosphate binding. Critically assessing works on the structural, functional, and evolutionary aspects of phosphate binding, we perform a simple computational experiment reconstructing its most ancient and generic sequence prototype. The profiles of the phosphate binding signatures have been derived in form of position-specific scoring matrices (PSSMs), their peculiarities depending on the type of the ligands have been analyzed, and evolutionary connections between them have been delineated. Then, the apparent prototype that gave rise to all relevant phosphate-binding signatures had also been reconstructed. We show that two major signatures of the phosphate binding that discriminate between the binding of dinucleotide- and nucleotide-containing ligands are GxGxxG and GxxGxG, respectively. It appears that the signature archetypal for dinucleotide-containing ligands is more generic, and it can frequently bind phosphate groups in nucleotide-containing ligands as well. The reconstructed prototype's key signature GxGGxG underlies the role of glycine residues in providing flexibility and interactions necessary for binding the phosphate groups. The prototype also contains other ancient amino acids, valine, and alanine, showing versatility towards evolutionary design and functional diversification.
Collapse
Affiliation(s)
- Zejun Zheng
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
| | | | - Igor N. Berezovsky
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
- Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, 117579, Singapore
| |
Collapse
|
2
|
Schaeffer RD, Zhang J, Medvedev KE, Kinch LN, Cong Q, Grishin NV. ECOD domain classification of 48 whole proteomes from AlphaFold Structure Database using DPAM2. PLoS Comput Biol 2024; 20:e1011586. [PMID: 38416793 PMCID: PMC10927120 DOI: 10.1371/journal.pcbi.1011586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 03/11/2024] [Accepted: 02/20/2024] [Indexed: 03/01/2024] Open
Abstract
Protein structure prediction has now been deployed widely across several different large protein sets. Large-scale domain annotation of these predictions can aid in the development of biological insights. Using our Evolutionary Classification of Protein Domains (ECOD) from experimental structures as a basis for classification, we describe the detection and cataloging of domains from 48 whole proteomes deposited in the AlphaFold Database. On average, we can provide positive classification (either of domains or other identifiable non-domain regions) for 90% of residues in all proteomes. We classified 746,349 domains from 536,808 proteins comprised of over 226,424,000 amino acid residues. We examine the varying populations of homologous groups in both eukaryotes and bacteria. In addition to containing a higher fraction of disordered regions and unassigned domains, eukaryotes show a higher proportion of repeated proteins, both globular and small repeats. We enumerate those highly populated domains that are shared in both eukaryotes and bacteria, such as the Rossmann domains, TIM barrels, and P-loop domains. Additionally, we compare the sampling of homologous groups from this whole proteome set against our stable ECOD reference and discuss groups that have been enriched by structure predictions. Finally, we discuss the implication of these results for protein target selection for future classification strategies for very large protein sets.
Collapse
Affiliation(s)
- R. Dustin Schaeffer
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Jing Zhang
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Kirill E. Medvedev
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Lisa N. Kinch
- Department of Molecular Biology, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Qian Cong
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Nick V. Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| |
Collapse
|
3
|
Manriquez‐Sandoval E, Fried SD. DomainMapper: Accurate domain structure annotation including those with non-contiguous topologies. Protein Sci 2022; 31:e4465. [PMID: 36208126 PMCID: PMC9601794 DOI: 10.1002/pro.4465] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/30/2022] [Accepted: 10/03/2022] [Indexed: 11/11/2022]
Abstract
Automated domain annotation is an important tool for structural informatics. These pipelines typically involve searching query sequences against hidden Markov model (HMM) profiles, yielding matches to profiles for various domains. However, domain annotation can be ambiguous or inaccurate when proteins contain domains with non-contiguous residue ranges, and especially when insertional domains are hosted within them. Here, we present DomainMapper, an algorithm that accurately assigns a unique domain structure annotation to a query sequence, including those with complex topologies. We validate our domain assignments using the AlphaFold database and confirm that non-contiguity is pervasive (10.74% of all domains in yeast and 4.52% in human). Using this resource, we find that certain folds have strong propensities to be non-contiguous or insertional across the Tree of Life. DomainMapper is freely available and can be ran as a single command-line function.
Collapse
Affiliation(s)
| | - Stephen D. Fried
- T. C. Jenkins Department of BiophysicsJohns Hopkins UniversityBaltimoreMDUSA
- Department of ChemistryJohns Hopkins UniversityBaltimoreMDUSA
| |
Collapse
|
4
|
Huang L, Chen W, Wei L, Su Y, Liang J, Lian H, Wang H, Long F, Yang F, Gao S, Tan Z, Xu J, Zhao J, Liu Q. Lonafarnib Inhibits Farnesyltransferase via Suppressing ERK Signaling Pathway to Prevent Osteoclastogenesis in Titanium Particle-Induced Osteolysis. Front Pharmacol 2022; 13:848152. [PMID: 35300293 PMCID: PMC8921770 DOI: 10.3389/fphar.2022.848152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 02/10/2022] [Indexed: 11/30/2022] Open
Abstract
Wear debris after total joint arthroplasty can attract the recruitment of macrophages, which release pro-inflammatory substances, triggering the activation of osteoclasts, thereby leading to periprosthetic osteolysis (PPOL) and aseptic loosening. However, the development of pharmacological strategies targeting osteoclasts to prevent periprosthetic osteolysis has not been fruitful. In this study, we worked toward researching the effects and mechanisms of a farnesyltransferase (FTase) inhibitor Lonafarnib (Lon) on receptor activator of nuclear factor κB (NF-κB) ligand (RANKL)-induced osteoclastogenesis and bone resorption, as well as the impacts of Lon on titanium particle-induced osteolysis. To investigate the impacts of Lon on bone resorption and osteoclastogenesis in vitro, bone marrow macrophages were incubated and stimulated with RANKL and macrophage colony-stimulating factor (M-CSF). The influence of Lon on osteolysis prevention in vivo was examined utilizing a titanium particle-induced mouse calvarial osteolysis model. The osteoclast-relevant genes expression was explored by real-time quantitative PCR. Immunofluorescence was used to detect intracellular localization of nuclear factor of activated T cells 1 (NFATc1). SiRNA silence assay was applied to examine the influence of FTase on osteoclasts activation. Related signaling pathways, including NFATc1 signaling, NF-κB, mitogen-activated protein kinases pathways were identified by western blot assay. Lon was illustrated to suppress bone resorptive function and osteoclastogenesis in vitro, and it also reduced the production of pro-inflammatory substances and prevented titanium particle-induced osteolysis in vivo. Lon decreased the expression of osteoclast-relevant genes and suppressed NFATc1 nuclear translocation and auto-amplification. Mechanistically, Lon dampened FTase, and inhibition of FTase reduced osteoclast formation by suppressing ERK signaling. Lon is a promising treatment option for osteoclast-related osteolysis diseases including periprosthetic osteolysis by targeted inhibition of FTase through suppressing ERK signaling.
Collapse
Affiliation(s)
- Linke Huang
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Department of Orthopaedics, The Second Affiliated Hospital of Guangxi Medical University, Nanning, China.,Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Weiwei Chen
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Linhua Wei
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China.,The Affiliated Nanning Infectious Disease Hospital of Guangxi Medical University, The Fourth People's Hospital of Nanning, Nanning, China
| | - Yuangang Su
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Jiamin Liang
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Haoyu Lian
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Hui Wang
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Feng Long
- Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Fan Yang
- Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Shiyao Gao
- Department of Orthopaedics, The Second Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - Zhen Tan
- Department of Orthopaedics, The Second Affiliated Hospital of Guangxi Medical University, Nanning, China
| | - Jiake Xu
- School of Biomedical Sciences, University of Western Australia, Perth, WA, Australia
| | - Jinmin Zhao
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.,Guangxi Key Laboratory of Regenerative Medicine, Guangxi Medical University, Nanning, China
| | - Qian Liu
- Research Centre for Regenerative Medicine, Orthopaedic Department, The First Affiliated Hospital of Guangxi Medical University, Nanning, China
| |
Collapse
|
5
|
Gazi AD, Kokkinidis M, Fadouloglou VE. α-Helices in the Type III Secretion Effectors: A Prevalent Feature with Versatile Roles. Int J Mol Sci 2021; 22:ijms22115412. [PMID: 34063760 PMCID: PMC8196651 DOI: 10.3390/ijms22115412] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 05/14/2021] [Accepted: 05/17/2021] [Indexed: 12/16/2022] Open
Abstract
Type III Secretion Systems (T3SSs) are multicomponent nanomachines located at the cell envelope of Gram-negative bacteria. Their main function is to transport bacterial proteins either extracellularly or directly into the eukaryotic host cell cytoplasm. Type III Secretion effectors (T3SEs), latest to be secreted T3S substrates, are destined to act at the eukaryotic host cell cytoplasm and occasionally at the nucleus, hijacking cellular processes through mimicking eukaryotic proteins. A broad range of functions is attributed to T3SEs, ranging from the manipulation of the host cell's metabolism for the benefit of the bacterium to bypassing the host's defense mechanisms. To perform this broad range of manipulations, T3SEs have evolved numerous novel folds that are compatible with some basic requirements: they should be able to easily unfold, pass through the narrow T3SS channel, and refold to an active form when on the other side. In this review, the various folds of T3SEs are presented with the emphasis placed on the functional and structural importance of α-helices and helical domains.
Collapse
Affiliation(s)
- Anastasia D. Gazi
- Unit of Technology & Service Ultrastructural Bio-Imaging (UTechS UBI), Institut Pasteur, 75015 Paris, France
- Correspondence: (A.D.G.); (V.E.F.)
| | - Michael Kokkinidis
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Nikolaou Plastira 100, Heraklion, 70013 Crete, Greece;
- Department of Biology, Voutes University Campus, University of Crete, Heraklion, 70013 Crete, Greece
| | - Vasiliki E. Fadouloglou
- Department of Molecular Biology & Genetics, Democritus University of Thrace, 68100 Alexandroupolis, Greece
- Correspondence: (A.D.G.); (V.E.F.)
| |
Collapse
|
6
|
Kolodny R, Nepomnyachiy S, Tawfik DS, Ben-Tal N. Bridging Themes: Short Protein Segments Found in Different Architectures. Mol Biol Evol 2021; 38:2191-2208. [PMID: 33502503 PMCID: PMC8136508 DOI: 10.1093/molbev/msab017] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The vast majority of theoretically possible polypeptide chains do not fold, let alone confer function. Hence, protein evolution from preexisting building blocks has clear potential advantages over ab initio emergence from random sequences. In support of this view, sequence similarities between different proteins is generally indicative of common ancestry, and we collectively refer to such homologous sequences as "themes." At the domain level, sequence homology is routinely detected. However, short themes which are segments, or fragments of intact domains, are particularly interesting because they may provide hints about the emergence of domains, as opposed to divergence of preexisting domains, or their mixing-and-matching to form multi-domain proteins. Here we identified 525 representative short themes, comprising 20-80 residues that are unexpectedly shared between domains considered to have emerged independently. Among these "bridging themes" are ones shared between the most ancient domains, for example, Rossmann, P-loop NTPase, TIM-barrel, flavodoxin, and ferredoxin-like. We elaborate on several particularly interesting cases, where the bridging themes mediate ligand binding. Ligand binding may have contributed to the stability and the plasticity of these building blocks, and to their ability to invade preexisting domains or serve as starting points for completely new domains.
Collapse
Affiliation(s)
- Rachel Kolodny
- Department of Computer Science, University of Haifa, Haifa, Israel
| | | | - Dan S Tawfik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Nir Ben-Tal
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
7
|
Shahnazari M, Alemzadeh A, Zakipour Z, Razi H. Evolution and classification of Na/K ATPase α-subunit in Arthropoda and Nematoda. GENE REPORTS 2021. [DOI: 10.1016/j.genrep.2020.101015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
8
|
Bhat AS, Kinch LN, Grishin NV. β-Strand-mediated interactions of protein domains. Proteins 2020; 88:1513-1527. [PMID: 32543729 PMCID: PMC8018532 DOI: 10.1002/prot.25970] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 03/10/2020] [Accepted: 06/06/2020] [Indexed: 01/14/2023]
Abstract
Protein domains exist by themselves or in combination with other domains to form complex multidomain proteins. Defining domain boundaries in proteins is essential for understanding their evolution and function but is not trivial. More specifically, partitioning domains that interact by forming a single β-sheet is known to be particularly troublesome for automatic structure-based domain decomposition pipelines. Here, we study edge-to-edge β-strand interactions between domains in a protein chain, to help define the boundaries for some more difficult cases where a single β-sheet spanning over two domains gives an appearance of one. We give a number of examples where β-strands belonging to a single β-sheet do not belong to a single domain and highlight the difficulties of automatic domain parsers on these examples. This work can be used as a baseline for defining domain boundaries in homologous proteins or proteins with similar domain interactions in the future.
Collapse
Affiliation(s)
- Archana S. Bhat
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9050
| | - Lisa N. Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9050
| | - Nick V. Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9050
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9050
| |
Collapse
|
9
|
Mylemans B, Noguchi H, Deridder E, Lescrinier E, Tame JRH, Voet ARD. Influence of circular permutations on the structure and stability of a six-fold circular symmetric designer protein. Protein Sci 2020; 29:2375-2386. [PMID: 33006397 DOI: 10.1002/pro.3961] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 09/23/2020] [Accepted: 09/26/2020] [Indexed: 11/09/2022]
Abstract
The β-propeller fold is adopted by a sequentially diverse family of repeat proteins with apparent rotational symmetry. While the structure is mostly stabilized by hydrophobic interactions, an additional stabilization is provided by hydrogen bonds between the N-and C-termini, which are almost invariably part of the same β-sheet. This feature is often referred to as the "Velcro" closure. The positioning of the termini within a blade is variable and depends on the protein family. In order to investigate the influence of this location on protein structure, folding and stability, we created different circular permutants, and a circularized version, of the designer propeller protein named Pizza. This protein is perfectly symmetrical, possessing six identical repeats. While all mutants adopt the same structure, the proteins lacking the "Velcro" closure were found to be significantly less resistant to thermal and chemical denaturation. This could explain why such proteins are rarely observed in nature. Interestingly the most common "Velcro" configuration for this protein family was not the most stable among the Pizza variants tested. The circularized version shows dramatically improved stability, which could have implications for future applications.
Collapse
Affiliation(s)
| | | | - Els Deridder
- Department of Chemistry, KU Leuven, Leuven, Belgium
| | | | | | | |
Collapse
|
10
|
Northover DE, Shank SD, Liberles DA. Characterizing lineage-specific evolution and the processes driving genomic diversification in chordates. BMC Evol Biol 2020; 20:24. [PMID: 32046633 PMCID: PMC7011509 DOI: 10.1186/s12862-020-1585-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Accepted: 01/16/2020] [Indexed: 11/21/2022] Open
Abstract
Background Understanding the origins of genome content has long been a goal of molecular evolution and comparative genomics. By examining genome evolution through the guise of lineage-specific evolution, it is possible to make inferences about the evolutionary events that have given rise to species-specific diversification. Here we characterize the evolutionary trends found in chordate species using The Adaptive Evolution Database (TAED). TAED is a database of phylogenetically indexed gene families designed to detect episodes of directional or diversifying selection across chordates. Gene families within the database have been assessed for lineage-specific estimates of dN/dS and have been reconciled to the chordate species to identify retained duplicates. Gene families have also been mapped to the functional pathways and amino acid changes which occurred on high dN/dS lineages have been mapped to protein structures. Results An analysis of this exhaustive database has enabled a characterization of the processes of lineage-specific diversification in chordates. A pathway level enrichment analysis of TAED determined that pathways most commonly found to have elevated rates of evolution included those involved in metabolism, immunity, and cell signaling. An analysis of protein fold presence on proteins, after normalizing for frequency in the database, found common folds such as Rossmann folds, Jelly Roll folds, and TIM barrels were overrepresented on proteins most likely to undergo directional selection. A set of gene families which experience increased numbers of duplications within short evolutionary times are associated with pathways involved in metabolism, olfactory reception, and signaling. An analysis of protein secondary structure indicated more relaxed constraint in β-sheets and stronger constraint on alpha Helices, amidst a general preference for substitutions at exposed sites. Lastly a detailed analysis of the ornithine decarboxylase gene family, a key enzyme in the pathway for polyamine synthesis, revealed lineage-specific evolution along the lineage leading to Cetacea through rapid sequence evolution in a duplicate gene with amino acid substitutions causing active site rearrangement. Conclusion Episodes of lineage-specific evolution are frequent throughout chordate species. Both duplication and directional selection have played large roles in the evolution of the phylum. TAED is a powerful tool for facilitating this understanding of lineage-specific evolution.
Collapse
Affiliation(s)
- David E Northover
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - Stephen D Shank
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA
| | - David A Liberles
- Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, PA, 19122, USA. .,Department of Molecular Biology, University of Wyoming, Laramie, WY, 82071, USA.
| |
Collapse
|
11
|
Liao Y, Schaeffer RD, Pei J, Grishin NV. A sequence family database built on ECOD structural domains. Bioinformatics 2019; 34:2997-3003. [PMID: 29659718 DOI: 10.1093/bioinformatics/bty214] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2018] [Accepted: 04/03/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation The ECOD database classifies protein domains based on their evolutionary relationships, considering both remote and close homology. The family group in ECOD provides classification of domains that are closely related to each other based on sequence similarity. Due to different perspectives on domain definition, direct application of existing sequence domain databases, such as Pfam, to ECOD struggles with several shortcomings. Results We created multiple sequence alignments and profiles from ECOD domains with the help of structural information in alignment building and boundary delineation. We validated the alignment quality by scoring structure superposition to demonstrate that they are comparable to curated seed alignments in Pfam. Comparison to Pfam and CDD reveals that 27 and 16% of ECOD families are new, but they are also dominated by small families, likely because of the sampling bias from the PDB database. There are 35 and 48% of families whose boundaries are modified comparing to counterparts in Pfam and CDD, respectively. Availability and implementation The new families are now integrated in the ECOD website. The aggregate HMMER profile library and alignment are available for download on ECOD website (http://prodata.swmed.edu/ecod). Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuxing Liao
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - R Dustin Schaeffer
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, USA.,Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Jimin Pei
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, USA.,Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Nick V Grishin
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, TX, USA.,Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
12
|
Berezovsky IN. Towards descriptor of elementary functions for protein design. Curr Opin Struct Biol 2019; 58:159-165. [PMID: 31352188 DOI: 10.1016/j.sbi.2019.06.010] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Accepted: 06/18/2019] [Indexed: 11/18/2022]
Abstract
We review studies of the protein evolution that help to formulate rules for protein design. Acknowledging the fundamental importance of Dayhoff's provision on the emergence of functional proteins from short peptides, we discuss multiple evidences of the omnipresent partitioning of protein globules into structural/functional units, using which greatly facilitates the engineering and design efforts. Closed loops and elementary functional loops, which are descendants of ancient ring-like peptides that formed fist protein domains in agreement with Dayhoff's hypothesis, can be considered as basic units of protein structure and function. We argue that future developments in protein design approaches should consider descriptors of the elementary functions, which will help to complement designed scaffolds with functional signatures and flexibility necessary for their functions.
Collapse
Affiliation(s)
- Igor N Berezovsky
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A⁎STAR), 30 Biopolis Street, #07-01, Matrix 138671, Singapore; Department of Biological Sciences (DBS), National University of Singapore (NUS), 8 Medical Drive, 117579, Singapore.
| |
Collapse
|
13
|
Schaeffer RD, Kinch L, Medvedev KE, Pei J, Cheng H, Grishin N. ECOD: identification of distant homology among multidomain and transmembrane domain proteins. BMC Mol Cell Biol 2019; 20:18. [PMID: 31226926 PMCID: PMC6588880 DOI: 10.1186/s12860-019-0204-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 06/02/2019] [Indexed: 12/03/2022] Open
Abstract
The manual classification of protein domains is approaching its 20th anniversary. ECOD is our mixed manual-automatic domain classification. Over time, the types of proteins which require manual curation has changed. Depositions with complex multidomain and multichain arrangements are commonplace. Transmembrane domains are regularly classified. Repeatedly, domains which are initially believed to be novel are found to have homologous links to existing classified domains. Here we present a brief summary of recent manual curation efforts in ECOD generally combined with specific case studies of transmembrane and multidomain proteins wherein manual curation was useful for discovering new homologous relationships. We present a new taxonomy for the classification of ABC transporter transmembrane domains. We examine alternate topologies of the leucine-specific (LS) domain of Leucine tRNA-synthetase. Finally, we elaborate on a distant homologous links between two helical dimerization domains.
Collapse
Affiliation(s)
- R Dustin Schaeffer
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390-9050, USA.
| | - Lisa Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, 75390-9050, USA
| | - Kirill E Medvedev
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390-9050, USA
| | - Jimin Pei
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, 75390-9050, USA
| | - Hua Cheng
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, 75390-9050, USA
| | - Nick Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX, 75390-9050, USA
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX, 75390-9050, USA
| |
Collapse
|
14
|
Klochkov SG, Neganova ME, Yarla NS, Parvathaneni M, Sharma B, Tarasov VV, Barreto G, Bachurin SO, Ashraf GM, Aliev G. Implications of farnesyltransferase and its inhibitors as a promising strategy for cancer therapy. Semin Cancer Biol 2019; 56:128-134. [DOI: 10.1016/j.semcancer.2017.10.010] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Revised: 10/14/2017] [Accepted: 10/30/2017] [Indexed: 12/20/2022]
|
15
|
MacCarthy E, Perry D, Kc DB. Advances in Protein Super-Secondary Structure Prediction and Application to Protein Structure Prediction. Methods Mol Biol 2019; 1958:15-45. [PMID: 30945212 DOI: 10.1007/978-1-4939-9161-7_2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Due to the advancement in various sequencing technologies, the gap between the number of protein sequences and the number of experimental protein structures is ever increasing. Community-wide initiatives like CASP have resulted in considerable efforts in the development of computational methods to accurately model protein structures from sequences. Sequence-based prediction of super-secondary structure has direct application in protein structure prediction, and there have been significant efforts in the prediction of super-secondary structure in the last decade. In this chapter, we first introduce the protein structure prediction problem and highlight some of the important progress in the field of protein structure prediction. Next, we discuss recent methods for the prediction of super-secondary structures. Finally, we discuss applications of super-secondary structure prediction in structure prediction/analysis of proteins. We also discuss prediction of protein structures that are composed of simple super-secondary structure repeats and protein structures that are composed of complex super-secondary structure repeats. Finally, we also discuss the recent trends in the field.
Collapse
Affiliation(s)
- Elijah MacCarthy
- Department of Computational Science and Engineering, North Carolina A&T State University, Greensboro, NC, USA
| | - Derrick Perry
- Department of Computational Science and Engineering, North Carolina A&T State University, Greensboro, NC, USA
| | - Dukka B Kc
- Department of Computational Science and Engineering, North Carolina A&T State University, Greensboro, NC, USA.
| |
Collapse
|
16
|
Khan T, Panday SK, Ghosh I. ProLego: tool for extracting and visualizing topological modules in protein structures. BMC Bioinformatics 2018; 19:167. [PMID: 29728050 PMCID: PMC5935970 DOI: 10.1186/s12859-018-2171-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Accepted: 04/30/2018] [Indexed: 11/10/2022] Open
Abstract
Background In protein design, correct use of topology is among the initial and most critical feature. Meticulous selection of backbone topology aids in drastically reducing the structure search space. With ProLego, we present a server application to explore the component aspect of protein structures and provide an intuitive and efficient way to scan the protein topology space. Result We have implemented in-house developed “topological representation” in an automated-pipeline to extract protein topology from given protein structure. Using the topology string, ProLego, compares topology against a non-redundant extensive topology database (ProLegoDB) as well as extracts constituent topological modules. The platform offers interactive topology visualization graphs. Conclusion ProLego, provides an alternative but comprehensive way to scan and visualize protein topology along with an extensive database of protein topology. ProLego can be found at http://www.proteinlego.com Electronic supplementary material The online version of this article (10.1186/s12859-018-2171-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Taushif Khan
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India.
| | - Shailesh Kumar Panday
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
| | - Indira Ghosh
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India
| |
Collapse
|
17
|
Lupas AN, Alva V. Ribosomal proteins as documents of the transition from unstructured (poly)peptides to folded proteins. J Struct Biol 2017; 198:74-81. [DOI: 10.1016/j.jsb.2017.04.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 04/23/2017] [Accepted: 04/24/2017] [Indexed: 11/16/2022]
|
18
|
Schaeffer RD, Liao Y, Cheng H, Grishin NV. ECOD: new developments in the evolutionary classification of domains. Nucleic Acids Res 2016; 45:D296-D302. [PMID: 27899594 PMCID: PMC5210594 DOI: 10.1093/nar/gkw1137] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 10/25/2016] [Accepted: 11/16/2016] [Indexed: 12/21/2022] Open
Abstract
Evolutionary Classification Of protein Domains (ECOD) (http://prodata.swmed.edu/ecod) comprehensively classifies protein with known spatial structures maintained by the Protein Data Bank (PDB) into evolutionary groups of protein domains. ECOD relies on a combination of automatic and manual weekly updates to achieve its high accuracy and coverage with a short update cycle. ECOD classifies the approximately 120 000 depositions of the PDB into more than 500 000 domains in ∼3400 homologous groups. We show the performance of the weekly update pipeline since the release of ECOD, describe improvements to the ECOD website and available search options, and discuss novel structures and homologous groups that have been classified in the recent updates. Finally, we discuss the future directions of ECOD and further improvements planned for the hierarchy and update process.
Collapse
Affiliation(s)
- R Dustin Schaeffer
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390-9050, USA
| | - Yuxing Liao
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390-9050, USA
| | - Hua Cheng
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390-9050, USA
| | - Nick V Grishin
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, TX 75390-9050, USA.,Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, TX 75390-9050, USA
| |
Collapse
|