1
|
Wei W, Valerio M, Ma N, Kang H, Nguyen LXT, Marcucci G, Vaidehi N. Disordered C-Terminus Plays a Critical Role in the Activity of the Small GTPase Ran. Biochemistry 2025. [PMID: 39999282 DOI: 10.1021/acs.biochem.4c00484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2025]
Abstract
Ran is a small GTPase of the Ras superfamily that governs nucleocytoplasmic transport, including that of miR-126, a microRNA that supports the homeostasis and expansion of leukemia stem cells (LSCs). Ran binds to Exportin 5 to facilitate the transport of precursor (pre)-miR-126 across the nuclear membrane for its maturation. Our goal is to inhibit Ran to prevent transport of pre-miR-126 to the cytoplasm. Like other Ras family proteins, targeting Ran with small molecules is challenging due to its relatively flat surface and lack of binding cavities. Ran's activity is regulated by a long and disordered C-terminus that provides opportunities for identifying cryptic binding pockets to target. We used a combination of molecular dynamics simulations and experiments and uncovered the critical role of the ensemble of the C-terminal conformations that enable the transition of Ran from the GTP-bound "on state" to its GDP-bound "off-state". We also showed that the Ran C-terminus allosterically modulates the conformations of residues in the nucleotide binding site and in the functionally relevant Switch 1 and 2 regions. Through computational deep mutational scans and experiments, we identified four residue hotspots L182, Y197, D200, and L201 at the core-C-terminus interface and four residue mutations V27A, E70D, N122A, and N122Y that mediate the allosteric communication between the core and switch regions. This information paves the way for our next step in the design of novel allosteric modulators for Ran.
Collapse
Affiliation(s)
- Wenyuan Wei
- Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, Duarte, California 91010, United States
- Irell and Manella Graduate School of Biosciences, City of Hope, Duarte, California 91010, United States
| | - Melissa Valerio
- Irell and Manella Graduate School of Biosciences, City of Hope, Duarte, California 91010, United States
- Department of Hematology and Hematopoietic Cell Transplantation, City of Hope Medical Center, Duarte, California 91010, United States
| | - Ning Ma
- Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, Duarte, California 91010, United States
| | - Hyunjun Kang
- Department of Hematology and Hematopoietic Cell Transplantation, City of Hope Medical Center, Duarte, California 91010, United States
| | - Le Xuan Truong Nguyen
- Department of Hematology and Hematopoietic Cell Transplantation, City of Hope Medical Center, Duarte, California 91010, United States
- Cancer & Cell Biology Division, Translational Genomics Research Institute, Phoenix, Arizona 85004, United States
| | - Guido Marcucci
- Department of Hematology and Hematopoietic Cell Transplantation, City of Hope Medical Center, Duarte, California 91010, United States
- Gehr Family Center for Leukemia Research, Hematology Malignancies and Stem Cell Transplantation Institute, City of Hope Medical Center, Duarte, California 91010, United States
| | - Nagarajan Vaidehi
- Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, Duarte, California 91010, United States
- Irell and Manella Graduate School of Biosciences, City of Hope, Duarte, California 91010, United States
| |
Collapse
|
2
|
Mathy CJP, Mishra P, Flynn JM, Perica T, Mavor D, Bolon DNA, Kortemme T. A complete allosteric map of a GTPase switch in its native cellular network. Cell Syst 2023; 14:237-246.e7. [PMID: 36801015 PMCID: PMC10173951 DOI: 10.1016/j.cels.2023.01.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Revised: 11/08/2022] [Accepted: 01/06/2023] [Indexed: 02/19/2023]
Abstract
Allosteric regulation is central to protein function in cellular networks. A fundamental open question is whether cellular regulation of allosteric proteins occurs only at a few defined positions or at many sites distributed throughout the structure. Here, we probe the regulation of GTPases-protein switches that control signaling through regulated conformational cycling-at residue-level resolution by deep mutagenesis in the native biological network. For the GTPase Gsp1/Ran, we find that 28% of the 4,315 assayed mutations show pronounced gain-of-function responses. Twenty of the sixty positions enriched for gain-of-function mutations are outside the canonical GTPase active site switch regions. Kinetic analysis shows that these distal sites are allosterically coupled to the active site. We conclude that the GTPase switch mechanism is broadly sensitive to cellular allosteric regulation. Our systematic discovery of new regulatory sites provides a functional map to interrogate and target GTPases controlling many essential biological processes.
Collapse
Affiliation(s)
- Christopher J P Mathy
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA; The UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Parul Mishra
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, MA 01605, USA; School of Life Sciences, University of Hyderabad, Hyderabad, Telangana, India
| | - Julia M Flynn
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Tina Perica
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - David Mavor
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Daniel N A Bolon
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA; The UC Berkeley-UCSF Graduate Program in Bioengineering, University of California, San Francisco, San Francisco, CA 94158, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA.
| |
Collapse
|
3
|
Yan H, Wu J, Li Y, Liu JS. Bayesian bi-clustering methods with applications in computational biology. Ann Appl Stat 2022. [DOI: 10.1214/22-aoas1622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Han Yan
- Department of Statistics, Harvard University
| | | | | | - Jun S. Liu
- Department of Statistics, Harvard University
| |
Collapse
|
4
|
Neuwald AF, Lanczycki CJ, Hodges TK, Marchler-Bauer A. Obtaining extremely large and accurate protein multiple sequence alignments from curated hierarchical alignments. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2020:5850901. [PMID: 32500917 PMCID: PMC7297217 DOI: 10.1093/database/baaa042] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2019] [Revised: 04/01/2020] [Accepted: 05/06/2020] [Indexed: 11/12/2022]
Abstract
For optimal performance, machine learning methods for protein sequence/structural analysis typically require as input a large multiple sequence alignment (MSA), which is often created using query-based iterative programs, such as PSI-BLAST or JackHMMER. However, because these programs align database sequences using a query sequence as a template, they may fail to detect or may tend to misalign sequences distantly related to the query. More generally, automated MSA programs often fail to align sequences correctly due to the unpredictable nature of protein evolution. Addressing this problem typically requires manual curation in the light of structural data. However, curated MSAs tend to contain too few sequences to serve as input for statistically based methods. We address these shortcomings by making publicly available a set of 252 curated hierarchical MSAs (hiMSAs), containing a total of 26 212 066 sequences, along with programs for generating from these extremely large MSAs. Each hiMSA consists of a set of hierarchically arranged MSAs representing individual subgroups within a superfamily along with template MSAs specifying how to align each subgroup MSA against MSAs higher up the hierarchy. Central to this approach is the MAPGAPS search program, which uses a hiMSA as a query to align (potentially vast numbers of) matching database sequences with accuracy comparable to that of the curated hiMSA. We illustrate this process for the exonuclease–endonuclease–phosphatase superfamily and for pleckstrin homology domains. A set of extremely large MSAs generated from the hiMSAs in this way is available as input for deep learning, big data analyses. MAPGAPS, auxiliary programs CDD2MGS, AddPhylum, PurgeMSA and ConvertMSA and links to National Center for Biotechnology Information data files are available at https://www.igs.umaryland.edu/labs/neuwald/software/mapgaps/.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences.,Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, 670 W. Baltimore Street, Baltimore, MD 21201, USA
| | - Christopher J Lanczycki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38 A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | | | - Aron Marchler-Bauer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38 A, 8600 Rockville Pike, Bethesda, MD 20894, USA
| |
Collapse
|
5
|
Toshchakov VY, Neuwald AF. A survey of TIR domain sequence and structure divergence. Immunogenetics 2020; 72:181-203. [PMID: 32002590 PMCID: PMC7075850 DOI: 10.1007/s00251-020-01157-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Accepted: 01/20/2020] [Indexed: 12/31/2022]
Abstract
Toll-interleukin-1R resistance (TIR) domains are ubiquitously present in all forms of cellular life. They are most commonly found in signaling proteins, as units responsible for signal-dependent formation of protein complexes that enable amplification and spatial propagation of the signal. A less common function of TIR domains is their ability to catalyze nicotinamide adenine dinucleotide degradation. This survey analyzes 26,414 TIR domains, automatically classified based on group-specific sequence patterns presumably determining biological function, using a statistical approach termed Bayesian partitioning with pattern selection (BPPS). We examine these groups and patterns in the light of available structures and biochemical analyses. Proteins within each of thirteen eukaryotic groups (10 metazoans and 3 plants) typically appear to perform similar functions, whereas proteins within each prokaryotic group typically exhibit diverse domain architectures, suggesting divergent functions. Groups are often uniquely characterized by structural fold variations associated with group-specific sequence patterns and by herein identified sequence motifs defining TIR domain functional divergence. For example, BPPS identifies, in helices C and D of TIRAP and MyD88 orthologs, conserved surface-exposed residues apparently responsible for specificity of TIR domain interactions. In addition, BPPS clarifies the functional significance of the previously described Box 2 and Box 3 motifs, each of which is a part of a larger, group-specific block of conserved, intramolecularly interacting residues.
Collapse
Affiliation(s)
- Vladimir Y Toshchakov
- Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA.
| | - Andrew F Neuwald
- Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| |
Collapse
|
6
|
Agnew C, Liu L, Liu S, Xu W, You L, Yeung W, Kannan N, Jablons D, Jura N. The crystal structure of the protein kinase HIPK2 reveals a unique architecture of its CMGC-insert region. J Biol Chem 2019; 294:13545-13559. [PMID: 31341017 PMCID: PMC6746438 DOI: 10.1074/jbc.ra119.009725] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 07/11/2019] [Indexed: 01/07/2023] Open
Abstract
The homeodomain-interacting protein kinase (HIPK) family is comprised of four nuclear protein kinases, HIPK1-4. HIPK proteins phosphorylate a diverse range of transcription factors involved in cell proliferation, differentiation, and apoptosis. HIPK2, thus far the best-characterized member of this largely understudied family of protein kinases, plays a role in the activation of p53 in response to DNA damage. Despite this tumor-suppressor function, HIPK2 is also found overexpressed in several cancers, and its hyperactivation causes chronic fibrosis. There are currently no structures of HIPK2 or of any other HIPK kinase. Here, we report the crystal structure of HIPK2's kinase domain bound to CX-4945, a casein kinase 2α (CK2α) inhibitor currently in clinical trials against several cancers. The structure, determined at 2.2 Å resolution, revealed that CX-4945 engages the HIPK2 active site in a hybrid binding mode between that seen in structures of CK2α and Pim1 kinases. The HIPK2 kinase domain crystallized in the active conformation, which was stabilized by phosphorylation of the activation loop. We noted that the overall kinase domain fold of HIPK2 closely resembles that of evolutionarily related dual-specificity tyrosine-regulated kinases (DYRKs). Most significant structural differences between HIPK2 and DYRKs included an absence of the regulatory N-terminal domain and a unique conformation of the CMGC-insert region and of a newly defined insert segment in the αC-β4 loop. This first crystal structure of HIPK2 paves the way for characterizing the understudied members of the HIPK family and for developing HIPK2-directed therapies for managing cancer and fibrosis.
Collapse
Affiliation(s)
- Christopher Agnew
- Cardiovascular Research Institute, University of California San Francisco, San Francisco, California 94158
| | - Lijun Liu
- Cardiovascular Research Institute, University of California San Francisco, San Francisco, California 94158
| | - Shu Liu
- Thoracic Oncology Laboratory, Department of Surgery, Comprehensive Cancer Center, University of California San Francisco, San Francisco, California 94115
| | - Wei Xu
- Thoracic Oncology Laboratory, Department of Surgery, Comprehensive Cancer Center, University of California San Francisco, San Francisco, California 94115
| | - Liang You
- Thoracic Oncology Laboratory, Department of Surgery, Comprehensive Cancer Center, University of California San Francisco, San Francisco, California 94115
| | - Wayland Yeung
- Institute of Bioinformatics and Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia 30602
| | - Natarajan Kannan
- Institute of Bioinformatics and Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia 30602
| | - David Jablons
- Thoracic Oncology Laboratory, Department of Surgery, Comprehensive Cancer Center, University of California San Francisco, San Francisco, California 94115, Supported by the Kazan McClain Partners' Foundation and the H. N. and Frances C. Berger Foundation. To whom correspondence may be addressed:
1600 Divisadero St., A745, San Francisco, CA 94115. Tel.:
415-353-7502; E-mail:
| | - Natalia Jura
- Cardiovascular Research Institute, University of California San Francisco, San Francisco, California 94158,Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, California 94158, To whom correspondence may be addressed:
555 Mission Bay Blvd. S., Rm. 452W, San Francisco, CA 94158. Tel.:
415-514-1133; E-mail:
| |
Collapse
|
7
|
Kalaivani R, Reema R, Srinivasan N. Recognition of sites of functional specialisation in all known eukaryotic protein kinase families. PLoS Comput Biol 2018; 14:e1005975. [PMID: 29438395 PMCID: PMC5826538 DOI: 10.1371/journal.pcbi.1005975] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2017] [Revised: 02/26/2018] [Accepted: 01/13/2018] [Indexed: 11/25/2022] Open
Abstract
The conserved function of protein phosphorylation, catalysed by members of protein kinase superfamily, is regulated in different ways in different kinase families. Further, differences in activating triggers, cellular localisation, domain architecture and substrate specificity between kinase families are also well known. While the transfer of γ-phosphate from ATP to the hydroxyl group of Ser/Thr/Tyr is mediated by a conserved Asp, the characteristic functional and regulatory sites are specialized at the level of families or sub-families. Such family-specific sites of functional specialization are unknown for most families of kinases. In this work, we systematically identify the family-specific residue features by comparing the extent of conservation of physicochemical properties, Shannon entropy and statistical probability of residue distributions between families of kinases. An integrated discriminatory score, which combines these three features, is developed to demarcate the functionally specialized sites in a kinase family from other sites. We achieved an area under ROC curve of 0.992 for the discrimination of kinase families. Our approach was extensively tested on well-studied families CDK and MAPK, wherein specific protein interaction sites and substrate recognition sites were successfully detected (p-value < 0.05). We also find that the known family-specific oncogenic driver mutation sites were scored high by our method. The method was applied to all known kinases encompassing 107 families from diverse eukaryotic organisms leading to a comprehensive list of family-specific functional sites. Apart from other uses, our method facilitates identification of specific protein interaction sites and drug target sites in a kinase family.
Collapse
Affiliation(s)
- Raju Kalaivani
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| | - Raju Reema
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| | | |
Collapse
|
8
|
Watters K, Inankur B, Gardiner JC, Warrick J, Sherer NM, Yin J, Palmenberg AC. Differential Disruption of Nucleocytoplasmic Trafficking Pathways by Rhinovirus 2A Proteases. J Virol 2017; 91:e02472-16. [PMID: 28179529 PMCID: PMC5375692 DOI: 10.1128/jvi.02472-16] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 02/01/2017] [Indexed: 01/11/2023] Open
Abstract
The RNA rhinoviruses (RV) encode 2A proteases (2Apro) that contribute essential polyprotein processing and host cell shutoff functions during infection, including the cleavage of Phe/Gly-containing nucleoporin proteins (Nups) within nuclear pore complexes (NPC). Within the 3 RV species, multiple divergent genotypes encode diverse 2Apro sequences that act differentially on specific Nups. Since only subsets of Phe/Gly motifs, particularly those within Nup62, Nup98, and Nup153, are recognized by transport receptors (karyopherins) when trafficking large molecular cargos through the NPC, the processing preferences of individual 2Apro predict RV genotype-specific targeting of NPC pathways and cargos. To test this idea, transformed HeLa cell lines were created with fluorescent cargos (mCherry) for the importin α/β, transportin 1, and transportin 3 import pathways and the Crm1-mediated export pathway. Live-cell imaging of single cells expressing recombinant RV 2Apro (A16, A45, B04, B14, B52, C02, and C15) showed disruption of each pathway with measurably different efficiencies and reaction rates. The B04 and B52 proteases preferentially targeted Nups in the import pathways, while B04 and C15 proteases were more effective against the export pathway. Virus-type-specific trends were also observed during infection of cells with A16, B04, B14, and B52 viruses or their chimeras, as measured by NF-κB (p65/Rel) translocation into the nucleus and the rates of virus-associated cytopathic effects. This study provides new tools for evaluating the host cell response to RV infections in real time and suggests that differential 2Apro activities explain, in part, strain-dependent host responses and diverse RV disease phenotypes.IMPORTANCE Genetic variation among human rhinovirus types includes unexpected diversity in the genes encoding viral proteases (2Apro) that help these viruses achieve antihost responses. When the enzyme activities of 7 different 2Apro were measured comparatively in transformed cells programed with fluorescent reporter systems and by quantitative cell imaging, the cellular substrates, particularly in the nuclear pore complex, used by these proteases were indeed attacked at different rates and with different affinities. The importance of this finding is that it provides a mechanistic explanation for how different types (strains) of rhinoviruses may elicit different cell responses that directly or indirectly lead to distinct disease phenotypes.
Collapse
Affiliation(s)
- Kelly Watters
- Institute for Molecular Virology, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Bahar Inankur
- Wisconsin Institutes for Discovery and Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Jaye C Gardiner
- Institute for Molecular Virology, University of Wisconsin-Madison, Madison, Wisconsin, USA
- McArdle Laboratories for Cancer Research, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Jay Warrick
- Wisconsin Institutes for Medical Research and Department of Biomedical Engineering, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Nathan M Sherer
- Institute for Molecular Virology, University of Wisconsin-Madison, Madison, Wisconsin, USA
- McArdle Laboratories for Cancer Research, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - John Yin
- Wisconsin Institutes for Discovery and Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Ann C Palmenberg
- Institute for Molecular Virology, University of Wisconsin-Madison, Madison, Wisconsin, USA
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| |
Collapse
|
9
|
Neuwald AF, Altschul SF. Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations. PLoS Comput Biol 2016; 12:e1005294. [PMID: 28002465 PMCID: PMC5225019 DOI: 10.1371/journal.pcbi.1005294] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 01/10/2017] [Accepted: 12/08/2016] [Indexed: 11/25/2022] Open
Abstract
Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes’ theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu). Protein sequence data, when gathered in great quantity, contain important but implicit biological information manifest as statistical correlations. Here we describe an approach to access this information by comprehensively modeling and characterizing the distribution of sequences belonging to a major protein superfamily. This approach takes as input a large set of unaligned sequences belonging to the superfamily. By applying the minimum description length principle, it seeks the statistical model that best explains the sequences while avoiding over-fitting the data. It concurrently aligns the sequences and, to model evolutionary divergence, partitions them into subgroups that are hierarchically-arranged based upon correlated residue patterns. Auxiliary routines create PyMOL scripts to visualize the locations of correlated residues within available structures. Because these correlations likely arise from structural and biochemical constraints, they can help elucidate protein properties important for functional specificity. Comparing and contrasting sequence and structural features in this way may therefore suggest, in the light of published studies, plausible biological hypotheses for experimental investigation. We illustrate this approach with N-acetyltransferases.
Collapse
Affiliation(s)
- Andrew F. Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, BioPark II, Room 617, Baltimore, MD, United States of America
- * E-mail:
| | - Stephen F. Altschul
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States of America
| |
Collapse
|
10
|
Neuwald AF. Gleaning structural and functional information from correlations in protein multiple sequence alignments. Curr Opin Struct Biol 2016; 38:1-8. [PMID: 27179293 DOI: 10.1016/j.sbi.2016.04.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 04/28/2016] [Accepted: 04/29/2016] [Indexed: 10/24/2022]
Abstract
The availability of vast amounts of protein sequence data facilitates detection of subtle statistical correlations due to imposed structural and functional constraints. Recent breakthroughs using Direct Coupling Analysis (DCA) and related approaches have tapped into correlations believed to be due to compensatory mutations. This has yielded some remarkable results, including substantially improved prediction of protein intra- and inter-domain 3D contacts, of membrane and globular protein structures, of substrate binding sites, and of protein conformational heterogeneity. A complementary approach is Bayesian Partitioning with Pattern Selection (BPPS), which partitions related proteins into hierarchically-arranged subgroups based on correlated residue patterns. These correlated patterns are presumably due to structural and functional constraints associated with evolutionary divergence rather than to compensatory mutations. Hence joint application of DCA- and BPPS-based approaches should help sort out the structural and functional constraints contributing to sequence correlations.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, 801 West Baltimore St., BioPark II, Room 617, Baltimore, MD 21201, United States.
| |
Collapse
|
11
|
Co-conserved MAPK features couple D-domain docking groove to distal allosteric sites via the C-terminal flanking tail. PLoS One 2015; 10:e0119636. [PMID: 25799139 PMCID: PMC4370755 DOI: 10.1371/journal.pone.0119636] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Accepted: 02/02/2015] [Indexed: 11/19/2022] Open
Abstract
Mitogen activated protein kinases (MAPKs) form a closely related family of kinases that control critical pathways associated with cell growth and survival. Although MAPKs have been extensively characterized at the biochemical, cellular, and structural level, an integrated evolutionary understanding of how MAPKs differ from other closely related protein kinases is currently lacking. Here, we perform statistical sequence comparisons of MAPKs and related protein kinases to identify sequence and structural features associated with MAPK functional divergence. We show, for the first time, that virtually all MAPK-distinguishing sequence features, including an unappreciated short insert segment in the β4-β5 loop, physically couple distal functional sites in the kinase domain to the D-domain peptide docking groove via the C-terminal flanking tail (C-tail). The coupling mediated by MAPK-specific residues confers an allosteric regulatory mechanism unique to MAPKs. In particular, the regulatory αC-helix conformation is controlled by a MAPK-conserved salt bridge interaction between an arginine in the αC-helix and an acidic residue in the C-tail. The salt-bridge interaction is modulated in unique ways in individual sub-families to achieve regulatory specificity. Our study is consistent with a model in which the C-tail co-evolved with the D-domain docking site to allosterically control MAPK activity. Our study provides testable mechanistic hypotheses for biochemical characterization of MAPK-conserved residues and new avenues for the design of allosteric MAPK inhibitors.
Collapse
|
12
|
Stefely JA, Reidenbach AG, Ulbrich A, Oruganty K, Floyd BJ, Jochem A, Saunders JM, Johnson IE, Minogue CE, Wrobel RL, Barber GE, Lee D, Li S, Kannan N, Coon JJ, Bingman CA, Pagliarini DJ. Mitochondrial ADCK3 employs an atypical protein kinase-like fold to enable coenzyme Q biosynthesis. Mol Cell 2014; 57:83-94. [PMID: 25498144 DOI: 10.1016/j.molcel.2014.11.002] [Citation(s) in RCA: 91] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2014] [Revised: 10/13/2014] [Accepted: 11/04/2014] [Indexed: 10/24/2022]
Abstract
The ancient UbiB protein kinase-like family is involved in isoprenoid lipid biosynthesis and is implicated in human diseases, but demonstration of UbiB kinase activity has remained elusive for unknown reasons. Here, we quantitatively define UbiB-specific sequence motifs and reveal their positions within the crystal structure of a UbiB protein, ADCK3. We find that multiple UbiB-specific features are poised to inhibit protein kinase activity, including an N-terminal domain that occupies the typical substrate binding pocket and a unique A-rich loop that limits ATP binding by establishing an unusual selectivity for ADP. A single alanine-to-glycine mutation of this loop flips this coenzyme selectivity and enables autophosphorylation but inhibits coenzyme Q biosynthesis in vivo, demonstrating functional relevance for this unique feature. Our work provides mechanistic insight into UbiB enzyme activity and establishes a molecular foundation for further investigation of how UbiB family proteins affect diseases and diverse biological pathways.
Collapse
Affiliation(s)
- Jonathan A Stefely
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Andrew G Reidenbach
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Arne Ulbrich
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Brendan J Floyd
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Adam Jochem
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Jaclyn M Saunders
- Mitochondrial Protein Partnership, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Isabel E Johnson
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Catherine E Minogue
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Russell L Wrobel
- Mitochondrial Protein Partnership, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Grant E Barber
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - David Lee
- Department of Medicine and UCSD DXMS Proteomics Resource, University of California, San Diego, La Jolla, CA 92023, USA
| | - Sheng Li
- Department of Medicine and UCSD DXMS Proteomics Resource, University of California, San Diego, La Jolla, CA 92023, USA
| | - Natarajan Kannan
- Department of Biochemistry, University of Georgia, Athens, GA 30602, USA
| | - Joshua J Coon
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Craig A Bingman
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Mitochondrial Protein Partnership, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - David J Pagliarini
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Mitochondrial Protein Partnership, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
13
|
Solution structures of Mengovirus Leader protein, its phosphorylated derivatives, and in complex with nuclear transport regulatory protein, RanGTPase. Proc Natl Acad Sci U S A 2014; 111:15792-7. [PMID: 25331866 DOI: 10.1073/pnas.1411098111] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Cardiovirus Leader (L) proteins induce potent antihost inhibition of active cellular nucleocytoplasmic trafficking by triggering aberrant hyperphosphorylation of nuclear pore proteins (Nup). To achieve this, L binds protein RanGTPase (Ran), a key trafficking regulator, and diverts it into tertiary or quaternary complexes with required kinases. The activity of L is regulated by two phosphorylation events not required for Ran binding. Matched NMR studies on the unphosphorylated, singly, and doubly phosphorylated variants of Mengovirus L (L(M)) show both modifications act together to partially stabilize a short internal α-helix comprising L(M) residues 43-46. This motif implies that ionic and Van der Waals forces contributed by phosphorylation help organize downstream residues 48-67 into a new interface. The full structure of L(M) as bound to Ran (unlabeled) and Ran (216 aa) as bound by L(M) (unlabeled) places L(M) into the BP1 binding site of Ran, wrapped by the conformational flexible COOH tail. The arrangement explains the tight KD for this complex and places the LM zinc finger and phosphorylation interface as surface exposed and available for subsequent reactions. The core structure of Ran, outside the COOH tail, is not altered by L(M) binding and remains accessible for canonical RanGTP partner interactions. Pull-down assays identify at least one putative Ran:L(M) partner as an exportin, Crm1, or CAS. A model of Ran:L(M):Crm1, based on the new structures suggests LM phosphorylation status may mediate Ran's selection of exportin(s) and cargo(s), perverting these native trafficking elements into the lethal antihost Nup phosphorylation pathways.
Collapse
|
14
|
Neuwald AF. A Bayesian sampler for optimization of protein domain hierarchies. J Comput Biol 2014; 21:269-86. [PMID: 24494927 DOI: 10.1089/cmb.2013.0099] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The process of identifying and modeling functionally divergent subgroups for a specific protein domain class and arranging these subgroups hierarchically has, thus far, largely been done via manual curation. How to accomplish this automatically and optimally is an unsolved statistical and algorithmic problem that is addressed here via Markov chain Monte Carlo sampling. Taking as input a (typically very large) multiple-sequence alignment, the sampler creates and optimizes a hierarchy by adding and deleting leaf nodes, by moving nodes and subtrees up and down the hierarchy, by inserting or deleting internal nodes, and by redefining the sequences and conserved patterns associated with each node. All such operations are based on a probability distribution that models the conserved and divergent patterns defining each subgroup. When we view these patterns as sequence determinants of protein function, each node or subtree in such a hierarchy corresponds to a subgroup of sequences with similar biological properties. The sampler can be applied either de novo or to an existing hierarchy. When applied to 60 protein domains from multiple starting points in this way, it converged on similar solutions with nearly identical log-likelihood ratio scores, suggesting that it typically finds the optimal peak in the posterior probability distribution. Similarities and differences between independently generated, nearly optimal hierarchies for a given domain help distinguish robust from statistically uncertain features. Thus, a future application of the sampler is to provide confidence measures for various features of a domain hierarchy.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine , Baltimore, Maryland
| |
Collapse
|
15
|
Neuwald AF, Lanczycki CJ, Marchler-Bauer A. Automated hierarchical classification of protein domain subfamilies based on functionally-divergent residue signatures. BMC Bioinformatics 2012; 13:144. [PMID: 22726767 PMCID: PMC3599474 DOI: 10.1186/1471-2105-13-144] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Accepted: 06/09/2012] [Indexed: 11/17/2022] Open
Abstract
Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, BioPark II, Room 617, 801 West Baltimore St, Baltimore, MD 21201, USA.
| | | | | |
Collapse
|
16
|
Surveying the manifold divergence of an entire protein class for statistical clues to underlying biochemical mechanisms. Stat Appl Genet Mol Biol 2011; 10:Article 36. [PMID: 22331370 DOI: 10.2202/1544-6115.1666] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Certain residues have no known function yet are co-conserved across distantly related protein families and diverse organisms, suggesting that they perform critical roles associated with as-yet-unidentified molecular properties and mechanisms. This raises the question of how to obtain additional clues regarding these mysterious biochemical phenomena with a view to formulating experimentally testable hypotheses. One approach is to access the implicit biochemical information encoded within the vast amount of genomic sequence data now becoming available. Here, a new Gibbs sampling strategy is formulated and implemented that can partition hundreds of thousands of sequences within a major protein class into multiple, functionally-divergent categories based on those pattern residues that best discriminate between categories. The sampler precisely defines the partition and pattern for each category by explicitly modeling unrelated, non-functional and related-yet-divergent proteins that would otherwise obscure the analysis. To aid biological interpretation, auxiliary routines can characterize pattern residues within available crystal structures and identify those structures most likely to shed light on the roles of pattern residues. This approach can be used to define and annotate automatically subgroup-specific conserved domain profiles based on statistically-rigorous empirical criteria rather than on the subjective and labor-intensive process of manual curation. Incorporating such profiles into domain database search sites (such as the NCBI BLAST site) will provide biologists with previously inaccessible molecular information useful for hypothesis generation and experimental design. Analyses of P-loop GTPases and of AAA+ ATPases illustrate the sampler's ability to obtain such information.
Collapse
|
17
|
Pluder F, Barjaktarovic Z, Azimzadeh O, Mörtl S, Krämer A, Steininger S, Sarioglu H, Leszczynski D, Nylund R, Hakanen A, Sriharshan A, Atkinson MJ, Tapio S. Low-dose irradiation causes rapid alterations to the proteome of the human endothelial cell line EA.hy926. RADIATION AND ENVIRONMENTAL BIOPHYSICS 2011; 50:155-166. [PMID: 21104263 DOI: 10.1007/s00411-010-0342-9] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2010] [Accepted: 11/01/2010] [Indexed: 05/30/2023]
Abstract
High doses of ionising radiation damage the heart by an as yet unknown mechanism. A concern for radiological protection is the recent epidemiological data indicating that doses as low as 100-500 mGy may induce cardiac damage. The aim of this study was to identify potential molecular targets and/or mechanisms involved in the pathogenesis of low-dose radiation-induced cardiovascular disease. The vascular endothelium plays a pivotal role in the regulation of cardiac function and is therefore a potential target tissue. We report here that low-dose radiation induced rapid and time-dependent changes in the cytoplasmic proteome of the human endothelial cell line EA.hy926. The proteomes were investigated at 4 and 24 h after irradiation at two different dose rates (Co-60 gamma ray total dose 200 mGy; 20 mGy/min and 190 mGy/min) using 2D-DIGE technology. Differentially expressed proteins were identified, after in-gel trypsin digestion, by MALDI-TOF/TOF tandem mass spectrometry, and peptide mass fingerprint analyses. We identified 15 significantly differentially expressed proteins, of which 10 were up-regulated and 5 down-regulated, with more than ±1.5-fold difference compared with unexposed cells. Pathways influenced by the low-dose exposures included the Ran and RhoA pathways, fatty acid metabolism and stress response.
Collapse
Affiliation(s)
- Franka Pluder
- Institute of Radiation Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstaedter Landstrasse 1, 85764, Neuherberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Bykova NV, Hoehn B, Rampitsch C, Banks T, Stebbing JA, Fan T, Knox R. Redox-sensitive proteome and antioxidant strategies in wheat seed dormancy control. Proteomics 2011; 11:865-82. [DOI: 10.1002/pmic.200900810] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2009] [Revised: 11/15/2010] [Accepted: 11/29/2010] [Indexed: 11/10/2022]
|
19
|
Jamali T, Jamali Y, Mehrbod M, Mofrad MRK. Nuclear pore complex: biochemistry and biophysics of nucleocytoplasmic transport in health and disease. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2011; 287:233-86. [PMID: 21414590 DOI: 10.1016/b978-0-12-386043-9.00006-2] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Nuclear pore complexes (NPCs) are the gateways connecting the nucleoplasm and cytoplasm. This structures are composed of over 30 different proteins and 60-125 MDa of mass depending on type of species. NPCs are bilateral pathways that selectively control the passage of macromolecules into and out of the nucleus. Molecules smaller than 40 kDa diffuse through the NPC passively while larger molecules require facilitated transport provided by their attachment to karyopherins. Kinetic studies have shown that approximately 1000 translocations occur per second per NPC. Maintaining its high selectivity while allowing for rapid translocation makes the NPC an efficient chemical nanomachine. In this review, we approach the NPC function via a structural viewpoint. Putting together different pieces of this puzzle, this chapter confers an overall insight into what molecular processes are engaged in import/export of active cargos across the NPC and how different transporters regulate nucleocytoplasmic transport. In the end, the correlation of several diseases and disorders with the NPC structural defects and dysfunctions is discussed.
Collapse
Affiliation(s)
- T Jamali
- Department of Bioengineering, University of California, Berkeley, California, USA
| | | | | | | |
Collapse
|
20
|
Mirza A, Mustafa M, Talevich E, Kannan N. Co-conserved features associated with cis regulation of ErbB tyrosine kinases. PLoS One 2010; 5:e14310. [PMID: 21179209 PMCID: PMC3001462 DOI: 10.1371/journal.pone.0014310] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2010] [Accepted: 11/08/2010] [Indexed: 11/18/2022] Open
Abstract
Background The epidermal growth factor receptor kinases, or ErbB kinases, belong to a large sub-group of receptor tyrosine kinases (RTKs), which share a conserved catalytic core. The catalytic core of ErbB kinases have functionally diverged from other RTKs in that they are activated by a unique allosteric mechanism that involves specific interactions between the kinase core and the flanking Juxtamembrane (JM) and COOH-terminal tail (C-terminal tail). Although extensive studies on ErbB and related tyrosine kinases have provided important insights into the structural basis for ErbB kinase functional divergence, the sequence features that contribute to the unique regulation of ErbB kinases have not been systematically explored. Methodology/Principal Findings In this study, we use a Bayesian approach to identify the selective sequence constraints that most distinguish ErbB kinases from other receptor tyrosine kinases. We find that strong ErbB kinase-specific constraints are imposed on residues that tether the JM and C-terminal tail to key functional regions of the kinase core. A conserved RIxKExE motif in the JM-kinase linker region and a glutamine in the inter-lobe linker are identified as two of the most distinguishing features of the ErbB family. While the RIxKExE motif tethers the C-terminal tail to the N-lobe of the kinase domain, the glutamine tethers the C-terminal tail to hinge regions critical for inter-lobe movement. Comparison of the active and inactive crystal structures of ErbB kinases indicates that the identified residues are conformationally malleable and can potentially contribute to the cis regulation of the kinase core by the JM and C-terminal tail. ErbB3, and EGFR orthologs in sponges and parasitic worms, diverge from some of the canonical ErbB features, providing insights into sub-family and lineage-specific functional specialization. Conclusion/Significance Our analysis pinpoints key residues for mutational analysis, and provides new clues to cancer mutations that alter the canonical modes of ErbB kinase regulation.
Collapse
Affiliation(s)
- Amar Mirza
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, United States of America
| | | | | | | |
Collapse
|
21
|
Neuwald AF. Bayesian classification of residues associated with protein functional divergence: Arf and Arf-like GTPases. Biol Direct 2010; 5:66. [PMID: 21129209 PMCID: PMC3012027 DOI: 10.1186/1745-6150-5-66] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Accepted: 12/03/2010] [Indexed: 11/22/2022] Open
Abstract
Background Certain residues within proteins are highly conserved across very distantly related organisms, yet their (presumably critical) structural or mechanistic roles are completely unknown. To obtain clues regarding such residues within Arf and Arf-like (Arf/Arl) GTPases--which function as on/off switches regulating vesicle trafficking, phospholipid metabolism and cytoskeletal remodeling--I apply a new sampling procedure for comparative sequence analysis, termed multiple category Bayesian Partitioning with Pattern Selection (mcBPPS). Results The mcBPPS sampler classified sequences within the entire P-loop GTPase class into multiple categories by identifying those evolutionarily-divergent residues most likely to be responsible for functional specialization. Here I focus on categories of residues that most distinguish various Arf/Arl GTPases from other GTPases. This identified residues whose specific roles have been previously proposed (and in some cases corroborated experimentally and that thus serve as positive controls), as well as several categories of co-conserved residues whose possible roles are first hinted at here. For example, Arf/Arl/Sar GTPases are most distinguished from other GTPases by a conserved aspartate residue within the phosphate binding loop (P-loop) and by co-conserved residues nearby that, together, can form a network of salt-bridge and hydrogen bond interactions centered on the GTPase active site. Residues corresponding to an N-[VI] motif that is conserved within Arf/Arl GTPases may play a role in the interswitch toggle characteristic of the Arf family, whereas other, co-conserved residues may modulate the flexibility of the guanine binding loop. Arl8 GTPases conserve residues that strikingly diverge from those typically found in other Arf/Arl GTPases and that form structural interactions suggestive of a novel interswitch toggle mechanism. Conclusions This analysis suggests specific mutagenesis experiments to explore mechanisms underlying GTP hydrolysis, nucleotide exchange and interswitch toggling within Arf/Arl GTPases. More generally, it illustrates how the mcBPPS sampler can complement traditional evolutionary analyses by providing an objective, quantitative and statistically rigorous way to explore protein functional-divergence in molecular detail. Because the sampler classifies the input sequences at the same time, it can be used to generate subgroup profiles, in which functionally-divergent categories of residues are annotated automatically. Reviewers This article was reviewed by Frank Eisenhaber, L Aravind and Daniel Gaston (nominated by Eric Bapteste). For the full reviews, go to the Reviewers' comments section.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Department of Biochemistry & Molecular Biology, Institute for Genome Sciences, University of Maryland School of Medicine, BioPark II, Room 617, 801 West Baltimore St, Baltimore, MD 21201, USA.
| |
Collapse
|
22
|
Abstract
Phylogenomics reveals extreme gene loss in typhus group (TG) rickettsiae relative to the levels for other rickettsial lineages. We report here a curious protease-encoding gene (ppcE) that is conserved only in TG rickettsiae. As a possible determinant of host pathogenicity, ppcE warrants consideration in the development of therapeutics against epidemic and murine typhus.
Collapse
|
23
|
Neuwald AF. Rapid detection, classification and accurate alignment of up to a million or more related protein sequences. ACTA ACUST UNITED AC 2009; 25:1869-75. [PMID: 19505947 DOI: 10.1093/bioinformatics/btp342] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. RESULTS This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. AVAILABILITY A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Department of Biochemistry & Molecular Biology and The Institute for Genome Sciences, University of Maryland, School of Medicine, BioPark II, Baltimore, MD 21201, USA.
| |
Collapse
|
24
|
Neuwald AF. The charge-dipole pocket: a defining feature of signaling pathway GTPase on/off switches. J Mol Biol 2009; 390:142-53. [PMID: 19427324 DOI: 10.1016/j.jmb.2009.05.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2008] [Revised: 04/07/2009] [Accepted: 05/01/2009] [Indexed: 11/19/2022]
Abstract
Ras-like GTPases function as on/off switches in intracellular signaling pathways. Their on or off state is communicated through conformational changes in the so-called switch I and II regions. It is commonly believed that the distinguishing molecular features of these GTPases are well known. Here, however, I identify-through a Bayesian iterative analysis of GTPase evolutionary divergence-a previously undescribed switch II structural component that (along with previously described, functionally critical residues) most distinguish these signaling pathway on/off switches from other GTPases. In certain Ras-like GTPases this newly-identified component forms an aromatic pocket around the negative-dipole moment at the end of a switch II helix with a positively charged residue inserted into the pocket. This helix is oriented in a specific direction away from the GTPase core, but is reoriented dramatically upon disruption of the charge-dipole pocket. The charge-dipole pocket occurs in both the on and off states and both the charge-dipole pocket and an alternative configuration occur within the unit cell of a single crystal structure of Rab5a GTPase in the off state. Thus, the charge-dipole pocket configuration is closely associated, not with the on or off state, but rather with formation of the outward-oriented helix and, as a result, with restructuring of the switch II N-terminal region, which has a critical role both in sensing the on/off state and in mediating GTP hydrolysis and nucleotide exchange.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, Baltimore, 21201, USA.
| |
Collapse
|
25
|
Neuwald AF. The glycine brace: a component of Rab, Rho, and Ran GTPases associated with hinge regions of guanine- and phosphate-binding loops. BMC STRUCTURAL BIOLOGY 2009; 9:11. [PMID: 19265520 PMCID: PMC2656535 DOI: 10.1186/1472-6807-9-11] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2008] [Accepted: 03/05/2009] [Indexed: 11/10/2022]
Abstract
Background Ras-like GTPases function as on-off switches in intracellular signalling pathways and include the Rab, Rho/Rac, Ran, Ras, Arf, Sar and Gα families. How these families have evolutionarily diverged from each other at the sequence level provides clues to underlying mechanisms associated with their functional specialization. Results Bayesian analysis of divergent patterns within a multiple alignment of Ras-like GTPase sequences identifies a structural component, termed here the glycine brace, as the feature that most distinguishes Rab, Rho/Rac, Ran and (to some degree) Ras family GTPases from other Ras-like GTPases. The glycine brace consists of four residues: An aromatic residue that forms a stabilizing CH-π interaction with a conserved glycine at the start of the guanine-binding loop; a second aromatic residue, which is nearly always a tryptophan, that likewise forms stabilizing CH-π and NH-π interactions with a glycine at the start of the phosphate-binding P-loop; and two other residues (typically an aspartate and a serine or threonine) that, together with a conserved buried water molecule, form a network of interactions connecting the two aromatic residues. Conclusion It is proposed that the two glycine residues function as hinges and that the glycine brace influences guanine nucleotide binding and release by interacting with these hinges.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, 801 West Baltimore St,, BioPark II, Baltimore, MD 21201, USA.
| |
Collapse
|
26
|
Hu M, Qin ZS. Query large scale microarray compendium datasets using a model-based bayesian approach with variable selection. PLoS One 2009; 4:e4495. [PMID: 19214232 PMCID: PMC2637418 DOI: 10.1371/journal.pone.0004495] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2008] [Accepted: 12/06/2008] [Indexed: 11/19/2022] Open
Abstract
In microarray gene expression data analysis, it is often of interest to identify genes that share similar expression profiles with a particular gene such as a key regulatory protein. Multiple studies have been conducted using various correlation measures to identify co-expressed genes. While working well for small datasets, the heterogeneity introduced from increased sample size inevitably reduces the sensitivity and specificity of these approaches. This is because most co-expression relationships do not extend to all experimental conditions. With the rapid increase in the size of microarray datasets, identifying functionally related genes from large and diverse microarray gene expression datasets is a key challenge. We develop a model-based gene expression query algorithm built under the Bayesian model selection framework. It is capable of detecting co-expression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust against sporadic outliers in the data. Both features are critically important for increasing the power of identifying co-expressed genes in large scale gene expression datasets. Our simulation studies suggest that this method outperforms existing correlation coefficients or mutual information-based query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons as well as novel potential target genes of numerous key transcription factors.
Collapse
Affiliation(s)
- Ming Hu
- Center for Statistical Genetics, Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Zhaohui S. Qin
- Center for Statistical Genetics, Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
27
|
Neuwald AF. Galpha Gbetagamma dissociation may be due to retraction of a buried lysine and disruption of an aromatic cluster by a GTP-sensing Arg Trp pair. Protein Sci 2008; 16:2570-7. [PMID: 17962409 DOI: 10.1110/ps.073098107] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The heterotrimeric G protein alpha subunit (Galpha) functions as a molecular switch by cycling between inactive GDP-bound and active GTP-bound states. When bound to GDP, Galpha interacts with high affinity to a complex of the beta and gamma subunits (Gbetagamma), but when bound to GTP, Galpha dissociates from this complex to activate downstream signaling pathways. Galpha's state is communicated to other cellular components via conformational changes within its switch I and II regions. To identify key determinants of Galpha's function as a signaling pathway molecular switch, a Bayesian approach was used to infer the selective constraints that most distinguish Galpha and closely related Arf family GTPases from distantly related translational and metabolic GTPases. The strongest of these constraints are imposed on seven residues within or near the switch II region. Likewise, constraints imposed on Galpha but not on other, closely related molecular switches correspond to four nearby residues. These constraints are explained by a proposed mechanism for GTP-induced dissociation of Galpha from Gbetagamma where an Arg-Trp pair senses the presence of bound GTP leading to conformational retraction of a nearby lysine and to disruption of an aromatic cluster. Within a complex of Gialpha, Gibetagamma, and GDP, this lysine establishes greater surface contact with Gibeta than does any other residue in Gialpha, whereas the aromatic cluster packs against a highly conserved tryptophan in Gibeta that establishes greater surface contact with Gialpha than does any other residue in Gibeta. Other structural features associated with Galpha functional divergence further support the proposed mechanism.
Collapse
|
28
|
Neuwald AF. The CHAIN program: forging evolutionary links to underlying mechanisms. Trends Biochem Sci 2007; 32:487-93. [PMID: 17962021 DOI: 10.1016/j.tibs.2007.08.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2007] [Revised: 08/13/2007] [Accepted: 08/17/2007] [Indexed: 11/25/2022]
Abstract
Proteins evolve new functions by modifying and extending the molecular machinery of an ancestral protein. Such changes show up as divergent sequence patterns, which are conserved in descendent proteins that maintain the divergent function. After multiply-aligning a set of input sequences, the CHAIN program partitions the sequences into two functionally divergent groups and then outputs an alignment that is annotated to reveal the selective pressures imposed on divergent residue positions. If atomic coordinates are also provided, hydrogen bonds and other atomic interactions associated with various categories of divergent residues are graphically displayed. Such analyses establish links between protein evolutionary divergence and functionally crucial atomic features and, as a result, can suggest plausible molecular mechanisms for experimental testing. This is illustrated here by its application to bacterial clamp-loader ATPases.
Collapse
Affiliation(s)
- Andrew F Neuwald
- The J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA.
| |
Collapse
|
29
|
Kannan N, Taylor SS, Zhai Y, Venter JC, Manning G. Structural and functional diversity of the microbial kinome. PLoS Biol 2007; 5:e17. [PMID: 17355172 PMCID: PMC1821047 DOI: 10.1371/journal.pbio.0050017] [Citation(s) in RCA: 206] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2006] [Accepted: 09/20/2006] [Indexed: 11/19/2022] Open
Abstract
The eukaryotic protein kinase (ePK) domain mediates the majority of signaling and coordination of complex events in eukaryotes. By contrast, most bacterial signaling is thought to occur through structurally unrelated histidine kinases, though some ePK-like kinases (ELKs) and small molecule kinases are known in bacteria. Our analysis of the Global Ocean Sampling (GOS) dataset reveals that ELKs are as prevalent as histidine kinases and may play an equally important role in prokaryotic behavior. By combining GOS and public databases, we show that the ePK is just one subset of a diverse superfamily of enzymes built on a common protein kinase-like (PKL) fold. We explored this huge phylogenetic and functional space to cast light on the ancient evolution of this superfamily, its mechanistic core, and the structural basis for its observed diversity. We cataloged 27,677 ePKs and 18,699 ELKs, and classified them into 20 highly distinct families whose known members suggest regulatory functions. GOS data more than tripled the count of ELK sequences and enabled the discovery of novel families and classification and analysis of all ELKs. Comparison between and within families revealed ten key residues that are highly conserved across families. However, all but one of the ten residues has been eliminated in one family or another, indicating great functional plasticity. We show that loss of a catalytic lysine in two families is compensated by distinct mechanisms both involving other key motifs. This diverse superfamily serves as a model for further structural and functional analysis of enzyme evolution.
Collapse
Affiliation(s)
- Natarajan Kannan
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California, United States of America
- Howard Hughes Medical Institute, University of California San Diego, La Jolla, California, United States of America
| | - Susan S Taylor
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California, United States of America
- Howard Hughes Medical Institute, University of California San Diego, La Jolla, California, United States of America
| | - Yufeng Zhai
- Razavi-Newman Center for Bioinformatics, Salk Institute for Biological Studies, La Jolla, California, United States of America
| | - J. Craig Venter
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Gerard Manning
- Razavi-Newman Center for Bioinformatics, Salk Institute for Biological Studies, La Jolla, California, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
30
|
Kannan N, Haste N, Taylor SS, Neuwald AF. The hallmark of AGC kinase functional divergence is its C-terminal tail, a cis-acting regulatory module. Proc Natl Acad Sci U S A 2007; 104:1272-7. [PMID: 17227859 PMCID: PMC1783090 DOI: 10.1073/pnas.0610251104] [Citation(s) in RCA: 175] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The catalytic activities of eukaryotic protein kinases (EPKs) are regulated by movement of the C-helix, movement of the N and C lobes upon ATP binding, and movement of the activation loop upon phosphorylation. Statistical analysis of the selective constraints associated with AGC kinase functional divergence reveals conserved interactions between these regulatory regions and three regions of the C-terminal tail (C-tail): the N-lobe tether (NLT), the active-site tether (AST), and the C-lobe tether (CLT). The NLT serves as a docking site for an upstream kinase PDK1 and, upon activation, positions the C-helix within the ATP binding pocket. The AST directly interacts with the ATP binding pocket, and the CLT interacts with the interlobe linker and the alphaC-beta4 loop, which appears to serve as a hinge for C-helix movement. The C-tail is a hallmark of AGC functional divergence inasmuch as most of the conserved core residues that distinguish AGC kinases from other EPKs are associated with the NLT, AST, or CLT. Moreover, several AGC catalytic core conserved residues that interact with the C-tail strikingly diverge from the canonical residues observed at corresponding positions in nearly all other EPKs, suggesting that the catalytic core may have coevolved with the C-tail in AGC kinases. These observations, along with the fact that the C-tail is needed for catalytic activity suggests that the C-tail is a cis-acting regulatory module that can also serve as a regulatory "handle," to which trans-acting cellular components can bind to modulate activity.
Collapse
Affiliation(s)
- Natarajan Kannan
- *Howard Hughes Medical Institute, Department of Chemistry and Biochemistry, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0654; and
| | - Nina Haste
- *Howard Hughes Medical Institute, Department of Chemistry and Biochemistry, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0654; and
| | - Susan S. Taylor
- *Howard Hughes Medical Institute, Department of Chemistry and Biochemistry, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0654; and
- To whom correspondence may be addressed: E-mail:
or
| | - Andrew F. Neuwald
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724
- To whom correspondence may be addressed: E-mail:
or
| |
Collapse
|
31
|
Neuwald AF. Hypothesis: bacterial clamp loader ATPase activation through DNA-dependent repositioning of the catalytic base and of a trans-acting catalytic threonine. Nucleic Acids Res 2006; 34:5280-90. [PMID: 17012286 PMCID: PMC1636414 DOI: 10.1093/nar/gkl519] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The prokaryotic DNA polymerase III clamp loader complex loads the β clamp onto DNA to link the replication complex to DNA during processive synthesis and unloads it again once synthesis is complete. This minimal complex consists of one δ, one δ′ and three γ subunits, all of which possess an AAA+ module—though only the γ subunit exhibits ATPase activity. Here clues to underlying clamp loader mechanisms are obtained through Bayesian inference of various categories of selective constraints imposed on the γ and δ′ subunits. It is proposed that a conserved histidine is ionized via electron transfer involving structurally adjacent residues within the sensor 1 region of γ's AAA+ module. The resultant positive charge on this histidine inhibits ATPase activity by drawing the negatively charged catalytic base away from the active site. It is also proposed that this arrangement is disrupted upon interaction of DNA with basic residues in γ implicated previously in DNA binding, regarding which a lysine that is near the sensor 1 region and that is highly conserved both in bacterial and in eukaryotic clamp loader ATPases appears to play a critical role. γ ATPases also appear to utilize a trans-acting threonine that is donated by helix 6 of an adjacent γ or δ′ subunit and that assists in the activation of a water molecule for nucleophilic attack on the γ phosphorous atom of ATP. As eukaryotic and archaeal clamp loaders lack most of these key residues, it appears that eubacteria utilize a fundamentally different mechanism for clamp loader activation than do these other organisms.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Cold Spring Harbor Laboratory, 1 Bungtown Road PO Box 100, Cold Spring Harbor, NY 11724, USA
| |
Collapse
|
32
|
Neuwald AF. Bayesian shadows of molecular mechanisms cast in the light of evolution. Trends Biochem Sci 2006; 31:374-82. [PMID: 16766187 DOI: 10.1016/j.tibs.2006.05.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Revised: 04/06/2006] [Accepted: 05/24/2006] [Indexed: 10/24/2022]
Abstract
A great many carefully designed experiments will be required to fully understand biological mechanisms in atomic detail. A complementary approach is to use powerful statistical procedures to rapidly test numerous scientific hypotheses using vast numbers of protein sequences--the cell's own blueprints for specifying biological mechanisms. Bayesian inference of the evolutionary constraints imposed on functionally divergent proteins can reveal key components of the molecular machinery and thereby suggest likely mechanisms to test experimentally. This approach is demonstrated by considering how DNA polymerase clamp-loader AAA+ ATPases couple DNA recognition to ATP hydrolysis and clamp loading.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Cold Spring Harbor Laboratory, 1 Bungtown Road, PO Box 100, Cold Spring Harbor, NY 11724, USA.
| |
Collapse
|
33
|
Kannan N, Neuwald AF. Did protein kinase regulatory mechanisms evolve through elaboration of a simple structural component? J Mol Biol 2005; 351:956-72. [PMID: 16051269 DOI: 10.1016/j.jmb.2005.06.057] [Citation(s) in RCA: 125] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2005] [Revised: 06/21/2005] [Accepted: 06/23/2005] [Indexed: 10/25/2022]
Abstract
Statistical analysis of the functional constraints acting on eukaryotic protein kinases (EPKs) and on distantly related kinases suggests that EPK regulatory mechanisms evolved around an ancient structural component whose most distinctive features include the HxD-motif adjoining the catalytic loop, the F-helix, an F-helix aspartate, and the DFG-motif adjoined to the activation loop. The HxD-histidine constitutes a convergence point for signal integration, as conserved interactions link it to key catalytic residues, to the F-helix aspartate, and to both ends of the DFG-motif. These and other conserved features appear to be associated with DFG conformational changes and with coordinated movements possibly associated with phosphate transfer and ADP release. The EPKs have acquired structural features that link this core component to likely substrate-interacting regions at either end of the F-helix (most notably involving an F-helix tryptophan) and to three regions undergoing conformational changes upon kinase activation: the activation segment, the C-helix, and the nucleotide-binding pocket.
Collapse
Affiliation(s)
- Natarajan Kannan
- Cold Spring Harbor Laboratory, 1 Bungtown Road, P.O. Box 100, Cold Spring Harbor, NY 11724, USA
| | | |
Collapse
|
34
|
Kronfeld K, Hochleitner E, Mendler S, Goldschmidt J, Lichtenfels R, Lottspeich F, Abken H, Seliger B. B7/CD28 costimulation of T cells induces a distinct proteome pattern. Mol Cell Proteomics 2005; 4:1876-87. [PMID: 16113399 DOI: 10.1074/mcp.m500194-mcp200] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Effective immune strategies for the eradication of human tumors require a detailed understanding of the interaction of tumor cells with the immune system, which might lead to an optimization of T cell responses. To understand the impact of B7-mediated costimulation on T cell activation comprehensive proteome analysis of B7-primed T cell populations were performed. Using this approach we identified different classes of proteins in T cells whose expression is either elevated or reduced upon B7-1- or B7-2-mediated CD28 costimulation. The altered proteins include regulators of the cell cycle and cell proliferation, signal transducers, components of the antigen processing machinery, transporters, cytoskeletal proteins, and metabolic enzymes. A number of differentially expressed proteins are further modified by phosphorylation. Our results provide novel insights into the complexity of the CD28 costimulatory pathway of T cells and will help to identify potential targets of therapeutic interventions for modulating anti-tumor T cell activation.
Collapse
Affiliation(s)
- Kai Kronfeld
- IIIrd Department of Internal Medicine, Johannes Gutenberg University, 55131 Mainz, Germany
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Neuwald AF. Evolutionary clues to eukaryotic DNA clamp-loading mechanisms: analysis of the functional constraints imposed on replication factor C AAA+ ATPases. Nucleic Acids Res 2005; 33:3614-28. [PMID: 16082778 PMCID: PMC1160110 DOI: 10.1093/nar/gki674] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Ring-shaped sliding clamps encircle DNA and bind to DNA polymerase, thereby preventing it from falling off during DNA replication. In eukaryotes, sliding clamps are loaded onto DNA by the replication factor C (RFC) complex, which consists of five distinct subunits (A–E), each of which contains an AAA+ module composed of a RecA-like α/β ATPase domain followed by a helical domain. AAA+ ATPases mediate chaperone-like protein remodeling. Despite remarkable progress in our understanding of clamp loaders, it is still unclear how recognition of primed DNA by RFC triggers ATP hydrolysis and how hydrolysis leads to conformational changes that can load the clamp onto DNA. While these questions can, of course, only be resolved experimentally, the design of such experiments is itself non-trivial and requires that one first formulate the right hypotheses based on preliminary observations. The functional constraints imposed on protein sequences during evolution are potential sources of information in this regard, inasmuch as these presumably are due to and thus reflect underlying mechanisms. Here, rigorous statistical procedures are used to measure and compare the constraints imposed on various RFC clamp-loader subunits, each of which performs a related but somewhat different, specialized function. Visualization of these constraints, within the context of the RFC structure, provides clues regarding clamp-loader mechanisms—suggesting, for example, that RFC-A possesses a triggering component for DNA-dependent ATP hydrolysis. It also suggests that, starting with RFC-A, four RFC subunits (A–D) are sequentially activated through a propagated switching mechanism in which a conserved arginine swings away from a position that disrupts the catalytic Walker B region and into contact with DNA thread through the center of the RFC/clamp complex. Strong constraints near regions of interaction between subunits and with the clamp likewise provide clues regarding possible coupling of hydrolysis-driven conformational changes to the clamp's release and loading onto DNA.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Cold Spring Harbor Laboratory, 1 Bungtown Road, PO Box 100, Cold Spring Harbor, NY 11724, USA.
| |
Collapse
|
36
|
Kannan N, Neuwald AF. Evolutionary constraints associated with functional specificity of the CMGC protein kinases MAPK, CDK, GSK, SRPK, DYRK, and CK2alpha. Protein Sci 2005; 13:2059-77. [PMID: 15273306 PMCID: PMC2279817 DOI: 10.1110/ps.04637904] [Citation(s) in RCA: 113] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Amino acid residues associated with functional specificity of cyclin-dependent kinases (CDKs), mitogen-activated protein kinases (MAPKs), glycogen synthase kinases (GSKs), and CDK-like kinases (CLKs), which are collectively termed the CMGC group, were identified by categorizing and quantifying the selective constraints acting upon these proteins during evolution. Many constraints specific to CMGC kinases correspond to residues between the N-terminal end of the activation segment and a CMGC-conserved insert segment associated with coprotein binding. The strongest such constraint is imposed on a "CMGC-arginine" near the substrate phosphorylation site with a side chain that plays a role both in substrate recognition and in kinase activation. Two nearby buried waters, which are also present in non-CMGC kinases, typically position the main chain of this arginine relative to the catalytic loop. These and other CMGC-specific features suggest a structural linkage between coprotein binding, substrate recognition, and kinase activation. Constraints specific to individual subfamilies point to mechanisms for CMGC kinase specialization. Within casein kinase 2alpha (CK2alpha), for example, the binding of one of the buried waters appears prohibited by the side chain of a leucine that is highly conserved within CK2alpha and that, along with substitution of lysine for the CMGC-arginine, may contribute to the broad substrate specificity of CK2alpha by relaxing characteristically conserved, precise interactions near the active site. This leucine is replaced by a conserved isoleucine or valine in other CMGC kinases, thereby illustrating the potential functional significance of subtle amino acid substitutions. Analysis of other CMGC kinases similarly suggests candidate family-specific residues for experimental follow-up.
Collapse
Affiliation(s)
- Natarajan Kannan
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA
| | | |
Collapse
|
37
|
Hoeppner DJ, Spector MS, Ratliff TM, Kinchen JM, Granat S, Lin SC, Bhusri SS, Conradt B, Herman MA, Hengartner MO. eor-1 and eor-2 are required for cell-specific apoptotic death in C. elegans. Dev Biol 2004; 274:125-38. [PMID: 15355793 DOI: 10.1016/j.ydbio.2004.06.022] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2003] [Revised: 06/15/2004] [Accepted: 06/18/2004] [Indexed: 11/22/2022]
Abstract
Programmed cell death occurs in every multicellular organism and in diverse cell types yet the genetic controls that define which cells will live and which will die remain poorly understood. During development of the nematode Caenorhabditis elegans, the coordinated activity of four gene products, EGL-1, CED-9, CED-4 and CED-3, results in the death of essentially all cells fated to die. To identify novel upstream components of the cell death pathway, we performed a genetic screen for mutations that abolish the death of the hermaphrodite-specific neurons (HSNs), a homologous pair of cells required for egg-laying in the hermaphrodite. We identified and cloned the genes, eor-1 and eor-2, which are required to specify the fate of cell death in male HSNs. In addition to defects in HSN death, mutation of either gene leads to defects in coordinated movement, neuronal migration, male tail development, and viability; all consistent with abnormal neuronal differentiation. eor-1 encodes a putative transcription factor related to the human oncogene PLZF. eor-2 encodes a novel but conserved protein. We propose that eor-1 and eor-2 function together throughout the nervous system to promote terminal differentiation of neurons and function specifically in male HSNs to promote apoptotic death of the HSNs.
Collapse
|
38
|
Neuwald AF, Liu JS. Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model. BMC Bioinformatics 2004; 5:157. [PMID: 15504234 PMCID: PMC538276 DOI: 10.1186/1471-2105-5-157] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2004] [Accepted: 10/25/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Certain protein families are highly conserved across distantly related organisms and belong to large and functionally diverse superfamilies. The patterns of conservation present in these protein sequences presumably are due to selective constraints maintaining important but unknown structural mechanisms with some constraints specific to each family and others shared by a larger subset or by the entire superfamily. To exploit these patterns as a source of functional information, we recently devised a statistically based approach called contrast hierarchical alignment and interaction network (CHAIN) analysis, which infers the strengths of various categories of selective constraints from co-conserved patterns in a multiple alignment. The power of this approach strongly depends on the quality of the multiple alignments, which thus motivated development of theoretical concepts and strategies to improve alignment of conserved motifs within large sets of distantly related sequences. RESULTS Here we describe a hidden Markov model (HMM), an algebraic system, and Markov chain Monte Carlo (MCMC) sampling strategies for alignment of multiple sequence motifs. The MCMC sampling strategies are useful both for alignment optimization and for adjusting position specific background amino acid frequencies for alignment uncertainties. Associated statistical formulations provide an objective measure of alignment quality as well as automatic gap penalty optimization. Improved alignments obtained in this way are compared with PSI-BLAST based alignments within the context of CHAIN analysis of three protein families: Gialpha subunits, prolyl oligopeptidases, and transitional endoplasmic reticulum (p97) AAA+ ATPases. CONCLUSION While not entirely replacing PSI-BLAST based alignments, which likewise may be optimized for CHAIN analysis using this approach, these motif-based methods often more accurately align very distantly related sequences and thus can provide a better measure of selective constraints. In some instances, these new approaches also provide a better understanding of family-specific constraints, as we illustrate for p97 ATPases. Programs implementing these procedures and supplementary information are available from the authors.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Cold Spring Harbor Laboratory, 1 Bungtown Road, P.O. Box 100, Cold Spring Harbor, NY 11724, USA
| | - Jun S Liu
- Department of Statistics, Harvard University, 1 Oxford Street, Cambridge MA, 02138, USA
| |
Collapse
|
39
|
Neuwald AF. Evolutionary clues to DNA polymerase III beta clamp structural mechanisms. Nucleic Acids Res 2003; 31:4503-16. [PMID: 12888511 PMCID: PMC169876 DOI: 10.1093/nar/gkg486] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The prokaryotic DNA polymerase III beta homodimeric clamp links the replication complex to DNA during polynucleotide synthesis. This clamp is loaded onto DNA and unloaded by the clamp loader complex, the delta subunit of which by itself can bind to and open the clamp. beta Clamps from diverse bacteria were examined using contrast hierarchical alignment and interaction network (CHAIN) analysis, a statistical approach that categorizes and measures the evolutionary constraints imposed on protein sequences by natural selection. Some constraints are subtle inasmuch as they are unique to certain bacteria. Examination of corresponding molecular interactions within structures of the Escherichia coli beta dimeric and delta-beta complexes reveals that N320, Y323 and R176, which are subject to very strong constraints, form a substructure that may serve as a platform for leveraging and directing delta-induced conformational changes. N320 may play a prominent role, as it is strategically situated between this substructure and regions linked to delta binding and opening of beta's dimeric interface. R176 appears to act as a relay between the delta binding site and the clamp's central hole. Other residues subject to strong constraints are likewise associated with structurally important features. For example, two pairs of interacting residues, R269/E304 and K74/E300, form salt bridges at the dimeric interface, while the C-terminal residues M362, P363, M364 and R365 appear to play key roles in delta binding. Q149 and K198 appear to sense DNA within the clamp's central hole while other residues may relay this information to the delta binding site. Mutagenesis experiments designed to explore possible mechanisms are proposed.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Cold Spring Harbor Laboratory, 1 Bungtown Road, PO Box 100, Cold Spring Harbor, NY 11724, USA.
| |
Collapse
|