1
|
Yeung W, Kwon A, Taujale R, Bunn C, Venkat A, Kannan N. Evolution of functional diversity in the holozoan tyrosine kinome. Mol Biol Evol 2021; 38:5625-5639. [PMID: 34515793 PMCID: PMC8662651 DOI: 10.1093/molbev/msab272] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The emergence of multicellularity is strongly correlated with the expansion of tyrosine kinases, a conserved family of signaling enzymes that regulates pathways essential for cell-to-cell communication. Although tyrosine kinases have been classified from several model organisms, a molecular-level understanding of tyrosine kinase evolution across all holozoans is currently lacking. Using a hierarchical sequence constraint-based classification of diverse holozoan tyrosine kinases, we construct a new phylogenetic tree that identifies two ancient clades of cytoplasmic and receptor tyrosine kinases separated by the presence of an extended insert segment in the kinase domain connecting the D and E-helices. Present in nearly all receptor tyrosine kinases, this fast-evolving insertion imparts diverse functionalities, such as post-translational modification sites and regulatory interactions. Eph and EGFR receptor tyrosine kinases are two exceptions which lack this insert, each forming an independent lineage characterized by unique functional features. We also identify common constraints shared across multiple tyrosine kinase families which warrant the designation of three new subgroups: Src module (SrcM), insulin receptor kinase-like (IRKL), and fibroblast, platelet-derived, vascular, and growth factor receptors (FPVR). Subgroup-specific constraints reflect shared autoinhibitory interactions involved in kinase conformational regulation. Conservation analyses describe how diverse tyrosine kinase signaling functions arose through the addition of family-specific motifs upon subgroup-specific features and coevolving protein domains. We propose the oldest tyrosine kinases, IRKL, SrcM, and Csk, originated from unicellular premetazoans and were coopted for complex multicellular functions. The increased frequency of oncogenic variants in more recent tyrosine kinases suggests that lineage-specific functionalities are selectively altered in human cancers.
Collapse
Affiliation(s)
- Wayland Yeung
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, USA
| | - Annie Kwon
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, USA
| | - Rahil Taujale
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, USA
| | - Claire Bunn
- Department of Genetics, University of Georgia, Athens, Georgia, USA
| | - Aarya Venkat
- Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, USA
| | - Natarajan Kannan
- Institute of Bioinformatics, University of Georgia, Athens, Georgia, USA.,Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, USA
| |
Collapse
|
2
|
Prieto-Echagüe V, Chan PM, Craddock BP, Manser E, Miller WT. PTB domain-directed substrate targeting in a tyrosine kinase from the unicellular choanoflagellate Monosiga brevicollis. PLoS One 2011; 6:e19296. [PMID: 21541291 PMCID: PMC3082566 DOI: 10.1371/journal.pone.0019296] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Accepted: 03/28/2011] [Indexed: 11/19/2022] Open
Abstract
Choanoflagellates are considered to be the closest living unicellular relatives of metazoans. The genome of the choanoflagellate Monosiga brevicollis contains a surprisingly high number and diversity of tyrosine kinases, tyrosine phosphatases, and phosphotyrosine-binding domains. Many of the tyrosine kinases possess combinations of domains that have not been observed in any multicellular organism. The role of these protein interaction domains in M. brevicollis kinase signaling is not clear. Here, we have carried out a biochemical characterization of Monosiga HMTK1, a protein containing a putative PTB domain linked to a tyrosine kinase catalytic domain. We cloned, expressed, and purified HMTK1, and we demonstrated that it possesses tyrosine kinase activity. We used immobilized peptide arrays to define a preferred ligand for the third PTB domain of HMTK1. Peptide sequences containing this ligand sequence are phosphorylated efficiently by recombinant HMTK1, suggesting that the PTB domain of HMTK1 has a role in substrate recognition analogous to the SH2 and SH3 domains of mammalian Src family kinases. We suggest that the substrate recruitment function of the noncatalytic domains of tyrosine kinases arose before their roles in autoinhibition.
Collapse
Affiliation(s)
- Victoria Prieto-Echagüe
- Department of Physiology and Biophysics, School of Medicine, Stony Brook University, Stony Brook, New York, United States of America
| | - Perry M. Chan
- sGSK group, Neuroscience Research Partnership/A*Star, Singapore, Singapore
| | - Barbara P. Craddock
- Department of Physiology and Biophysics, School of Medicine, Stony Brook University, Stony Brook, New York, United States of America
| | - Edward Manser
- sGSK group, Neuroscience Research Partnership/A*Star, Singapore, Singapore
| | - W. Todd Miller
- Department of Physiology and Biophysics, School of Medicine, Stony Brook University, Stony Brook, New York, United States of America
- * E-mail:
| |
Collapse
|
3
|
Michel G, Barbeyron T, Kloareg B, Czjzek M. The family 6 carbohydrate-binding modules have coevolved with their appended catalytic modules toward similar substrate specificity. Glycobiology 2009; 19:615-23. [PMID: 19240276 DOI: 10.1093/glycob/cwp028] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The survey of carbohydrate active enzymes in genomic data uncovered the modular architecture of most of these proteins. Many of the additional modules associated with catalytic modules tightly bind carbohydrates. The primary role of these carbohydrate-binding modules (CBMs) is to enhance the enzymatic activity of the ensemble by bringing their appended catalytic module(s) in intimate contact with their substrates. Biochemical and biophysical approaches have unraveled the subtle interplay of the modules and the structural basis for their ligand specificities, but little attention has been paid to the evolutionary mechanisms leading to the appearance of modular architecture in carbohydrate active enzymes. Focusing on the promiscuous family CBM6 modules, we investigated the evolution of substrate specificities in parallel to that of their respectively appended catalytic modules. An extensive phylogenetic analysis of family CBM6 modules indicates that these noncatalytic modules have diverged into clades which coincide with their substrate selectivity. These data as well as the remarkable congruence of the phylogenetic trees inferred from CBM6s on the one hand and their associated catalytic modules on the other hand show that CBM6s and their associated glycoside hydrolases have coevolved to acquire the same substrate specificity. We also propose an evolutionary scenario explaining the emergence of the modular agarases, by which existent alpha-agarases acquired their agar-binding CBM6 module through a lateral transfer from pre-existing beta-agarases. Altogether, this observed coevolution between CBM6s and their catalytic modules will facilitate the prediction of the substrate specificity of uncharacterized CBM6 modules present in genomic data.
Collapse
Affiliation(s)
- Gurvan Michel
- UPMC University Paris 06, 3CNRS, UMR 7139 Marine Plants and Biomolecules, Station Biologique de Roscoff, Roscoff, Bretagne, France.
| | | | | | | |
Collapse
|
4
|
Yadav SS, Miller WT. The evolutionarily conserved arrangement of domains in SRC family kinases is important for substrate recognition. Biochemistry 2008; 47:10871-80. [PMID: 18803405 DOI: 10.1021/bi800930e] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The SH3-SH2-kinase domain arrangement in nonreceptor tyrosine kinases has been conserved throughout evolution. For Src family kinases, the relative positions of the domains are important for enzyme regulation; they permit the assembly of Src kinases into autoinhibited conformations. The SH3 and SH2 domains of Src family kinases have an additional role in determining the substrate specificity of the kinase. We addressed the question of whether the domain arrangement of Src family kinases has a role in substrate specificity by producing mutants with alternative arrangements. Our results suggest that changes in the positions of domains can lead to specific changes in the phosphorylation of Sam68 and Cas by Src. Phosphorylation of Cas by several mutants triggers downstream signaling leading to cell migration. The placement of the SH2 domain with respect to the catalytic domain of Src appears to be especially important for proper substrate recognition, while the placement of the SH3 domain is more flexible. The results suggest that the involvement of the SH3 and SH2 domains in substrate recognition is one reason for the strict conservation of the SH3-SH2-kinase architecture.
Collapse
Affiliation(s)
- Shalini S Yadav
- Department of Physiology and Biophysics, School of Medicine, Stony Brook University, Stony Brook, New York 11794-8661, USA
| | | |
Collapse
|
5
|
Lappalainen I, Thusberg J, Shen B, Vihinen M. Genome wide analysis of pathogenic SH2 domain mutations. Proteins 2008; 72:779-92. [PMID: 18260110 DOI: 10.1002/prot.21970] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
The authors have made a genome-wide analysis of mutations in Src homology 2 (SH2) domains associated with human disease. Disease-causing mutations have been detected in the SH2 domains of cytoplasmic signaling proteins Bruton tyrosine kinase (BTK), SH2D1A, Ras GTPase activating protein (RasGAP), ZAP-70, SHP-2, STAT1, STAT5B, and the p85alpha subunit of the PIP3. Mutations in the BTK, SH2D1A, ZAP70, STAT1, and STAT5B genes have been shown to cause diverse immunodeficiencies, whereas the mutations in RASA1 and PIK3R1 genes lead to basal carcinoma and diabetes, respectively. PTPN11 mutations cause Noonan sydrome and different types of cancer, depending mainly on whether the mutation is inherited or sporadic. We collected and analyzed all known pathogenic mutations affecting human SH2 domains by bioinformatics methods. Among the investigated protein properties are sequence conservation and covariance, structural stability, side chain rotamers, packing effects, surface electrostatics, hydrogen bond formation, accessible surface area, salt bridges, and residue contacts. The majority of the mutations affect positions essential for phosphotyrosine ligand binding and specificity. The structural basis of the SH2 domain diseases was elucidated based on the bioinformatic analysis.
Collapse
Affiliation(s)
- Ilkka Lappalainen
- Department of Biological and Environmental Sciences, Division of Biochemistry, FI-00014 University of Helsinki, Finland
| | | | | | | |
Collapse
|
6
|
Phylogeny of Tec Family Kinases: Identification of a Premetazoan Origin of Btk, Bmx, Itk, Tec, Txk, and the Btk Regulator SH3BP5. ADVANCES IN GENETICS 2008; 64:51-80. [DOI: 10.1016/s0065-2660(08)00803-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
7
|
Calvo E, Flores-Romero P, López JA, Navas A. Identification of Proteins Expressing Differences among Isolates of Meloidogyne spp. (Nematoda: Meloidogynidae) by Nano-Liquid Chromatography Coupled to Ion-Trap Mass Spectrometry. J Proteome Res 2005; 4:1017-21. [PMID: 15952751 DOI: 10.1021/pr0500298] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Total protein variation (up to ninety-five different positions) was revealed by two-dimensional electrophoresis (2-DE) in 18 isolates from populations of M. arenaria (6 isolates), M. incognita (10), M. javanica (1) plus an unclassified isolate in a previously reported study. Isolates of M. arenaria, M. javanica, Meloidogyne sp., and M. incognita formed two separate groups defined on the basis of two sets of protein positions that could be considered as diagnostic characters, but we could not identify these proteins by MALDI-TOF. To identify these marker positions, nano-liquid chromatography as peptides separation method was coupled to an ion-trap mass spectrometer for induced real-time fragmentation of eluted peptides. Group diagnostic proteins for M. incognita and M. arenaria were in-gel digested and on line analyzed by tandem mass spectrometry (LC-MS/MS). Six proteins out of seven selected spots were unambiguously identified by the analysis of the corresponding MS/MS (MS2) spectrum from parent ions fragmentation: Actin, Enolase, CG3752-PA protein similar to Aldehyde Dehydrogenase, HSP-60 and Translation initiation factor elF-4A. In M. incognita sample, de novo sequencing experiment of doubly charged ion at m/z=936.9 Da in spot 29 identified as enolase, reveals three residue substitutions (K to T, N to T, and D to E) when tentative sequence was compared with that of Anisakis simplex and Onchocerca volvulus enolase, thus three SNPs (single nucleotide polymorphisms) were also possibly identified.
Collapse
Affiliation(s)
- E Calvo
- Museo Nacional de Ciencias Naturales, CSIC, José Gutierrez Abascal 2, Madrid 28006, Spain
| | | | | | | |
Collapse
|
8
|
Cetkovic H, Grebenjuk VA, Müller WEG, Gamulin V. Src proteins/src genes: from sponges to mammals. Gene 2004; 342:251-61. [PMID: 15527984 DOI: 10.1016/j.gene.2004.07.044] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2004] [Revised: 07/08/2004] [Accepted: 07/23/2004] [Indexed: 11/27/2022]
Abstract
The genome of marine sponge Suberites domuncula, a member of the most ancient and most simple metazoan phylum Porifera, encodes at least five genes for Src-type proteins, more than, i.e., Caenorhabditis elegans or Drosophila melanogaster (two in each). Three proteins, SRC1SD, SRC2SD and SRC3SD, were fully characterized. The overall homology (identity+similarity) among the three S. domuncula Srcs (68-71%) is much lower than the sequence conservation between orthologous Src proteins from freshwater sponges (82-85%). It is therefore very likely that several src genes/proteins were already present in the genome of Urmetazoa, the hypothetical metazoan ancestor. We have identified in the S. domuncula expressed sequence tags (ESTs) database further Src homology 2 (SH2) and 3 (SH3) domains that are unrelated to protein tyrosine kinases (PTKs). Src-related SH2 and SH3 domains from different species are much more conserved than SH2 and SH3 domains from different proteins in the same organism (S. domuncula), supporting the view that the common, ancestral src gene was already a multidomain protein composed of SH3, SH2 and tyrosine kinase (TK) domains. Two S. domuncula src genes were fully sequenced: src1SD gene has six and src2SD gene only one intron in front of SH2 domain, located at the same position in both genes. All vertebrate src genes, from fish to human, originated from the same ancestral gene, because they all have 10 introns at conserved positions. However, src genes in invertebrates have fewer introns that are located at different positions. Only the intron in front of the SH2 domain is present at the absolutely conserved position (and phase) in all known src genes, indicating that at least this intron was already present in the ancestral gene, common to all Metazoa. Our results also suggest that TK domain in this ancestral src was encoded on a single exon.
Collapse
Affiliation(s)
- Helena Cetkovic
- Department of Molecular Biology, Rudjer Boskovic Institute, Bijenicka cesta 54, 10000 Zagreb, Croatia
| | | | | | | |
Collapse
|
9
|
Gao Q, Hua J, Kimura R, Headd JJ, Fu XY, Chin YE. Identification of the linker-SH2 domain of STAT as the origin of the SH2 domain using two-dimensional structural alignment. Mol Cell Proteomics 2004; 3:704-14. [PMID: 15073273 DOI: 10.1074/mcp.m300131-mcp200] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The availability of large volumes of genomic sequences presents an unprecedented proteomic challenge to characterize the structure and function of various protein motifs. Primary structural alignment is often unable to accurately identify a given motif due to sequence divergence; however, with the aid of secondary structural prediction for analysis, it becomes feasible to explore protein motifs on a proteome-wide scale. Here we report the use of secondary structural alignment to characterize the Src homology 2 (SH2) domains of both conventional and divergent sequences and divide them into two groups, Src-type and STAT-type. In addition to the basic "alphabetabetabetaalpha" structure (betaBeta), the Src-type SH2 domain contains an extra beta-strand (betaE or betaE-betaF motif). Alternatively, the linker domain-conjugated SH2 domain in STAT contains the alphaB' motif. Combining BLAST data from betaBeta core motif sequences with predicted secondary structural alignment, we have screened for SH2 domains in various eukaryotic model systems including Arabidopsis, Dictyostelium, and Saccharomyces. Two novel genes carrying the linker-SH2 domain of STAT were discovered and subsequently cloned from Arabidopsis. These genes, designated as STAT-type linker-SH2 domain factors (STATL), are found in a wide array of vascular and nonvascular plants, suggesting that the linker-SH2 domain evolved prior to the divergence of plants and animals. Using this approach, we expanded the number of putative SH2 domain-bearing genes in Dictyostelium and comparatively studied the secondary structural profiles of both typical and atypical SH2 domains. Our results indicate that the linker-SH2 domain of the transcription factor STAT is one of the most ancient and fully developed functional domains, serving as a template for the continuing evolution of the SH2 domain essential for phosphotyrosine signal transduction.
Collapse
Affiliation(s)
- Qian Gao
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | | | | | | | | | | |
Collapse
|
10
|
Abstract
The JAK/STAT pathway plays important roles in vertebrate and invertebrate development. The recent cloning and characterisation of the receptor in Drosophila shows that the pathway is conserved across phyla. In this review we describe current knowledge of the pathway and use genome data to discuss what elements are present in Drosophila. We also summarise recent work describing the involvement of the JAK/STAT pathway in oogenesis and spermatogenesis. Interestingly, the JAK/STAT pathway maintains the niche required for germline stem cell maintenance in the testis, providing the first molecular characterisation of a stem cell niche. Drosophila's streamlined pathway offers a simple model to find new elements and analyse the function of existing ones.
Collapse
|
11
|
Popovici C, Leveugle M, Birnbaum D, Coulier F. Coparalogy: physical and functional clusterings in the human genome. Biochem Biophys Res Commun 2001; 288:362-70. [PMID: 11606051 DOI: 10.1006/bbrc.2001.5794] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Two rounds of large-scale duplications are thought to have occurred in early vertebrate ancestry; this is now known as the "2R hypothesis." They have led to the constitution of subfamilies of paralogous genes. Chromosomal regions that contain present-day paralogs (paralogous regions or paralogons) have been identified in mammals. We show that sets of paralogons (PGs) can be assembled in a tentative "human genome paralogy map" that includes all autosomes and X. A total of 14 PGs, containing more than 1600 genes, were assembled in this paralogy map. Genes that belong to the same PG are coparalogs. We show that identification of coparalogy can be used (i) to broaden data on gene mapping, (ii) to identify physical gene clusters that derive from early cis-duplications, and (iii) to speculate on coevolution and coregulation of genes sharing a common structure or function (functional clusters). Thus, coparalogy analyses should parallel phylogenetic analyses and can help draw hypotheses on gene and genome evolution.
Collapse
Affiliation(s)
- C Popovici
- U119 INSERM, IFR57, Laboratoire d'Oncologie Moléculaire, 27 boulevard Leï Roure, 13009 Marseille, France
| | | | | | | |
Collapse
|