51
|
Chitale M, Hawkins T, Park C, Kihara D. ESG: extended similarity group method for automated protein function prediction. ACTA ACUST UNITED AC 2009; 25:1739-45. [PMID: 19435743 DOI: 10.1093/bioinformatics/btp309] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Importance of accurate automatic protein function prediction is ever increasing in the face of a large number of newly sequenced genomes and proteomics data that are awaiting biological interpretation. Conventional methods have focused on high sequence similarity-based annotation transfer which relies on the concept of homology. However, many cases have been reported that simple transfer of function from top hits of a homology search causes erroneous annotation. New methods are required to handle the sequence similarity in a more robust way to combine together signals from strongly and weakly similar proteins for effectively predicting function for unknown proteins with high reliability. RESULTS We present the extended similarity group (ESG) method, which performs iterative sequence database searches and annotates a query sequence with Gene Ontology terms. Each annotation is assigned with probability based on its relative similarity score with the multiple-level neighbors in the protein similarity graph. We will depict how the statistical framework of ESG improves the prediction accuracy by iteratively taking into account the neighborhood of query protein in the sequence similarity space. ESG outperforms conventional PSI-BLAST and the protein function prediction (PFP) algorithm. It is found that the iterative search is effective in capturing multiple-domains in a query protein, enabling accurately predicting several functions which originate from different domains. AVAILABILITY ESG web server is available for automated protein function prediction at http://dragon.bio.purdue.edu/ESG/.
Collapse
Affiliation(s)
- Meghana Chitale
- Department of Computer Science, Purdue University, IN 47907, USA
| | | | | | | |
Collapse
|
52
|
Schmitt E, Galimand M, Panvert M, Courvalin P, Mechulam Y. Structural bases for 16 S rRNA methylation catalyzed by ArmA and RmtB methyltransferases. J Mol Biol 2009; 388:570-82. [PMID: 19303884 DOI: 10.1016/j.jmb.2009.03.034] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2009] [Revised: 03/04/2009] [Accepted: 03/13/2009] [Indexed: 10/21/2022]
Abstract
Aminoglycosides are used extensively for the treatment of severe infections due to Gram-negative bacteria. However, certain species have become highly resistant after acquisition of genes for methyltransferases which catalyze post-transcriptional methylation of N7-G1405 in 16 S rRNA of 30 S ribosomal subunits. Inactivation of this enzymatic activity is therefore an important challenge for development of an effective therapy. The present work describes the crystallographic structures of methyltransferases RmtB and ArmA from clinical isolates. Together with biochemical experiments, the 3D structures indicate that the N-terminal domain specific for this family of methyltransferases is required for enzymatic activity. Site-directed mutagenesis has enabled important residues for catalysis and RNA binding to be identified. These high-resolution structures should underpin the design of potential inhibitors of these enzymes, which could be used to restore the activity of aminoglycosides against resistant pathogens.
Collapse
Affiliation(s)
- Emmanuelle Schmitt
- Laboratoire de Biochimie, Ecole Polytechnique, Centre National de la Recherche Scientifique, Palaiseau Cedex, France.
| | | | | | | | | |
Collapse
|
53
|
Delfosse V, Girard E, Birck C, Delmarcelle M, Delarue M, Poch O, Schultz P, Mayer C. Structure of the archaeal pab87 peptidase reveals a novel self-compartmentalizing protease family. PLoS One 2009; 4:e4712. [PMID: 19266066 PMCID: PMC2651629 DOI: 10.1371/journal.pone.0004712] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2009] [Accepted: 01/28/2009] [Indexed: 11/18/2022] Open
Abstract
Self-compartmentalizing proteases orchestrate protein turnover through an original architecture characterized by a central catalytic chamber. Here we report the first structure of an archaeal member of a new self-compartmentalizing protease family forming a cubic-shaped octamer with D4 symmetry and referred to as CubicO. We solved the structure of the Pyrococcus abyssi Pab87 protein at 2.2 Å resolution using the anomalous signal of the high-phasing-power lanthanide derivative Lu-HPDO3A. A 20 Å wide channel runs through this supramolecular assembly of 0.4 MDa, giving access to a 60 Å wide central chamber holding the eight active sites. Surprisingly, activity assays revealed that Pab87 degrades specifically d-amino acid containing peptides, which have never been observed in archaea. Genomic context of the Pab87 gene showed that it is surrounded by genes involved in the amino acid/peptide transport or metabolism. We propose that CubicO proteases are involved in the processing of d-peptides from environmental origins.
Collapse
Affiliation(s)
- Vanessa Delfosse
- Centre de Recherche des Cordeliers, LRMA, INSERM UMR-S 872, Université Pierre et Marie Curie, Paris, France
| | | | | | | | | | | | | | | |
Collapse
|
54
|
Aniba MR, Siguenza S, Friedrich A, Plewniak F, Poch O, Marchler-Bauer A, Thompson JD. Knowledge-based expert systems and a proof-of-concept case study for multiple sequence alignment construction and analysis. Brief Bioinform 2009; 10:11-23. [PMID: 18971242 PMCID: PMC2638625 DOI: 10.1093/bib/bbn045] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2008] [Revised: 10/02/2008] [Indexed: 11/15/2022] Open
Abstract
The traditional approach to bioinformatics analyses relies on independent task-specific services and applications, using different input and output formats, often idiosyncratic, and frequently not designed to inter-operate. In general, such analyses were performed by experts who manually verified the results obtained at each step in the process. Today, the amount of bioinformatics information continuously being produced means that handling the various applications used to study this information presents a major data management and analysis challenge to researchers. It is now impossible to manually analyse all this information and new approaches are needed that are capable of processing the large-scale heterogeneous data in order to extract the pertinent information. We review the recent use of integrated expert systems aimed at providing more efficient knowledge extraction for bioinformatics research. A general methodology for building knowledge-based expert systems is described, focusing on the unstructured information management architecture, UIMA, which provides facilities for both data and process management. A case study involving a multiple alignment expert system prototype called AlexSys is also presented.
Collapse
Affiliation(s)
- Mohamed Radhouene Aniba
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), F-67400 Illkirch, France
| | | | | | | | | | | | | |
Collapse
|
55
|
Benelli D, Marzi S, Mancone C, Alonzi T, la Teana A, Londei P. Function and ribosomal localization of aIF6, a translational regulator shared by archaea and eukarya. Nucleic Acids Res 2008; 37:256-67. [PMID: 19036786 PMCID: PMC2615626 DOI: 10.1093/nar/gkn959] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The translation factor IF6 is shared by the Archaea and the Eukarya, but is not found in Bacteria. The properties of eukaryal IF6 (eIF6) have been extensively studied, but remain somewhat elusive. eIF6 behaves as a ribosome-anti-association factor and is involved in miRNA-mediated gene silencing; however, it also seems to participate in ribosome synthesis and export. Here we have determined the function and ribosomal localization of the archaeal (Sulfolobus solfataricus) IF6 homologue (aIF6). We find that aIF6 binds specifically to the 50S ribosomal subunits, hindering the formation of 70S ribosomes and strongly inhibiting translation. aIF6 is uniformly expressed along the cell cycle, but it is upregulated following both cold- and heat shock. The aIF6 ribosomal binding site lies in the middle of the 30-S interacting surface of the 50S subunit, including a number of critical RNA and protein determinants involved in subunit association. The data suggest that the IF6 protein evolved in the archaeal–eukaryal lineage to modulate translational efficiency under unfavourable environmental conditions, perhaps acquiring additional functions during eukaryotic evolution.
Collapse
Affiliation(s)
- Dario Benelli
- Dipartimento Biotecnologie Cellulari ed Ematologia, Policlinico Umberto I, Università di Roma Sapienza, Roma, Italy
| | | | | | | | | | | |
Collapse
|
56
|
Pompidor G, Maillard AP, Girard E, Gambarelli S, Kahn R, Covès J. X-ray structure of the metal-sensor CnrX in both the apo- and copper-bound forms. FEBS Lett 2008; 582:3954-8. [PMID: 18992246 DOI: 10.1016/j.febslet.2008.10.042] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2008] [Revised: 10/17/2008] [Accepted: 10/24/2008] [Indexed: 10/21/2022]
|
57
|
Gallien S, Perrodou E, Carapito C, Deshayes C, Reyrat JM, Van Dorsselaer A, Poch O, Schaeffer C, Lecompte O. Ortho-proteogenomics: multiple proteomes investigation through orthology and a new MS-based protocol. Genome Res 2008; 19:128-35. [PMID: 18955433 DOI: 10.1101/gr.081901.108] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The progress in sequencing technologies irrigates biology with an ever-increasing number of genome sequences. In most cases, the gene repertoire is predicted in silico and conceptually translated into proteins. As recently highlighted, the predicted genes exhibit frequent errors, particularly in start codons, with a serious impact on subsequent biological studies. A new "ortho-proteogenomic" approach is presented here for the annotation refinement of multiple genomes at once. It combines comparative genomics with an original proteomic protocol that allows the characterization of both N-terminal and internal peptides in a single experiment. This strategy was applied to the Mycobacterium genus with Mycobacterium smegmatis as the reference, and identified 946 distinct proteins, including 443 characterized N termini. These experimental data allowed the correction of 19% of the characterized start codons, the identification of 29 proteins missed during the annotation process, and the curation, thanks to comparative genomics, of 4328 sequences of 16 other Mycobacterium proteomes.
Collapse
Affiliation(s)
- Sébastien Gallien
- Laboratoire de Spectrométrie de Masse Bio-Organique, IPHC-DSA, ULP, CNRS, UMR7178, 67 087 Strasbourg, France.
| | | | | | | | | | | | | | | | | |
Collapse
|
58
|
Lecompte O, Poch O, Laporte J. PtdIns5P regulation through evolution: roles in membrane trafficking? Trends Biochem Sci 2008; 33:453-60. [PMID: 18774718 DOI: 10.1016/j.tibs.2008.07.002] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2008] [Revised: 07/01/2008] [Accepted: 07/02/2008] [Indexed: 01/27/2023]
Abstract
Phosphoinositides are lipid second messengers that are essential for many cellular processes, including signal transduction and cell compartmentalization. Among them, phosphatidylinositol 5-phosphate (PtdIns5P) is the least characterized, although several proteins involved in its regulation are implicated in human diseases. We studied the distribution of 32 PtdIns5P-metabolizing proteins in 39 eukaryotic genomes. Phylogenetic profiles identify four groups of co-evolving proteins, confirming known protein complexes and revealing new ones. The complexes comprise a phosphatase, a kinase and a regulator; this indicates that physical interactions between the three partners are necessary for the acute spatial regulation of PtdIns5P turnover. By examining PtdIns5P metabolism in this new perspective, we propose a role for PtdIns5P in membrane trafficking from late endosomal compartments to the plasma membrane.
Collapse
Affiliation(s)
- Odile Lecompte
- Department of Structural Biology and Genomics, rue Laurent Fries, Illkirch, F-67400 France
| | | | | |
Collapse
|
59
|
Campagnoli MF, Ramenghi U, Armiraglio M, Quarello P, Garelli E, Carando A, Avondo F, Pavesi E, Fribourg S, Gleizes PE, Loreni F, Dianzani I. RPS19 mutations in patients with Diamond-Blackfan anemia. Hum Mutat 2008; 29:911-20. [PMID: 18412286 DOI: 10.1002/humu.20752] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Diamond-Blackfan anemia (DBA) is an inherited disease characterized by pure erythroid aplasia. Thirty percent (30%) of patients display malformations, especially of the hands, face, heart, and urogenital tract. DBA has an autosomal dominant pattern of inheritance. De novo mutations are common and familial cases display wide clinical heterogeneity. Twenty-five percent (25%) of patients carry a mutation in the ribosomal protein (RP) S19 gene, whereas mutations in RPS24, RPS17, RPL35A, RPL11, and RPL5 are rare. These genes encode for structural proteins of the ribosome. A link between ribosomal functions and erythroid aplasia is apparent in DBA, but its etiology is not clear. Most authors agree that a defect in protein synthesis in a rapidly proliferating tissue, such as the erythroid bone marrow, may explain the defective erythropoiesis. A total of 77 RPS19 mutations have been described. Most are whole gene deletions, translocations, or truncating mutations (nonsense or frameshift), suggesting that haploinsufficiency is the basis of DBA pathology. A total of 22 missense mutations have also been described and several works have provided in vitro functional data for the mutant proteins. This review looks at the data on all these mutations, proposes a functional classification, and describes six new mutations. It is shown that patients with RPS19 mutations display a poorer response to steroids and a worse long-term prognosis compared to other DBA patients.
Collapse
|
60
|
Mathieu-Daudé F, Lafay B, Touzet O, Lelièvre J, Parrado F, Bosseno MF, Rojas AM, Fatha S, Ouaissi A, Brenière SF. Exploring the FL-160-CRP gene family through sequence variability of the complement regulatory protein (CRP) expressed by the trypomastigote stage of Trypanosoma cruzi. INFECTION GENETICS AND EVOLUTION 2008; 8:258-66. [DOI: 10.1016/j.meegid.2007.12.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2007] [Revised: 12/14/2007] [Accepted: 12/17/2007] [Indexed: 11/25/2022]
|
61
|
Kamesh N, Aradhyam GK, Manoj N. The repertoire of G protein-coupled receptors in the sea squirt Ciona intestinalis. BMC Evol Biol 2008; 8:129. [PMID: 18452600 PMCID: PMC2396169 DOI: 10.1186/1471-2148-8-129] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2007] [Accepted: 05/01/2008] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND G protein-coupled receptors (GPCRs) constitute a large family of integral transmembrane receptor proteins that play a central role in signal transduction in eukaryotes. The genome of the protochordate Ciona intestinalis has a compact size with an ancestral complement of many diversified gene families of vertebrates and is a good model system for studying protochordate to vertebrate diversification. An analysis of the Ciona repertoire of GPCRs from a comparative genomic perspective provides insight into the evolutionary origins of the GPCR signalling system in vertebrates. RESULTS We have identified 169 gene products in the Ciona genome that code for putative GPCRs. Phylogenetic analyses reveal that Ciona GPCRs have homologous representatives from the five major GRAFS (Glutamate, Rhodopsin, Adhesion, Frizzled and Secretin) families concomitant with other vertebrate GPCR repertoires. Nearly 39% of Ciona GPCRs have unambiguous orthologs of vertebrate GPCR families, as defined for the human, mouse, puffer fish and chicken genomes. The Rhodopsin family accounts for ~68% of the Ciona GPCR repertoire wherein the LGR-like subfamily exhibits a lineage specific gene expansion of a group of receptors that possess a novel domain organisation hitherto unobserved in metazoan genomes. CONCLUSION Comparison of GPCRs in Ciona to that in human reveals a high level of orthology of a protochordate repertoire with that of vertebrate GPCRs. Our studies suggest that the ascidians contain the basic ancestral complement of vertebrate GPCR genes. This is evident at the subfamily level comparisons since Ciona GPCR sequences are significantly analogous to vertebrate GPCR subfamilies even while exhibiting Ciona specific genes. Our analysis provides a framework to perform future experimental and comparative studies to understand the roles of the ancestral chordate versions of GPCRs that predated the divergence of the urochordates and the vertebrates.
Collapse
Affiliation(s)
- N Kamesh
- Department of Biotechnology, Bhupat and Jyothi Mehta School of Biosciences Building, Indian Institute of Technology Madras, Chennai 600036, India.
| | | | | |
Collapse
|
62
|
Perrodou E, Chica C, Poch O, Gibson TJ, Thompson JD. A new protein linear motif benchmark for multiple sequence alignment software. BMC Bioinformatics 2008; 9:213. [PMID: 18439277 PMCID: PMC2374782 DOI: 10.1186/1471-2105-9-213] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2007] [Accepted: 04/25/2008] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Linear motifs (LMs) are abundant short regulatory sites used for modulating the functions of many eukaryotic proteins. They play important roles in post-translational modification, cell compartment targeting, docking sites for regulatory complex assembly and protein processing and cleavage. Methods for LM detection are now being developed that are strongly dependent on scores for motif conservation in homologous proteins. However, most LMs are found in natively disordered polypeptide segments that evolve rapidly, unhindered by structural constraints on the sequence. These regions of modular proteins are difficult to align using classical multiple sequence alignment programs that are specifically optimised to align the globular domains. As a consequence, poor motif alignment quality is hindering efforts to detect new LMs. RESULTS We have developed a new benchmark, as part of the BAliBASE suite, designed to assess the ability of standard multiple alignment methods to detect and align LMs. The reference alignments are organised into different test sets representing real alignment problems and contain examples of experimentally verified functional motifs, extracted from the Eukaryotic Linear Motif (ELM) database. The benchmark has been used to evaluate and compare a number of multiple alignment programs. With distantly related proteins, the worst alignment program correctly aligns 48% of LMs compared to 73% for the best program. However, the performance of all the programs is adversely affected by the introduction of other sequences containing false positive motifs. The ranking of the alignment programs based on LM alignment quality is similar to that observed when considering full-length protein alignments, however little correlation was observed between LM and overall alignment quality for individual alignment test cases. CONCLUSION We have shown that none of the programs currently available is capable of reliably aligning LMs in distantly related sequences and we have highlighted a number of specific problems. The results of the tests suggest possible ways to improve program accuracy for difficult, divergent sequences.
Collapse
Affiliation(s)
- Emmanuel Perrodou
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Department of Structural Biology and Genomics, F-67400 Illkirch, France.
| | | | | | | | | |
Collapse
|
63
|
Mutagenesis in the alpha3alpha4 GyrA helix and in the Toprim domain of GyrB refines the contribution of Mycobacterium tuberculosis DNA gyrase to intrinsic resistance to quinolones. Antimicrob Agents Chemother 2008; 52:2909-14. [PMID: 18426901 DOI: 10.1128/aac.01380-07] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The replacement of M74 in GyrA, A83 in GyrA, and R447 in GyrB of Mycobacterium tuberculosis gyrase by their Escherichia coli homologs resulted in active enzymes as quinolone susceptible as the E. coli gyrase. This demonstrates that the primary structure of gyrase determines intrinsic quinolone resistance and was supported by a three-dimensional model of N-terminal GyrA.
Collapse
|
64
|
Lagier-Tourenne C, Tazir M, López LC, Quinzii CM, Assoum M, Drouot N, Busso C, Makri S, Ali-Pacha L, Benhassine T, Anheim M, Lynch DR, Thibault C, Plewniak F, Bianchetti L, Tranchant C, Poch O, DiMauro S, Mandel JL, Barros MH, Hirano M, Koenig M. ADCK3, an ancestral kinase, is mutated in a form of recessive ataxia associated with coenzyme Q10 deficiency. Am J Hum Genet 2008; 82:661-72. [PMID: 18319074 DOI: 10.1016/j.ajhg.2007.12.024] [Citation(s) in RCA: 223] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2007] [Revised: 12/15/2007] [Accepted: 12/28/2007] [Indexed: 01/17/2023] Open
Abstract
Muscle coenzyme Q(10) (CoQ(10) or ubiquinone) deficiency has been identified in more than 20 patients with presumed autosomal-recessive ataxia. However, mutations in genes required for CoQ(10) biosynthetic pathway have been identified only in patients with infantile-onset multisystemic diseases or isolated nephropathy. Our SNP-based genome-wide scan in a large consanguineous family revealed a locus for autosomal-recessive ataxia at chromosome 1q41. The causative mutation is a homozygous splice-site mutation in the aarF-domain-containing kinase 3 gene (ADCK3). Five additional mutations in ADCK3 were found in three patients with sporadic ataxia, including one known to have CoQ(10) deficiency in muscle. All of the patients have childhood-onset cerebellar ataxia with slow progression, and three of six have mildly elevated lactate levels. ADCK3 is a mitochondrial protein homologous to the yeast COQ8 and the bacterial UbiB proteins, which are required for CoQ biosynthesis. Three out of four patients tested showed a low endogenous pool of CoQ(10) in their fibroblasts or lymphoblasts, and two out of three patients showed impaired ubiquinone synthesis, strongly suggesting that ADCK3 is also involved in CoQ(10) biosynthesis. The deleterious nature of the three identified missense changes was confirmed by the introduction of them at the corresponding positions of the yeast COQ8 gene. Finally, a phylogenetic analysis shows that ADCK3 belongs to the family of atypical kinases, which includes phosphoinositide and choline kinases, suggesting that ADCK3 plays an indirect regulatory role in ubiquinone biosynthesis possibly as part of a feedback loop that regulates ATP production.
Collapse
|
65
|
Fuellen G. Homology and phylogeny and their automated inference. Naturwissenschaften 2008; 95:469-81. [PMID: 18288471 DOI: 10.1007/s00114-008-0348-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2007] [Revised: 12/20/2007] [Accepted: 01/12/2008] [Indexed: 11/25/2022]
Abstract
The analysis of the ever-increasing amount of biological and biomedical data can be pushed forward by comparing the data within and among species. For example, an integrative analysis of data from the genome sequencing projects for various species traces the evolution of the genomes and identifies conserved and innovative parts. Here, I review the foundations and advantages of this "historical" approach and evaluate recent attempts at automating such analyses. Biological data is comparable if a common origin exists (homology), as is the case for members of a gene family originating via duplication of an ancestral gene. If the family has relatives in other species, we can assume that the ancestral gene was present in the ancestral species from which all the other species evolved. In particular, describing the relationships among the duplicated biological sequences found in the various species is often possible by a phylogeny, which is more informative than homology statements. Detecting and elaborating on common origins may answer how certain biological sequences developed, and predict what sequences are in a particular species and what their function is. Such knowledge transfer from sequences in one species to the homologous sequences of the other is based on the principle of 'my closest relative looks and behaves like I do', often referred to as 'guilt by association'. To enable knowledge transfer on a large scale, several automated 'phylogenomics pipelines' have been developed in recent years, and seven of these will be described and compared. Overall, the examples in this review demonstrate that homology and phylogeny analyses, done on a large (and automated) scale, can give insights into function in biology and biomedicine.
Collapse
Affiliation(s)
- Georg Fuellen
- Bioinformatics Research Group, Institute for Mathematics and Computer Science, Ernst-Moritz-Arndt-University Greifswald, Greifswald, Germany.
| |
Collapse
|
66
|
Kuntz S, Kieffer E, Bianchetti L, Lamoureux N, Fuhrmann G, Viville S. Tex19, a mammalian-specific protein with a restricted expression in pluripotent stem cells and germ line. Stem Cells 2007; 26:734-44. [PMID: 18096721 DOI: 10.1634/stemcells.2007-0772] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Although the properties of embryonic stem (ES) cells make these cells very attractive in the field of replacement therapy, the molecular mechanisms involved in the maintenance of their pluripotency are not fully characterized. Starting from the observation that most pluripotent markers are also expressed by spermatogonia stem cells, we identified Tex19 as a new potential pluripotency marker. We show that Tex19 is a mammalian-specific protein duplicated in mouse and rat, renamed Tex19.1 and Tex19.2, whereas only one form is found in human. In mouse, both forms are localized on chromosome 11 and transcribed in opposite directions. Tex19 proteins are well conserved, showing two highly conserved domains that do not present any similarity with any other known domains. We show that Tex19.2 is specifically detected in the male somatic gonad lineage, whereas Tex19.1 expression is very similar to that of Oct4. Transcripts are maternally inherited, and expression starts as soon as the early embryo and later is limited to the germ line. Tex19.1 transcripts were also detected in mouse pluripotent stem cells, and expression of Tex19.1, like that of Oct4, decreases after murine embryonic stem and germ cell differentiation. Human TEX19 was more closely related to murine Tex19.1 and was also detected in adult testis and in undifferentiated ES cells. By immunofluorescence, we found that Tex19.1 protein localizes to the nucleus of mouse ES and inner cell mass cells. All these results suggest that Tex19.1, as well as human TEX19, could be a new factor involved in the maintenance of self-renewal or pluripotency of stem cells.
Collapse
Affiliation(s)
- Sandra Kuntz
- IGBMC, Department of Developmental Biology, 1 Rue Laurent Fries, Illkirch, F-67400 France
| | | | | | | | | | | |
Collapse
|
67
|
Marzi S, Myasnikov AG, Serganov A, Ehresmann C, Romby P, Yusupov M, Klaholz BP. Structured mRNAs regulate translation initiation by binding to the platform of the ribosome. Cell 2007; 130:1019-31. [PMID: 17889647 DOI: 10.1016/j.cell.2007.07.008] [Citation(s) in RCA: 105] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Revised: 05/18/2007] [Accepted: 07/06/2007] [Indexed: 01/04/2023]
Abstract
Gene expression can be regulated at the level of initiation of protein biosynthesis via structural elements present at the 5' untranslated region of mRNAs. These folded mRNA segments may bind to the ribosome, thus blocking translation until the mRNA unfolds. Here, we report a series of cryo-electron microscopy snapshots of ribosomal complexes directly visualizing either the mRNA structure blocked by repressor protein S15 or the unfolded, active mRNA. In the stalled state, the folded mRNA prevents the start codon from reaching the peptidyl-tRNA (P) site inside the ribosome. Upon repressor release, the mRNA unfolds and moves into the mRNA channel allowing translation initiation. A comparative structure and sequence analysis suggests the existence of a universal stand-by site on the ribosome (the 30S platform) dedicated for binding regulatory 5' mRNA elements. Different types of mRNA structures may be accommodated during translation preinitiation and regulate gene expression by transiently stalling the ribosome.
Collapse
MESH Headings
- 5' Untranslated Regions
- Amino Acid Sequence
- Base Sequence
- Binding Sites
- Cryoelectron Microscopy
- Escherichia coli/genetics
- Escherichia coli/metabolism
- Escherichia coli Proteins/chemistry
- Escherichia coli Proteins/genetics
- Escherichia coli Proteins/metabolism
- Gene Expression Regulation, Bacterial
- Models, Molecular
- Molecular Sequence Data
- Mutation
- Nucleic Acid Conformation
- Peptide Chain Initiation, Translational
- Protein Binding
- Protein Conformation
- RNA, Bacterial/chemistry
- RNA, Bacterial/metabolism
- RNA, Messenger/metabolism
- RNA, Transfer/metabolism
- Regulatory Sequences, Ribonucleic Acid
- Ribosomal Proteins/chemistry
- Ribosomal Proteins/genetics
- Ribosomal Proteins/metabolism
- Ribosomes/chemistry
- Ribosomes/metabolism
- Ribosomes/ultrastructure
- Sequence Homology, Amino Acid
- Sequence Homology, Nucleic Acid
- Structural Homology, Protein
- Time Factors
Collapse
Affiliation(s)
- Stefano Marzi
- IGBMC (Institute of Genetics and of Molecular and Cellular Biology), Department of Structural Biology and Genomics, Illkirch, F-67404 France
| | | | | | | | | | | | | |
Collapse
|
68
|
Chalmel F, Léveillard T, Jaillard C, Lardenois A, Berdugo N, Morel E, Koehl P, Lambrou G, Holmgren A, Sahel JA, Poch O. Rod-derived Cone Viability Factor-2 is a novel bifunctional-thioredoxin-like protein with therapeutic potential. BMC Mol Biol 2007; 8:74. [PMID: 17764561 PMCID: PMC2064930 DOI: 10.1186/1471-2199-8-74] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2007] [Accepted: 08/31/2007] [Indexed: 11/10/2022] Open
Abstract
Background Cone degeneration is the hallmark of the inherited retinal disease retinitis pigmentosa. We have previously identified a trophic factor "Rod-derived Cone Viability Factor (RdCVF) that is secreted by rods and promote cone viability in a mouse model of the disease. Results Here we report the bioinformatic identification and the experimental analysis of RdCVF2, a second trophic factor belonging to the Rod-derived Cone Viability Factor family. The mouse RdCVF gene is known to be bifunctional, encoding both a long thioredoxin-like isoform (RdCVF-L) and a short isoform with trophic cone photoreceptor viability activity (RdCVF-S). RdCVF2 shares many similarities with RdCVF in terms of gene structure, expression in a rod-dependent manner and protein 3D structure. Furthermore, like RdCVF, the RdCVF2 short isoform exhibits cone rescue activity that is independent of its putative thiol-oxydoreductase activity. Conclusion Taken together, these findings define a new family of bifunctional genes which are: expressed in vertebrate retina, encode trophic cone viability factors, and have major therapeutic potential for human retinal neurodegenerative diseases such as retinitis pigmentosa.
Collapse
Affiliation(s)
- Frédéric Chalmel
- Divisions of Bioinformatics and Biochemistry, Swiss Institute of Bioinformatics, University of Basel, CH-4056 Basel, Switzerland
| | - Thierry Léveillard
- Laboratoire de Physiopathologie Cellulaire et Moléculaire de la Rétine, Inserm U592, Université Pierre et Marie Curie, 75571 Paris, France
| | - Céline Jaillard
- Laboratoire de Physiopathologie Cellulaire et Moléculaire de la Rétine, Inserm U592, Université Pierre et Marie Curie, 75571 Paris, France
| | - Aurélie Lardenois
- Laboratoire de Biologie et Génomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP, BP 163, 67404 Illkirch cedex, France
| | - Naomi Berdugo
- Laboratoire de Physiopathologie Cellulaire et Moléculaire de la Rétine, Inserm U592, Université Pierre et Marie Curie, 75571 Paris, France
- Fovea-Pharmaceuticals – 12 rue Jean Antoine Le Baif – 75013 Paris
| | - Emmanuelle Morel
- Laboratoire de Physiopathologie Cellulaire et Moléculaire de la Rétine, Inserm U592, Université Pierre et Marie Curie, 75571 Paris, France
| | - Patrice Koehl
- Department of Computer Science, Genome Center, University of California, Davis, CA 95616, USA
| | - George Lambrou
- Novartis Institutes for Biomedical Research, Basel 4002, Switzerland
| | - Arne Holmgren
- Medical Nobel Institute for Biochemistry, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - José A Sahel
- Laboratoire de Physiopathologie Cellulaire et Moléculaire de la Rétine, Inserm U592, Université Pierre et Marie Curie, 75571 Paris, France
- Institute of Ophthalmology, University College of London, UK
| | - Olivier Poch
- Laboratoire de Biologie et Génomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP, BP 163, 67404 Illkirch cedex, France
| |
Collapse
|
69
|
Gregory LA, Aguissa-Touré AH, Pinaud N, Legrand P, Gleizes PE, Fribourg S. Molecular basis of Diamond-Blackfan anemia: structure and function analysis of RPS19. Nucleic Acids Res 2007; 35:5913-21. [PMID: 17726054 PMCID: PMC2034476 DOI: 10.1093/nar/gkm626] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Diamond–Blackfan anemia (DBA) is a rare congenital disease linked to mutations in the ribosomal protein genes rps19, rps24 and rps17. It belongs to the emerging class of ribosomal disorders. To understand the impact of DBA mutations on RPS19 function, we have solved the crystal structure of RPS19 from Pyrococcus abyssi. The protein forms a five α-helix bundle organized around a central amphipathic α-helix, which corresponds to the DBA mutation hot spot. From the structure, we classify DBA mutations relative to their respective impact on protein folding (class I) or on surface properties (class II). Class II mutations cluster into two conserved basic patches. In vivo analysis in yeast demonstrates an essential role for class II residues in the incorporation into pre-40S ribosomal particles. This data indicate that missense mutations in DBA primarily affect the capacity of the protein to be incorporated into pre-ribosomes, thus blocking maturation of the pre-40S particles.
Collapse
Affiliation(s)
- Lynn A. Gregory
- INSERM U869, Institut Européen de Chimie et Biologie, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, F-33076, Laboratoire de Biologie Moléculaire des eucaryotes (UMR5099) and Institut d’Exploration Fonctionnelle des Génomes (IFR109), CNRS and Université Paul Sabatier, 118 route de Narbonne F-31062 Toulouse and Synchrotron SOLEIL L’Orme des Merisiers, Saint Aubin- BP48, 91192 Gif sur Yvette Cedex, France
| | - Almass-Houd Aguissa-Touré
- INSERM U869, Institut Européen de Chimie et Biologie, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, F-33076, Laboratoire de Biologie Moléculaire des eucaryotes (UMR5099) and Institut d’Exploration Fonctionnelle des Génomes (IFR109), CNRS and Université Paul Sabatier, 118 route de Narbonne F-31062 Toulouse and Synchrotron SOLEIL L’Orme des Merisiers, Saint Aubin- BP48, 91192 Gif sur Yvette Cedex, France
| | - Noël Pinaud
- INSERM U869, Institut Européen de Chimie et Biologie, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, F-33076, Laboratoire de Biologie Moléculaire des eucaryotes (UMR5099) and Institut d’Exploration Fonctionnelle des Génomes (IFR109), CNRS and Université Paul Sabatier, 118 route de Narbonne F-31062 Toulouse and Synchrotron SOLEIL L’Orme des Merisiers, Saint Aubin- BP48, 91192 Gif sur Yvette Cedex, France
| | - Pierre Legrand
- INSERM U869, Institut Européen de Chimie et Biologie, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, F-33076, Laboratoire de Biologie Moléculaire des eucaryotes (UMR5099) and Institut d’Exploration Fonctionnelle des Génomes (IFR109), CNRS and Université Paul Sabatier, 118 route de Narbonne F-31062 Toulouse and Synchrotron SOLEIL L’Orme des Merisiers, Saint Aubin- BP48, 91192 Gif sur Yvette Cedex, France
| | - Pierre-Emmanuel Gleizes
- INSERM U869, Institut Européen de Chimie et Biologie, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, F-33076, Laboratoire de Biologie Moléculaire des eucaryotes (UMR5099) and Institut d’Exploration Fonctionnelle des Génomes (IFR109), CNRS and Université Paul Sabatier, 118 route de Narbonne F-31062 Toulouse and Synchrotron SOLEIL L’Orme des Merisiers, Saint Aubin- BP48, 91192 Gif sur Yvette Cedex, France
- *To whom correspondence should be addressed. 00 33 5 40 00 30 6300 33 5 40 00 30 68 Correspondence may also be addressed to Pierre-Emmanuel Gleizes. Tel/Fax: 00 33 5 61 33 59 26/58 86,
| | - Sébastien Fribourg
- INSERM U869, Institut Européen de Chimie et Biologie, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, F-33076, Laboratoire de Biologie Moléculaire des eucaryotes (UMR5099) and Institut d’Exploration Fonctionnelle des Génomes (IFR109), CNRS and Université Paul Sabatier, 118 route de Narbonne F-31062 Toulouse and Synchrotron SOLEIL L’Orme des Merisiers, Saint Aubin- BP48, 91192 Gif sur Yvette Cedex, France
- *To whom correspondence should be addressed. 00 33 5 40 00 30 6300 33 5 40 00 30 68 Correspondence may also be addressed to Pierre-Emmanuel Gleizes. Tel/Fax: 00 33 5 61 33 59 26/58 86,
| |
Collapse
|
70
|
Brown DP, Krishnamurthy N, Sjölander K. Automated protein subfamily identification and classification. PLoS Comput Biol 2007; 3:e160. [PMID: 17708678 PMCID: PMC1950344 DOI: 10.1371/journal.pcbi.0030160] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2006] [Accepted: 06/25/2007] [Indexed: 11/22/2022] Open
Abstract
Function prediction by homology is widely used to provide preliminary functional annotations for genes for which experimental evidence of function is unavailable or limited. This approach has been shown to be prone to systematic error, including percolation of annotation errors through sequence databases. Phylogenomic analysis avoids these errors in function prediction but has been difficult to automate for high-throughput application. To address this limitation, we present a computationally efficient pipeline for phylogenomic classification of proteins. This pipeline uses the SCI-PHY (Subfamily Classification in Phylogenomics) algorithm for automatic subfamily identification, followed by subfamily hidden Markov model (HMM) construction. A simple and computationally efficient scoring scheme using family and subfamily HMMs enables classification of novel sequences to protein families and subfamilies. Sequences representing entirely novel subfamilies are differentiated from those that can be classified to subfamilies in the input training set using logistic regression. Subfamily HMM parameters are estimated using an information-sharing protocol, enabling subfamilies containing even a single sequence to benefit from conservation patterns defining the family as a whole or in related subfamilies. SCI-PHY subfamilies correspond closely to functional subtypes defined by experts and to conserved clades found by phylogenetic analysis. Extensive comparisons of subfamily and family HMM performances show that subfamily HMMs dramatically improve the separation between homologous and non-homologous proteins in sequence database searches. Subfamily HMMs also provide extremely high specificity of classification and can be used to predict entirely novel subtypes. The SCI-PHY Web server at http://phylogenomics.berkeley.edu/SCI-PHY/ allows users to upload a multiple sequence alignment for subfamily identification and subfamily HMM construction. Biologists wishing to provide their own subfamily definitions can do so. Source code is available on the Web page. The Berkeley Phylogenomics Group PhyloFacts resource contains pre-calculated subfamily predictions and subfamily HMMs for more than 40,000 protein families and domains at http://phylogenomics.berkeley.edu/phylofacts/.
Collapse
Affiliation(s)
- Duncan P Brown
- Department of Bioengineering, University of California, Berkeley, California, United States of America
| | - Nandini Krishnamurthy
- Department of Bioengineering, University of California, Berkeley, California, United States of America
| | - Kimmen Sjölander
- Department of Bioengineering, University of California, Berkeley, California, United States of America
| |
Collapse
|
71
|
Legrand P, Pinaud N, Minvielle-Sébastia L, Fribourg S. The structure of the CstF-77 homodimer provides insights into CstF assembly. Nucleic Acids Res 2007; 35:4515-22. [PMID: 17584787 PMCID: PMC1935011 DOI: 10.1093/nar/gkm458] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
The cleavage stimulation factor (CstF) is essential for the first step of poly(A) tail formation at the 3' ends of mRNAs. This heterotrimeric complex is built around the 77-kDa protein bridging both CstF-64 and CstF-50 subunits. We have solved the crystal structure of the 77-kDa protein from Encephalitozoon cuniculi at a resolution of 2 Å. The structure folds around 11 Half-a-TPR repeats defining two domains. The crystal structure reveals a tight homodimer exposing phylogenetically conserved areas for interaction with protein partners. Mapping experiments identify the C-terminal region of Rna14p, the yeast counterpart of CstF-77, as the docking domain for Rna15p, the yeast CstF-64 homologue.
Collapse
Affiliation(s)
- Pierre Legrand
- Institut Européen de Chimie et Biologie, INSERM U869, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, 146 rue Léo Saignat, F-33076, Synchrotron SOLEIL, L’Orme des Merisiers, Saint-Aubin, B.P. 48, 91192 Gif-sur-Yvette Cedex, and Institut de Biochimie et Génétique Cellulaires, CNRS UMR 5095, 1 rue Camille Saint-Saëns, F-33077 Bordeaux cedex
| | - Noël Pinaud
- Institut Européen de Chimie et Biologie, INSERM U869, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, 146 rue Léo Saignat, F-33076, Synchrotron SOLEIL, L’Orme des Merisiers, Saint-Aubin, B.P. 48, 91192 Gif-sur-Yvette Cedex, and Institut de Biochimie et Génétique Cellulaires, CNRS UMR 5095, 1 rue Camille Saint-Saëns, F-33077 Bordeaux cedex
| | - Lionel Minvielle-Sébastia
- Institut Européen de Chimie et Biologie, INSERM U869, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, 146 rue Léo Saignat, F-33076, Synchrotron SOLEIL, L’Orme des Merisiers, Saint-Aubin, B.P. 48, 91192 Gif-sur-Yvette Cedex, and Institut de Biochimie et Génétique Cellulaires, CNRS UMR 5095, 1 rue Camille Saint-Saëns, F-33077 Bordeaux cedex
| | - Sébastien Fribourg
- Institut Européen de Chimie et Biologie, INSERM U869, 2 rue Robert Escarpit Pessac, F-33607, Université Victor Segalen, Bordeaux 2, 146 rue Léo Saignat, F-33076, Synchrotron SOLEIL, L’Orme des Merisiers, Saint-Aubin, B.P. 48, 91192 Gif-sur-Yvette Cedex, and Institut de Biochimie et Génétique Cellulaires, CNRS UMR 5095, 1 rue Camille Saint-Saëns, F-33077 Bordeaux cedex
- *To whom correspondence should be addressed. 00 33 (0)5 40 00 30 6300 33 (0)5 40 00 30 68
| |
Collapse
|
72
|
Vuilleumier R, Boeuf G, Fuentes M, Gehring WJ, Falcón J. Cloning and early expression pattern of two melatonin biosynthesis enzymes in the turbot (Scophthalmus maximus). Eur J Neurosci 2007; 25:3047-57. [PMID: 17561818 DOI: 10.1111/j.1460-9568.2007.05578.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Melatonin biosynthesis from serotonin involves the sequential activation of the arylalkylamine N-acetyltransferase (AANAT) and hydroxyindole-O-methyltransferase (HIOMT). Photoperiod synchronizes a daily rhythm in pineal and retinal melatonin secretion through controlling AANAT activity. Teleost fish possess two Aanat, one expressed in the retina (AANAT1) and the other expressed in the pineal gland (AANAT2). We report here the full-length cloning of Aanat1, Aanat2, SmHiomt and Otx5 (orthodenticle homeobox homolog 5) in the turbot (Scophthalmus maximus, Sm), a flatfish belonging to an evolutionary recent group of Teleost. The temporal expression pattern of the genes investigated is consistent with the idea that OTX5 is needed for photoreceptor specification, and that the pineal gland differentiates before the retina. SmAanat2 expression remained pineal specific during the period of time investigated, whereas SmOtx5 and SmHiomt expressions were seen in both the retina and pineal gland. Our results do not support the existence of a second SmHiomt, as is the case for SmAanat. Neither SmAanat2 nor SmHiomt mRNAs displayed cyclic accumulation in the pineal organ of embryos and larvae maintained under a light-dark cycle from fertilization onward. This is in marked contrast with the situation observed with zebrafish Aanat2, indicating that the molecular mechanisms controlling the development of the pineal melatonin system have been modified during the evolution of Teleost.
Collapse
Affiliation(s)
- Robin Vuilleumier
- Biozentrum, University of Basel, Cell and Developmental Biology, Basel, Switzerland
| | | | | | | | | |
Collapse
|
73
|
Ranjith-Kumar CT, Miller W, Sun J, Xiong J, Santos J, Yarbrough I, Lamb RJ, Mills J, Duffy KE, Hoose S, Cunningham M, Holzenburg A, Mbow ML, Sarisky RT, Kao CC. Effects of single nucleotide polymorphisms on Toll-like receptor 3 activity and expression in cultured cells. J Biol Chem 2007; 282:17696-705. [PMID: 17434873 DOI: 10.1074/jbc.m700209200] [Citation(s) in RCA: 110] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Recognition of double-stranded RNA by Toll-like receptor 3 (TLR3) will increase the production of cytokines and chemokines through transcriptional activation by the NF-kappaB protein. Over 136 single-nucleotide polymorphisms (SNPs) in TLR3 have been identified in the human population. Of these, four alter the sequence of the TLR3 protein. Molecular modeling suggests that two of the SNPs, N284I and L412F, could affect the packing of the leucine-rich repeating units in TLR3. Notably, L412F is reported to be present in 20% of the population and is higher in the asthmatic population. To examine whether the four SNPs affect TLR3 function, each were cloned and tested for their ability to activate the expression of TLR3-dependent reporter constructs. SNP N284I was nearly completely defective for activating reporter activity, and L412F was reduced in activity. These two SNPs did not obviously affect the level of TLR3 expression or their intracellular location in vesicles. However, N284I and L412F were underrepresented on the cell surface, as determined by flow cytometry analysis, and were not efficiently secreted into the culture medium when expressed as the soluble ectodomain. They were also reduced in their ability to act in a dominant negative fashion on the wild type TLR3 allele. These observations suggest that N284I and L412F affect the activities of TLR3 needed for proper signaling.
Collapse
Affiliation(s)
- C T Ranjith-Kumar
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas 77843, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
74
|
Blast sampling for structural and functional analyses. BMC Bioinformatics 2007; 8:62. [PMID: 17319945 PMCID: PMC1819393 DOI: 10.1186/1471-2105-8-62] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2006] [Accepted: 02/23/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The post-genomic era is characterised by a torrent of biological information flooding the public databases. As a direct consequence, similarity searches starting with a single query sequence frequently lead to the identification of hundreds, or even thousands of potential homologues. The huge volume of data renders the subsequent structural, functional and evolutionary analyses very difficult. It is therefore essential to develop new strategies for efficient sampling of this large sequence space, in order to reduce the number of sequences to be processed. At the same time, it is important to retain the most pertinent sequences for structural and functional studies. RESULTS An exhaustive analysis on a large scale test set (284 protein families) was performed to compare the efficiency of four different sampling methods aimed at selecting the most pertinent sequences. These four methods sample the proteins detected by BlastP searches and can be divided into two categories: two customisable methods where the user defines either the maximal number or the percentage of sequences to be selected; two automatic methods in which the number of sequences selected is determined by the program. We focused our analysis on the potential information content of the sampled sets of sequences using multiple alignment of complete sequences as the main validation tool. The study considered two criteria: the total number of sequences in BlastP and their associated E-values. The subsequent analyses investigated the influence of the sampling methods on the E-value distributions, the sequence coverage, the final multiple alignment quality and the active site characterisation at various residue conservation thresholds as a function of these criteria. CONCLUSION The comparative analysis of the four sampling methods allows us to propose a suitable sampling strategy that significantly reduces the number of homologous sequences required for alignment, while at the same time maintaining the relevant information concerning the active site residues.
Collapse
|
75
|
Garnier N, Friedrich A, Bolze R, Bettler E, Moulinier L, Geourjon C, Thompson JD, Deléage G, Poch O. MAGOS: multiple alignment and modelling server. Bioinformatics 2006; 22:2164-5. [PMID: 16820425 DOI: 10.1093/bioinformatics/btl349] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED MAGOS is a web server allowing automated protein modelling coupled to the creation of a hierarchical and annotated multiple alignment of complete sequences. MAGOS is designed for an interactive approach of structural information within the framework of the evolutionary relevance of mined and predicted sequence information. AVAILABILITY The web server is freely available at http://pig-pbil.ibcp.fr/magos.
Collapse
Affiliation(s)
- N Garnier
- Institut de Biologie et Chimie des Protéines (IBCP UMR 5086),CNRS, Univ. Lyon1, IFR128 BioSciences Lyon-Gerland, 7, passage du Vercors, 69367 Lyon cedex 07, France
| | | | | | | | | | | | | | | | | |
Collapse
|
76
|
Thompson JD, Muller A, Waterhouse A, Procter J, Barton GJ, Plewniak F, Poch O. MACSIMS: multiple alignment of complete sequences information management system. BMC Bioinformatics 2006; 7:318. [PMID: 16792820 PMCID: PMC1539025 DOI: 10.1186/1471-2105-7-318] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2006] [Accepted: 06/23/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. RESULTS MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. CONCLUSION MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at http://bips.u-strasbg.fr/MACSIMS/.
Collapse
Affiliation(s)
- Julie D Thompson
- Laboratoire de Biologie et Genomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France
| | - Arnaud Muller
- The Laboratory of Molecular Biology, Genetic Analysis & Modelling, Luxembourg
| | - Andrew Waterhouse
- Post Genomics & Molecular Interactions Centre, School of Life Sciences, University of Dundee, UK
| | - Jim Procter
- Post Genomics & Molecular Interactions Centre, School of Life Sciences, University of Dundee, UK
| | - Geoffrey J Barton
- Post Genomics & Molecular Interactions Centre, School of Life Sciences, University of Dundee, UK
| | - Frédéric Plewniak
- Laboratoire de Biologie et Genomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France
| | - Olivier Poch
- Laboratoire de Biologie et Genomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch, France
| |
Collapse
|
77
|
Busso D, Poussin-Courmontagne P, Rosé D, Ripp R, Litt A, Thierry JC, Moras D. Structural genomics of eukaryotic targets at a laboratory scale. ACTA ACUST UNITED AC 2006; 6:81-8. [PMID: 16211503 DOI: 10.1007/s10969-005-1909-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2004] [Accepted: 01/16/2005] [Indexed: 11/29/2022]
Abstract
Structural genomics programs are distributed worldwide and funded by large institutions such as the NIH in United-States, the RIKEN in Japan or the European Commission through the SPINE network in Europe. Such initiatives, essentially managed by large consortia, led to technology and method developments at the different steps required to produce biological samples compatible with structural studies. Besides specific applications, method developments resulted mainly upon miniaturization and parallelization. The challenge that academic laboratories faces to pursue structural genomics programs is to produce, at a higher rate, protein samples. The Structural Biology and Genomics Department (IGBMC - Illkirch - France) is implicated in a structural genomics program of high eukaryotes whose goal is solving crystal structures of proteins and their complexes (including large complexes) related to human health and biotechnology. To achieve such a challenging goal, the Department has established a medium-throughput pipeline for producing protein samples suitable for structural biology studies. Here, we describe the setting up of our initiative from cloning to crystallization and we demonstrate that structural genomics may be manageable by academic laboratories by strategic investments in robotic and by adapting classical bench protocols and new developments, in particular in the field of protein expression, to parallelization.
Collapse
Affiliation(s)
- Didier Busso
- Département de Biologie et de Génomique Structurales, IGBMC, CNRS/INSERM/Université Louis Pasteur, Parc d'Innovation, 1 rue Laurent Fries, BP10142, 67404, Illkirch, cedex, France.
| | | | | | | | | | | | | |
Collapse
|
78
|
Müller SA, Pozidis C, Stone R, Meesters C, Chami M, Engel A, Economou A, Stahlberg H. Double hexameric ring assembly of the type III protein translocase ATPase HrcN. Mol Microbiol 2006; 61:119-25. [PMID: 16824099 DOI: 10.1111/j.1365-2958.2006.05219.x] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The specialized type III secretion (T3S) apparatus of pathogenic and symbiotic Gram-negative bacteria comprises a complex transmembrane organelle and an ATPase homologous to the F1-ATPase beta subunit. The T3S ATPase HrcN of Pseudomonas syringae associates with the inner membrane, and its ATP hydrolytic activity is stimulated by dodecamerization. The structure of dodecameric HrcN (HrcN12) determined to 1.6 nm by cryo-electron microscopy is presented. HrcN12 comprises two hexameric rings that are probably stacked face-to-face by the association of their C-terminal domains. It is 11.5 +/- 1.0 nm in diameter, 12.0 +/- 2.0 nm high and has a 2.0-3.8 nm wide inner channel. This structure is compared to a homology model based on the structure of the F1-beta-ATPase. A model for its incorporation within the T3S apparatus is presented.
Collapse
Affiliation(s)
- Shirley A Müller
- M. E. Müller Institute for Structural Biology, Biozentrum, University of Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| | | | | | | | | | | | | | | |
Collapse
|
79
|
Lam CS, Rastegar S, Strähle U. Distribution of cannabinoid receptor 1 in the CNS of zebrafish. Neuroscience 2005; 138:83-95. [PMID: 16368195 DOI: 10.1016/j.neuroscience.2005.10.069] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2005] [Revised: 10/21/2005] [Accepted: 10/25/2005] [Indexed: 12/11/2022]
Abstract
The cannabinoid receptor 1 (Cb1) mediates the psychoactive effect of marijuana. In mammals, there is abundant evidence advocating the importance of cannabinoid signaling; activation of Cb1 exerts diverse functions, chiefly by its ability to modulate neurotransmission. Thus, much attention has been devoted to understand its role in health and disease and to evaluate its therapeutic potential. Here, we have cloned zebrafish cb1 and investigated its expression in developing and adult zebrafish brain. Sequence analysis showed that there is a high degree of conservation, especially in residues demonstrated to be critical for function in mammals. In situ hybridization revealed that zebrafish cb1 appears first in the preoptic area at 24 hours post-fertilization. Subsequently, transcripts are detected in the dorsal telencephalon, hypothalamus, pretectum and torus longitudinalis. A similar pattern of expression is recapitulated in the adult brain. While cb1 is intensively stained in the medial zone of the dorsal telencephalon, expression elsewhere is weak by comparison. In particular, localization of cb1 in the telencephalic periventricular matrix is suggestive of the involvement of Cb1 in neurogenesis, bearing strong resemblance in terms of expression and function to the proliferative mammalian hippocampal formation. In addition, a gradient-like expression of cb1 is detected in the torus longitudinalis, a teleost specific neural tissue. In relation to dopaminergic neurons in the diencephalic posterior tuberculum (considered to be the teleostean homologue of the mammalian midbrain dopaminergic system), both cb1 and tyrosine hydroxylase-expressing cells occupy non-overlapping domains. However there is evidence that they are co-localized in the caudal zone of the hypothalamus, implying a direct modulation of dopamine release in this particular region. Collectively, our data indicate the propensity of zebrafish cb1 to participate in multiple neurological processes.
Collapse
Affiliation(s)
- C S Lam
- Institute for Toxicology and Genetics, Forschungszentrum Karlsruhe, Postfach 3640, 76021 Karlsruhe, University of Heidelberg, Baden-Wurtemberg, Germany
| | | | | |
Collapse
|
80
|
Bianchetti L, Thompson JD, Lecompte O, Plewniak F, Poch O. vALId: validation of protein sequence quality based on multiple alignment data. J Bioinform Comput Biol 2005; 3:929-47. [PMID: 16078368 DOI: 10.1142/s0219720005001326] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2004] [Revised: 02/02/2005] [Accepted: 02/06/2005] [Indexed: 11/18/2022]
Abstract
The validation of sequences is essential to perform accurate phylogeny and structure/function analysis. However among the thousands of protein sequences available in the public databases, most have been predicted in silico and have not systematically undergone a quality verification. It has recently become evident that they often contain sequence errors. To address the problem of automatic protein quality control, we have developed vALId, an interactive web interfaced software. Taking advantage of high quality multiple alignments of complete protein sequences (MACS), vALId first warns about the presence of suspicious insertions, deletions (indels) and divergent segments, and second, proposes corrections based on transcripts and genome contigs. In a first evaluation test, hundreds of indels and divergent segments were randomly generated in a manually refined MACS. The sensitivity (Sn) and specificity (Sp) of indel detection were excellent (0.96) while the mean Sn(0.49) and Sp(0.56) of divergent segment delineation depended on the percent identity between sequence neighbors. In a second test, 6195 sequences in 100 MACS corresponding to different functional and structural protein families were analyzed. 65% of the sequences were in silico predictions and 44% of eukaryote predicted proteins were partially incorrect with at least one suspicious indel or divergent segment.
Collapse
Affiliation(s)
- Laurent Bianchetti
- Plate-Forme de Bioinformatique de Strasbourg, Laboratoire de Bioinformatique et Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire (CNRS/INSERM/ULP), Illkirch Cedex, France.
| | | | | | | | | |
Collapse
|
81
|
Myasnikov AG, Marzi S, Simonetti A, Giuliodori AM, Gualerzi CO, Yusupova G, Yusupov M, Klaholz BP. Conformational transition of initiation factor 2 from the GTP- to GDP-bound state visualized on the ribosome. Nat Struct Mol Biol 2005; 12:1145-9. [PMID: 16284619 DOI: 10.1038/nsmb1012] [Citation(s) in RCA: 121] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2005] [Accepted: 10/03/2005] [Indexed: 11/08/2022]
Abstract
Initiation of protein synthesis is a universally conserved event that requires initiation factors IF1, IF2 and IF3 in prokaryotes. IF2 is a GTPase essential for binding initiator transfer RNA to the 30S ribosomal subunit and recruiting the 50S subunit into the 70S initiation complex. We present two cryo-EM structures of the assembled 70S initiation complex comprising mRNA, fMet-tRNA(fMet) and IF2 with either a non-hydrolyzable GTP analog or GDP. Transition from the GTP-bound to the GDP-bound state involves substantial conformational changes of IF2 and of the entire ribosome. In the GTP analog-bound state, IF2 interacts mostly with the 30S subunit and extends to the initiator tRNA in the peptidyl (P) site, whereas in the GDP-bound state IF2 steps back and adopts a 'ready-to-leave' conformation. Our data also provide insights into the molecular mechanism guiding release of IF1 and IF3.
Collapse
Affiliation(s)
- Alexander G Myasnikov
- Department of Structural Biology and Genomics, Institute of Genetics and Molecular and Cellular Biology, Centre National de la Recherche Scientifique/Institut National de la Santé et de la Recherche Médicale, Université Louis Pasteur, Illkirch, France
| | | | | | | | | | | | | | | |
Collapse
|
82
|
Uhring M, Bey G, Lecompte O, Cavarelli J, Moras D, Poch O. Cloning, purification and crystallization of a Walker-type Pyrococcus abyssi ATPase family member. Acta Crystallogr Sect F Struct Biol Cryst Commun 2005; 61:925-7. [PMID: 16511197 PMCID: PMC1991322 DOI: 10.1107/s174430910502868x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2005] [Accepted: 09/12/2005] [Indexed: 11/11/2022]
Abstract
Several ATPase proteins play essential roles in the initiation of chromosomal DNA replication in archaea. Walker-type ATPases are defined by their conserved Walker A and B motifs, which are associated with nucleotide binding and ATP hydrolysis. A family of 28 ATPase proteins with non-canonical Walker A sequences has been identified by a bioinformatics study of comparative genomics in Pyrococcus genomes. A high-throughput structural study on P. abyssi has been started in order to establish the structure of these proteins. 16 genes have been cloned and characterized. Six out of the seven soluble constructs were purified in Escherichia coli and one of them, PABY2304, has been crystallized. X-ray diffraction data were collected from selenomethionine-derivative crystals using synchrotron radiation. The crystals belong to the orthorhombic space group C2, with unit-cell parameters a = 79.41, b = 48.63, c = 108.77 A, and diffract to beyond 2.6 A resolution.
Collapse
Affiliation(s)
- Muriel Uhring
- Département de Biologie et Génomiques Structurales, UMR 7104, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP Strasbourg, 1 Rue Laurent Fries, 64404 Illkirch, France
| | - Gilbert Bey
- Département de Biologie et Génomiques Structurales, UMR 7104, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP Strasbourg, 1 Rue Laurent Fries, 64404 Illkirch, France
| | - Odile Lecompte
- Département de Biologie et Génomiques Structurales, UMR 7104, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP Strasbourg, 1 Rue Laurent Fries, 64404 Illkirch, France
| | - Jean Cavarelli
- Département de Biologie et Génomiques Structurales, UMR 7104, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP Strasbourg, 1 Rue Laurent Fries, 64404 Illkirch, France
| | - Dino Moras
- Département de Biologie et Génomiques Structurales, UMR 7104, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP Strasbourg, 1 Rue Laurent Fries, 64404 Illkirch, France
| | - Olivier Poch
- Département de Biologie et Génomiques Structurales, UMR 7104, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP Strasbourg, 1 Rue Laurent Fries, 64404 Illkirch, France
- Correspondence e-mail:
| |
Collapse
|
83
|
Tao AL, He SH. Cloning, expression, and characterization of pollen allergens from Humulus scandens (Lour) Merr and Ambrosia artemisiifolia L. Acta Pharmacol Sin 2005; 26:1225-32. [PMID: 16174439 DOI: 10.1111/j.1745-7254.2005.00194.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
AIM To clone the pollen allergen genes in Humulus scandens (Lour) Merr (LvCao in Chinese) and short ragweed (Ambrosia artemisiifolia L) for recombinant allergen production and immunotherapy. METHODS The allergen genes were selectively amplified in the weed pollen cDNA pool by using a special PCR profile, with the primers designed by a modeling procedure. Following truncated gene cloning and confirmation of the pollen source, unknown 3'cDNA ends were identified by using the 3'-RACE method. The gene function conferred by the full-length coding region was evaluated by a homologue search in the GenBank database. Recombinant proteins expressed in Escherichia coli pET-44 RosettaBlue cells were subsequently characterized by N-terminal end sequencing, IgE binding, and cross-reactivity. RESULTS Three full-length cDNAs were obtained in each weed. Multiple alignment analysis revealed that the deduced amino acid sequences were 83% identical to each other and 56%-90% identical to panallergen profilins from other species. Five recombinant proteins were abundantly expressed in non-fusion forms and were confirmed by using the N-terminal end sequence identity. Sera from patients who were allergic to A artemisiifolia reacted not only with rAmb a 8(D03) derived from A artemisiifolia, but also with recombinant protein rHum s 1(LCM9) derived from H scandens, which confirmed the allergenicity and cross-reactivity of the recombinant proteins from the 2 sources. Comparison of the degenerate primers used for truncated gene cloning with the full-length cDNA demonstrated that alternative nucleotide degeneracy occurred. CONCLUSION This study demonstrates a useful method for cloning homologous allergen genes across different species, particularly for little-studied species. The recombinant allergens obtained might be useful for the immunotherapeutic treatment of H scandens and/or A artemisiifolia pollen allergies.
Collapse
Affiliation(s)
- Ai-lin Tao
- Allergy and Inflammation Research Institute, Shantou University Medical College, Shantou 515031, China
| | | |
Collapse
|
84
|
Schmitt E, Panvert M, Blanquet S, Mechulam Y. Structural Basis for tRNA-Dependent Amidotransferase Function. Structure 2005; 13:1421-33. [PMID: 16216574 DOI: 10.1016/j.str.2005.06.016] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2005] [Revised: 06/20/2005] [Accepted: 06/30/2005] [Indexed: 11/26/2022]
Abstract
Besides direct charging of tRNAs by aminoacyl-tRNA synthetases, indirect routes also ensure attachment of some amino acids onto tRNA. Such routes may explain how new amino acids entered into protein synthesis. In archaea and in most bacteria, tRNA(Gln) is first misaminoacylated by glutamyl-tRNA synthetase. Glu-tRNA(Gln) is then matured into Gln-tRNA(Gln) by a tRNA-dependent amidotransferase. We report the structure of a tRNA-dependent amidotransferase-that of GatDE from Pyrococcus abyssi. The 3.0 A resolution crystal structure shows a tetramer with two GatD molecules as the core and two GatE molecules at the periphery. The fold of GatE cannot be related to that of any tRNA binding enzyme. The ammonium donor site on GatD and the tRNA site on GatE are markedly distant. Comparison of GatD and L-asparaginase structures shows how the motion of a beta hairpin region containing a crucial catalytic threonine may control the overall reaction cycle of GatDE.
Collapse
MESH Headings
- Amino Acid Sequence
- Amino Acyl-tRNA Synthetases/chemistry
- Amino Acyl-tRNA Synthetases/metabolism
- Binding Sites
- Conserved Sequence
- Crystallography, X-Ray
- Dimerization
- Glutamate-tRNA Ligase/chemistry
- Glutamate-tRNA Ligase/metabolism
- Models, Molecular
- Molecular Sequence Data
- Nitrogenous Group Transferases/chemistry
- Nitrogenous Group Transferases/genetics
- Nitrogenous Group Transferases/metabolism
- Protein Biosynthesis
- Protein Folding
- Protein Structure, Secondary
- Protein Structure, Tertiary
- Protein Subunits/chemistry
- Pyrococcus abyssi/enzymology
- RNA, Archaeal/chemistry
- RNA, Archaeal/genetics
- RNA, Archaeal/metabolism
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Bacterial/metabolism
- RNA, Transfer/chemistry
- RNA, Transfer/metabolism
- RNA, Transfer, Gln/metabolism
- Sequence Homology, Amino Acid
- Threonine/chemistry
- X-Ray Diffraction
Collapse
Affiliation(s)
- Emmanuelle Schmitt
- Laboratoire de Biochimie, Unité Mixte de Recherche 7654, CNRS-Ecole Polytechnique, F-91128 Palaiseau cedex, France.
| | | | | | | |
Collapse
|
85
|
Muller J, Oma Y, Vallar L, Friederich E, Poch O, Winsor B. Sequence and comparative genomic analysis of actin-related proteins. Mol Biol Cell 2005; 16:5736-48. [PMID: 16195354 PMCID: PMC1289417 DOI: 10.1091/mbc.e05-06-0508] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Actin-related proteins (ARPs) are key players in cytoskeleton activities and nuclear functions. Two complexes, ARP2/3 and ARP1/11, also known as dynactin, are implicated in actin dynamics and in microtubule-based trafficking, respectively. ARP4 to ARP9 are components of many chromatin-modulating complexes. Conventional actins and ARPs codefine a large family of homologous proteins, the actin superfamily, with a tertiary structure known as the actin fold. Because ARPs and actin share high sequence conservation, clear family definition requires distinct features to easily and systematically identify each subfamily. In this study we performed an in depth sequence and comparative genomic analysis of ARP subfamilies. A high-quality multiple alignment of approximately 700 complete protein sequences homologous to actin, including 148 ARP sequences, allowed us to extend the ARP classification to new organisms. Sequence alignments revealed conserved residues, motifs, and inserted sequence signatures to define each ARP subfamily. These discriminative characteristics allowed us to develop ARPAnno (http://bips.u-strasbg.fr/ARPAnno), a new web server dedicated to the annotation of ARP sequences. Analyses of sequence conservation among actins and ARPs highlight part of the actin fold and suggest interactions between ARPs and actin-binding proteins. Finally, analysis of ARP distribution across eukaryotic phyla emphasizes the central importance of nuclear ARPs, particularly the multifunctional ARP4.
Collapse
Affiliation(s)
- Jean Muller
- Laboratoire de Biologie et Génomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP, BP 163, 67404 Illkirch Cedex, France.
| | | | | | | | | | | |
Collapse
|
86
|
Vingadassalom D, Kolb A, Mayer C, Rybkine T, Collatz E, Podglajen I. An unusual primary sigma factor in the Bacteroidetes phylum. Mol Microbiol 2005; 56:888-902. [PMID: 15853878 DOI: 10.1111/j.1365-2958.2005.04590.x] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The presence of housekeeping gene promoters with a unique consensus sequence in Bacteroides fragilis, previously described by Bayley et al. (2000, FEMS Microbiol Lett 193: 149-154), suggested the existence of a particular primary sigma factor. The single rpoD-like gene observed in the B. fragilis genome, and similarly in those of other members of the Bacteroidetes phylum, was found to be essential. It encodes a protein, sigma(ABfr), of only 32.7 kDa that is produced with equal abundance during all phases of growth and was concluded to be the primary sigma factor. sigma(ABfr) and its orthologues in the Bacteroidetes are unusual primary sigma factors in that they lack region 1.1, have a unique signature made up of 29 strictly identical amino acids and are the only RpoD factors that cluster with the RpoS factors. Although binding to the Escherichia coli core RNA polymerase, sigma(ABfr) does not support transcription initiation from any promoter when it is part of the heterologous holoenzyme, while in the reconstituted homologous holoenzyme it does so only from typical B. fragilis, including rrs, promoters but not from the lacUV5 or RNA I promoters.
Collapse
Affiliation(s)
- Didier Vingadassalom
- INSERM E0004, Laboratoire de Recherche Moléculaire sur les Antibiotiques, Université Paris VI, 75270 Paris, France
| | | | | | | | | | | |
Collapse
|
87
|
Waagmeester A, Thompson J, Reyrat JM. Identifying sigma factors in Mycobacterium smegmatis by comparative genomic analysis. Trends Microbiol 2005; 13:505-9. [PMID: 16140533 DOI: 10.1016/j.tim.2005.08.009] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2005] [Revised: 08/05/2005] [Accepted: 08/23/2005] [Indexed: 11/28/2022]
Abstract
Mycobacterium smegmatis is a saprophytic species that has been used for 15 years as a model to perform heterologous regulation and virulence studies of Mycobacterium tuberculosis. Members of the extracytoplasmic sigma factors family, which are required for adaptive responses to various environmental stresses, are responsible for some of the virulence traits of M. tuberculosis. A bioinformatic search on the genome of M. smegmatis has predicted the existence of 26 sigma factors, which is twice the number that are present in M. tuberculosis. A phylogenetic analysis has shown that despite this high number of sigma factors the orthologs of the genes sigC, sigI and sigK of M. tuberculosis are absent in the M. smegmatis genome. Several sigma factors are specific for M. smegmatis, with a special enrichment in the sigH and, to a lesser extent, in the sigJ and sigL subfamily, pinpointing the potential variability of the repertoire of adaptive response in this saprophytic species.
Collapse
Affiliation(s)
- Andra Waagmeester
- INSERM-UMR 570, Groupe Avenir, Université Paris V-Descartes, Faculté de Médecine, Site Necker, F-75730 Paris Cedex 15, France
| | | | | |
Collapse
|
88
|
Thompson JD, Holbrook SR, Katoh K, Koehl P, Moras D, Westhof E, Poch O. MAO: a Multiple Alignment Ontology for nucleic acid and protein sequences. Nucleic Acids Res 2005; 33:4164-71. [PMID: 16043635 PMCID: PMC1180671 DOI: 10.1093/nar/gki735] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The application of high-throughput techniques such as genomics, proteomics or transcriptomics means that vast amounts of heterogeneous data are now available in the public databases. Bioinformatics is responding to the challenge with new integrated management systems for data collection, validation and analysis. Multiple alignments of genomic and protein sequences provide an ideal environment for the integration of this mass of information. In the context of the sequence family, structural and functional data can be evaluated and propagated from known to unknown sequences. However, effective integration is being hindered by syntactic and semantic differences between the different data resources and the alignment techniques employed. One solution to this problem is the development of an ontology that systematically defines the terms used in a specific domain. Ontologies are used to share data from different resources, to automatically analyse information and to represent domain knowledge for non-experts. Here, we present MAO, a new ontology for multiple alignments of nucleic and protein sequences. MAO is designed to improve interoperation and data sharing between different alignment protocols for the construction of a high quality, reliable multiple alignment in order to facilitate knowledge extraction and the presentation of the most pertinent information to the biologist.
Collapse
Affiliation(s)
- Julie D Thompson
- Institut de Génétique et deBiologie Moléculaire et Cellulaire 1 rue Laurent Fries, B.P. 10142, 67404 Illkirch Cedex, France.
| | | | | | | | | | | | | |
Collapse
|
89
|
Chalmel F, Lardenois A, Thompson JD, Muller J, Sahel JA, Léveillard T, Poch O. GOAnno: GO annotation based on multiple alignment. Bioinformatics 2005; 21:2095-6. [PMID: 15647299 DOI: 10.1093/bioinformatics/bti252] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED GOAnno is a web tool that automatically annotates proteins according to the Gene Ontology (GO) using evolutionary information available in hierarchized multiple alignments. GO terms present in the aligned functional subfamily can be cross-validated and propagated to obtain highly reliable predicted GO annotation based on the GOAnno algorithm. AVAILABILITY The web tool and a reduced version for local installation are freely available at http://igbmc.u-strasbg.fr/GOAnno/GOAnno.html SUPPLEMENTARY INFORMATION The website supplies a detailed explanation and illustration of the algorithm at http://igbmc.u-strasbg.fr/GOAnno/GOAnnoHelp.html.
Collapse
Affiliation(s)
- F Chalmel
- Laboratoire de Biologie et Génomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP BP 163, Illkirch , France.
| | | | | | | | | | | | | |
Collapse
|
90
|
Biarrotte-Sorin S, Maillard AP, Delettré J, Sougakoff W, Arthur M, Mayer C. Crystal structures of Weissella viridescens FemX and its complex with UDP-MurNAc-pentapeptide: insights into FemABX family substrates recognition. Structure 2004; 12:257-67. [PMID: 14962386 DOI: 10.1016/j.str.2004.01.006] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2003] [Revised: 10/28/2003] [Accepted: 10/28/2003] [Indexed: 11/16/2022]
Abstract
Members of the FemABX protein family are novel therapeutic targets, as they are involved in the synthesis of the bacterial cell wall. They catalyze the addition of amino acid(s) on the peptidoglycan precursor using aminoacylated tRNA as a substrate. We report here the high-resolution structure of Weissella viridescens L-alanine transferase FemX and its complex with the UDP-MurNAc-pentapeptide. This is the first structure example of a FemABX family member that does not possess a coiled-coil domain. FemX consists of two structurally equivalent domains, separated by a cleft containing the binding site of the UDP-MurNAc-pentapeptide and a long channel that traverses one of the two domains. Our structural studies bring new insights into the evolution of the FemABX and the related GNAT superfamilies, shed light on the recognition site of the aminoacylated tRNA in Fem proteins, and allowed manual docking of the acceptor end of the alanyl-tRNAAla.
Collapse
Affiliation(s)
- Sabrina Biarrotte-Sorin
- Laboratoire de Minéralogie-Cristallographie de Paris, Université Paris 6, 4 place Jussieu, Paris Cedex 05, 75252, France
| | | | | | | | | | | |
Collapse
|
91
|
Degot S, Le Hir H, Alpy F, Kedinger V, Stoll I, Wendling C, Seraphin B, Rio MC, Tomasetto C. Association of the breast cancer protein MLN51 with the exon junction complex via its speckle localizer and RNA binding module. J Biol Chem 2004; 279:33702-15. [PMID: 15166247 DOI: 10.1074/jbc.m402754200] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
MLN51 is a nucleocytoplasmic shuttling protein that is overexpressed in breast cancer. The function of MLN51 in mammals remains elusive. Its fly homolog, named barentsz, as well as the proteins mago nashi and tsunagi have been shown to be required for proper oskar mRNA localization to the posterior pole of the oocyte. Magoh and Y14, the human homologs of mago nashi and tsunagi, are core components of the exon junction complex (EJC). The EJC is assembled on spliced mRNAs and plays important roles in post-splicing events including mRNA export, nonsense-mediated mRNA decay, and translation. In the present study, we show that human MLN51 is an RNA-binding protein present in ribonucleo-protein complexes. By co-immunoprecipitation assays, endogenous MLN51 protein is found to be associated with EJC components, including Magoh, Y14, and NFX1/TAP, and subcellular localization studies indicate that MLN51 transiently co-localizes with Magoh in nuclear speckles. Moreover, we demonstrate that MLN51 specifically associates with spliced mRNAs in co-precipitation experiments, both in the nucleus and in the cytoplasm, at the position where the EJC is deposited. Most interesting, we have identified a region within MLN51 sufficient to bind RNA, to interact with Magoh and spliced mRNA, and to address the protein to nuclear speckles. This conserved region of MLN51 was therefore named SELOR for speckle localizer and RNA binding module. Altogether our data demonstrate that MLN51 associates with EJC in the nucleus and remains stably associated with mRNA in the cytoplasm, suggesting that its overexpression might alter mRNA metabolism in cancer.
Collapse
Affiliation(s)
- Sébastien Degot
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Département de Pathologie Moléculaire, UPR 6520 CNRS/U596 INSERM/Université Louis Pasteur, BP 10142, 67404 Illkirch, France
| | | | | | | | | | | | | | | | | |
Collapse
|
92
|
Klaholz BP, Myasnikov AG, Van Heel M. Visualization of release factor 3 on the ribosome during termination of protein synthesis. Nature 2004; 427:862-5. [PMID: 14985767 DOI: 10.1038/nature02332] [Citation(s) in RCA: 113] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2003] [Accepted: 01/08/2004] [Indexed: 11/09/2022]
Abstract
Termination of protein synthesis by the ribosome requires two release factor (RF) classes. The class II RF3 is a GTPase that removes class I RFs (RF1 or RF2) from the ribosome after release of the nascent polypeptide. RF3 in the GDP state binds to the ribosomal class I RF complex, followed by an exchange of GDP for GTP and release of the class I RF. As GTP hydrolysis triggers release of RF3 (ref. 4), we trapped RF3 on Escherichia coli ribosomes using a nonhydrolysable GTP analogue. Here we show by cryo-electron microscopy that the complex can adopt two different conformational states. In 'state 1', RF3 is pre-bound to the ribosome, whereas in 'state 2' RF3 contacts the ribosome GTPase centre. The transfer RNA molecule translocates from the peptidyl site in state 1 to the exit site in state 2. This translocation is associated with a large conformational rearrangement of the ribosome. Because state 1 seems able to accommodate simultaneously both RF3 and RF2, whose position is known from previous studies, we can infer the release mechanism of class I RFs.
Collapse
Affiliation(s)
- Bruno P Klaholz
- Department of Biological Sciences, Imperial College London, London SW7 2AY, UK.
| | | | | |
Collapse
|
93
|
Thompson JD, Prigent V, Poch O. LEON: multiple aLignment Evaluation Of Neighbours. Nucleic Acids Res 2004; 32:1298-307. [PMID: 14982955 PMCID: PMC390283 DOI: 10.1093/nar/gkh294] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2003] [Revised: 01/16/2004] [Accepted: 01/29/2004] [Indexed: 11/13/2022] Open
Abstract
Sequence alignments are fundamental to a wide range of applications, including database searching, functional residue identification and structure prediction techniques. These applications predict or propagate structural/functional/evolutionary information based on a presumed homology between the aligned sequences. If the initial hypothesis of homology is wrong, no subsequent application, however sophisticated, can be expected to yield accurate results. Here we present a novel method, LEON, to predict homology between proteins based on a multiple alignment of complete sequences (MACS). In MACS, weak signals from distantly related proteins can be considered in the overall context of the family. Intermediate sequences and the combination of individual weak matches are used to increase the significance of low-scoring regions. Residue composition is also taken into account by incorporation of several existing methods for the detection of compositionally biased sequence segments. The accuracy and reliability of the predictions is demonstrated in large-scale comparisons with structural and sequence family databases, where the specificity was shown to be >99% and the sensitivity was estimated to be approximately 76%. LEON can thus be used to reliably identify the complex relationships between large multidomain proteins and should be useful for automatic high-throughput genome annotations, 2D/3D structure predictions, protein-protein interaction predictions etc.
Collapse
Affiliation(s)
- Julie D Thompson
- Laboratoire de Biologie et Genomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP, BP 163, 67404 Illkirch Cedex, France
| | | | | |
Collapse
|
94
|
Duval D, Duval G, Kedinger C, Poch O, Boeuf H. The 'PINIT' motif, of a newly identified conserved domain of the PIAS protein family, is essential for nuclear retention of PIAS3L. FEBS Lett 2003; 554:111-8. [PMID: 14596924 DOI: 10.1016/s0014-5793(03)01116-5] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PIAS proteins, cytokine-dependent STAT-associated repressors, exhibit intrinsic E3-type SUMO ligase activities and form a family of transcriptional modulators. Three conserved domains have been identified so far in this protein family, the SAP box, the MIZ-Zn finger/RING module and the acidic C-terminal domain, which are essential for protein interactions, DNA binding or SUMO ligase activity. We have identified a novel conserved domain of 180 residues in PIAS proteins and shown that its 'PINIT' motif as well as other conserved motifs (in the SAP box and in the RING domain) are independently involved in nuclear retention of PIAS3L, the long form of PIAS3, that we have characterized in mouse embryonic stem cells.
Collapse
Affiliation(s)
- D Duval
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP, P.O. Box 10142, C.U. de Strasbourg, 67404 Illkirch, France
| | | | | | | | | |
Collapse
|