1
|
Vitale L, Caracausi M, Casadei R, Pelleri MC, Piovesan A. Difficulty in obtaining the complete mRNA coding sequence at 5' region (5' end mRNA artifact): Causes, consequences in biology and medicine and possible solutions for obtaining the actual amino acid sequence of proteins (Review). Int J Mol Med 2017; 39:1063-1071. [PMID: 28393177 DOI: 10.3892/ijmm.2017.2942] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 03/16/2017] [Indexed: 11/06/2022] Open
Abstract
The known difficulty in obtaining the actual full length, complete sequence of a messenger RNA (mRNA) may lead to the erroneous determination of its coding sequence at the 5' region (5' end mRNA artifact), and consequently to the wrong assignment of the translation start codon, leading to the inaccurate prediction of the encoded polypeptide at its amino terminus. Among the known human genes whose study was affected by this artifact, we can include disco interacting protein 2 homolog A (DIP2A; KIAA0184), Down syndrome critical region 1 (DSCR1), SON DNA binding protein (SON), trefoil factor 3 (TFF3) and URB1 ribosome biogenesis 1 homolog (URB1; KIAA0539) on chromosome 21, as well as receptor for activated C kinase 1 (RACK1, also known as GNB2L1), glutaminyl‑tRNA synthetase (QARS) and tyrosyl-DNA phosphodiesterase 2 (TDP2) along with another 474 loci, including interleukin 16 (IL16). In this review, we discuss the causes of this issue, its quantitative incidence in biomedical research, the consequences in biology and medicine, and the possible solutions for obtaining the actual amino acid sequence of proteins in the post-genomics era.
Collapse
Affiliation(s)
- Lorenza Vitale
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, I‑40126 Bologna, Italy
| | - Maria Caracausi
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, I‑40126 Bologna, Italy
| | - Raffaella Casadei
- Department for Life Quality Studies, University of Bologna, I‑47921 Rimini, Italy
| | - Maria Chiara Pelleri
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, I‑40126 Bologna, Italy
| | - Allison Piovesan
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, I‑40126 Bologna, Italy
| |
Collapse
|
2
|
Improving mRNA 5' coding sequence determination in the mouse genome. Mamm Genome 2014; 25:149-59. [PMID: 24504701 DOI: 10.1007/s00335-013-9498-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Accepted: 12/09/2013] [Indexed: 10/25/2022]
Abstract
The incomplete determination of the mRNA 5' end sequence may lead to the incorrect assignment of the first AUG codon and to errors in the prediction of the encoded protein product. Due to the significance of the mouse as a model organism in biomedical research, we performed a systematic identification of coding regions at the 5' end of all known mouse mRNAs, using an automated expressed sequence tag (EST)-based approach which we have previously described. By parsing almost 4 million BLAT alignments we found 351 mouse loci, out of 20,221 analyzed, in which an extension of the mRNA 5' coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for Apc2 and Mknk2 cDNAs. We also generated a list of 16,330 mouse mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5' end in the current form. Systematic searches in the main mouse genome databases and genome browsers showed that 82% of our results are original and have not been identified by their annotation pipelines. Moreover, the same information is not easily derivable from RNA-Seq data, due to short sequence length and laboriousness in building full-length transcript structures. In conclusion, our results improve the determination of full-length 5' coding sequences and might be useful in order to reduce errors when studying mouse gene structure and function in biomedical research.
Collapse
|
3
|
Meertens L, Carnec X, Lecoin MP, Ramdasi R, Guivel-Benhassine F, Lew E, Lemke G, Schwartz O, Amara A. The TIM and TAM families of phosphatidylserine receptors mediate dengue virus entry. Cell Host Microbe 2013; 12:544-57. [PMID: 23084921 DOI: 10.1016/j.chom.2012.08.009] [Citation(s) in RCA: 380] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Revised: 06/26/2012] [Accepted: 08/17/2012] [Indexed: 11/24/2022]
Abstract
Dengue viruses (DVs) are responsible for the most medically relevant arboviral diseases. However, the molecular interactions mediating DV entry are poorly understood. We determined that TIM and TAM proteins, two receptor families that mediate the phosphatidylserine (PtdSer)-dependent phagocytic removal of apoptotic cells, serve as DV entry factors. Cells poorly susceptible to DV are robustly infected after ectopic expression of TIM or TAM receptors. Conversely, DV infection of susceptible cells is inhibited by anti-TIM or anti-TAM antibodies or knockdown of TIM and TAM expression. TIM receptors facilitate DV entry by directly interacting with virion-associated PtdSer. TAM-mediated infection relies on indirect DV recognition, in which the TAM ligand Gas6 acts as a bridging molecule by binding to PtdSer within the virion. This dual mode of virus recognition by TIM and TAM receptors reveals how DVs usurp the apoptotic cell clearance pathway for infectious entry.
Collapse
Affiliation(s)
- Laurent Meertens
- INSERM U944, Laboratoire de Pathologie et Virologie Moléculaire, Hôpital Saint-Louis, 1 Avenue Claude Vellefaux, 75010 Paris, France
| | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Casadei R, Piovesan A, Vitale L, Facchin F, Pelleri MC, Canaider S, Bianconi E, Frabetti F, Strippoli P. Genome-scale analysis of human mRNA 5' coding sequences based on expressed sequence tag (EST) database. Genomics 2012; 100:125-30. [PMID: 22659028 DOI: 10.1016/j.ygeno.2012.05.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Revised: 05/21/2012] [Accepted: 05/23/2012] [Indexed: 11/19/2022]
Abstract
The "5' end mRNA artifact" issue refers to the incorrect assignment of the first AUG codon in an mRNA, due to the incomplete determination of its 5' end sequence. We performed a systematic identification of coding regions at the 5' end of all human known mRNAs, using an automated expressed sequence tag (EST)-based approach. Following parsing of more than 7 million BLAT alignments, we found 477 human loci, out of 18,665 analyzed, in which an extension of the mRNA 5' coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for GNB2L1, QARS and TDP2 cDNAs, and the consequences for the functional studies of these loci are discussed. We also generated a list of 20,775 human mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5' in the current form.
Collapse
Affiliation(s)
- Raffaella Casadei
- Center for Research in Molecular Genetics Fondazione CARISBO, Department of Histology, Embryology and Applied Biology, University of Bologna, via Belmeloro 8, 40126 Bologna, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Armengaud J. Proteogenomics and systems biology: quest for the ultimate missing parts. Expert Rev Proteomics 2010; 7:65-77. [DOI: 10.1586/epr.09.104] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/30/2023]
|
6
|
Florent I, Porcel BM, Guillaume E, Da Silva C, Artiguenave F, Maréchal E, Bréhélin L, Gascuel O, Charneau S, Wincker P, Grellier P. A Plasmodium falciparum FcB1-schizont-EST collection providing clues to schizont specific gene structure and polymorphism. BMC Genomics 2009; 10:235. [PMID: 19454033 PMCID: PMC2695484 DOI: 10.1186/1471-2164-10-235] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2008] [Accepted: 05/19/2009] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND The Plasmodium falciparum genome (3D7 strain) published in 2002, revealed ~5,400 genes, mostly based on in silico predictions. Experimental data is therefore required for structural and functional assessments of P. falciparum genes and expression, and polymorphic data are further necessary to exploit genomic information to further qualify therapeutic target candidates. Here, we undertook a large scale analysis of a P. falciparum FcB1-schizont-EST library previously constructed by suppression subtractive hybridization (SSH) to study genes expressed during merozoite morphogenesis, with the aim of: 1) obtaining an exhaustive collection of schizont specific ESTs, 2) experimentally validating or correcting P. falciparum gene models and 3) pinpointing genes displaying protein polymorphism between the FcB1 and 3D7 strains. RESULTS A total of 22,125 clones randomly picked from the SSH library were sequenced, yielding 21,805 usable ESTs that were then clustered on the P. falciparum genome. This allowed identification of 243 protein coding genes, including 121 previously annotated as hypothetical. Statistical analysis of GO terms, when available, indicated significant enrichment in genes involved in "entry into host-cells" and "actin cytoskeleton". Although most ESTs do not span full-length gene reading frames, detailed sequence comparison of FcB1-ESTs versus 3D7 genomic sequences allowed the confirmation of exon/intron boundaries in 29 genes, the detection of new boundaries in 14 genes and identification of protein polymorphism for 21 genes. In addition, a large number of non-protein coding ESTs were identified, mainly matching with the two A-type rRNA units (on chromosomes 5 and 7) and to a lower extent, two atypical rRNA loci (on chromosomes 1 and 8), TARE subtelomeric regions (several chromosomes) and the recently described telomerase RNA gene (chromosome 9). CONCLUSION This FcB1-schizont-EST analysis confirmed the actual expression of 243 protein coding genes, allowing the correction of structural annotations for a quarter of these sequences. In addition, this analysis demonstrated the actual transcription of several remarkable non-protein coding loci: 2 atypical rRNA, TARE region and telomerase RNA gene. Together with other collections of P. falciparum ESTs, usually generated from mixed parasite stages, this collection of FcB1-schizont-ESTs provides valuable data to gain further insight into the P. falciparum gene structure, polymorphism and expression.
Collapse
Affiliation(s)
- Isabelle Florent
- FRE3206 CNRS/MNHN, USM504, Biologie Fonctionnelle des Protozoaires, RDDM, Muséum National d'Histoire Naturelle, Paris, France.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
FragIdent--automatic identification and characterisation of cDNA-fragments. BMC Genomics 2009; 10:95. [PMID: 19254371 PMCID: PMC2672089 DOI: 10.1186/1471-2164-10-95] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2008] [Accepted: 03/02/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many genetic studies and functional assays are based on cDNA fragments. After the generation of cDNA fragments from an mRNA sample, their content is at first unknown and must be assigned by sequencing reactions or hybridisation experiments. Even in characterised libraries, a considerable number of clones are wrongly annotated. Furthermore, mix-ups can happen in the laboratory. It is therefore essential to the relevance of experimental results to confirm or determine the identity of the employed cDNA fragments. However, the manual approach for the characterisation of these fragments using BLAST web interfaces is not suited for larger number of sequences and so far, no user-friendly software is publicly available. RESULTS Here we present the development of FragIdent, an application for the automatic identification of open reading frames (ORFs) within cDNA-fragments. The software performs BLAST analyses to identify the genes represented by the sequences and suggests primers to complete the sequencing of the whole insert. Gene-specific information as well as the protein domains encoded by the cDNA fragment are retrieved from Internet-based databases and included in the output. The application features an intuitive graphical interface and is designed for researchers without any bioinformatics skills. It is suited for projects comprising up to several hundred different clones. CONCLUSION We used FragIdent to identify 84 cDNA clones from a yeast two-hybrid experiment. Furthermore, we identified 131 protein domains within our analysed clones. The source code is freely available from our homepage at http://compbio.charite.de/genetik/FragIdent/.
Collapse
|
8
|
Espagne E, Lespinet O, Malagnac F, Da Silva C, Jaillon O, Porcel BM, Couloux A, Aury JM, Ségurens B, Poulain J, Anthouard V, Grossetete S, Khalili H, Coppin E, Déquard-Chablat M, Picard M, Contamine V, Arnaise S, Bourdais A, Berteaux-Lecellier V, Gautheret D, de Vries RP, Battaglia E, Coutinho PM, Danchin EG, Henrissat B, Khoury RE, Sainsard-Chanet A, Boivin A, Pinan-Lucarré B, Sellem CH, Debuchy R, Wincker P, Weissenbach J, Silar P. The genome sequence of the model ascomycete fungus Podospora anserina. Genome Biol 2008; 9:R77. [PMID: 18460219 PMCID: PMC2441463 DOI: 10.1186/gb-2008-9-5-r77] [Citation(s) in RCA: 234] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2007] [Revised: 02/12/2008] [Accepted: 05/06/2008] [Indexed: 12/13/2022] Open
Abstract
A 10X draft sequence of Podospora anserina genome shows highly dynamic evolution since its divergence from Neurospora crassa. Background The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. Results We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed sequence tag collection. Similar to higher eukaryotes, the P. anserina transcription/splicing machinery generates numerous non-conventional transcripts. Comparison of the P. anserina genome and orthologous gene set with the one of its close relatives, Neurospora crassa, shows that synteny is poorly conserved, the main result of evolution being gene shuffling in the same chromosome. The P. anserina genome contains fewer repeated sequences and has evolved new genes by duplication since its separation from N. crassa, despite the presence of the repeat induced point mutation mechanism that mutates duplicated sequences. We also provide evidence that frequent gene loss took place in the lineages leading to P. anserina and N. crassa. P. anserina contains a large and highly specialized set of genes involved in utilization of natural carbon sources commonly found in its natural biotope. It includes genes potentially involved in lignin degradation and efficient cellulose breakdown. Conclusion The features of the P. anserina genome indicate a highly dynamic evolution since the divergence of P. anserina and N. crassa, leading to the ability of the former to use specific complex carbon sources that match its needs in its natural biotope.
Collapse
Affiliation(s)
- Eric Espagne
- Univ Paris-Sud, Institut de Génétique et Microbiologie, UMR8621, 91405 Orsay cedex, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Frabetti F, Casadei R, Lenzi L, Canaider S, Vitale L, Facchin F, Carinci P, Zannotti M, Strippoli P. Systematic analysis of mRNA 5' coding sequence incompleteness in Danio rerio: an automated EST-based approach. Biol Direct 2007; 2:34. [PMID: 18042283 PMCID: PMC2222617 DOI: 10.1186/1745-6150-2-34] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2007] [Accepted: 11/27/2007] [Indexed: 11/15/2022] Open
Abstract
Background All standard methods for cDNA cloning are affected by a potential inability to effectively clone the 5' region of mRNA. The aim of this work was to estimate mRNA open reading frame (ORF) 5' region sequence completeness in the model organism Danio rerio (zebrafish). Results We implemented a novel automated approach (5'_ORF_Extender) that systematically compares available expressed sequence tags (ESTs) with all the zebrafish experimentally determined mRNA sequences, identifies additional sequence stretches at 5' region and scans for the presence of all conditions needed to define a new, extended putative ORF. Our software was able to identify 285 (3.3%) mRNAs with putatively incomplete ORFs at 5' region and, in three example cases selected (selt1a, unc119.2, nppa), the extended coding region at 5' end was cloned by reverse transcription-polymerase chain reaction (RT-PCR). Conclusion The implemented method, which could also be useful for the analysis of other genomes, allowed us to describe the relevance of the "5' end mRNA artifact" problem for genomic annotation and functional genomic experiment design in zebrafish. Open peer review This article was reviewed by Alexey V. Kochetov (nominated by Mikhail Gelfand), Shamil Sunyaev, and Gáspár Jékely. For the full reviews, please go to the Reviewers' Comments section.
Collapse
Affiliation(s)
- Flavia Frabetti
- Center for Research in Molecular Genetics "Fondazione CARISBO", Department of Histology, Embryology and Applied Biology, University of Bologna, via Belmeloro 8, 40126 Bologna (BO), Italy.
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Peters BA, St. Croix B, Sjöblom T, Cummins JM, Silliman N, Ptak J, Saha S, Kinzler KW, Hatzis C, Velculescu VE. Large-scale identification of novel transcripts in the human genome. Genome Res 2007; 17:287-92. [PMID: 17267814 PMCID: PMC1800919 DOI: 10.1101/gr.5486607] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Although the sequencing of the human genome has been completed, the number and identity of genes contained within it remains to be fully determined. We used LongSAGE to analyze 660,357 human transcripts from human brain mRNA and identified expression of 17,409 known genes and >15,000 different transcripts that were not annotated in genome databases. Analysis of a subset of these unannotated transcripts suggests that 85% were differentially expressed in various tissue types and that fewer than 20% would have been detected by ab initio gene predictions. These studies suggest that the human genome contains on the order of twice as many transcribed regions as are currently annotated and that experimental approaches will be required to fully elucidate the novel genes corresponding to these transcripts.
Collapse
Affiliation(s)
- Brock A. Peters
- The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA
- Department of Pharmacology and Molecular Sciences, Johns Hopkins University, Baltimore, Maryland 21231, USA
| | - Brad St. Croix
- Tumor Angiogenesis Section, Mouse Cancer Genetics Program, National Cancer Institute, Frederick, Maryland 21702, USA
| | - Tobias Sjöblom
- The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA
| | - Jordan M. Cummins
- The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA
| | - Natalie Silliman
- The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA
| | - Janine Ptak
- The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA
| | - Saurabh Saha
- The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA
| | - Kenneth W. Kinzler
- The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA
- Department of Pharmacology and Molecular Sciences, Johns Hopkins University, Baltimore, Maryland 21231, USA
| | | | - Victor E. Velculescu
- The Ludwig Center for Cancer Genetics and Therapeutics, The Johns Hopkins University Kimmel Cancer Center, Baltimore, Maryland 21231, USA
- Corresponding author.E-mail ; fax (410) 955-0548
| |
Collapse
|
11
|
Kochetov AV, Sarai A, Rogozin IB, Shumny VK, Kolchanov NA. The role of alternative translation start sites in the generation of human protein diversity. Mol Genet Genomics 2005; 273:491-6. [PMID: 15959805 DOI: 10.1007/s00438-005-1152-7] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2004] [Accepted: 03/29/2005] [Indexed: 11/29/2022]
Abstract
According to the scanning model, 40S ribosomal subunits initiate translation at the first (5' proximal) AUG codon they encounter. However, if the first AUG is in a suboptimal context, it may not be recognized, and translation can then initiate at downstream AUG(s). In this way, a single RNA can produce several variant products. Earlier experiments suggested that some of these additional protein variants might be functionally important. We have analysed human mRNAs that have AUG triplets in 5' untranslated regions and mRNAs in which the annotated translational start codon is located in a suboptimal context. It was found that 3% of human mRNAs have the potential to encode N-terminally extended variants of the annotated proteins and 12% could code for N-truncated variants. The predicted subcellular localizations of these protein variants were compared: 31% of the N-extended proteins and 30% of the N-truncated proteins were predicted to localize to subcellular compartments that differed from those targeted by the annotated protein forms. These results suggest that additional AUGs may frequently be exploited for the synthesis of proteins that possess novel functional properties.
Collapse
Affiliation(s)
- Alex V Kochetov
- Institute of Cytology and Genetics, Novosibirsk 630090, Russia.
| | | | | | | | | |
Collapse
|
12
|
Gomez SM, Eiglmeier K, Segurens B, Dehoux P, Couloux A, Scarpelli C, Wincker P, Weissenbach J, Brey PT, Roth CW. Pilot Anopheles gambiae full-length cDNA study: sequencing and initial characterization of 35,575 clones. Genome Biol 2005; 6:R39. [PMID: 15833126 PMCID: PMC1088967 DOI: 10.1186/gb-2005-6-4-r39] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2004] [Revised: 01/07/2005] [Accepted: 02/17/2005] [Indexed: 11/10/2022] Open
Abstract
We describe the preliminary analysis of over 35,000 clones from a full-length enriched cDNA library from the malaria mosquito vector Anopheles gambiae. The clones define nearly 3,700 genes, of which around 2,600 significantly improve current gene definitions. An additional 17% of the genes were not previously annotated, suggesting that an equal percentage may be missing from the current Anopheles genome annotation.
Collapse
Affiliation(s)
- Shawn M Gomez
- Unité de Biochimie et Biologie Moléculaire des Insectes and CNRS FRE 2849, Institut Pasteur, 75724 Paris Cedex 15, France
| | - Karin Eiglmeier
- Unité de Biochimie et Biologie Moléculaire des Insectes and CNRS FRE 2849, Institut Pasteur, 75724 Paris Cedex 15, France
| | - Beatrice Segurens
- Genoscope/Centre National de Séquençage and CNRS UMR 8030, 91057 Evry Cedex, France
| | - Pierre Dehoux
- Plate-forme Intégration et Analyse Génomiques, Institut Pasteur, 75724 Paris Cedex 15, France
| | - Arnaud Couloux
- Genoscope/Centre National de Séquençage and CNRS UMR 8030, 91057 Evry Cedex, France
| | - Claude Scarpelli
- Genoscope/Centre National de Séquençage and CNRS UMR 8030, 91057 Evry Cedex, France
| | - Patrick Wincker
- Genoscope/Centre National de Séquençage and CNRS UMR 8030, 91057 Evry Cedex, France
| | - Jean Weissenbach
- Genoscope/Centre National de Séquençage and CNRS UMR 8030, 91057 Evry Cedex, France
| | - Paul T Brey
- Unité de Biochimie et Biologie Moléculaire des Insectes and CNRS FRE 2849, Institut Pasteur, 75724 Paris Cedex 15, France
| | - Charles W Roth
- Unité de Biochimie et Biologie Moléculaire des Insectes and CNRS FRE 2849, Institut Pasteur, 75724 Paris Cedex 15, France
| |
Collapse
|
13
|
Furutani-Seiki M, Wittbrodt J. Medaka and zebrafish, an evolutionary twin study. Mech Dev 2005; 121:629-37. [PMID: 15210172 DOI: 10.1016/j.mod.2004.05.010] [Citation(s) in RCA: 162] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2004] [Accepted: 05/17/2004] [Indexed: 02/08/2023]
Abstract
Comparison of two related species is one of the most successful approaches to decipher general genetic principles in eukaryotes. This is best illustrated in yeast, where the model systems Saccharomyyces. cervisiae and Schizosaccharomyces. pombe have been examined. Powerful forward genetics in both species, species-specific differences in biological features and the phylogenetic distance between the two species, make them well suited for a comparative approach. Recent whole genome sequencing has also facilitated comparative genomics of these simple eukaryotes. It is now possible to go a step further using higher eukaryotes. A duplication of the genome at the base of the teleost radiation, facilitated evolution of almost 25,000 fish species, more than half of all vertebrate species together. Two teleost genetic model systems have emerged in the past few decades: zebrafish, in which large-scale mutagenesis has been successfully performed, and Medaka, a Japanese killifish with a century of history in genetics and now, as reported in this issue, many induced mutations. In this review we will illustrate how comparison of these two model species, Medaka and zebrafish, can reveal conserved and species-specific genetic and molecular mechanisms underlying vertebrate development.
Collapse
Affiliation(s)
- Makoto Furutani-Seiki
- SORST, Kondoh research team, Japan Science and Technology Agency (JST), Kyoto, Japan.
| | | |
Collapse
|
14
|
Kochetov AV. AUG codons at the beginning of protein coding sequences are frequent in eukaryotic mRNAs with a suboptimal start codon context. Bioinformatics 2004; 21:837-40. [PMID: 15531618 DOI: 10.1093/bioinformatics/bti136] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The translation start site plays an important role in the control of translation efficiency of eukaryotic mRNAs. However, mRNAs with a suboptimal context of start AUG codon are relatively abundant. It is likely that at least some mRNAs with suboptimal start codon context contain the other signals providing additional information for efficient AUG recognition. RESULTS Frequency of AUG codons at the beginning of the coding part of eukaryotic mRNAs was analyzed in relation to the context of translation start codon. It was found that the observed downstream AUG content in the mRNAs with optimal start codon context was close to the expected value, whereas it was significantly higher in the mRNAs with a suboptimal context. It is likely that downstream AUG codons can often be utilized as additional start sites to increase translation rate of mRNAs with a suboptimal context of the annotated start codon and many eukaryotic proteins can be characterized by some N-end heterogeneity.
Collapse
Affiliation(s)
- Alex V Kochetov
- Institute of Cytology and Genetics Lavrentieva 10, Novosibirsk 630090 Russia.
| |
Collapse
|