1
|
Le NMT, So KK, Chun J, Kim DH. Expression of virus-like particles (VLPs) of foot-and-mouth disease virus (FMDV) using Saccharomyces cerevisiae. Appl Microbiol Biotechnol 2024; 108:81. [PMID: 38194136 PMCID: PMC10776484 DOI: 10.1007/s00253-023-12902-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 09/19/2023] [Accepted: 09/30/2023] [Indexed: 01/10/2024]
Abstract
We engineered Saccharomyces cerevisiae to express structural proteins of foot-and-mouth disease virus (FMDV) and produce virus-like particles (VLPs). The gene, which encodes four structural capsid proteins (VP0 (VP4 and VP2), VP3, and VP1), followed by a translational "ribosomal skipping" sequence consisting of 2A and protease 3C, was codon-optimized and chemically synthesized. The cloned gene was used to transform S. cerevisiae 2805 strain. Western blot analysis revealed that the polyprotein consisting of VP0, VP3, and VP1 was processed into the discrete capsid proteins. Western blot analysis of 3C confirmed the presence of discrete 3C protein, suggesting that the 2A sequence functioned as a "ribosomal skipping" signal in the yeast for an internal re-initiation of 3C translation from a monocistronic transcript, thereby indicating polyprotein processing by the discrete 3C protease. Moreover, a band corresponding to only VP2, which was known to be non-enzymatically processed from VP0 to both VP4 and VP2 during viral assembly, further validated the assembly of processed capsid proteins into VLPs. Electron microscopy showed the presence of the characteristic icosahedral VLPs. Our results clearly demonstrate that S. cerevisiae processes the viral structural polyprotein using a viral 3C protease and the resulting viral capsid subunits are assembled into virion particles. KEY POINTS: • Ribosomal skipping by self-cleaving FMDV peptide in S. cerevisiae. • Proteolytic processing of a structural polyprotein from a monocistronic transcript. • Assembly of the processed viral capsid proteins into a virus-like particle.
Collapse
Affiliation(s)
- Ngoc My Tieu Le
- Department of Bioactive Material Sciences, Jeonbuk National University, Jeonju, 54896, Jeollabuk-do, Republic of Korea
| | - Kum-Kang So
- Institute for Molecular Biology and Genetics, Department of Molecular Biology, Jeonbuk National University, Jeonju, Jeollabuk-Do, Republic of Korea
| | - Jeesun Chun
- Institute for Molecular Biology and Genetics, Department of Molecular Biology, Jeonbuk National University, Jeonju, Jeollabuk-Do, Republic of Korea
| | - Dae-Hyuk Kim
- Department of Bioactive Material Sciences, Jeonbuk National University, Jeonju, 54896, Jeollabuk-do, Republic of Korea.
- Institute for Molecular Biology and Genetics, Department of Molecular Biology, Jeonbuk National University, Jeonju, Jeollabuk-Do, Republic of Korea.
| |
Collapse
|
2
|
Gwon Y, So KK, Chun J, Kim DH. Metabolic engineering of Saccharomyces cerevisiae for the biosynthesis of a fungal pigment from the phytopathogenic fungus Cladosporium phlei. J Biol Eng 2024; 18:33. [PMID: 38741106 DOI: 10.1186/s13036-024-00429-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 05/03/2024] [Indexed: 05/16/2024] Open
Abstract
BACKGROUND Cladosporium phlei is a phytopathogenic fungus that produces a pigment called phleichrome. This fungal perylenequinone plays an important role in the production of a photosensitizer that is a necessary component of photodynamic therapy. We applied synthetic biology to produce phleichrome using Saccharomyces cerevisiae. RESULTS The gene Cppks1, which encodes a non-reducing polyketide synthase (NR-PKS) responsible for the biosynthesis of phleichrome in C. phlei, was cloned into a yeast episomal vector and used to transform S. cerevisiae. In addition, a gene encoding a phosphopantetheinyl transferase (PPTase) of Aspergillus nidulans was cloned into a yeast integrative vector and also introduced into S. cerevisiae for the enzymatic activation of the protein product of Cppks1. Co-transformed yeasts were screened on a leucine/uracil-deficient selective medium and the presence of both integrative as well as episomal recombinant plasmids in the yeast were confirmed by colony PCR. The episomal vector for Cppks1 expression was so dramatically unstable during cultivation that most cells lost their episomal vector rapidly in nonselective media. This loss was also observed to a less degree in selective media. This data strongly suggests that the presence of the Cppks1 gene exerts a significant detrimental effect on the growth of transformed yeast cells and that selection pressure is required to maintain the Cppks1-expressing vector. The co-transformants on the selective medium showed the distinctive changes in pigmentation after a period of prolonged cultivation at 20 °C and 25 °C, but not at 30 °C. Furthermore, thin layer chromatography (TLC) revealed the presence of a spot corresponding with the purified phleichrome in the extract from the cells of the co-transformants. Liquid chromatography (LC/MS/MS) verified that the newly expressed pigment was indeed phleichrome. CONCLUSION Our results indicate that metabolic engineering by multiple gene expression is possible and capable of producing fungal pigment phleichrome in S. cerevisiae. This result adds to our understanding of the characteristics of fungal PKS genes, which exhibit complex structures and diverse biological activities.
Collapse
Affiliation(s)
- Yeji Gwon
- Department of Bioactive Material Sciences, Jeonbuk National University, Jeonju, 54896, Republic of Korea
| | - Kum-Kang So
- Institute for Molecular Biology and Genetics, Jeonbuk National University, Jeonju, 54896, Republic of Korea
- Department of Molecular Biology, Jeonbuk National University, Jeonju, 54896, Republic of Korea
| | - Jeesun Chun
- Institute for Molecular Biology and Genetics, Jeonbuk National University, Jeonju, 54896, Republic of Korea
- Department of Molecular Biology, Jeonbuk National University, Jeonju, 54896, Republic of Korea
| | - Dae-Hyuk Kim
- Department of Bioactive Material Sciences, Jeonbuk National University, Jeonju, 54896, Republic of Korea.
- Institute for Molecular Biology and Genetics, Jeonbuk National University, Jeonju, 54896, Republic of Korea.
- Department of Molecular Biology, Jeonbuk National University, Jeonju, 54896, Republic of Korea.
| |
Collapse
|
3
|
Wacholder A, Parikh SB, Coelho NC, Acar O, Houghton C, Chou L, Carvunis AR. A vast evolutionarily transient translatome contributes to phenotype and fitness. Cell Syst 2023; 14:363-381.e8. [PMID: 37164009 PMCID: PMC10348077 DOI: 10.1016/j.cels.2023.04.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 01/30/2023] [Accepted: 04/06/2023] [Indexed: 05/12/2023]
Abstract
Translation is the process by which ribosomes synthesize proteins. Ribosome profiling recently revealed that many short sequences previously thought to be noncoding are pervasively translated. To identify protein-coding genes in this noncanonical translatome, we combine an integrative framework for extremely sensitive ribosome profiling analysis, iRibo, with high-powered selection inferences tailored for short sequences. We construct a reference translatome for Saccharomyces cerevisiae comprising 5,400 canonical and almost 19,000 noncanonical translated elements. Only 14 noncanonical elements were evolving under detectable purifying selection. A representative subset of translated elements lacking signatures of selection demonstrated involvement in processes including DNA repair, stress response, and post-transcriptional regulation. Our results suggest that most translated elements are not conserved protein-coding genes and contribute to genotype-phenotype relationships through fast-evolving molecular mechanisms.
Collapse
Affiliation(s)
- Aaron Wacholder
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Saurin Bipin Parikh
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Integrative Systems Biology Program, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Nelson Castilho Coelho
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Omer Acar
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Joint CMU-Pitt PhD Program in Computational Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Carly Houghton
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Joint CMU-Pitt PhD Program in Computational Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Lin Chou
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Integrative Systems Biology Program, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| |
Collapse
|
4
|
Deciphering the Origin, Evolution, and Physiological Function of the Subtelomeric Aryl-Alcohol Dehydrogenase Gene Family in the Yeast Saccharomyces cerevisiae. Appl Environ Microbiol 2017; 84:AEM.01553-17. [PMID: 29079624 PMCID: PMC5734042 DOI: 10.1128/aem.01553-17] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2017] [Accepted: 10/23/2017] [Indexed: 12/02/2022] Open
Abstract
Homology searches indicate that Saccharomyces cerevisiae strain BY4741 contains seven redundant genes that encode putative aryl-alcohol dehydrogenases (AAD). Yeast AAD genes are located in subtelomeric regions of different chromosomes, and their functional role(s) remain enigmatic. Here, we show that two of these genes, AAD4 and AAD14, encode functional enzymes that reduce aliphatic and aryl-aldehydes concomitant with the oxidation of cofactor NADPH, and that Aad4p and Aad14p exhibit different substrate preference patterns. Other yeast AAD genes are undergoing pseudogenization. The 5′ sequence of AAD15 has been deleted from the genome. Repair of an AAD3 missense mutation at the catalytically essential Tyr73 residue did not result in a functional enzyme. However, ancestral-state reconstruction by fusing Aad6 with Aad16 and by N-terminal repair of Aad10 restores NADPH-dependent aryl-alcohol dehydrogenase activities. Phylogenetic analysis indicates that AAD genes are narrowly distributed in wood-saprophyte fungi and in yeast that occupy lignocellulosic niches. Because yeast AAD genes exhibit activity on veratraldehyde, cinnamaldehyde, and vanillin, they could serve to detoxify aryl-aldehydes released during lignin degradation. However, none of these compounds induce yeast AAD gene expression, and Aad activities do not relieve aryl-aldehyde growth inhibition. Our data suggest an ancestral role for AAD genes in lignin degradation that is degenerating as a result of yeast's domestication and use in brewing, baking, and other industrial applications. IMPORTANCE Functional characterization of hypothetical genes remains one of the chief tasks of the postgenomic era. Although the first Saccharomyces cerevisiae genome sequence was published over 20 years ago, 22% of its estimated 6,603 open reading frames (ORFs) remain unverified. One outstanding example of this category of genes is the enigmatic seven-member AAD family. Here, we demonstrate that proteins encoded by two members of this family exhibit aliphatic and aryl-aldehyde reductase activity, and further that such activity can be recovered from pseudogenized AAD genes via ancestral-state reconstruction. The phylogeny of yeast AAD genes suggests that these proteins may have played an important ancestral role in detoxifying aromatic aldehydes in ligninolytic fungi. However, in yeast adapted to niches rich in sugars, AAD genes become subject to mutational erosion. Our findings shed new light on the selective pressures and molecular mechanisms by which genes undergo pseudogenization.
Collapse
|
5
|
Crappé J, Van Criekinge W, Menschaert G. Little things make big things happen: A summary of micropeptide encoding genes. EUPA OPEN PROTEOMICS 2014. [DOI: 10.1016/j.euprot.2014.02.006] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
6
|
Physical methods for genetic transformation of fungi and yeast. Phys Life Rev 2014; 11:184-203. [DOI: 10.1016/j.plrev.2014.01.007] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Accepted: 01/21/2014] [Indexed: 01/27/2023]
|
7
|
Su M, Ling Y, Yu J, Wu J, Xiao J. Small proteins: untapped area of potential biological importance. Front Genet 2013; 4:286. [PMID: 24379829 PMCID: PMC3864261 DOI: 10.3389/fgene.2013.00286] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Accepted: 11/27/2013] [Indexed: 01/13/2023] Open
Abstract
Polypeptides containing ≤100 amino acid residues (AAs) are generally considered to be small proteins (SPs). Many studies have shown that some SPs are involved in important biological processes, including cell signaling, metabolism, and growth. SP generally has a simple domain and has an advantage to be used as model system to overcome folding speed limits in protein folding simulation and drug design. But SPs were once thought to be trivial molecules in biological processes compared to large proteins. Because of the constraints of experimental methods and bioinformatics analysis, many genome projects have used a length threshold of 100 amino acid residues to minimize erroneous predictions and SPs are relatively under-represented in earlier studies. The general protein discovery methods have potential problems to predict and validate SPs, and very few effective tools and algorithms were developed specially for SPs identification. In this review, we mainly consider the diverse strategies applied to SPs prediction and discuss the challenge for differentiate SP coding genes from artifacts. We also summarize current large-scale discovery of SPs in species at the genome level. In addition, we present an overview of SPs with regard to biological significance, structural application, and evolution characterization in an effort to gain insight into the significance of SPs.
Collapse
Affiliation(s)
- Mingming Su
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences Beijing, China ; Graduate University of Chinese Academy of Sciences Beijing, China
| | - Yunchao Ling
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences Beijing, China ; Graduate University of Chinese Academy of Sciences Beijing, China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences Beijing, China
| | - Jiayan Wu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences Beijing, China
| | - Jingfa Xiao
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences Beijing, China
| |
Collapse
|
8
|
Re-annotation of protein-coding genes in the genome of saccharomyces cerevisiae based on support vector machines. PLoS One 2013; 8:e64477. [PMID: 23874379 PMCID: PMC3707884 DOI: 10.1371/journal.pone.0064477] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2013] [Accepted: 04/15/2013] [Indexed: 11/19/2022] Open
Abstract
The annotation of the well-studied organism, Saccharomyces cerevisiae, has been improving over the past decade while there are unresolved debates over the amount of biologically significant open reading frames (ORFs) in yeast genome. We revisited the total count of protein-coding genes in S. cerevisiae S288c genome using a theoretical approach by combining the Support Vector Machine (SVM) method with six widely used measurements of sequence statistical features. The accuracy of our method is over 99.5% in 10-fold cross-validation. Based on the annotation data in Saccharomyces Genome Database (SGD), we studied the coding capacity of all 1744 ORFs which lack experimental results and suggested that the overall number of chromosomal ORFs encoding proteins in yeast should be 6091 by removing 488 spurious ORFs. The importance of the present work lies in at least two aspects. First, cross-validation and retrospective examination showed the fidelity of our method in recognizing ORFs that likely encode proteins. Second, we have provided a web service that can be accessed at http://cobi.uestc.edu.cn/services/yeast/, which enables the prediction of protein-coding ORFs of the genus Saccharomyces with a high accuracy.
Collapse
|
9
|
Ladoukakis E, Pereira V, Magny EG, Eyre-Walker A, Couso JP. Hundreds of putatively functional small open reading frames in Drosophila. Genome Biol 2011; 12:R118. [PMID: 22118156 PMCID: PMC3334604 DOI: 10.1186/gb-2011-12-11-r118] [Citation(s) in RCA: 120] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2011] [Revised: 11/04/2011] [Accepted: 11/25/2011] [Indexed: 12/22/2022] Open
Abstract
Background The relationship between DNA sequence and encoded information is still an unsolved puzzle. The number of protein-coding genes in higher eukaryotes identified by genome projects is lower than was expected, while a considerable amount of putatively non-coding transcription has been detected. Functional small open reading frames (smORFs) are known to exist in several organisms. However, coding sequence detection methods are biased against detecting such very short open reading frames. Thus, a substantial number of non-canonical coding regions encoding short peptides might await characterization. Results Using bio-informatics methods, we have searched for smORFs of less than 100 amino acids in the putatively non-coding euchromatic DNA of Drosophila melanogaster, and initially identified nearly 600,000 of them. We have studied the pattern of conservation of these smORFs as coding entities between D. melanogaster and Drosophila pseudoobscura, their presence in syntenic and in transcribed regions of the genome, and their ratio of conservative versus non-conservative nucleotide changes. For negative controls, we compared the results with those obtained using random short sequences, while a positive control was provided by smORFs validated by proteomics data. Conclusions The combination of these analyses led us to postulate the existence of at least 401 functional smORFs in Drosophila, with the possibility that as many as 4,561 such functional smORFs may exist.
Collapse
|
10
|
Teste MA, François JM, Parrou JL. Characterization of a new multigene family encoding isomaltases in the yeast Saccharomyces cerevisiae, the IMA family. J Biol Chem 2010; 285:26815-26824. [PMID: 20562106 DOI: 10.1074/jbc.m110.145946] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
It has been known for a long time that the yeast Saccharomyces cerevisiae can assimilate alpha-methylglucopyranoside and isomaltose. We here report the identification of 5 genes (YGR287c, YIL172c, YJL216c, YJL221c and YOL157c), which, similar to the SUCx, MALx, or HXTx multigene families, are located in the subtelomeric regions of different chromosomes. They share high nucleotide sequence identities between themselves (66-100%) and with the MALx2 genes (63-74%). Comparison of their amino acid sequences underlined a substitution of threonine by valine in region II, one of the four highly conserved regions of the alpha-glucosidase family. This change was previously shown to be sufficient to discriminate alpha-1,4- to alpha-1,6-glucosidase activity in YGR287c (Yamamoto, K., Nakayama, A., Yamamoto, Y., and Tabata, S. (2004) Eur. J. Biochem. 271, 3414-3420). We showed that each of these five genes encodes a protein with alpha-glucosidase activity on isomaltose, and we therefore renamed these genes IMA1 to IMA5 for IsoMAltase. Our results also illustrated that sequence polymorphisms among this family led to interesting variability of gene expression patterns and of catalytic efficiencies on different substrates, which altogether should account for the absence of functional redundancy for growth on isomaltose. Indeed, deletion studies revealed that IMA1/YGR287c encodes the major isomaltase and that growth on isomaltose required the presence of AGT1, which encodes an alpha-glucoside transporter. Expressions of IMA1 and IMA5/YJL216c were strongly induced by maltose, isomaltose, and alpha-methylglucopyranoside, in accordance with their regulation by the Malx3p-transcription system. The physiological relevance of this IMAx multigene family in S. cerevisiae is discussed.
Collapse
Affiliation(s)
- Marie-Ange Teste
- CNRS, UMR5504, F-31400 Toulouse, France; INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés, F-31400 Toulouse, France; Université de Toulouse, INSA, UPS, INP, LISBP, 135 Avenue de Rangueil, F-31077 Toulouse, France
| | - Jean Marie François
- CNRS, UMR5504, F-31400 Toulouse, France; INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés, F-31400 Toulouse, France; Université de Toulouse, INSA, UPS, INP, LISBP, 135 Avenue de Rangueil, F-31077 Toulouse, France
| | - Jean-Luc Parrou
- CNRS, UMR5504, F-31400 Toulouse, France; INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés, F-31400 Toulouse, France; Université de Toulouse, INSA, UPS, INP, LISBP, 135 Avenue de Rangueil, F-31077 Toulouse, France.
| |
Collapse
|
11
|
Luo L, Li H, Zhang L. ORF organization and gene recognition in the yeast genome. Comp Funct Genomics 2010; 4:318-28. [PMID: 18629282 PMCID: PMC2448446 DOI: 10.1002/cfg.292] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2002] [Revised: 03/03/2003] [Accepted: 03/10/2003] [Indexed: 11/10/2022] Open
Abstract
Some rules on gene recognition and ORF organization in the Saccharomyces cerevisiae genome are demonstrated by statistical analyses of sequence data. This study includes: (a) The random frame rule-that the six reading frames W1, W2, W3, C1, C2 and C3 in the double-stranded genome are randomly occupied by ORFs (related phenomena on ORF overlapping are also discussed). (b) The inhomogeneity rule-coding and non-coding ORFs differ in inhomogeneity of base composition in the three codon positions. By use of the inhomogeneity index (IHI), one can make a distinction between coding (IHI > 14) and non-coding (IHI < or = 14) ORFs at 95% accuracy. We find that 'spurious' ORFs (with IHI < or = 14) are distributed mainly in three classes of ORFs, namely, those with 'similarity to unknown proteins', those with 'no similarity', or 'questionable ORFs'. The total number of spurious ORFs (which are unlikely to be regarded as coding ORFs) is estimated to be 470. (c) The evaluation of ORF length distribution shows that below 200 amino acids the occurrence of ATG initiator ORFs is close to random.
Collapse
Affiliation(s)
- Liaofu Luo
- Laboratory of Theoretical Biophysics, Faculty of Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | | | | |
Collapse
|
12
|
Luo Z, van Vuuren HJJ. Functional analyses of PAU genes in Saccharomyces cerevisiae. Microbiology (Reading) 2009; 155:4036-4049. [DOI: 10.1099/mic.0.030726-0] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
PAU genes constitute the largest gene family in Saccharomyces cerevisiae, with 24 members mostly located in the subtelomeric regions of chromosomes. Little information is available about PAU genes, other than expression data for some members. In this study, we systematically compared the sequences of all 24 members, examined the expression of PAU3, PAU5, DAN2, PAU17 and PAU20 in response to stresses, and investigated the stability of all Pau proteins. The chromosomal localization, synteny and sequence analyses revealed that PAU genes could have been amplified by segmental and retroposition duplication through mechanisms of chromosomal end translocation and Ty-associated recombination. The coding sequences diverged through nucleotide substitution and insertion/deletion of one to four codons, thus causing changes in amino acids, truncation or extension of Pau proteins. Pairwise comparison of non-coding regions revealed little homology in flanking sequences of some members. All 24 PAU promoters contain a TATA box, and 22 PAU promoters contain at least one copy of the anaerobic response element and the aerobic repression motif. Differential expression was observed among PAU3, PAU5, PAU17, PAU20 and DAN2 in response to stress, with PAU5 having the highest capacity to be induced by anaerobic conditions, low temperature and wine fermentations. Furthermore, Pau proteins with 124 aa were less stable than those with 120 or 122 aa. Our results indicate that duplicated PAU genes have been evolving, and the individual Pau proteins might possess specific roles for the adaptation of S. cerevisiae to certain environmental stresses.
Collapse
Affiliation(s)
- Zongli Luo
- Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Hennie J. J. van Vuuren
- Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| |
Collapse
|
13
|
Abstract
The yeast genetics community has embraced genomic biology, and there is a general understanding that obtaining a full encyclopedia of functions of the approximately 6000 genes is a worthwhile goal. The yeast literature comprises over 40,000 research papers, and the number of yeast researchers exceeds the number of genes. There are mutated and tagged alleles for virtually every gene, and hundreds of high-throughput data sets and computational analyses have been described. Why, then, are there >1000 genes still listed as uncharacterized on the Saccharomyces Genome Database, 10 years after sequencing the genome of this powerful model organism? Examination of the currently uncharacterized gene set suggests that while some are small or newly discovered, the vast majority were evident from the initial genome sequence. Most are present in multiple genomics data sets, which may provide clues to function. In addition, roughly half contain recognizable protein domains, and many of these suggest specific metabolic activities. Notably, the uncharacterized gene set is highly enriched for genes whose only homologs are in other fungi. Achieving a full catalog of yeast gene functions may require a greater focus on the life of yeast outside the laboratory.
Collapse
Affiliation(s)
- Lourdes Peña-Castillo
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | | |
Collapse
|
14
|
Haraguchi N, Andoh T, Frendewey D, Tani T. Mutations in the SF1-U2AF59-U2AF23 Complex Cause Exon Skipping in Schizosaccharomyces pombe. J Biol Chem 2007; 282:2221-8. [PMID: 17130122 DOI: 10.1074/jbc.m609430200] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
To identify genes involved in the mechanism to ensure ordered 5' to 3' exon joining in constitutively spliced pre-mRNAs, we screened for mutants that cause exon skipping in the fission yeast Schizosaccharomyces pombe using a reporter plasmid, which contains the ura4+ gene with the nda3 intron 1-exon 2-intron 2 sequence. The reporter plasmid was designed to produce the functional ura4+ mRNA, when the central nda3 exon is skipped during the splicing reaction. We mutagenized cells harboring the plasmid by UV irradiation and isolated 34 ura+ mutants that grew on minimal medium. Of those, eight mutants were found to be temperature sensitive (ts) for growth. Complementation analyses revealed that the ts mutants belong to three distinct complementation groups named ods (ordered splicing) 1, 2, and 3. RT-PCR analyses showed that products of exon skipping were actually generated in the ods mutants. We cloned the genes responsible for the ods mutations, and found that ods1+, ods2+, and ods3+ encode splicing factors Prp2p/U2AF59, U2AF23, and SF1, respectively, which form a SF1-U2AF59-U2AF23 complex involved in recognition of the branch-point and 3' splice site sequences in a pre-mRNA. We also showed that mutations in the SF1-U2AF59-U2AF23 binding sequences in the reporter plasmid result in exon skipping in wild-type S. pombe cells. In addition, drugs that decrease the rate of transcription elongation were found to suppress the exon skipping in the ods mutants. These results suggest that co-transcriptional recognition of a nascent pre-mRNA by the SF1-U2AF59-U2AF23 complex is essential for ordered exon joining in constitutive splicing in S. pombe.
Collapse
Affiliation(s)
- Noriko Haraguchi
- Department of Biological Sciences, Graduate School of Science and Technology, Kumamoto University, Kumamoto 860-8555, Japan
| | | | | | | |
Collapse
|
15
|
Fisk DG, Ball CA, Dolinski K, Engel SR, Hong EL, Issel-Tarver L, Schwartz K, Sethuraman A, Botstein D, Cherry JM. Saccharomyces cerevisiae S288C genome annotation: a working hypothesis. Yeast 2006; 23:857-65. [PMID: 17001629 PMCID: PMC3040122 DOI: 10.1002/yea.1400] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
The S. cerevisiae genome is the most well-characterized eukaryotic genome and one of the simplest in terms of identifying open reading frames (ORFs), yet its primary annotation has been updated continually in the decade since its initial release in 1996 (Goffeau et al., 1996). The Saccharomyces Genome Database (SGD; www.yeastgenome.org) (Hirschman et al., 2006), the community-designated repository for this reference genome, strives to ensure that the S. cerevisiae annotation is as accurate and useful as possible. At SGD, the S. cerevisiae genome sequence and annotation are treated as a working hypothesis, which must be repeatedly tested and refined. In this paper, in celebration of the tenth anniversary of the completion of the S. cerevisiae genome sequence, we discuss the ways in which the S. cerevisiae sequence and annotation have changed, consider the multiple sources of experimental and comparative data on which these changes are based, and describe our methods for evaluating, incorporating and documenting these new data.
Collapse
Affiliation(s)
- Dianna G. Fisk
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | - Catherine A. Ball
- Department of Biochemistry, School of Medicine, Stanford University, Stanford, CA 94305-5307, USA
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
| | - Stacia R. Engel
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | - Eurie L. Hong
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | | | - Katja Schwartz
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | - Anand Sethuraman
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | - David Botstein
- Lewis-Sigler Institute for Integrative Genomics, Department of Molecular Biology, Princeton University, Princeton, NJ 08544, USA
| | - J. Michael Cherry
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
- Correspondence to: J. Michael Cherry, Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA,
| | | |
Collapse
|
16
|
Kastenmayer JP, Ni L, Chu A, Kitchen LE, Au WC, Yang H, Carter CD, Wheeler D, Davis RW, Boeke JD, Snyder MA, Basrai MA. Functional genomics of genes with small open reading frames (sORFs) in S. cerevisiae. Genome Res 2006; 16:365-73. [PMID: 16510898 PMCID: PMC1415214 DOI: 10.1101/gr.4355406] [Citation(s) in RCA: 157] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Genes with small open reading frames (sORFs; <100 amino acids) represent an untapped source of important biology. sORFs largely escaped analysis because they were difficult to predict computationally and less likely to be targeted by genetic screens. Thus, the substantial number of sORFs and their potential importance have only recently become clear. To investigate sORF function, we undertook the first functional studies of sORFs in any system, using the model eukaryote Saccharomyces cerevisiae. Based on independent experimental approaches and computational analyses, evidence exists for 299 sORFs in the S. cerevisiae genome, representing approximately 5% of the annotated ORFs. We determined that a similar percentage of sORFs are annotated in other eukaryotes, including humans, and 184 of the S. cerevisiae sORFs exhibit similarity with ORFs in other organisms. To investigate sORF function, we constructed a collection of gene-deletion mutants of 140 newly identified sORFs, each of which contains a strain-specific "molecular barcode," bringing the total number of sORF deletion strains to 247. Phenotypic analyses of the new gene-deletion strains identified 22 sORFs required for haploid growth, growth at high temperature, growth in the presence of a nonfermentable carbon source, or growth in the presence of DNA damage and replication-arrest agents. We provide a collection of sORF deletion strains that can be integrated into the existing deletion collection as a resource for the yeast community for elucidating gene function. Moreover, our analyses of the S. cerevisiae sORFs establish that sORFs are conserved across eukaryotes and have important biological functions.
Collapse
Affiliation(s)
- James P Kastenmayer
- Genetics Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20889, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Guerra OG, Rubio IGS, da Silva Filho CG, Bertoni RA, Dos Santos Govea RC, Vicente EJ. A novel system of genetic transformation allows multiple integrations of a desired gene in Saccharomyces cerevisiae chromosomes. J Microbiol Methods 2006; 67:437-45. [PMID: 16831478 DOI: 10.1016/j.mimet.2006.04.014] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2006] [Revised: 04/20/2006] [Accepted: 04/21/2006] [Indexed: 11/23/2022]
Abstract
Increasing industrial competitiveness and productivity demand that recombinant yeast strains, used in many different processes, be constantly adapted and/or genetically improved to suit changing requirements. Among yeasts, Saccharomyces cerevisiae is the best-studied organism, and the most frequently employed yeast in industrial processes. In the present study, laboratory strains and industrial S. cerevisiae strains were stably transformed with a novel vector containing the glucoamylase cDNA of Aspergillus awamori flanked by delta-sequences (deltaGlucodelta), and lacking a positive selection marker. Co-transformation with known plasmids allowed selection by auxotrophic complementation of the leu2 mutation and/or geneticin resistance (G418). In all cases, several copies of the deltaGlucodelta vector were inserted into the genome of the yeast cell without selective pressure, showing 100% stability after 80 generations. Transformation frequency of the new vector was similar for S. cerevisiae laboratory strains and industrial wild-type S. cerevisiae strains. This novel genetic transformation system is versatile and suitable to introduce several stable copies of a desired expression cassette into the genome of different S. cerevisiae yeast strains.
Collapse
Affiliation(s)
- Odanir Garcia Guerra
- Department of Microbiology, Biomedical Sciences Institute, University of São Paulo-USP, Av. Prof. Lineu Prestes 1374, Cidade Universitária, São Paulo, Brazil
| | | | | | | | | | | |
Collapse
|
18
|
Hirschman JE, Balakrishnan R, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hong EL, Livstone MS, Nash R, Park J, Oughtred R, Skrzypek M, Starr B, Theesfeld CL, Williams J, Andrada R, Binkley G, Dong Q, Lane C, Miyasato S, Sethuraman A, Schroeder M, Thanawala MK, Weng S, Dolinski K, Botstein D, Cherry JM. Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome. Nucleic Acids Res 2006; 34:D442-5. [PMID: 16381907 PMCID: PMC1347479 DOI: 10.1093/nar/gkj117] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Sequencing and annotation of the entire Saccharomyces cerevisiae genome has made it possible to gain a genome-wide perspective on yeast genes and gene products. To make this information available on an ongoing basis, the Saccharomyces Genome Database (SGD) () has created the Genome Snapshot (). The Genome Snapshot summarizes the current state of knowledge about the genes and chromosomal features of S.cerevisiae. The information is organized into two categories: (i) number of each type of chromosomal feature annotated in the genome and (ii) number and distribution of genes annotated to Gene Ontology terms. Detailed lists are accessible through SGD's Advanced Search tool (), and all the data presented on this page are available from the SGD ftp site ().
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Michael S. Livstone
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | | | | | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | | | | | | | | | | | | | | | | | | | | | - Mark Schroeder
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | | | | | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - David Botstein
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton UniversityWashington Road, Princeton, NJ 08544, USA
| | - J. Michael Cherry
- To whom correspondence should be addressed. Tel: +1 650 723 7541; Fax: +1 650 725 1534;
| |
Collapse
|
19
|
Abstract
Recent sequencing efforts and experiments have advanced our understanding of genome evolution in yeasts, particularly the Saccharomyces yeasts. The ancestral genome of the Saccharomyces sensu stricto complex has been subject to both whole-genome duplication, followed by massive sequence loss and divergence, and segmental duplication. In addition the subtelomeric regions are subject to further duplications and rearrangements via ectopic exchanges. Translocations and other gross chromosomal rearrangements that break down syntenic relationships occur; however, they do not appear to be a driving force of speciation. Analysis of single genomes has been fruitful for hypothesis generation such as the whole-genome duplication, but comparative genomics between close and more distant species has proven to be a powerful tool in testing these hypotheses as well as elucidating evolutionary processes acting on the genome. Future work on population genomics and experimental evolution will keep yeast at the forefront of studies in genome evolution.
Collapse
Affiliation(s)
- Gianni Liti
- Institute of Genetics, University of Nottingham, Queen's Medical Centre, Nottingham NG7 2UH, United Kingdom.
| | | |
Collapse
|
20
|
Gelperin DM, White MA, Wilkinson ML, Kon Y, Kung LA, Wise KJ, Lopez-Hoyo N, Jiang L, Piccirillo S, Yu H, Gerstein M, Dumont ME, Phizicky EM, Snyder M, Grayhack EJ. Biochemical and genetic analysis of the yeast proteome with a movable ORF collection. Genes Dev 2005; 19:2816-26. [PMID: 16322557 PMCID: PMC1315389 DOI: 10.1101/gad.1362105] [Citation(s) in RCA: 390] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2005] [Accepted: 09/26/2005] [Indexed: 11/24/2022]
Abstract
Functional analysis of the proteome is an essential part of genomic research. To facilitate different proteomic approaches, a MORF (moveable ORF) library of 5854 yeast expression plasmids was constructed, each expressing a sequence-verified ORF as a C-terminal ORF fusion protein, under regulated control. Analysis of 5573 MORFs demonstrates that nearly all verified ORFs are expressed, suggests the authenticity of 48 ORFs characterized as dubious, and implicates specific processes including cytoskeletal organization and transcriptional control in growth inhibition caused by overexpression. Global analysis of glycosylated proteins identifies 109 new confirmed N-linked and 345 candidate glycoproteins, nearly doubling the known yeast glycome.
Collapse
Affiliation(s)
- Daniel M Gelperin
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, New York 14642, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Keeling PJ, Fast NM, Law JS, Williams BAP, Slamovits CH. Comparative genomics of microsporidia. Folia Parasitol (Praha) 2005; 52:8-14. [PMID: 16004359 DOI: 10.14411/fp.2005.002] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Microsporidia have been known for some time to possess among the smallest genomes of any eukaryote. There is now a completely sequenced microsporidian genome, as well as several other large-scale sequencing efforts, so the nature of these genomes is becoming apparent. This paper reviews some of the characteristics of microsporidian genomes in general, and some of the recent discoveries made through comparative genomic analyses. In general, microsporidian genomes are both reduced and compacted. Reduction takes place through gene loss, which is understandable in obligate intracellular parasites that rely on their host for many metabolites. Compaction is a more complex process, and is as yet not fully understood. It is clear from genomes surveyed thus far that the remaining genes are tightly packed and that there is little non-coding sequence, resulting in some extraordinary arrangements, including overlapping genes. Compaction also seems to affect certain aspects of genome evolution, like the frequency of rearrangements. The force behind this compaction is not known, and is especially interesting in light of the fact that surveys of genomes that are significantly different in size yield similar complements of protein-coding genes. There are some interesting exceptions, including catalase, photolyase and some mitochondrial proteins, but the rarity of these raises an interesting question as to what accounts for the significant differences seen in the genome sizes among microsporidia.
Collapse
Affiliation(s)
- Patrick J Keeling
- Canadian Institute for Advanced Research, Botany Department, University of British Columbia, 3529-6270 University Boulevard, Vancouver, BC V6T 1Z4, Canada.
| | | | | | | | | |
Collapse
|
22
|
Balakrishnan R, Christie KR, Costanzo MC, Dolinski K, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hong EL, Nash R, Oughtred R, Skrzypek M, Theesfeld CL, Binkley G, Dong Q, Lane C, Sethuraman A, Weng S, Botstein D, Cherry JM. Fungal BLAST and Model Organism BLASTP Best Hits: new comparison resources at the Saccharomyces Genome Database (SGD). Nucleic Acids Res 2005; 33:D374-7. [PMID: 15608219 PMCID: PMC539977 DOI: 10.1093/nar/gki023] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is a scientific database of gene, protein and genomic information for the yeast Saccharomyces cerevisiae. SGD has recently developed two new resources that facilitate nucleotide and protein sequence comparisons between S.cerevisiae and other organisms. The Fungal BLAST tool provides directed searches against all fungal nucleotide and protein sequences available from GenBank, divided into categories according to organism, status of completeness and annotation, and source. The Model Organism BLASTP Best Hits resource displays, for each S.cerevisiae protein, the single most similar protein from several model organisms and presents links to the database pages of those proteins, facilitating access to curated information about potential orthologs of yeast proteins.
Collapse
Affiliation(s)
- Rama Balakrishnan
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Loewen CJR, Levine TP. A highly conserved binding site in vesicle-associated membrane protein-associated protein (VAP) for the FFAT motif of lipid-binding proteins. J Biol Chem 2005; 280:14097-104. [PMID: 15668246 DOI: 10.1074/jbc.m500147200] [Citation(s) in RCA: 178] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
A variety of lipid-binding proteins contain a recently described motif, designated FFAT (two phenylalanines in an acidic tract), which binds to vesicle-associated-membrane protein-associated protein (VAP). VAP is a conserved integral membrane protein of the endoplasmic reticulum that contains at its amino terminus a domain related to the major sperm protein of nematode worms. Here we have studied the FFAT-VAP interaction in Saccharomyces cerevisiae, where the VAP homologue Scs2 regulates phospholipid metabolism via an interaction with the FFAT motif of Opi1. By introducing mutations at random into Scs2, we found that mutations that abrogated binding to FFAT were clustered in the most highly conserved region. Using site-directed mutagenesis, we identified several critical residues, including two lysines widely separated in the primary sequence. By examining all other conserved basic residues, we identified a third residue that was moderately important for binding FFAT. Modeling VAP on the known structure of major sperm protein showed that the critical residues form a patch on a positively charged face of the protein. In vivo functional studies of SCS22, a second SCS2-like gene in S. cerevisiae, showed that SCS2 was the dominant gene in the regulation of Opi1, with a minor contribution from SCS22. We then established that reduction in the affinity of Scs2 mutants for FFAT correlated well with loss of function, indicating the importance of these residues for binding FFAT motifs. Finally, we found that human VAP-A could substitute for Scs2 but that it functioned poorly, suggesting that other factors modulate the binding of Scs2 to proteins with FFAT motifs.
Collapse
Affiliation(s)
- Christopher J R Loewen
- Division of Cell Biology, Institute of Ophthalmology, University College London, Bath Street, London EC1V 9EL, United Kingdom
| | | |
Collapse
|
24
|
Heidtman M, Chen CZ, Collins RN, Barlowe C. Yos1p is a novel subunit of the Yip1p-Yif1p complex and is required for transport between the endoplasmic reticulum and the Golgi complex. Mol Biol Cell 2005; 16:1673-83. [PMID: 15659647 PMCID: PMC1073651 DOI: 10.1091/mbc.e04-10-0873] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Yeast Yip1p is a member of a conserved family of transmembrane proteins that interact with Rab GTPases. Previous studies also have indicated a role for Yip1p in the biogenesis of endoplasmic reticulum (ER)-derived COPII transport vesicles. In this report, we describe the identification and characterization of the uncharacterized open reading frame YER074W-A as a novel multicopy suppressor of the thermosensitive yip1-4 strain. We have termed this gene Yip One Suppressor 1 (YOS1). Yos1p is essential for growth and for function of the secretory pathway; depletion or inactivation of Yos1p blocks transport between the ER and the Golgi complex. YOS1 encodes an integral membrane protein of 87 amino acids that is conserved in eukaryotes. Yos1p localizes to ER and Golgi membranes and is efficiently packaged into ER-derived COPII transport vesicles. Yos1p associates with Yip1p and Yif1p, indicating Yos1p is a novel subunit of the Yip1p-Yif1p complex.
Collapse
Affiliation(s)
- Matthew Heidtman
- Department of Biochemistry, Dartmouth Medical School, Hanover, NH 03755, USA
| | | | | | | |
Collapse
|
25
|
Wei GH, Liu DP, Liang CC. Charting gene regulatory networks: strategies, challenges and perspectives. Biochem J 2004; 381:1-12. [PMID: 15080794 PMCID: PMC1133755 DOI: 10.1042/bj20040311] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2004] [Revised: 04/13/2004] [Accepted: 04/13/2004] [Indexed: 11/17/2022]
Abstract
One of the foremost challenges in the post-genomic era will be to chart the gene regulatory networks of cells, including aspects such as genome annotation, identification of cis-regulatory elements and transcription factors, information on protein-DNA and protein-protein interactions, and data mining and integration. Some of these broad sets of data have already been assembled for building networks of gene regulation. Even though these datasets are still far from comprehensive, and the approach faces many important and difficult challenges, some strategies have begun to make connections between disparate regulatory events and to foster new hypotheses. In this article we review several different genomics and proteomics technologies, and present bioinformatics methods for exploring these data in order to make novel discoveries.
Collapse
Affiliation(s)
- Gong-Hong Wei
- National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) and Peking Union Medical College (PUMC), 5 Dong Dan San Tiao, Beijing 100005, P.R. China
| | - De-Pei Liu
- National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) and Peking Union Medical College (PUMC), 5 Dong Dan San Tiao, Beijing 100005, P.R. China
- To whom correspondence should be addressed (e-mail )
| | - Chih-Chuan Liang
- National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) and Peking Union Medical College (PUMC), 5 Dong Dan San Tiao, Beijing 100005, P.R. China
| |
Collapse
|
26
|
Boyer J, Badis G, Fairhead C, Talla E, Hantraye F, Fabre E, Fischer G, Hennequin C, Koszul R, Lafontaine I, Ozier-Kalogeropoulos O, Ricchetti M, Richard GF, Thierry A, Dujon B. Large-scale exploration of growth inhibition caused by overexpression of genomic fragments in Saccharomyces cerevisiae. Genome Biol 2004; 5:R72. [PMID: 15345056 PMCID: PMC522879 DOI: 10.1186/gb-2004-5-9-r72] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2004] [Revised: 07/13/2004] [Accepted: 07/26/2004] [Indexed: 03/24/2023] Open
Abstract
We have screened the genome of Saccharomyces cerevisiae for fragments that confer a growth-retardation phenotype when overexpressed in a multicopy plasmid with a tetracycline-regulatable (Tet-off) promoter. We selected 714 such fragments with a mean size of 700 base-pairs out of around 84,000 clones tested. These include 493 in-frame open reading frame fragments corresponding to 454 distinct genes (of which 91 are of unknown function), and 162 out-of-frame, antisense and intergenic genomic fragments, representing the largest collection of toxic inserts published so far in yeast.
Collapse
Affiliation(s)
- Jeanne Boyer
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| | - Gwenaël Badis
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
- Unité de Génétique des Interactions Macromoléculaires (URA2171 CNRS), Department of Structure and Dynamics of Genomes, Institut Pasteur, 25 rue du Dr Roux, 75724 Paris-Cedex 15, France
| | - Cécile Fairhead
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| | - Emmanuel Talla
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
- CNRS-Laboratoire de Chimie Bactérienne, 31 Chemin Joseph Aiguier, 13402 Marseille-Cedex 20, France
| | - Florence Hantraye
- Unité de Génétique des Interactions Macromoléculaires (URA2171 CNRS), Department of Structure and Dynamics of Genomes, Institut Pasteur, 25 rue du Dr Roux, 75724 Paris-Cedex 15, France
| | - Emmanuelle Fabre
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| | - Gilles Fischer
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| | - Christophe Hennequin
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
- Laboratoire de Parasitologie, Faculté de Médecine St-Antoine, 27 rue de Chaligny, 75012 Paris, France
| | - Romain Koszul
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| | - Ingrid Lafontaine
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| | | | - Miria Ricchetti
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
- Unité de Génétique et Biochimie du Développement, Institut Pasteur, 25 rue du Dr Roux 75724 Paris-Cedex 15, France
| | - Guy-Franck Richard
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| | - Agnès Thierry
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| | - Bernard Dujon
- Unité de Génétique Moléculaire des Levures (URA2171 CNRS and UFR 927 Université Pierre et Marie Curie)
| |
Collapse
|
27
|
Kellis M, Patterson N, Birren B, Berger B, Lander ES. Methods in comparative genomics: genome correspondence, gene identification and regulatory motif discovery. J Comput Biol 2004; 11:319-55. [PMID: 15285895 DOI: 10.1089/1066527041410319] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In Kellis et al. (2003), we reported the genome sequences of S. paradoxus, S. mikatae, and S. bayanus and compared these three yeast species to their close relative, S. cerevisiae. Genomewide comparative analysis allowed the identification of functionally important sequences, both coding and noncoding. In this companion paper we describe the mathematical and algorithmic results underpinning the analysis of these genomes. (1) We present methods for the automatic determination of genome correspondence. The algorithms enabled the automatic identification of orthologs for more than 90% of genes and intergenic regions across the four species despite the large number of duplicated genes in the yeast genome. The remaining ambiguities in the gene correspondence revealed recent gene family expansions in regions of rapid genomic change. (2) We present methods for the identification of protein-coding genes based on their patterns of nucleotide conservation across related species. We observed the pressure to conserve the reading frame of functional proteins and developed a test for gene identification with high sensitivity and specificity. We used this test to revisit the genome of S. cerevisiae, reducing the overall gene count by 500 genes (10% of previously annotated genes) and refining the gene structure of hundreds of genes. (3) We present novel methods for the systematic de novo identification of regulatory motifs. The methods do not rely on previous knowledge of gene function and in that way differ from the current literature on computational motif discovery. Based on genomewide conservation patterns of known motifs, we developed three conservation criteria that we used to discover novel motifs. We used an enumeration approach to select strongly conserved motif cores, which we extended and collapsed into a small number of candidate regulatory motifs. These include most previously known regulatory motifs as well as several noteworthy novel motifs. The majority of discovered motifs are enriched in functionally related genes, allowing us to infer a candidate function for novel motifs. Our results demonstrate the power of comparative genomics to further our understanding of any species. Our methods are validated by the extensive experimental knowledge in yeast and will be invaluable in the study of complex genomes like that of the human.
Collapse
Affiliation(s)
- Manolis Kellis
- Whitehead Institute Center for Genome Research, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | | | | | | | | |
Collapse
|
28
|
Abstract
For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown that the minimum number of genes from each species that need to be compared to produce a reliable phylogeny is about 20. Yeast has also become an attractive model to study speciation in eukaryotes, especially to understand molecular mechanisms behind the establishment of reproductive isolation. Comparison of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide the background to use more yeast species in model studies, to combat pathogens and for efficient manipulation of industrial strains.
Collapse
Affiliation(s)
- Jure Piskur
- BioCentrum-DTU, Building 301, Technical University of Denmark, DK-2800 Kgl. Lyngby, Denmark.
| | | |
Collapse
|
29
|
Lafontaine I, Fischer G, Talla E, Dujon B. Gene relics in the genome of the yeast Saccharomyces cerevisiae. Gene 2004; 335:1-17. [PMID: 15194185 DOI: 10.1016/j.gene.2004.03.028] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2003] [Revised: 02/09/2004] [Accepted: 03/29/2004] [Indexed: 10/26/2022]
Abstract
There is increasing evidence that DNA duplication is a common and ongoing process that plays a major role in molecular evolution of genomes and that a large fraction of the duplicated gene copies becomes non-functional by accumulation of deleterious mutations. In order to describe this phenomenon, we systematically searched the 6404 intergenic regions (IRs) of the genome of Saccharomyces cerevisiae for traces of coding sequences presenting degenerated but still recognizable sequence similarity with active open reading frames (5823 annotated ORFs). We detected a total of 124 anciently coding regions, or "gene relics", showing similarity to a total of 149 distinct active ORFs. This set of relics shows a continuum of sequence degeneration from those whose sequence is slightly altered compared to the functional ORF (classically defined as pseudogenes), to those that contains so many deleterious mutations, as to reach the limit of recognition. Gene relics are more concentrated in the subtelomeric regions of the chromosomes, reflecting the high plasticity of these regions. The presence of relics also revealed ancestral duplication events of chromosomal segments that were previously undetected. Some of these segments are intermingled with the more easily recognizable ancestral blocks of duplication, indicating successive duplication events. We present a compilation of all the data available, leading to a total of 278 pseudogenes in the genome of S. cerevisiae.
Collapse
Affiliation(s)
- Ingrid Lafontaine
- Unité de Génétique Moléculaire des Levures, CNRS URA 2171, Institut Pasteur, Université Pierre et Marie Curie UFR 927, 25, rue du Docteur Roux 75724, Paris, Cedex 15, France.
| | | | | | | |
Collapse
|
30
|
Leh-Louis V, Wirth B, Despons L, Wain-Hobson S, Potier S, Souciet JL. Differential evolution of the Saccharomyces cerevisiae DUP240 paralogs and implication of recombination in phylogeny. Nucleic Acids Res 2004; 32:2069-78. [PMID: 15087486 PMCID: PMC407815 DOI: 10.1093/nar/gkh529] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2003] [Revised: 02/05/2004] [Accepted: 03/16/2004] [Indexed: 11/14/2022] Open
Abstract
Multigene families are observed in all genomes sequenced so far and are the reflection of key evolutionary mechanisms. The DUP240 family, identified in Saccharomyces cerevisiae strain S288C, is composed of 10 paralogs: seven are organized as two tandem repeats and three are solo ORFs. To investigate the evolution of the three solo paralogs, YAR023c, YCR007c and YHL044w, we performed a comparative analysis between 15 S.cerevisiae strains. These three ORFs are present in all strains and the conservation of synteny indicates that they are not frequently involved in chromosomal reshaping, in contrast to the DUP240 ORFs organized in tandem repeats. Our analysis of nucleotide and amino acid variations indicates that YAR023c and YHL044w fix mutations more easily than YCR007c, although they all belong to the same multigene family. This comparative analysis was also conducted with five arbitrarily chosen Ascomycetes-specific genes and five arbitrarily chosen common genes (genes that have a homolog in at least one non-Ascomycetes organism). Ascomycetes-specific genes appear to be diverging faster than common genes in the S.cerevisiae species, a situation that was previously described between different yeast species. Our results point to the strong contribution, during DNA sequence evolution, of allelic recombination besides nucleotide substitution.
Collapse
Affiliation(s)
- V Leh-Louis
- Laboratoire de Microbiologie et Génétique, FRE 2326 Université Louis Pasteur/CNRS, Institut de Botanique, F-67083 Strasbourg Cedex, France
| | | | | | | | | | | |
Collapse
|
31
|
Harrison PM, Carriero N, Liu Y, Gerstein M. A "polyORFomic" analysis of prokaryote genomes using disabled-homology filtering reveals conserved but undiscovered short ORFs. J Mol Biol 2003; 333:885-92. [PMID: 14583187 DOI: 10.1016/j.jmb.2003.09.016] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Prokaryote gene annotation is complicated by large numbers of short open reading frames (ORFs) that arise naturally from genetic code design. Historically, many hypothetical ORFs have been annotated as genes in microbes, usually with an arbitrary length threshold (e.g. greater than 100 codons). Given the use of such thresholds, what is the extent of genuine undiscovered short genes in the current sampling of prokaryote genomes? To assess rigorously the potential under-annotation of short ORFs with homology, we exhaustively compared the polyORFome--all possible ORFs in 64 prokaryotes (53 bacteria and 11 archaea) plus budding yeast--to itself and to all known proteins. The novelty of our analysis is that, firstly, sequence comparisons to/between both annotated and un-annotated ORFs are considered, and secondly a two-step disabled-homology filter is applied to set aside putative pseudogenes and spurious ORFs. We find that un-annotated homologous short ORFs (uhORFs) correspond to a small but non-negligible fraction of the annotated prokaryote proteomes (0.5-3.8%, depending on selection criteria). Moreover, the disabled-homology filter indicates that about a third of uhORFs correspond to putative pseudogenes or spurious ORFs. Our analysis shows that the use of annotation length thresholds is unnecessary, as there are manageable numbers of short ORF homologies conserved (without disablements) across microbial genomes. Data on uhORFs are available from http://pseudogene.org/polyo
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, P.O. Box 208114, New Haven, CT 06520-8114, USA.
| | | | | | | |
Collapse
|
32
|
Torkko JM, Koivuranta KT, Kastaniotis AJ, Airenne TT, Glumoff T, Ilves M, Hartig A, Gurvitz A, Hiltunen JK. Candida tropicalis expresses two mitochondrial 2-enoyl thioester reductases that are able to form both homodimers and heterodimers. J Biol Chem 2003; 278:41213-20. [PMID: 12890667 DOI: 10.1074/jbc.m307664200] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Here we report on the cloning of a Candida tropicalis gene, ETR2, that is closely related to ETR1. Both genes encode enzymatically active 2-enoyl thioester reductases involved in mitochondrial synthesis of fatty acids (fatty acid synthesis type II) and respiratory competence. The 5'- and 3'-flanking (coding) regions of ETR2 and ETR1 are about 90% (97%) identical, indicating that the genes have evolved via gene duplication. The gene products differ in three amino acid residues: Ile67 (Val), Ala92 (Thr), and Lys251 (Arg) in Etr2p (Etr1p). Quantitative PCR analysis and reverse transcriptase-PCR indicated that both genes were expressed about equally in fermenting and ETR1 predominantly respiring yeast cells. Like the situation with ETR1, expression of ETR2 in respiration-deficient Saccharomyces cerevisiae mutant cells devoid of Ybr026p/Etr1p was able to restore growth on glycerol. Triclosan that is used as an antibacterial agent against fatty acid synthesis type II 2-enoyl thioester reductases inhibited growth of FabI overexpressing mutant yeast cells but was not able to inhibit respiratory growth of the ETR2- or ETR1-complemented mutant yeast cells. Resolving of crystal structures obtained via Etr2p and Etr1p co-crystallization indicated that all possible dimer variants occur in the same asymmetric unit, suggesting that similar dimer formation also takes place in vivo.
Collapse
Affiliation(s)
- Juha M Torkko
- Biocenter Oulu, Department of Biochemistry, University of Oulu, Finland
| | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Talla E, Tekaia F, Brino L, Dujon B. A novel design of whole-genome microarray probes for Saccharomyces cerevisiae which minimizes cross-hybridization. BMC Genomics 2003; 4:38. [PMID: 14499002 PMCID: PMC239980 DOI: 10.1186/1471-2164-4-38] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2003] [Accepted: 09/22/2003] [Indexed: 12/19/2022] Open
Abstract
Background Numerous DNA microarray hybridization experiments have been performed in yeast over the last years using either synthetic oligonucleotides or PCR-amplified coding sequences as probes. The design and quality of the microarray probes are of critical importance for hybridization experiments as well as subsequent analysis of the data. Results We present here a novel design of Saccharomyces cerevisiae microarrays based on a refined annotation of the genome and with the aim of reducing cross-hybridization between related sequences. An effort was made to design probes of similar lengths, preferably located in the 3'-end of reading frames. The sequence of each gene was compared against the entire yeast genome and optimal sub-segments giving no predicted cross-hybridization were selected. A total of 5660 novel probes (more than 97% of the yeast genes) were designed. For the remaining 143 genes, cross-hybridization was unavoidable. Using a set of 18 deletant strains, we have experimentally validated our cross-hybridization procedure. Sensitivity, reproducibility and dynamic range of these new microarrays have been measured. Based on this experience, we have written a novel program to design long oligonucleotides for microarray hybridizations of complete genome sequences. Conclusions A validated procedure to predict cross-hybridization in microarray probe design was defined in this work. Subsequently, a novel Saccharomyces cerevisiae microarray (which minimizes cross-hybridization) was designed and constructed. Arrays are available at Eurogentec S. A. Finally, we propose a novel design program, OliD, which allows automatic oligonucleotide design for microarrays. The OliD program is available from authors.
Collapse
Affiliation(s)
- Emmanuel Talla
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| | - Fredj Tekaia
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| | - Laurent Brino
- Eurogentec s.a., Parc Scientifique du Sart Tilman, B-4102 Seraing, Belgium
| | - Bernard Dujon
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| |
Collapse
|
34
|
Howe KJ, Kane CM, Ares M. Perturbation of transcription elongation influences the fidelity of internal exon inclusion in Saccharomyces cerevisiae. RNA (NEW YORK, N.Y.) 2003; 9:993-1006. [PMID: 12869710 PMCID: PMC1370465 DOI: 10.1261/rna.5390803] [Citation(s) in RCA: 123] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2003] [Accepted: 05/13/2003] [Indexed: 05/17/2023]
Abstract
Unknown mechanisms exist to ensure that exons are not skipped during biogenesis of mRNA. Studies have connected transcription elongation with regulated alternative exon inclusion. To determine whether the relative rates of transcription elongation and spliceosome assembly might play a general role in enforcing constitutive exon inclusion, we measured exon skipping for a natural two-intron gene in which the internal exon is constitutively included in the mRNA. Mutations in this gene that subtly reduce recognition of the intron 1 branchpoint cause exon skipping, indicating that rapid recognition of the first intron is important for enforcing exon inclusion. To test the role of transcription elongation, we treated cells to increase or decrease the rate of transcription elongation. Consistent with the "first come, first served" model, we found that exon skipping in vivo is inhibited when transcription is slowed by RNAP II mutants or when cells are treated with inhibitors of elongation. Expression of the elongation factor TFIIS stimulates exon skipping, and this effect is eliminated when lac repressor is targeted to DNA encoding the second intron. A mutation in U2 snRNA promotes exon skipping, presumably because a delay in recognition of the first intron allows elongating RNA polymerase to transcribe the downstream intron. This indicates that the relative rates of elongation and splicing are tuned so that the fidelity of exon inclusion is enhanced. These findings support a general role for kinetic coordination of transcription elongation and splicing during the transcription-dependent control of splicing.
Collapse
Affiliation(s)
- Kenneth James Howe
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, California 94720, USA
| | | | | |
Collapse
|
35
|
Brachat S, Dietrich FS, Voegeli S, Zhang Z, Stuart L, Lerch A, Gates K, Gaffney T, Philippsen P. Reinvestigation of the Saccharomyces cerevisiae genome annotation by comparison to the genome of a related fungus: Ashbya gossypii. Genome Biol 2003; 4:R45. [PMID: 12844361 PMCID: PMC193632 DOI: 10.1186/gb-2003-4-7-r45] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2003] [Revised: 05/07/2003] [Accepted: 05/28/2003] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND The recently sequenced genome of the filamentous fungus Ashbya gossypii revealed remarkable similarities to that of the budding yeast Saccharomyces cerevisiae both at the level of homology and synteny (conservation of gene order). Thus, it became possible to reinvestigate the S. cerevisiae genome in the syntenic regions leading to an improved annotation. RESULTS We have identified 23 novel S. cerevisiae open reading frames (ORFs) as syntenic homologs of A. gossypii genes; for all but one, homologs are present in other eukaryotes including humans. Other comparisons identified 13 overlooked introns and suggested 69 potential sequence corrections resulting in ORF extensions or ORF fusions with improved homology to the syntenic A. gossypii homologs. Of the proposed corrections, 25 were tested and confirmed by resequencing. In addition, homologs of nearly 1,000 S. cerevisiae ORFs, presently annotated as hypothetical, were found in A. gossypii at syntenic positions and can therefore be considered as authentic genes. Finally, we suggest that over 400 S. cerevisiae ORFs that overlap other ORFs in S. cerevisiae and for which no homolog can be detected in A. gossypii should be regarded as spurious. CONCLUSIONS Although, the S. cerevisiae genome is rightly considered as one of the most accurately sequenced and annotated eukaryotic genomes, we have shown that it still benefits substantially from comparison to the completed sequence and syntenic gene map of A. gossypii, an evolutionarily related fungus. This type of approach will strongly support the annotation of more complex genomes such as the human and murine genomes.
Collapse
Affiliation(s)
- Sophie Brachat
- Institute of Applied Microbiology, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| | - Fred S Dietrich
- Institute of Applied Microbiology, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710-3568, USA
| | - Sylvia Voegeli
- Institute of Applied Microbiology, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| | - Zhihong Zhang
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710-3568, USA
| | - Larissa Stuart
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, NC 27710-3568, USA
| | - Anita Lerch
- Institute of Applied Microbiology, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| | | | - Tom Gaffney
- Syngenta, Research Triangle Park, NC 27709, USA
| | - Peter Philippsen
- Institute of Applied Microbiology, Biozentrum der Universität Basel, Klingelbergstrasse 70, CH-4056 Basel, Switzerland
| |
Collapse
|
36
|
Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 2003; 423:241-54. [PMID: 12748633 DOI: 10.1038/nature01644] [Citation(s) in RCA: 1305] [Impact Index Per Article: 62.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2003] [Accepted: 04/01/2003] [Indexed: 11/09/2022]
Abstract
Identifying the functional elements encoded in a genome is one of the principal challenges in modern biology. Comparative genomics should offer a powerful, general approach. Here, we present a comparative analysis of the yeast Saccharomyces cerevisiae based on high-quality draft sequences of three related species (S. paradoxus, S. mikatae and S. bayanus). We first aligned the genomes and characterized their evolution, defining the regions and mechanisms of change. We then developed methods for direct identification of genes and regulatory motifs. The gene analysis yielded a major revision to the yeast gene catalogue, affecting approximately 15% of all genes and reducing the total count by about 500 genes. The motif analysis automatically identified 72 genome-wide elements, including most known regulatory motifs and numerous new motifs. We inferred a putative function for most of these motifs, and provided insights into their combinatorial interactions. The results have implications for genome analysis of diverse organisms, including the human.
Collapse
Affiliation(s)
- Manolis Kellis
- Whitehead/MIT Center for Genome Research, Nine Cambridge Center, Cambridge, Massachusetts 02142, USA.
| | | | | | | | | |
Collapse
|
37
|
Reboul J, Vaglio P, Rual JF, Lamesch P, Martinez M, Armstrong CM, Li S, Jacotot L, Bertin N, Janky R, Moore T, Hudson JR, Hartley JL, Brasch MA, Vandenhaute J, Boulton S, Endress GA, Jenna S, Chevet E, Papasotiropoulos V, Tolias PP, Ptacek J, Snyder M, Huang R, Chance MR, Lee H, Doucette-Stamm L, Hill DE, Vidal M. C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression. Nat Genet 2003; 34:35-41. [PMID: 12679813 DOI: 10.1038/ng1140] [Citation(s) in RCA: 310] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2003] [Accepted: 03/14/2003] [Indexed: 11/08/2022]
Abstract
To verify the genome annotation and to create a resource to functionally characterize the proteome, we attempted to Gateway-clone all predicted protein-encoding open reading frames (ORFs), or the 'ORFeome,' of Caenorhabditis elegans. We successfully cloned approximately 12,000 ORFs (ORFeome 1.1), of which roughly 4,000 correspond to genes that are untouched by any cDNA or expressed-sequence tag (EST). More than 50% of predicted genes needed corrections in their intron-exon structures. Notably, approximately 11,000 C. elegans proteins can now be expressed under many conditions and characterized using various high-throughput strategies, including large-scale interactome mapping. We suggest that similar ORFeome projects will be valuable for other organisms, including humans.
Collapse
Affiliation(s)
- Jérôme Reboul
- Dana-Farber Cancer Institute and Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Affiliation(s)
- Michael Snyder
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06520, USA
| | | |
Collapse
|
39
|
Decottignies A, Sanchez-Perez I, Nurse P. Schizosaccharomyces pombe essential genes: a pilot study. Genome Res 2003; 13:399-406. [PMID: 12618370 PMCID: PMC430286 DOI: 10.1101/gr.636103] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2002] [Accepted: 12/10/2002] [Indexed: 11/24/2022]
Abstract
After completion of the Schizosaccharomyces pombe genome sequence, we have carried out a pilot gene deletion project to assess the feasibility of a genome-wide deletion project and to estimate the percentage of essential genes. Using a PCR-based gene deletion procedure, we investigated 100 genes within a 253-kb region of chromosome II. Eight of nine genes located within a region of 18 kb could not be deleted, suggesting that systematic deletion of all fission yeast genes may be difficult to achieve using this PCR approach. The percentage of essential genes was found to be 17.5%. Further deletion of selected S. pombe genes revealed that whether a gene is essential or not is correlated with the timing of its appearance on the tree of life and its conservation within all branches of the tree. None of the investigated ancient genes in fission yeast that have been lost in the Saccharomyces cerevisiae lineage are essential. In agreement with S. cerevisiae and Caenorhabditis elegans genome analyses, our data suggest that natural selection has preferentially kept the genes required for vital functions. We propose that many of the essential eukaryotic genes appeared with the first eukaryotic cell and have remained conserved in all species.
Collapse
|
40
|
Kessler MM, Zeng Q, Hogan S, Cook R, Morales AJ, Cottarel G. Systematic discovery of new genes in the Saccharomyces cerevisiae genome. Genome Res 2003; 13:264-71. [PMID: 12566404 PMCID: PMC420365 DOI: 10.1101/gr.232903] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2002] [Accepted: 11/07/2002] [Indexed: 11/24/2022]
Abstract
We used genome-wide comparative analysis of predicted protein sequences to identify many novel small genes, named smORFs for small open reading frames, within the budding yeast genome. Further analysis of 117 of these new genes showed that 84 are transcribed. We extended our analysis of one smORF conserved from yeast to human. This investigation provides an updated and comprehensive annotation of the yeast genome, validates additional concepts in the study of genomes in silico, and increases the expected numbers of coding sequences in a genome with the corresponding impact on future functional genomics and proteomics studies.
Collapse
Affiliation(s)
- Marco M Kessler
- Genome Therapeutics Corporation, Waltham, Massachusetts 02453, USA
| | | | | | | | | | | |
Collapse
|
41
|
Cheeseman IM, Anderson S, Jwa M, Green EM, Kang JS, Yates JR, Chan CSM, Drubin DG, Barnes G. Phospho-regulation of kinetochore-microtubule attachments by the Aurora kinase Ipl1p. Cell 2002; 111:163-72. [PMID: 12408861 DOI: 10.1016/s0092-8674(02)00973-x] [Citation(s) in RCA: 488] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The Aurora kinase Ipl1p plays a crucial role in regulating kinetochore-microtubule attachments in budding yeast, but the underlying basis for this regulation is not known. To identify Ipl1p targets, we first purified 28 kinetochore proteins from yeast protein extracts. These studies identified five previously uncharacterized kinetochore proteins and defined two additional kinetochore subcomplexes. We then used mass spectrometry to identify 18 phosphorylation sites in 7 of these 28 proteins. Ten of these phosphorylation sites are targeted directly by Ipl1p, allowing us to identify a consensus phosphorylation site for an Aurora kinase. Our systematic mutational analysis of the Ipl1p phosphorylation sites demonstrated that the essential microtubule binding protein Dam1p is a key Ipl1p target for regulating kinetochore-microtubule attachments in vivo.
Collapse
Affiliation(s)
- Iain M Cheeseman
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Rodríguez-Navarro S, Llorente B, Rodríguez-Manzaneque MT, Ramne A, Uber G, Marchesan D, Dujon B, Herrero E, Sunnerhagen P, Pérez-Ortín JE. Functional analysis of yeast gene families involved in metabolism of vitamins B1and B6. Yeast 2002; 19:1261-76. [PMID: 12271461 DOI: 10.1002/yea.916] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
In order to clarify their physiological functions, we have undertaken a characterization of the three-membered gene families SNZ1-3 and SNO1-3. In media lacking vitamin B(6), SNZ1 and SNO1 were both required for growth in certain conditions, but neither SNZ2, SNZ3, SNO2 nor SNO3 were required. Copies 2 and 3 of the gene products have, in spite of their extremely close sequence similarity, slightly different functions in the cell. We have also found that copies 2 and 3 are activated by the lack of thiamine and that the Snz proteins physically interact with the thiamine biosynthesis Thi5 protein family. Whereas copy 1 is required for conditions in which B(6) is essential for growth, copies 2 and 3 seem more related with B(1) biosynthesis during the exponential phase.
Collapse
Affiliation(s)
- Susana Rodríguez-Navarro
- Departamento de Bioquímica y Biología Molecular, Universitat de València, C/Dr Moliner 50, E-46100, Burjassot, Spain
| | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Teixeira MT, Dujon B, Fabre E. Genome-wide nuclear morphology screen identifies novel genes involved in nuclear architecture and gene-silencing in Saccharomyces cerevisiae. J Mol Biol 2002; 321:551-61. [PMID: 12206772 DOI: 10.1016/s0022-2836(02)00652-6] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Organisation of the cell nucleus is crucial for the regulation of gene expression but little is known about how nuclei are structured. To address this issue, we designed a genomic screen to identify factors involved in nuclear architecture in Saccharomyces cerevisiae. This screen is based on microscopic monitoring of nuclear pore complexes and nucleolar proteins fused with the green fluorescent protein in a collection of approximately 400 individual deletion mutants. Among the 12 genes identified by this screen, most affect both the nuclear envelope and the nucleolar morphology. Corresponding gene products are localised preferentially to the nucleus or close to the nuclear periphery. Interestingly, these nuclear morphology alterations were associated with chromatin-silencing defects. These genes provide a molecular context to explore the functional link between nuclear architecture and gene silencing.
Collapse
Affiliation(s)
- Maria Teresa Teixeira
- Département de Structure et Dynamique des Génomes, Unité de Génétique Moléculaire des Levures, URA 2171 CNRS and UFR 927 Univ. P. M Curie, Institut Pasteur, 25 Rue du Docteur Roux, 75724 Cedex 15, Paris, France
| | | | | |
Collapse
|
44
|
Oshiro G, Wodicka LM, Washburn MP, Yates JR, Lockhart DJ, Winzeler EA. Parallel identification of new genes in Saccharomyces cerevisiae. Genome Res 2002; 12:1210-20. [PMID: 12176929 PMCID: PMC186640 DOI: 10.1101/gr.226802] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2001] [Accepted: 05/17/2002] [Indexed: 01/13/2023]
Abstract
Short open reading frames (ORFs) occur frequently in primary genome sequence. Distinguishing bona fide small genes from the tens of thousands of short ORFs is one of the most challenging aspects of genome annotation. Direct experimental evidence is often required. Here we use a combination of expression profiling and mass spectrometry to verify the independent transcription of 138 and the translation of 50 previously nonannotated genes in the Saccharomyces cerevisiae genome. Through combined evidence, we propose the addition of 62 new genes to the genome and provide experimental support for the inclusion of 10 previously identified genes.
Collapse
Affiliation(s)
- Guy Oshiro
- Genomics Institute of the Novartis Research Foundation, San Diego, California 92121, USA
| | | | | | | | | | | |
Collapse
|
45
|
Poirey R, Despons L, Leh V, Lafuente MJ, Potier S, Souciet JL, Jauniaux JC. Functional analysis of the Saccharomyces cerevisiae DUP240 multigene family reveals membrane-associated proteins that are not essential for cell viability. MICROBIOLOGY (READING, ENGLAND) 2002; 148:2111-2123. [PMID: 12101299 DOI: 10.1099/00221287-148-7-2111] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The DUP240 gene family of Saccharomyces cerevisiae is composed of 10 members. They encode proteins of about 240 amino acids which contain two predicted transmembrane domains. Database searches identified only one homologue in the closely related species Saccharomyces bayanus, indicating that the DUP240 genes encode proteins specific to Saccharomyces sensu stricto. The short-flanking homology PCR gene-replacement strategy with a variety of selective markers for replacements, and classical genetic methods, were used to generate strains deleted for all 10 DUP240 genes. All of the knock-out strains were viable and had similar growth kinetics to the wild-type. Two-hybrid screens, hSos1p fusions and GFP fusions were carried out; the results indicated that the Dup240 proteins are membrane associated, and that some of them are concentrated around the plasma membrane.
Collapse
Affiliation(s)
- Rémy Poirey
- Laboratoire de Génétique et Microbiologie, UPRES-A 7010 ULP/CNRS, Institut de Botanique, 28 rue Goethe, F-67083 Strasbourg cedex, France2
- Angewandte Tumorvirologie, Abteilung F0100 and Virologie Appliquée à l'Oncologie (Unité INSERM 375), Deutsches Krebsforschungszentrum, P. 1011949, D-69009 Heidelberg, Germany1
| | - Laurence Despons
- Laboratoire de Génétique et Microbiologie, UPRES-A 7010 ULP/CNRS, Institut de Botanique, 28 rue Goethe, F-67083 Strasbourg cedex, France2
| | - Véronique Leh
- Laboratoire de Génétique et Microbiologie, UPRES-A 7010 ULP/CNRS, Institut de Botanique, 28 rue Goethe, F-67083 Strasbourg cedex, France2
| | - Maria-Jose Lafuente
- Angewandte Tumorvirologie, Abteilung F0100 and Virologie Appliquée à l'Oncologie (Unité INSERM 375), Deutsches Krebsforschungszentrum, P. 1011949, D-69009 Heidelberg, Germany1
| | - Serge Potier
- Laboratoire de Génétique et Microbiologie, UPRES-A 7010 ULP/CNRS, Institut de Botanique, 28 rue Goethe, F-67083 Strasbourg cedex, France2
| | - Jean-Luc Souciet
- Laboratoire de Génétique et Microbiologie, UPRES-A 7010 ULP/CNRS, Institut de Botanique, 28 rue Goethe, F-67083 Strasbourg cedex, France2
| | - Jean-Claude Jauniaux
- Angewandte Tumorvirologie, Abteilung F0100 and Virologie Appliquée à l'Oncologie (Unité INSERM 375), Deutsches Krebsforschungszentrum, P. 1011949, D-69009 Heidelberg, Germany1
| |
Collapse
|
46
|
Gerton JL, DeRisi JL. Mnd1p: an evolutionarily conserved protein required for meiotic recombination. Proc Natl Acad Sci U S A 2002; 99:6895-900. [PMID: 12011448 PMCID: PMC124500 DOI: 10.1073/pnas.102167899] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We used a functional genomics approach to identify a gene required for meiotic recombination, YGL183c or MND1. MND1 was spliced in meiotic cells, extending the annotated YGL183c ORF N terminus by 45 aa. Saccharomyces cerevisiae mnd1-1 mutants, in which the majority of the MND1 coding sequence was removed, arrested before the first meiotic division with a phenotype reminiscent of dmc1 mutants. Physical and genetic analysis showed that these cells initiated recombination, but did not form heteroduplex DNA or double Holliday junctions, suggesting that Mnd1p is involved in strand invasion. Orthologs of MND1 were identified in protists, several yeasts, plants, and mammals, suggesting that its function has been conserved throughout evolution.
Collapse
Affiliation(s)
- Jennifer L Gerton
- Department of Biochemistry and Biophysics, University of California, 513 Parnassus Avenue, Box 0448, San Francisco, CA 94143-0448, USA
| | | |
Collapse
|
47
|
Mackiewicz P, Kowalczuk M, Mackiewicz D, Nowicka A, Dudkiewicz M, Laszkiewicz A, Dudek MR, Cebrat S. How many protein-coding genes are there in the Saccharomyces cerevisiae genome? Yeast 2002; 19:619-29. [PMID: 11967832 DOI: 10.1002/yea.865] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
We have compared the results of estimations of the total number of protein-coding genes in the Saccharomyces cerevisiae genome, which have been obtained by many laboratories since the yeast genome sequence was published in 1996. We propose that there are 5300-5400 genes in the genome. This makes the first estimation of the number of intronless ORFs longer than 100 codons, based on the features of the set of genes with phenotypes known in 1997 to be correct. This estimation assumed that the set of the first 2300 genes with known phenotypes was representative for the whole set of protein-coding genes in the genome. The same method used in this paper for the approximation of the total number of protein-coding sequences among more than 40 000 ORFs longer than 20 codons gives a result that is only slightly higher. This suggests that there are still some non-coding ORFs in the databases and a few dozen small ORFs, not yet annotated, which probably code for proteins.
Collapse
Affiliation(s)
- Pawel Mackiewicz
- Institute of Microbiology, Wroclaw University, ul. Przybyszewskiego 63/77, 51-148 Wroclaw, Poland
| | | | | | | | | | | | | | | |
Collapse
|
48
|
Harrison PM, Kumar A, Lang N, Snyder M, Gerstein M. A question of size: the eukaryotic proteome and the problems in defining it. Nucleic Acids Res 2002; 30:1083-90. [PMID: 11861898 PMCID: PMC101239 DOI: 10.1093/nar/30.5.1083] [Citation(s) in RCA: 148] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2001] [Revised: 12/20/2001] [Accepted: 01/02/2002] [Indexed: 11/14/2022] Open
Abstract
We discuss the problems in defining the extent of the proteomes for completely sequenced eukaryotic organisms (i.e. the total number of protein-coding sequences), focusing on yeast, worm, fly and human. (i) Six years after completion of its genome sequence, the true size of the yeast proteome is still not defined. New small genes are still being discovered, and a large number of existing annotations are being called into question, with these questionable ORFs (qORFs) comprising up to one-fifth of the 'current' proteome. We discuss these in the context of an ideal genome-annotation strategy that considers the proteome as a rigorously defined subset of all possible coding sequences ('the orfome'). (ii) Despite the greater apparent complexity of the fly (more cells, more complex physiology, longer lifespan), the nematode worm appears to have more genes. To explain this, we compare the annotated proteomes of worm and fly, relating to both genome-annotation and genome evolution issues. (iii) The unexpectedly small size of the gene complement estimated for the complete human genome provoked much public debate about the nature of biological complexity. However, in the first instance, for the human genome, the relationship between gene number and proteome size is far from simple. We survey the current estimates for the numbers of human genes and, from this, we estimate a range for the size of the human proteome. The determination of this is substantially hampered by the unknown extent of the cohort of pseudogenes ('dead' genes), in combination with the prevalence of alternative splicing. (Further information relating to yeast is available at http://genecensus.org/yeast/orfome)
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, PO Box 208114, New Haven, CT 06520-8114, USA
| | | | | | | | | |
Collapse
|
49
|
Harrison P, Kumar A, Lan N, Echols N, Snyder M, Gerstein M. A small reservoir of disabled ORFs in the yeast genome and its implications for the dynamics of proteome evolution. J Mol Biol 2002; 316:409-19. [PMID: 11866506 DOI: 10.1006/jmbi.2001.5343] [Citation(s) in RCA: 91] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We surveyed the sequenced Saccharomyces cerevisiae genome (strain S288C) comprehensively for open reading frames (ORFs) that could encode full-length proteins but contain obvious mid-sequence disablements (frameshifts or premature stop codons). These pseudogenic features are termed disabled ORFs (dORFs). Using homology to annotated yeast ORFs and non-yeast proteins plus a simple region extension procedure, we have found 183 dORFs. Combined with the 38 existing annotations for potential dORFs, we have a total pool of up to 221 dORFs, corresponding to less than approximately 3% of the proteome. Additionally, we found 20 pairs of annotated ORFs for yeast that could be merged into a single ORF (termed a mORF) by read-through of the intervening stop codon, and may comprise a complete ORF in other yeast strains. Focussing on a core pool of 98 dORFs with a verifying protein homology, we find that most dORFs are substantially decayed, with approximately 90% having two or more disablements, and approximately 60% having four or more. dORFs are much more yeast-proteome specific than live yeast genes (having about half the chance that they are related to a non-yeast protein). They show a dramatically increased density at the telomeres of chromosomes, relative to genes. A microarray study shows that some dORFs are expressed even though they carry multiple disablements, and thus may be more resistant to nonsense-mediated decay. Many of the dORFs may be involved in responding to environmental stresses, as the largest functional groups include growth inhibition, flocculation, and the SRP/TIP1 family. Our results have important implications for proteome evolution. The characteristics of the dORF population suggest the sorts of genes that are likely to fall in and out of usage (and vary in copy number) in a strain-specific way and highlight the role of subtelomeric regions in engendering this diversity. Our results also have important implications for the effects of the [PSI+] prion. The dORFs disabled by only a single stop and the mORFs (together totalling 35) provide an estimate for the extent of the sequence population that can be resurrected readily through the demonstrated ability of the [PSI+] prion to cause nonsense-codon read-through. Also, the dORFs and mORFs that we find have properties (e.g. growth inhibition, flocculation, vanadate resistance, stress response) that are potentially related to the ability of [PSI+] to engender substantial phenotypic variation in yeast strains under different environmental conditions. (See genecensus.org/pseudogene for further information.)
Collapse
Affiliation(s)
- Paul Harrison
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | | | | | | | | | | |
Collapse
|
50
|
Wood V, Gwilliam R, Rajandream MA, Lyne M, Lyne R, Stewart A, Sgouros J, Peat N, Hayles J, Baker S, Basham D, Bowman S, Brooks K, Brown D, Brown S, Chillingworth T, Churcher C, Collins M, Connor R, Cronin A, Davis P, Feltwell T, Fraser A, Gentles S, Goble A, Hamlin N, Harris D, Hidalgo J, Hodgson G, Holroyd S, Hornsby T, Howarth S, Huckle EJ, Hunt S, Jagels K, James K, Jones L, Jones M, Leather S, McDonald S, McLean J, Mooney P, Moule S, Mungall K, Murphy L, Niblett D, Odell C, Oliver K, O'Neil S, Pearson D, Quail MA, Rabbinowitsch E, Rutherford K, Rutter S, Saunders D, Seeger K, Sharp S, Skelton J, Simmonds M, Squares R, Squares S, Stevens K, Taylor K, Taylor RG, Tivey A, Walsh S, Warren T, Whitehead S, Woodward J, Volckaert G, Aert R, Robben J, Grymonprez B, Weltjens I, Vanstreels E, Rieger M, Schäfer M, Müller-Auer S, Gabel C, Fuchs M, Düsterhöft A, Fritzc C, Holzer E, Moestl D, Hilbert H, Borzym K, Langer I, Beck A, Lehrach H, Reinhardt R, Pohl TM, Eger P, Zimmermann W, Wedler H, Wambutt R, Purnelle B, Goffeau A, Cadieu E, Dréano S, Gloux S, Lelaure V, Mottier S, Galibert F, Aves SJ, Xiang Z, Hunt C, Moore K, Hurst SM, Lucas M, Rochet M, Gaillardin C, Tallada VA, Garzon A, Thode G, Daga RR, Cruzado L, Jimenez J, Sánchez M, del Rey F, Benito J, Domínguez A, Revuelta JL, Moreno S, Armstrong J, Forsburg SL, Cerutti L, Lowe T, McCombie WR, Paulsen I, Potashkin J, Shpakovski GV, Ussery D, Barrell BG, Nurse P, Cerrutti L. The genome sequence of Schizosaccharomyces pombe. Nature 2002; 415:871-80. [PMID: 11859360 DOI: 10.1038/nature724] [Citation(s) in RCA: 1118] [Impact Index Per Article: 50.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We have sequenced and annotated the genome of fission yeast (Schizosaccharomyces pombe), which contains the smallest number of protein-coding genes yet recorded for a eukaryote: 4,824. The centromeres are between 35 and 110 kilobases (kb) and contain related repeats including a highly conserved 1.8-kb element. Regions upstream of genes are longer than in budding yeast (Saccharomyces cerevisiae), possibly reflecting more-extended control regions. Some 43% of the genes contain introns, of which there are 4,730. Fifty genes have significant similarity with human disease genes; half of these are cancer related. We identify highly conserved genes important for eukaryotic cell organization including those required for the cytoskeleton, compartmentation, cell-cycle control, proteolysis, protein phosphorylation and RNA splicing. These genes may have originated with the appearance of eukaryotic life. Few similarly conserved genes that are important for multicellular organization were identified, suggesting that the transition from prokaryotes to eukaryotes required more new genes than did the transition from unicellular to multicellular organization.
Collapse
Affiliation(s)
- V Wood
- The Wellcome Trust Sanger Institute, The Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|