1
|
Guillamón JM, Barrio E. Genetic Polymorphism in Wine Yeasts: Mechanisms and Methods for Its Detection. Front Microbiol 2017; 8:806. [PMID: 28522998 PMCID: PMC5415627 DOI: 10.3389/fmicb.2017.00806] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Accepted: 04/19/2017] [Indexed: 01/09/2023] Open
Abstract
The processes of yeast selection for using as wine fermentation starters have revealed a great phenotypic diversity both at interspecific and intraspecific level, which is explained by a corresponding genetic variation among different yeast isolates. Thus, the mechanisms involved in promoting these genetic changes are the main engine generating yeast biodiversity. Currently, an important task to understand biodiversity, population structure and evolutionary history of wine yeasts is the study of the molecular mechanisms involved in yeast adaptation to wine fermentation, and on remodeling the genomic features of wine yeast, unconsciously selected since the advent of winemaking. Moreover, the availability of rapid and simple molecular techniques that show genetic polymorphisms at species and strain levels have enabled the study of yeast diversity during wine fermentation. This review will summarize the mechanisms involved in generating genetic polymorphisms in yeasts, the molecular methods used to unveil genetic variation, and the utility of these polymorphisms to differentiate strains, populations, and species in order to infer the evolutionary history and the adaptive evolution of wine yeasts, and to identify their influence on their biotechnological and sensorial properties.
Collapse
Affiliation(s)
- José M Guillamón
- Departamento de Biotecnología de los Alimentos, Instituto de Agroquímica y Tecnología de Alimentos - Consejo Superior de Investigaciones Científicas (CSIC)Valencia, Spain
| | - Eladio Barrio
- Departamento de Biotecnología de los Alimentos, Instituto de Agroquímica y Tecnología de Alimentos - Consejo Superior de Investigaciones Científicas (CSIC)Valencia, Spain.,Departamento de Genética, Universidad de ValenciaValencia, Spain
| |
Collapse
|
2
|
Meena S, Kumar SR, Venkata Rao DK, Dwivedi V, Shilpashree HB, Rastogi S, Shasany AK, Nagegowda DA. De Novo Sequencing and Analysis of Lemongrass Transcriptome Provide First Insights into the Essential Oil Biosynthesis of Aromatic Grasses. FRONTIERS IN PLANT SCIENCE 2016; 7:1129. [PMID: 27516768 PMCID: PMC4963619 DOI: 10.3389/fpls.2016.01129] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Accepted: 07/15/2016] [Indexed: 05/09/2023]
Abstract
Aromatic grasses of the genus Cymbopogon (Poaceae family) represent unique group of plants that produce diverse composition of monoterpene rich essential oils, which have great value in flavor, fragrance, cosmetic, and aromatherapy industries. Despite the commercial importance of these natural aromatic oils, their biosynthesis at the molecular level remains unexplored. As the first step toward understanding the essential oil biosynthesis, we performed de novo transcriptome assembly and analysis of C. flexuosus (lemongrass) by employing Illumina sequencing. Mining of transcriptome data and subsequent phylogenetic analysis led to identification of terpene synthases, pyrophosphatases, alcohol dehydrogenases, aldo-keto reductases, carotenoid cleavage dioxygenases, alcohol acetyltransferases, and aldehyde dehydrogenases, which are potentially involved in essential oil biosynthesis. Comparative essential oil profiling and mRNA expression analysis in three Cymbopogon species (C. flexuosus, aldehyde type; C. martinii, alcohol type; and C. winterianus, intermediate type) with varying essential oil composition indicated the involvement of identified candidate genes in the formation of alcohols, aldehydes, and acetates. Molecular modeling and docking further supported the role of identified protein sequences in aroma formation in Cymbopogon. Also, simple sequence repeats were found in the transcriptome with many linked to terpene pathway genes including the genes potentially involved in aroma biosynthesis. This work provides the first insights into the essential oil biosynthesis of aromatic grasses, and the identified candidate genes and markers can be a great resource for biotechnological and molecular breeding approaches to modulate the essential oil composition.
Collapse
Affiliation(s)
- Seema Meena
- Molecular Plant Biology and Biotechnology Lab, Council of Scientific and Industrial Research – Central Institute of Medicinal and Aromatic Plants Research CentreBangalore, India
| | - Sarma R. Kumar
- Molecular Plant Biology and Biotechnology Lab, Council of Scientific and Industrial Research – Central Institute of Medicinal and Aromatic Plants Research CentreBangalore, India
| | - D. K. Venkata Rao
- Molecular Plant Biology and Biotechnology Lab, Council of Scientific and Industrial Research – Central Institute of Medicinal and Aromatic Plants Research CentreBangalore, India
| | - Varun Dwivedi
- Molecular Plant Biology and Biotechnology Lab, Council of Scientific and Industrial Research – Central Institute of Medicinal and Aromatic Plants Research CentreBangalore, India
| | - H. B. Shilpashree
- Molecular Plant Biology and Biotechnology Lab, Council of Scientific and Industrial Research – Central Institute of Medicinal and Aromatic Plants Research CentreBangalore, India
| | - Shubhra Rastogi
- Biotechnology Division, Council of Scientific and Industrial Research – Central Institute of Medicinal and Aromatic PlantsLucknow, India
| | - Ajit K. Shasany
- Biotechnology Division, Council of Scientific and Industrial Research – Central Institute of Medicinal and Aromatic PlantsLucknow, India
| | - Dinesh A. Nagegowda
- Molecular Plant Biology and Biotechnology Lab, Council of Scientific and Industrial Research – Central Institute of Medicinal and Aromatic Plants Research CentreBangalore, India
- *Correspondence: Dinesh A. Nagegowda,
| |
Collapse
|
3
|
Richard GF, Viterbo D, Khanna V, Mosbach V, Castelain L, Dujon B. Highly specific contractions of a single CAG/CTG trinucleotide repeat by TALEN in yeast. PLoS One 2014; 9:e95611. [PMID: 24748175 PMCID: PMC3991675 DOI: 10.1371/journal.pone.0095611] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2013] [Accepted: 03/28/2014] [Indexed: 12/22/2022] Open
Abstract
Trinucleotide repeat expansions are responsible for more than two dozens severe neurological disorders in humans. A double-strand break between two short CAG/CTG trinucleotide repeats was formerly shown to induce a high frequency of repeat contractions in yeast. Here, using a dedicated TALEN, we show that induction of a double-strand break into a CAG/CTG trinucleotide repeat in heterozygous yeast diploid cells results in gene conversion of the repeat tract with near 100% efficacy, deleting the repeat tract. Induction of the same TALEN in homozygous yeast diploids leads to contractions of both repeats to a final length of 3–13 triplets, with 100% efficacy in cells that survived the double-strand breaks. Whole-genome sequencing of surviving yeast cells shows that the TALEN does not increase mutation rate. No other CAG/CTG repeat of the yeast genome showed any length alteration or mutation. No large genomic rearrangement such as aneuploidy, segmental duplication or translocation was detected. It is the first demonstration that induction of a TALEN in an eukaryotic cell leads to shortening of trinucleotide repeat tracts to lengths below pathological thresholds in humans, with 100% efficacy and very high specificity.
Collapse
Affiliation(s)
- Guy-Franck Richard
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, Paris, France
- Sorbonne Universités, UPMC Univ Paris 6, IFD, Paris, France
- CNRS, UMR3525, Paris, France
- * E-mail:
| | - David Viterbo
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, Paris, France
- Sorbonne Universités, UPMC Univ Paris 6, IFD, Paris, France
- CNRS, UMR3525, Paris, France
| | - Varun Khanna
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, Paris, France
- Sorbonne Universités, UPMC Univ Paris 6, IFD, Paris, France
- CNRS, UMR3525, Paris, France
| | - Valentine Mosbach
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, Paris, France
- Sorbonne Universités, UPMC Univ Paris 6, IFD, Paris, France
- CNRS, UMR3525, Paris, France
| | - Lauriane Castelain
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, Paris, France
- Sorbonne Universités, UPMC Univ Paris 6, IFD, Paris, France
- CNRS, UMR3525, Paris, France
| | - Bernard Dujon
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, Paris, France
- Sorbonne Universités, UPMC Univ Paris 6, IFD, Paris, France
- CNRS, UMR3525, Paris, France
| |
Collapse
|
4
|
Naga BLRI, Mangamoori LN, Subramanyam S. Identification and characterization of EST-SSRs in finger millet (Eleusine coracana (L.) Gaertn.). ACTA ACUST UNITED AC 2012. [DOI: 10.1007/s12892-011-0064-9] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
5
|
Tyagi S, Sharma M, Das A. Comparative genomic analysis of simple sequence repeats in three Plasmodium species. Parasitol Res 2010; 108:451-8. [PMID: 20924609 DOI: 10.1007/s00436-010-2086-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2010] [Accepted: 09/08/2010] [Indexed: 11/24/2022]
Abstract
Simple sequence repeats (SSRs) are known to be responsible for genetic complexities and play major roles in gene and genome evolution. To this respect, malaria parasites are known to have rapidly evolving and complex genomes with complicated and differential pathogenic behaviors. Hence, by studying the whole genome comparative SSRs patterns, one can understand genomic complexities and differential evolutionary patterns of these species. We herein utilized the whole genome sequence information of three Plasmodium species, Plasmodium falciparum, Plasmodium vivax, and Plasmodium knowlesi, to comparatively analyze genome-wide distribution of SSRs. The study revealed that despite having the smallest genome size, P. falciparum bears the highest SSR content among the three Plasmodium species. Furthermore, distribution patterns of different SSRs types (e.g., mono, di, tri, tetra, penta, and hexa) in term of relative abundance and relative density provide evidences for greater accumulation of di-repeats and marked decrease of mono-repeats in P. falciparum in comparison to other two species. Overall, the types and distribution of SSRs in P. falciparum genome was found to be different than that of P. vivax and P. knowlesi. The latter two species have quite similar SSR organizations in many aspects of the data. The results were discussed in terms of comparative SSR patterns among the three Plasmodium species, uniqueness of P. falciparum in SSR organization and general pattern of evolution of SSRs in Plasmodium.
Collapse
Affiliation(s)
- Suchi Tyagi
- Evolutionary Genomics and Bioinformatics Laboratory, Division of Genomics and Bioinformatics, National Institute of Malaria Research, Sector 8, Dwarka, New Delhi, 110 077, India
| | | | | |
Collapse
|
6
|
Rorick MM, Wagner GP. The origin of conserved protein domains and amino acid repeats via adaptive competition for control over amino acid residues. J Mol Evol 2010; 70:29-43. [PMID: 20024539 PMCID: PMC3368225 DOI: 10.1007/s00239-009-9305-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2009] [Accepted: 11/18/2009] [Indexed: 10/20/2022]
Abstract
Some proteins, such as homeodomain transcription factors, contain highly conserved regions of sequence. It has recently been suggested that multiple functional domains overlap in the homeodomain, together explaining this high conservation. However, the question remains why so many functional domains cluster together in one relatively small and constrained region of the protein. Here we have modeled an evolutionary mechanism that can produce this kind of clustering: conserved functional domains are displaced from the parts of the molecule that are undergoing adaptive evolution because novel functions generally out-compete conserved functions for control over the identity of amino acid residues. We call this model COAA, for Competition Over Amino Acids. We also studied the evolution of amino acid repeats (a.k.a. homopeptides), which are especially prevalent in transcription factors. Repeats that are encoded by non-homogenous mixtures of synonymous codons cannot be explained by replication slippage alone. Our model provides two explanations for their origin, maintenance, and over-representation in highly conserved proteins. We demonstrate that either competition between multiple functional domains for space within a sequence, or reuse of a sequence for many functions over time, can cause the evolution of amino acid repeats. Both of these processes are characteristic of multifunctional proteins such as homeodomain transcription factors. We conclude that the COAA model can explain two widely recognized features of transcription factor proteins: conserved domains and a tendency to accumulate homopeptides.
Collapse
Affiliation(s)
- Mary M Rorick
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520-8106, USA.
| | | |
Collapse
|
7
|
Simon M, Hancock JM. Tandem and cryptic amino acid repeats accumulate in disordered regions of proteins. Genome Biol 2009; 10:R59. [PMID: 19486509 PMCID: PMC2718493 DOI: 10.1186/gb-2009-10-6-r59] [Citation(s) in RCA: 92] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2009] [Accepted: 06/01/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Amino acid repeats (AARs) are common features of protein sequences. They often evolve rapidly and are involved in a number of human diseases. They also show significant associations with particular Gene Ontology (GO) functional categories, particularly transcription, suggesting they play some role in protein function. It has been suggested recently that AARs play a significant role in the evolution of intrinsically unstructured regions (IURs) of proteins. We investigate the relationship between AAR frequency and evolution and their localization within proteins based on a set of 5,815 orthologous proteins from four mammalian (human, chimpanzee, mouse and rat) and a bird (chicken) genome. We consider two classes of AAR (tandem repeats and cryptic repeats: regions of proteins containing overrepresentations of short amino acid repeats). RESULTS Mammals show very similar repeat frequencies but chicken shows lower frequencies of many of the cryptic repeats common in mammals. Regions flanking tandem AARs evolve more rapidly than the rest of the protein containing the repeat and this phenomenon is more pronounced for non-conserved repeats than for conserved ones. GO associations are similar to those previously described for the mammals, but chicken cryptic repeats show fewer significant associations. Comparing the overlaps of AARs with IURs and protein domains showed that up to 96% of some AAR types are associated preferentially with IURs. However, no more than 15% of IURs contained an AAR. CONCLUSIONS Their location within IURs explains many of the evolutionary properties of AARs. Further study is needed on the types of IURs containing AARs.
Collapse
Affiliation(s)
- Michelle Simon
- Bioinformatics Group, MRC Harwell, Mammalian Genetics Unit, Harwell Science and Innovation Campus, Harwell, Oxfordshire, OX11 0RD, UK
| | - John M Hancock
- Bioinformatics Group, MRC Harwell, Mammalian Genetics Unit, Harwell Science and Innovation Campus, Harwell, Oxfordshire, OX11 0RD, UK
| |
Collapse
|
8
|
Richard GF, Kerrest A, Dujon B. Comparative genomics and molecular dynamics of DNA repeats in eukaryotes. Microbiol Mol Biol Rev 2008; 72:686-727. [PMID: 19052325 PMCID: PMC2593564 DOI: 10.1128/mmbr.00011-08] [Citation(s) in RCA: 335] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Repeated elements can be widely abundant in eukaryotic genomes, composing more than 50% of the human genome, for example. It is possible to classify repeated sequences into two large families, "tandem repeats" and "dispersed repeats." Each of these two families can be itself divided into subfamilies. Dispersed repeats contain transposons, tRNA genes, and gene paralogues, whereas tandem repeats contain gene tandems, ribosomal DNA repeat arrays, and satellite DNA, itself subdivided into satellites, minisatellites, and microsatellites. Remarkably, the molecular mechanisms that create and propagate dispersed and tandem repeats are specific to each class and usually do not overlap. In the present review, we have chosen in the first section to describe the nature and distribution of dispersed and tandem repeats in eukaryotic genomes in the light of complete (or nearly complete) available genome sequences. In the second part, we focus on the molecular mechanisms responsible for the fast evolution of two specific classes of tandem repeats: minisatellites and microsatellites. Given that a growing number of human neurological disorders involve the expansion of a particular class of microsatellites, called trinucleotide repeats, a large part of the recent experimental work on microsatellites has focused on these particular repeats, and thus we also review the current knowledge in this area. Finally, we propose a unified definition for mini- and microsatellites that takes into account their biological properties and try to point out new directions that should be explored in a near future on our road to understanding the genetics of repeated sequences.
Collapse
Affiliation(s)
- Guy-Franck Richard
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, CNRS, URA2171, Université Pierre et Marie Curie, UFR927, 25 rue du Dr. Roux, F-75015, Paris, France.
| | | | | |
Collapse
|
9
|
Hosseini A, Ranade SH, Ghosh I, Khandekar P. Simple sequence repeats in different genome sequences of Shigella and comparison with high GC and AT-rich genomes. ACTA ACUST UNITED AC 2008; 19:167-76. [PMID: 18464038 DOI: 10.1080/10425170701461730] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Simple sequence repeats (SSRs) are omnipresent in prokaryotes and eukaryotes, and are found anywhere in the genome in both protein encoding and noncoding regions. In present study the whole genome sequences of seven chromosomes (Shigella flexneri 2a str301 and 2457T, Shigella sonnei, Escherichia coli k12, Mycobacterium tuberculosis, Mycobacterium leprae and Staphylococcus saprophyticus) have downloaded from the GenBank database for identifying abundance, distribution and composition of SSRs and also to determine difference between the tandem repeats in real genome and randomness genome (using sequence shuffling tool) of the organisms included in this study. The data obtained in the present study show that: (i) tandem repeats are widely distributed throughout the genomes; (ii) SSRs are differentially distributed among coding and noncoding regions in investigated Shigella genomes; (iii) total frequency of SSRs in noncoding regions are higher than coding regions; (iv) in all investigated chromosomes ratio of Trinucleotide SSRs in real genomes are much higher than randomness genomes and Di nucleotide SSRs are lower; (v) Ratio of total and mononucleotide SSRs in real genome is higher than randomness genomes in E. coli K12, S. flexneri str 301 and S. saprophyticus, while it is lower in S. flexneri str 2457T, S.sonnei and M. tuberculosis and it is approximately same in M. leprae; (vi) frequency of codon repetitions are vary considerably depending on the type of encoded amino acids.
Collapse
Affiliation(s)
- Ashraf Hosseini
- Institute of Bioinformatics and Biotechnology, University of Pune, Pune, India.
| | | | | | | |
Collapse
|
10
|
Walczak E, Czaplińska A, Barszczewski W, Wilgosz M, Wojtatowicz M, Robak M. RAPD with microsatellite as a tool for differentiation of Candida genus yeasts isolated in brewing. Food Microbiol 2007; 24:305-12. [PMID: 17188210 DOI: 10.1016/j.fm.2006.04.012] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2005] [Revised: 03/01/2006] [Accepted: 04/01/2006] [Indexed: 11/29/2022]
Abstract
Fifteen wild yeast strains were isolated in two factories of a lager brewing company in Poland. Their identification with API 32C system showed mainly the presence of Candida sake species (7/15). To differentiate the isolates, randomly amplified polymorphic DNA (RAPD) with (GTG)(5), (GAC)(5), (GACA)(4) microsatellite primers and M13 core sequence (5'-GAG GGT GGC GGT TCT-3') were chosen. The results of patterns similarity are presented as dendrograms for each RAPD analysis and for overall patterns. On the overall patterns, all isolates identified as C. sake, except Strain No. 1, were regrouped in one cluster. Collection strain C. sake CBS 617 was similar in 46% to the cluster with six isolates (Strain Nos. 3, 6, 8, 11, 13, 14). The second reference strain C. sake CBS 159 and the Strain No. 1 were regrouped with other Candida species (collection strains) showing, respectively, only 20% and 42% of similarity to other C. sake strains. The similarity based on the overall dendrogram between isolate Nos. 3, 6, 8, 11, 13, 14 and C. sake CBS 617 was 49%. Between those strains and other Candida, the similarity was only 37%.
Collapse
Affiliation(s)
- Ewa Walczak
- Department of Biotechnology and Food Microbiology, Faculty of Food Science Agricultural University of Wrocław, Norwida 25, 50-375 Wrocław, Poland
| | | | | | | | | | | |
Collapse
|
11
|
Mularoni L, Veitia RA, Albà MM. Highly constrained proteins contain an unexpectedly large number of amino acid tandem repeats. Genomics 2006; 89:316-25. [PMID: 17196365 DOI: 10.1016/j.ygeno.2006.11.011] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2006] [Revised: 10/30/2006] [Accepted: 11/22/2006] [Indexed: 11/16/2022]
Abstract
Single-amino-acid tandem repeats are very common in mammalian proteins but their function and evolution are still poorly understood. Here we investigate how the variability and prevalence of amino acid repeats are related to the evolutionary constraints operating on the proteins. We find a significant positive correlation between repeat size difference and protein nonsynonymous substitution rate in human and mouse orthologous genes. This association is observed for all the common amino acid repeat types and indicates that rapid diversification of repeat structures, involving both trinucleotide slippage and nucleotide substitutions, preferentially occurs in proteins subject to low selective constraints. However, strikingly, we also observe a significant negative correlation between the number of repeats in a protein and the gene nonsynonymous substitution rate, particularly for glutamine, glycine, and alanine repeats. This implies that proteins subject to strong selective constraints tend to contain an unexpectedly high number of repeats, which tend to be well conserved between the two species. This is consistent with a role for selection in the maintenance of a significant number of repeats. Analysis of the codon structure of the sequences encoding the repeats shows that codon purity is associated with high repeat size interspecific variability. Interestingly, polyalanine and polyglutamine repeats associated with disease show very distinctive features regarding the degree of repeat conservation and the protein sequence selective constraints.
Collapse
Affiliation(s)
- Loris Mularoni
- Research Unit on Biomedical Informatics, Institut Municipal d'Investigació Mèdica, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | | | | |
Collapse
|
12
|
Zhang Z, Xue Q. Tri-nucleotide repeats and their association with genes in rice genome. Biosystems 2005; 82:248-56. [PMID: 16226835 DOI: 10.1016/j.biosystems.2005.08.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2005] [Accepted: 08/16/2005] [Indexed: 11/27/2022]
Abstract
Tri-nucleotide repeats (TNRs) are extremely abundant in rice genome, of which CCG/CGG repeats have an advantage over other repeats, with approximate half of all the TNRs in the genome. Our results show that rice genome has relatively abundant TNRs with high GC content, and containing only purines or pyrimidines under the same GC content. The AAT/ATT repeats that occur predominantly in intergenic and intronic regions have a considerably higher average length than that of other repeats. The highest frequency of TNRs occurs in 5'-UTR regions, followed by in coding and 5'-flanking regions. Purines-rich TNRs prefer to the coding regions, but pyrimidines-rich TNRs exhibit a stronger bias to upstream regions, suggesting that they might be considered as the regulatory elements in gene expression. As if TNRs located predominantly near the start of coding regions do not significantly influence on the protein function.
Collapse
Affiliation(s)
- Zhonghua Zhang
- James D. Watson Institute of Genome Science, Zhejiang University, Hangzhou 310008, China
| | | |
Collapse
|
13
|
Abstract
Minisatellites are DNA tandem repeats exhibiting size polymorphism among individuals of a population. This polymorphism is generated by two different mechanisms, both in human and yeast cells, "replication slippage" during S-phase DNA synthesis and "repair slippage" associated to meiotic gene conversion. The Saccharomyces cerevisiae genome contains numerous natural minisatellites. They are located on all chromosomes without any obvious distribution bias. Minisatellites found in protein-coding genes have longer repeat units and on the average more repeat units than minisatellites in noncoding regions. They show an excess of cytosines on the coding strand, as compared to guanines (negative GC skew). They are always multiples of three, encode serine- and threonine-rich amino acid repeats, and are found preferably within genes encoding cell wall proteins, suggesting that they are positively selected in this particular class of genes. Genome-wide, there is no statistically significant association between minisatellites and meiotic recombination hot spots. In addition, minisatellites that are located in the vicinity of a meiotic hot spot are not more polymorphic than minisatellites located far from any hot spot. This suggests that minisatellites, in S. cerevisiae, evolve probably by strand slippage during replication or mitotic recombination. Finally, evolution of minisatellites among hemiascomycetous yeasts shows that even though many minisatellite-containing genes are conserved, most of the time the minisatellite itself is not conserved. The diversity of minisatellite sequences found in orthologous genes of different species suggests that minisatellites are differentially acquired and lost during evolution of hemiascomycetous yeasts at a pace faster than the genes containing them.
Collapse
Affiliation(s)
- Guy-Franck Richard
- Unité de Génétique Moléculaire des Levures, Université Pierre et Marie Curie, Institut Pasteur, 75724 Paris Cedex 15, France.
| | | |
Collapse
|
14
|
Talla E, Tekaia F, Brino L, Dujon B. A novel design of whole-genome microarray probes for Saccharomyces cerevisiae which minimizes cross-hybridization. BMC Genomics 2003; 4:38. [PMID: 14499002 PMCID: PMC239980 DOI: 10.1186/1471-2164-4-38] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2003] [Accepted: 09/22/2003] [Indexed: 12/19/2022] Open
Abstract
Background Numerous DNA microarray hybridization experiments have been performed in yeast over the last years using either synthetic oligonucleotides or PCR-amplified coding sequences as probes. The design and quality of the microarray probes are of critical importance for hybridization experiments as well as subsequent analysis of the data. Results We present here a novel design of Saccharomyces cerevisiae microarrays based on a refined annotation of the genome and with the aim of reducing cross-hybridization between related sequences. An effort was made to design probes of similar lengths, preferably located in the 3'-end of reading frames. The sequence of each gene was compared against the entire yeast genome and optimal sub-segments giving no predicted cross-hybridization were selected. A total of 5660 novel probes (more than 97% of the yeast genes) were designed. For the remaining 143 genes, cross-hybridization was unavoidable. Using a set of 18 deletant strains, we have experimentally validated our cross-hybridization procedure. Sensitivity, reproducibility and dynamic range of these new microarrays have been measured. Based on this experience, we have written a novel program to design long oligonucleotides for microarray hybridizations of complete genome sequences. Conclusions A validated procedure to predict cross-hybridization in microarray probe design was defined in this work. Subsequently, a novel Saccharomyces cerevisiae microarray (which minimizes cross-hybridization) was designed and constructed. Arrays are available at Eurogentec S. A. Finally, we propose a novel design program, OliD, which allows automatic oligonucleotide design for microarrays. The OliD program is available from authors.
Collapse
Affiliation(s)
- Emmanuel Talla
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| | - Fredj Tekaia
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| | - Laurent Brino
- Eurogentec s.a., Parc Scientifique du Sart Tilman, B-4102 Seraing, Belgium
| | - Bernard Dujon
- Institut Pasteur, Unité de Génétique Moléculaire des Levures (URA 2171 CNRS, UFR 927 Université PM Curie), 25 rue du Docteur Roux, F-75724 Paris cedex 15, France
| |
Collapse
|
15
|
Fabre E, Dujon B, Richard GF. Transcription and nuclear transport of CAG/CTG trinucleotide repeats in yeast. Nucleic Acids Res 2002; 30:3540-7. [PMID: 12177295 PMCID: PMC134249 DOI: 10.1093/nar/gkf483] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Trinucleotide repeats are involved in several neurological disorders in humans. DNA sequences containing CAG/CTG repeats are prone to slippage during replication and double-strand break repair. The effects of trinucleotide repeats on transcription and on nuclear export were analyzed in vivo in yeast. Transcription of a CAG/CTG trinucleotide repeat in the 3'-untranslated region of a URA3 reporter gene leads to transcription of messenger RNAs several kilobases longer than the expected size. These long mRNAs form more readily when CAG rather than CTG repeats are transcribed. CAG- or CUG-containing transcripts show a non-homogeneous cellular localization. We propose that long mRNAs result from transcription slippage, and discuss the possible implications for human diseases.
Collapse
Affiliation(s)
- Emmanuelle Fabre
- Unité de Génétique Moléculaire des Levures (URA 2171 CNRS and UFR 927 Université Pierre et Marie Curie) Institut Pasteur, 25 rue du Dr Roux, 75724 Paris cedex 15, France.
| | | | | |
Collapse
|
16
|
Abstract
Having the complete genome sequence of Saccharomyces cerevisiae makes us aware of the ultimate goal of yeast molecular biology: the 'solution' of the cell, that is, an understanding of the function of all approximately 6000 proteins (and a few RNAs) and how they interact with each other and the environment. The recent development of 'genomic' approaches for studying gene function makes this goal seem reachable in the foreseeable future. When this is accomplished, we will have entered a Golden Age, when we will have the information necessary for designing truly incisive experiments to reveal biological function.
Collapse
Affiliation(s)
- M Johnston
- Department of Genetics, Box 8232, Washington University School of Medicine, 660 Euclid Avenue, St Louis, Missouri 63113, USA.
| |
Collapse
|
17
|
Richard GF, Hennequin C, Thierry A, Dujon B. Trinucleotide repeats and other microsatellites in yeasts. Res Microbiol 1999; 150:589-602. [PMID: 10672999 DOI: 10.1016/s0923-2508(99)00131-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Microsatellites are direct tandem DNA repeats found in all genomes. A particular class of microsatellites, called trinucleotide repeats, is responsible for a number of neurological disorders in humans. We review here our current state of knowledge on trinucleotide repeat instability, and discuss the molecular mechanisms that may be involved in trinucleotide repeat expansions leading to fatal diseases in humans. We also present original data on microsatellite distribution in several microbial genomes, and on the use of microsatellites as physical markers to accurately and easily genotype yeast strains.
Collapse
Affiliation(s)
- G F Richard
- Unité de génétique moléculaire des levures, URA1300 CNRS, UFR927, université Pierre et Marie Curie, Institut Pasteur, Paris, France
| | | | | | | |
Collapse
|