1
|
Descorps-Declère S, Richard GF. Megasatellite formation and evolution in vertebrate genes. Cell Rep 2022; 40:111347. [PMID: 36103826 DOI: 10.1016/j.celrep.2022.111347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 04/28/2022] [Accepted: 08/23/2022] [Indexed: 11/03/2022] Open
Abstract
Since formation of the first proto-eukaryotes, gene repertoire and genome complexity have significantly increased. Among genetic elements responsible for this increase are tandem repeats. Here we describe a genome-wide analysis of large tandem repeats, called megasatellites, in 58 vertebrate genomes. Two bursts occurred, one after the radiation between Agnatha and Gnathostomata fishes and the second one in therian mammals. Megasatellites are enriched in subtelomeric regions and frequently encoded in genes involved in transcription regulation, intracellular trafficking, and cell membrane metabolism, reminiscent of what is observed in fungus genomes. The presence of many introns within young megasatellites suggests that an exon-intron DNA segment is first duplicated and amplified before accumulation of mutations in intronic parts partially erases the megasatellite in such a way that it becomes detectable only in exons. Our results suggest that megasatellite formation and evolution is a dynamic and still ongoing process in vertebrate genomes.
Collapse
Affiliation(s)
- Stéphane Descorps-Declère
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, 25 rue du Dr Roux, 75015 Paris, France.
| | - Guy-Franck Richard
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Natural & Synthetic Genome Instabilities, 25 rue du Dr Roux, 75015 Paris, France.
| |
Collapse
|
2
|
Saguez C, Viterbo D, Descorps-Declère S, Cormack BP, Dujon B, Richard GF. Functional variability in adhesion and flocculation of yeast megasatellite genes. Genetics 2022; 221:iyac042. [PMID: 35274698 PMCID: PMC9071537 DOI: 10.1093/genetics/iyac042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 03/07/2022] [Indexed: 11/14/2022] Open
Abstract
Megasatellites are large tandem repeats found in all fungal genomes but especially abundant in the opportunistic pathogen Candida glabrata. They are encoded in genes involved in cell-cell interactions, either between yeasts or between yeast and human cells. In the present work, we have been using an iterative genetic system to delete several Candida glabrata megasatellite-containing genes and found that 2 of them were positively involved in adhesion to epithelial cells, whereas 3 genes negatively controlled adhesion. Two of the latter, CAGL0B05061g or CAGL0A04851g, were also negative regulators of yeast-to-yeast adhesion, making them central players in controlling Candida glabrata adherence properties. Using a series of synthetic Saccharomyces cerevisiae strains in which the FLO1 megasatellite was replaced by other tandem repeats of similar length but different sequences, we showed that the capacity of a strain to flocculate in liquid culture was unrelated to its capacity to adhere to epithelial cells or to invade agar. Finally, to understand how megasatellites were initially created and subsequently expanded, an experimental evolution system was set up, in which modified yeast strains containing different megasatellite seeds were grown in bioreactors for more than 200 generations and selected for their ability to sediment at the bottom of the culture tube. Several flocculation-positive mutants were isolated. Functionally relevant mutations included general transcription factors as well as a 230-kbp segmental duplication.
Collapse
Affiliation(s)
- Cyril Saguez
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Genétique des Génomes, Paris F-75015, France
- Present address: Abolis Biotechnologies, 5 Rue Henri Desbruères, Evry 91030, France
| | - David Viterbo
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Genétique des Génomes, Paris F-75015, France
| | - Stéphane Descorps-Declère
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Genétique des Génomes, Paris F-75015, France
- Institut Pasteur, Bioinformatics and Biostatistics Hub, Department of Computational Biology, Paris F-75015, France
| | - Brendan P Cormack
- Department of Molecular Biology & Genetics, Johns Hopkins University, Baltimore, Maryland 21287, USA
| | - Bernard Dujon
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Genétique des Génomes, Paris F-75015, France
| | - Guy-Franck Richard
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Genétique des Génomes, Paris F-75015, France
| |
Collapse
|
3
|
Tekaia F. Genome Data Exploration Using Correspondence Analysis. Bioinform Biol Insights 2016; 10:59-72. [PMID: 27279736 PMCID: PMC4898644 DOI: 10.4137/bbi.s39614] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2016] [Revised: 04/12/2016] [Accepted: 04/14/2016] [Indexed: 01/14/2023] Open
Abstract
Recent developments of sequencing technologies that allow the production of massive amounts of genomic and genotyping data have highlighted the need for synthetic data representation and pattern recognition methods that can mine and help discovering biologically meaningful knowledge included in such large data sets. Correspondence analysis (CA) is an exploratory descriptive method designed to analyze two-way data tables, including some measure of association between rows and columns. It constructs linear combinations of variables, known as factors. CA has been used for decades to study high-dimensional data, and remarkable inferences from large data tables were obtained by reducing the dimensionality to a few orthogonal factors that correspond to the largest amount of variability in the data. Herein, I review CA and highlight its use by considering examples in handling high-dimensional data that can be constructed from genomic and genetic studies. Examples in amino acid compositions of large sets of species (viruses, phages, yeast, and fungi) as well as an example related to pairwise shared orthologs in a set of yeast and fungal species, as obtained from their proteome comparisons, are considered. For the first time, results show striking segregations between yeasts and fungi as well as between viruses and phages. Distributions obtained from shared orthologs show clusters of yeast and fungal species corresponding to their phylogenetic relationships. A direct comparison with the principal component analysis method is discussed using a recently published example of genotyping data related to newly discovered traces of an ancient hominid that was compared to modern human populations in the search for ancestral similarities. CA offers more detailed results highlighting links between modern humans and the ancient hominid and their characterizations. Compared to the popular principal component analysis method, CA allows easier and more effective interpretation of results, particularly by the ability of relating individual patterns with their corresponding characteristic variables.
Collapse
Affiliation(s)
- Fredj Tekaia
- Institut Pasteur, Unit of Structural Microbiology, CNRS URA 3528 and University Paris Diderot, Sorbonne Paris Cité, Paris, France
| |
Collapse
|
4
|
Descorps-Declère S, Saguez C, Cournac A, Marbouty M, Rolland T, Ma L, Bouchier C, Moszer I, Dujon B, Koszul R, Richard GF. Genome-wide replication landscape of Candida glabrata. BMC Biol 2015; 13:69. [PMID: 26329162 PMCID: PMC4556013 DOI: 10.1186/s12915-015-0177-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 08/05/2015] [Indexed: 11/25/2022] Open
Abstract
Background The opportunistic pathogen Candida glabrata is a member of the Saccharomycetaceae yeasts. Like its close relative Saccharomyces cerevisiae, it underwent a whole-genome duplication followed by an extensive loss of genes. Its genome contains a large number of very long tandem repeats, called megasatellites. In order to determine the whole replication program of the C. glabrata genome and its general chromosomal organization, we used deep-sequencing and chromosome conformation capture experiments. Results We identified 253 replication fork origins, genome wide. Centromeres, HML and HMR loci, and most histone genes are replicated early, whereas natural chromosomal breakpoints are located in late-replicating regions. In addition, 275 autonomously replicating sequences (ARS) were identified during ARS-capture experiments, and their relative fitness was determined during growth competition. Analysis of ARSs allowed us to identify a 17-bp consensus, similar to the S. cerevisiae ARS consensus sequence but slightly more constrained. Megasatellites are not in close proximity to replication origins or termini. Using chromosome conformation capture, we also show that early origins tend to cluster whereas non-subtelomeric megasatellites do not cluster in the yeast nucleus. Conclusions Despite a shorter cell cycle, the C. glabrata replication program shares unexpected striking similarities to S. cerevisiae, in spite of their large evolutionary distance and the presence of highly repetitive large tandem repeats in C. glabrata. No correlation could be found between the replication program and megasatellites, suggesting that their formation and propagation might not be directly caused by replication fork initiation or termination. Electronic supplementary material The online version of this article (doi:10.1186/s12915-015-0177-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Stéphane Descorps-Declère
- Institut Pasteur, Center of Bioinformatics, Biostatistics and Integrative Biology (C3BI), F-75015, Paris, France.
| | - Cyril Saguez
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, F-75015, Paris, France. .,CNRS, UMR3525, F-75015, Paris, France. .,Sorbonne Universités, UPMC Univ Paris 06, 4 Place Jussieu, 75252, Paris, Cedex 05, France.
| | - Axel Cournac
- CNRS, UMR3525, F-75015, Paris, France. .,Institut Pasteur, Groupe Régulation Spatiale des Génomes, Département Génomes & Génétique, F-75015, Paris, France.
| | - Martial Marbouty
- CNRS, UMR3525, F-75015, Paris, France. .,Institut Pasteur, Groupe Régulation Spatiale des Génomes, Département Génomes & Génétique, F-75015, Paris, France.
| | - Thomas Rolland
- Present address: Institut Pasteur, Unité de Génétique Humaine et Fonctions Cognitives, Département des Neurosciences, F-75015, Paris, France.
| | - Laurence Ma
- Institut Pasteur, Plate-forme Génomique, Département Génomes & Génétique, F-75015, Paris, France.
| | - Christiane Bouchier
- Institut Pasteur, Plate-forme Génomique, Département Génomes & Génétique, F-75015, Paris, France.
| | - Ivan Moszer
- Present address: Plate-forme Bio-informatique/Biostatistique, Institut de Neurosciences Translationnelles IHU-A-ICM, Hôpital Pitié-Salpêtrière, 47-83 bd de l'Hôpital, 75561, Paris, Cedex 13, France.
| | - Bernard Dujon
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, F-75015, Paris, France. .,CNRS, UMR3525, F-75015, Paris, France. .,Sorbonne Universités, UPMC Univ Paris 06, 4 Place Jussieu, 75252, Paris, Cedex 05, France.
| | - Romain Koszul
- CNRS, UMR3525, F-75015, Paris, France. .,Institut Pasteur, Groupe Régulation Spatiale des Génomes, Département Génomes & Génétique, F-75015, Paris, France.
| | - Guy-Franck Richard
- Institut Pasteur, Unité de Génétique Moléculaire des Levures, Département Génomes & Génétique, F-75015, Paris, France. .,CNRS, UMR3525, F-75015, Paris, France. .,Sorbonne Universités, UPMC Univ Paris 06, 4 Place Jussieu, 75252, Paris, Cedex 05, France.
| |
Collapse
|
5
|
Willcocks S, Wren BW. Shared characteristics between Mycobacterium tuberculosis and fungi contribute to virulence. Future Microbiol 2015; 9:657-68. [PMID: 24957092 DOI: 10.2217/fmb.14.29] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Mycobacterium tuberculosis, an etiologic agent of tuberculosis, exacts a heavy toll in terms of human morbidity and mortality. Although an ancient disease, new strains are emerging as human population density increases. The emergent virulent strains appear adept at steering the host immune response from a protective Th1 type response towards a Th2 bias, a feature shared with some pathogenic fungi. Other common characteristics include infection site, metabolic features, the composition and display of cell surface molecules, the range of innate immune receptors engaged during infection, and the ability to form granulomas. Literature from these two distinct fields of research are reviewed to propose that the emergent virulent strains of M. tuberculosis are in the process of convergent evolution with pathogenic fungi, and are increasing the prominence of conserved traits from environmental phylogenetic ancestors that facilitate their evasion of host defenses and dissemination.
Collapse
Affiliation(s)
- Sam Willcocks
- The London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
| | | |
Collapse
|
6
|
Ahmad KM, Kokošar J, Guo X, Gu Z, Ishchuk OP, Piškur J. Genome structure and dynamics of the yeast pathogen Candida glabrata. FEMS Yeast Res 2014; 14:529-35. [PMID: 24528571 PMCID: PMC4320752 DOI: 10.1111/1567-1364.12145] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2013] [Revised: 02/07/2014] [Accepted: 02/08/2014] [Indexed: 01/09/2023] Open
Abstract
The yeast pathogen Candida glabrata is the second most frequent cause of Candida infections. However, from the phylogenetic point of view, C. glabrata is much closer to Saccharomyces cerevisiae than to Candida albicans. Apparently, this yeast has relatively recently changed its life style and become a successful opportunistic pathogen. Recently, several C. glabrata sister species, among them clinical and environmental isolates, have had their genomes characterized. Also, hundreds of C. glabrata clinical isolates have been characterized for their genomes. These isolates display enormous genomic plasticity. The number and size of chromosomes vary drastically, as well as intra- and interchromosomal segmental duplications occur frequently. The observed genome alterations could affect phenotypic properties and thus help to adapt to the highly variable and harsh habitats this yeast finds in different human patients and their tissues. Further genome sequencing of pathogenic isolates will provide a valuable tool to understand the mechanisms behind genome dynamics and help to elucidate the genes contributing to the virulence potential.
Collapse
|
7
|
DeForte S, Reddy KD, Uversky VN. Digested disorder: Quarterly intrinsic disorder digest (April-May-June, 2013). INTRINSICALLY DISORDERED PROTEINS 2013; 1:e27454. [PMID: 28516028 PMCID: PMC5424790 DOI: 10.4161/idp.27454] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 12/06/2013] [Indexed: 01/18/2023]
Abstract
The current literature on intrinsically disordered proteins is overwhelming. To keep interested readers up to speed with this literature, we continue a "Digested Disorder" project and represent a series of reader's digest type articles objectively representing the research papers and reviews on intrinsically disordered proteins. The only 2 criteria for inclusion in this digest are the publication date (a paper should be published within the covered time frame) and topic (a paper should be dedicated to any aspect of protein intrinsic disorder). The current digest issue covers papers published during the period of April, May, and June of 2013. The papers are grouped hierarchically by topics they cover, and for each of the included paper a short description is given on its major findings.
Collapse
Affiliation(s)
- Shelly DeForte
- Department of Molecular Medicine; Morsani College of Medicine; University of South Florida; Tampa, FL USA
| | - Krishna D Reddy
- Department of Molecular Medicine; Morsani College of Medicine; University of South Florida; Tampa, FL USA
| | - Vladimir N Uversky
- Department of Molecular Medicine; Morsani College of Medicine; University of South Florida; Tampa, FL USA.,USF Health Byrd Alzheimer's Research Institute; Morsani College of Medicine; University of South Florida; Tampa, FL USA.,Institute for Biological Instrumentation; Russian Academy of Sciences; Pushchino, Moscow Region, Russia
| |
Collapse
|