1
|
Horvath J, Jedlicka P, Kratka M, Kubat Z, Kejnovsky E, Lexa M. Detection and classification of long terminal repeat sequences in plant LTR-retrotransposons and their analysis using explainable machine learning. BioData Min 2024; 17:57. [PMID: 39696434 DOI: 10.1186/s13040-024-00410-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Accepted: 11/22/2024] [Indexed: 12/20/2024] Open
Abstract
BACKGROUND Long terminal repeats (LTRs) represent important parts of LTR retrotransposons and retroviruses found in high copy numbers in a majority of eukaryotic genomes. LTRs contain regulatory sequences essential for the life cycle of the retrotransposon. Previous experimental and sequence studies have provided only limited information about LTR structure and composition, mostly from model systems. To enhance our understanding of these key sequence modules, we focused on the contrasts between LTRs of various retrotransposon families and other genomic regions. Furthermore, this approach can be utilized for the classification and prediction of LTRs. RESULTS We used machine learning methods suitable for DNA sequence classification and applied them to a large dataset of plant LTR retrotransposon sequences. We trained three machine learning models using (i) traditional model ensembles (Gradient Boosting), (ii) hybrid convolutional/long and short memory network models, and (iii) a DNA pre-trained transformer-based model using k-mer sequence representation. All three approaches were successful in classifying and isolating LTRs in this data, as well as providing valuable insights into LTR sequence composition. The best classification (expressed as F1 score) achieved for LTR detection was 0.85 using the hybrid network model. The most accurate classification task was superfamily classification (F1=0.89) while the least accurate was family classification (F1=0.74). The trained models were subjected to explainability analysis. Positional analysis identified a mixture of interesting features, many of which had a preferred absolute position within the LTR and/or were biologically relevant, such as a centrally positioned TATA-box regulatory sequence, and TG..CA nucleotide patterns around both LTR edges. CONCLUSIONS Our results show that the models used here recognized biologically relevant motifs, such as core promoter elements in the LTR detection task, and a development and stress-related subclass of transcription factor binding sites in the family classification task. Explainability analysis also highlighted the importance of 5'- and 3'- edges in LTR identity and revealed need to analyze more than just dinucleotides at these ends. Our work shows the applicability of machine learning models to regulatory sequence analysis and classification, and demonstrates the important role of the identified motifs in LTR detection.
Collapse
Affiliation(s)
- Jakub Horvath
- Faculty of Informatics, Masaryk University, Botanicka 68a, Brno, 60200, Czech Republic.
| | - Pavel Jedlicka
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, Brno, 61200, Czech Republic
| | - Marie Kratka
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, Brno, 61200, Czech Republic
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, Brno, 62500, Czech Republic
| | - Zdenek Kubat
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, Brno, 61200, Czech Republic
| | - Eduard Kejnovsky
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, Brno, 61200, Czech Republic
| | - Matej Lexa
- Faculty of Informatics, Masaryk University, Botanicka 68a, Brno, 60200, Czech Republic.
| |
Collapse
|
2
|
Wang B, Saleh AA, Yang N, Asare E, Chen H, Wang Q, Chen C, Song C, Gao B. High Diversity of Long Terminal Repeat Retrotransposons in Compact Vertebrate Genomes: Insights from Genomes of Tetraodontiformes. Animals (Basel) 2024; 14:1425. [PMID: 38791643 PMCID: PMC11117352 DOI: 10.3390/ani14101425] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 05/04/2024] [Accepted: 05/07/2024] [Indexed: 05/26/2024] Open
Abstract
This study aimed to investigate the evolutionary profile (including diversity, activity, and abundance) of retrotransposons (RTNs) with long terminal repeats (LTRs) in ten species of Tetraodontiformes. These species, Arothron firmamentum, Lagocephalus sceleratus, Pao palembangensis, Takifugu bimaculatus, Takifugu flavidus, Takifugu ocellatus, Takifugu rubripes, Tetraodon nigroviridis, Mola mola, and Thamnaconus septentrionalis, are known for having the smallest genomes among vertebrates. Data mining revealed a high diversity and wide distribution of LTR retrotransposons (LTR-RTNs) in these compact vertebrate genomes, with varying abundances among species. A total of 819 full-length LTR-RTN sequences were identified across these genomes, categorized into nine families belonging to four different superfamilies: ERV (Orthoretrovirinae and Epsilon retrovirus), Copia, BEL-PAO, and Gypsy (Gmr, Mag, V-clade, CsRN1, and Barthez). The Gypsy superfamily exhibited the highest diversity. LTR family distribution varied among species, with Takifugu bimaculatus, Takifugu flavidus, Takifugu ocellatus, and Takifugu rubripes having the highest richness of LTR families and sequences. Additionally, evidence of recent invasions was observed in specific tetraodontiform genomes, suggesting potential transposition activity. This study provides insights into the evolution of LTR retrotransposons in Tetraodontiformes, enhancing our understanding of their impact on the structure and evolution of host genomes.
Collapse
Affiliation(s)
- Bingqing Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
| | - Ahmed A. Saleh
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
- Animal and Fish Production Department, Faculty of Agriculture (Al-Shatby), Alexandria University, Alexandria 11865, Egypt
| | - Naisu Yang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
| | - Emmanuel Asare
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
| | - Hong Chen
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
| | - Quan Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
| | - Cai Chen
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
| | - Chengyi Song
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
| | - Bo Gao
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China; (B.W.); (A.A.S.); (N.Y.); (E.A.); (H.C.); (Q.W.); (C.C.); (C.S.)
| |
Collapse
|
3
|
Shin W, Mun S, Han K. Human Endogenous Retrovirus-K (HML-2)-Related Genetic Variation: Human Genome Diversity and Disease. Genes (Basel) 2023; 14:2150. [PMID: 38136972 PMCID: PMC10742618 DOI: 10.3390/genes14122150] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/23/2023] [Accepted: 11/26/2023] [Indexed: 12/24/2023] Open
Abstract
Human endogenous retroviruses (HERVs) comprise a significant portion of the human genome, making up roughly 8%, a notable comparison to the 2-3% represented by coding sequences. Numerous studies have underscored the critical role and importance of HERVs, highlighting their diverse and extensive influence on the evolution of the human genome and establishing their complex correlation with various diseases. Among HERVs, the HERV-K (HML-2) subfamily has recently attracted significant attention, integrating into the human genome after the divergence between humans and chimpanzees. Its insertion in the human genome has received considerable attention due to its structural and functional characteristics and the time of insertion. Originating from ancient exogenous retroviruses, these elements succeeded in infecting germ cells, enabling vertical transmission and existing as proviruses within the genome. Remarkably, these sequences have retained the capacity to form complete viral sequences, exhibiting activity in transcription and translation. The HERV-K (HML-2) subfamily is the subject of active debate about its potential positive or negative effects on human genome evolution and various pathologies. This review summarizes the variation, regulation, and diseases in human genome evolution arising from the influence of HERV-K (HML-2).
Collapse
Affiliation(s)
- Wonseok Shin
- NGS Clinical Laboratory, Division of Cancer Research, Dankook University Hospital, Cheonan 31116, Republic of Korea;
- Smart Animal Bio Institute, Dankook University, Cheonan 31116, Republic of Korea;
| | - Seyoung Mun
- Smart Animal Bio Institute, Dankook University, Cheonan 31116, Republic of Korea;
- College of Science & Technology, Dankook University, Cheonan 31116, Republic of Korea
- Center for Bio-Medical Engineering Core Facility, Dankook University, Cheonan 31116, Republic of Korea
| | - Kyudong Han
- Smart Animal Bio Institute, Dankook University, Cheonan 31116, Republic of Korea;
- Center for Bio-Medical Engineering Core Facility, Dankook University, Cheonan 31116, Republic of Korea
- Department of Microbiology, College of Science & Technology, Dankook University, Cheonan 31116, Republic of Korea
- Department of Bioconvergence Engineering, Dankook University, Yongin 16890, Republic of Korea
- R&D Center, HuNBiome Co., Ltd., Seoul 08507, Republic of Korea
| |
Collapse
|
4
|
Duhamel M, Hood ME, Rodríguez de la Vega RC, Giraud T. Dynamics of transposable element accumulation in the non-recombining regions of mating-type chromosomes in anther-smut fungi. Nat Commun 2023; 14:5692. [PMID: 37709766 PMCID: PMC10502011 DOI: 10.1038/s41467-023-41413-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 08/30/2023] [Indexed: 09/16/2023] Open
Abstract
In the absence of recombination, the number of transposable elements (TEs) increases due to less efficient selection, but the dynamics of such TE accumulations are not well characterized. Leveraging a dataset of 21 independent events of recombination cessation of different ages in mating-type chromosomes of Microbotryum fungi, we show that TEs rapidly accumulated in regions lacking recombination, but that TE content reached a plateau at ca. 50% of occupied base pairs by 1.5 million years following recombination suppression. The same TE superfamilies have expanded in independently evolved non-recombining regions, in particular rolling-circle replication elements (Helitrons). Long-terminal repeat (LTR) retrotransposons of the Copia and Ty3 superfamilies also expanded, through transposition bursts (distinguished from gene conversion based on LTR divergence), with both non-recombining regions and autosomes affected, suggesting that non-recombining regions constitute TE reservoirs. This study improves our knowledge of genome evolution by showing that TEs can accumulate through bursts, following non-linear decelerating dynamics.
Collapse
Affiliation(s)
- Marine Duhamel
- Ecologie Systématique Evolution, IDEEV, CNRS, Université Paris-Saclay, AgroParisTech, Bâtiment 680, 12 route RD128, 91190, Gif-sur-Yvette, France.
- Evolution der Pflanzen und Pilze, Ruhr-Universität Bochum, Universitätsstraße 150, 44780, Bochum, Germany.
| | - Michael E Hood
- Department of Biology, Amherst College, 01002-5000, Amherst, MA, USA
| | - Ricardo C Rodríguez de la Vega
- Ecologie Systématique Evolution, IDEEV, CNRS, Université Paris-Saclay, AgroParisTech, Bâtiment 680, 12 route RD128, 91190, Gif-sur-Yvette, France
| | - Tatiana Giraud
- Ecologie Systématique Evolution, IDEEV, CNRS, Université Paris-Saclay, AgroParisTech, Bâtiment 680, 12 route RD128, 91190, Gif-sur-Yvette, France
| |
Collapse
|
5
|
Wang Q, Shi Y, Bian Q, Zhang N, Wang M, Wang J, Li X, Lai L, Zhao Z, Yu H. Molecular mechanisms of syncytin-1 in tumors and placental development related diseases. Discov Oncol 2023; 14:104. [PMID: 37326913 DOI: 10.1007/s12672-023-00702-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/09/2023] [Accepted: 05/25/2023] [Indexed: 06/17/2023] Open
Abstract
Human endogenous retroviruses (HERVs) have evolved from exogenous retroviruses and account for approximately 8% of the human genome. A growing number of findings suggest that the abnormal expression of HERV genes is associated with schizophrenia, multiple sclerosis, endometriosis, breast cancer, bladder cancer and other diseases. HERV-W env (syncytin-1) is a membrane glycoprotein which plays an important role in placental development. It includes embryo implantation, fusion of syncytiotrophoblasts and of fertilized eggs, and immune response. The abnormal expression of syncytin-1 is related to placental development-related diseases such as preeclampsia, infertility, and intrauterine growth restriction, as well as tumors such as neuroblastoma, endometrial cancer, and endometriosis. This review mainly focused on the molecular interactions of syncytin-1 in placental development-related diseases and tumors, to explore whether syncytin-1 can be an emerging biological marker and potential therapeutic target.
Collapse
Affiliation(s)
- Qianqian Wang
- Department of Biochemistry, Jining Medical University, 133 Hehua Road, Jining, 272067, Shandong, People's Republic of China
| | - Ying Shi
- Department of Biochemistry, Jining Medical University, 133 Hehua Road, Jining, 272067, Shandong, People's Republic of China
| | - Qiang Bian
- Collaborative Innovation Center, Jining Medical University, Jining, 272067, Shandong, People's Republic of China
- Department of Pathophysiology, Weifang Medical University, Weifang, 261053, Shandong, People's Republic of China
| | - Naibin Zhang
- Department of Biochemistry, Jining Medical University, 133 Hehua Road, Jining, 272067, Shandong, People's Republic of China
| | - Meng Wang
- Department of Biochemistry, Jining Medical University, 133 Hehua Road, Jining, 272067, Shandong, People's Republic of China
| | - Jianing Wang
- Department of Biochemistry, Jining Medical University, 133 Hehua Road, Jining, 272067, Shandong, People's Republic of China
| | - Xuan Li
- Department of Biochemistry, Jining Medical University, 133 Hehua Road, Jining, 272067, Shandong, People's Republic of China
| | - Luhao Lai
- Collaborative Innovation Center, Jining Medical University, Jining, 272067, Shandong, People's Republic of China
| | - Zhankui Zhao
- The Affiliated Hospital of Jining Medical University, Jining Medical University, 89 Guhuai Road, Jining, 272029, Shandong, People's Republic of China.
| | - Honglian Yu
- Department of Biochemistry, Jining Medical University, 133 Hehua Road, Jining, 272067, Shandong, People's Republic of China.
- Collaborative Innovation Center, Jining Medical University, Jining, 272067, Shandong, People's Republic of China.
| |
Collapse
|
6
|
Annotation of Siberian Larch (Larix sibirica Ledeb.) Nuclear Genome—One of the Most Cold-Resistant Tree Species in the Only Deciduous GENUS in Pinaceae. PLANTS 2022; 11:plants11152062. [PMID: 35956540 PMCID: PMC9370799 DOI: 10.3390/plants11152062] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 07/22/2022] [Accepted: 07/26/2022] [Indexed: 11/17/2022]
Abstract
The recent release of the nuclear, chloroplast and mitochondrial genome assemblies of Siberian larch (Larix sibirica Ledeb.), one of the most cold-resistant tree species in the only deciduous genus of Pinaceae, with seasonal senescence and a rot-resistant valuable timber widely used in construction, greatly contributed to the development of genomic resources for the larch genus. Here, we present an extensive repeatome analysis and the first annotation of the draft nuclear Siberian larch genome assembly. About 66% of the larch genome consists of highly repetitive elements (REs), with the likely wave of retrotransposons insertions into the larch genome estimated to occur 4–5 MYA. In total, 39,370 gene models were predicted, with 87% of them having homology to the Arabidopsis-annotated proteins and 78% having at least one GO term assignment. The current state of the genome annotations allows for the exploration of the gymnosperm and angiosperm species for relative gene abundance in different functional categories. Comparative analysis of functional gene categories across different angiosperm and gymnosperm species finds that the Siberian larch genome has an overabundance of genes associated with programmed cell death (PCD), autophagy, stress hormone biosynthesis and regulatory pathways; genes that may play important roles in seasonal senescence and stress response to extreme cold in larch. Despite being incomplete, the draft assemblies and annotations of the conifer genomes are at a point of development where they now represent a valuable source for further genomic, genetic and population studies.
Collapse
|
7
|
Aroh O, Halanych KM. Genome-wide characterization of LTR retrotransposons in the non-model deep-sea annelid Lamellibrachia luymesi. BMC Genomics 2021; 22:466. [PMID: 34157969 PMCID: PMC8220671 DOI: 10.1186/s12864-021-07749-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 05/20/2021] [Indexed: 02/06/2023] Open
Abstract
Background Long Terminal Repeat retrotransposons (LTR retrotransposons) are mobile genetic elements composed of a few genes between terminal repeats and, in some cases, can comprise over half of a genome’s content. Available data on LTR retrotransposons have facilitated comparative studies and provided insight on genome evolution. However, data are biased to model systems and marine organisms, including annelids, have been underrepresented in transposable elements studies. Here, we focus on genome of Lamellibrachia luymesi, a vestimentiferan tubeworm from deep-sea hydrocarbon seeps, to gain knowledge of LTR retrotransposons in a deep-sea annelid. Results We characterized LTR retrotransposons present in the genome of L. luymesi using bioinformatic approaches and found that intact LTR retrotransposons makes up about 0.1% of L. luymesi genome. Previous characterization of the genome has shown that this tubeworm hosts several known LTR-retrotransposons. Here we describe and classify LTR retrotransposons in L. luymesi as within the Gypsy, Copia and Bel-pao superfamilies. Although, many elements fell within already recognized families (e.g., Mag, CSRN1), others formed clades distinct from previously recognized families within these superfamilies. However, approximately 19% (41) of recovered elements could not be classified. Gypsy elements were the most abundant while only 2 Copia and 2 Bel-pao elements were present. In addition, analysis of insertion times indicated that several LTR-retrotransposons were recently transposed into the genome of L. luymesi, these elements had identical LTR’s raising possibility of recent or ongoing retrotransposon activity. Conclusions Our analysis contributes to knowledge on diversity of LTR-retrotransposons in marine settings and also serves as an important step to assist our understanding of the potential role of retroelements in marine organisms. We find that many LTR retrotransposons, which have been inserted in the last few million years, are similar to those found in terrestrial model species. However, several new groups of LTR retrotransposons were discovered suggesting that the representation of LTR retrotransposons may be different in marine settings. Further study would improve understanding of the diversity of retrotransposons across animal groups and environments. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07749-1.
Collapse
Affiliation(s)
- Oluchi Aroh
- Department of Biological Sciences & Molette Biology Laboratory for Environmental and Climate Change Studies, College of Science and Mathematics, Auburn University, 101 Rouse Life Science Building, Auburn, AL, 36849, USA.
| | - Kenneth M Halanych
- Department of Biological Sciences & Molette Biology Laboratory for Environmental and Climate Change Studies, College of Science and Mathematics, Auburn University, 101 Rouse Life Science Building, Auburn, AL, 36849, USA
| |
Collapse
|
8
|
Bakuła Z, Siedlecki P, Gromadka R, Gawor J, Gromadka A, Pomorski JJ, Panagiotopoulou H, Jagielski T. A first insight into the genome of Prototheca wickerhamii, a major causative agent of human protothecosis. BMC Genomics 2021; 22:168. [PMID: 33750287 PMCID: PMC7941945 DOI: 10.1186/s12864-021-07491-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 02/26/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Colourless microalgae of the Prototheca genus are the only known plants that have consistently been implicated in a range of clinically relevant opportunistic infections in both animals and humans. The Prototheca algae are emerging pathogens, whose incidence has increased importantly over the past two decades. Prototheca wickerhamii is a major human pathogen, responsible for at least 115 cases worldwide. Although the algae are receiving more attention nowadays, there is still a substantial knowledge gap regarding their biology, and pathogenicity in particular. Here we report, for the first time, the complete nuclear genome, organelle genomes, and transcriptome of the P. wickerhamii type strain ATCC 16529. RESULTS The assembled genome size was of 16.7 Mbp, making it the smallest and most compact genome sequenced so far among the protothecans. Key features of the genome included a high overall GC content (64.5%), a high number (6081) and proportion (45.9%) of protein-coding genes, and a low repetitive sequence content (2.2%). The vast majority (90.6%) of the predicted genes were confirmed with the corresponding transcripts upon RNA-sequencing analysis. Most (93.2%) of the genes had their putative function assigned when searched against the InterProScan database. A fourth (23.3%) of the genes were annotated with an enzymatic activity possibly associated with the adaptation to the human host environment. The P. wickerhamii genome encoded a wide array of possible virulence factors, including those already identified in two model opportunistic fungal pathogens, i.e. Candida albicans and Trichophyton rubrum, and thought to be involved in invasion of the host or elicitation of the adaptive stress response. Approximately 6% of the P. wickerhamii genes matched a Pathogen-Host Interaction Database entry and had a previously experimentally proven role in the disease development. Furthermore, genes coding for proteins (e.g. ATPase, malate dehydrogenase) hitherto considered as potential virulence factors of Prototheca spp. were demonstrated in the P. wickerhamii genome. CONCLUSIONS Overall, this study is the first to describe the genetic make-up of P. wickerhamii and discovers proteins possibly involved in the development of protothecosis.
Collapse
Affiliation(s)
- Zofia Bakuła
- Department of Medical Microbiology, Institute of Microbiology, Faculty of Biology, University of Warsaw, I. Miecznikowa 1, 02-096, Warsaw, Poland
| | - Paweł Siedlecki
- Department of Systems Biology, University of Warsaw, I. Miecznikowa 1, 02-096, Warsaw, Poland.,Department of Bioinformatics, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, A. Pawińskiego 5a, 02-106, Warsaw, Poland
| | - Robert Gromadka
- DNA Sequencing and Synthesis Facility, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, A. Pawińskiego 5a, 02-106, Warsaw, Poland
| | - Jan Gawor
- DNA Sequencing and Synthesis Facility, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, A. Pawińskiego 5a, 02-106, Warsaw, Poland
| | - Agnieszka Gromadka
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, A. Pawińskiego 5a, 02-106, Warsaw, Poland
| | - Jan J Pomorski
- Museum and Institute of Zoology, Polish Academy of Sciences, Wilcza 64, 00-679, Warsaw, Poland
| | - Hanna Panagiotopoulou
- Museum and Institute of Zoology, Polish Academy of Sciences, Wilcza 64, 00-679, Warsaw, Poland
| | - Tomasz Jagielski
- Department of Medical Microbiology, Institute of Microbiology, Faculty of Biology, University of Warsaw, I. Miecznikowa 1, 02-096, Warsaw, Poland.
| |
Collapse
|
9
|
Mair WJ, Thomas GJ, Dodhia K, Hills AL, Jayasena KW, Ellwood SR, Oliver RP, Lopez-Ruiz FJ. Parallel evolution of multiple mechanisms for demethylase inhibitor fungicide resistance in the barley pathogen Pyrenophora teres f. sp. maculata. Fungal Genet Biol 2020; 145:103475. [DOI: 10.1016/j.fgb.2020.103475] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Revised: 09/11/2020] [Accepted: 09/25/2020] [Indexed: 10/23/2022]
|
10
|
Quesneville H. Twenty years of transposable element analysis in the Arabidopsis thaliana genome. Mob DNA 2020; 11:28. [PMID: 32742313 PMCID: PMC7385966 DOI: 10.1186/s13100-020-00223-x] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 07/16/2020] [Indexed: 11/10/2022] Open
Abstract
Transposable elements (TEs) are mobile repetitive DNA sequences shown to be major drivers of genome evolution. As the first plant to have its genome sequenced and analyzed at the genomic scale, Arabidopsis thaliana has largely contributed to our TE knowledge. The present report describes 20 years of accumulated TE knowledge gained through the study of the Arabidopsis genome and covers the known TE families, their relative abundance, and their genomic distribution. It presents our knowledge of the different TE family activities, mobility, population and long-term evolutionary dynamics. Finally, the role of TE as substrates for new genes and their impact on gene expression is illustrated through a few selected demonstrative cases. Promising future directions for TE studies in this species conclude the review.
Collapse
|
11
|
Chromatin Profiling of the Repetitive and Nonrepetitive Genomes of the Human Fungal Pathogen Candida albicans. mBio 2019; 10:mBio.01376-19. [PMID: 31337722 PMCID: PMC6650553 DOI: 10.1128/mbio.01376-19] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The fungus Candida albicans is an opportunistic pathogen that normally lives on the human body without causing any harm. However, C. albicans is also a dangerous pathogen responsible for millions of infections annually. C. albicans is such a successful pathogen because it can adapt to and thrive in different environments. Chemical modifications of chromatin, the structure that packages DNA into cells, can allow environmental adaptation by regulating gene expression and genome organization. Surprisingly, the contribution of chromatin modification to C. albicans biology is still largely unknown. For the first time, we analyzed C. albicans chromatin modifications on a genome-wide basis. We demonstrate that specific chromatin states are associated with distinct regions of the C. albicans genome and identify the roles of the chromatin modifiers Sir2 and Set1 in shaping C. albicans chromatin and gene expression. Eukaryotic genomes are packaged into chromatin structures that play pivotal roles in regulating all DNA-associated processes. Histone posttranslational modifications modulate chromatin structure and function, leading to rapid regulation of gene expression and genome stability, key steps in environmental adaptation. Candida albicans, a prevalent fungal pathogen in humans, can rapidly adapt and thrive in diverse host niches. The contribution of chromatin to C. albicans biology is largely unexplored. Here, we generated the first comprehensive chromatin profile of histone modifications (histone H3 trimethylated on lysine 4 [H3K4me3], histone H3 acetylated on lysine 9 [H3K9Ac], acetylated lysine 16 on histone H4 [H4K16Ac], and γH2A) across the C. albicans genome and investigated its relationship to gene expression by harnessing genome-wide sequencing approaches. We demonstrated that gene-rich nonrepetitive regions are packaged into canonical euchromatin in association with histone modifications that mirror their transcriptional activity. In contrast, repetitive regions are assembled into distinct chromatin states; subtelomeric regions and the ribosomal DNA (rDNA) locus are assembled into heterochromatin, while major repeat sequences and transposons are packaged in chromatin that bears features of euchromatin and heterochromatin. Genome-wide mapping of γH2A, a marker of genome instability, identified potential recombination-prone genomic loci. Finally, we present the first quantitative chromatin profiling in C. albicans to delineate the role of the chromatin modifiers Sir2 and Set1 in controlling chromatin structure and gene expression. This report presents the first genome-wide chromatin profiling of histone modifications associated with the C. albicans genome. These epigenomic maps provide an invaluable resource to understand the contribution of chromatin to C. albicans biology and identify aspects of C. albicans chromatin organization that differ from that of other yeasts.
Collapse
|
12
|
The Genome of the Human Pathogen Candida albicans Is Shaped by Mutation and Cryptic Sexual Recombination. mBio 2018; 9:mBio.01205-18. [PMID: 30228236 PMCID: PMC6143739 DOI: 10.1128/mbio.01205-18] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The opportunistic fungal pathogen Candida albicans lacks a conventional sexual program and is thought to evolve, at least primarily, through the clonal acquisition of genetic changes. Here, we performed an analysis of heterozygous diploid genomes from 21 clinical isolates to determine the natural evolutionary processes acting on the C. albicans genome. Mutation and recombination shaped the genomic landscape among the C. albicans isolates. Strain-specific single nucleotide polymorphisms (SNPs) and insertions/deletions (indels) clustered across the genome. Additionally, loss-of-heterozygosity (LOH) events contributed substantially to genotypic variation, with most long-tract LOH events extending to the ends of the chromosomes suggestive of repair via break-induced replication. Consistent with a model of inheritance by descent, most polymorphisms were shared between closely related strains. However, some isolates contained highly mosaic genomes consistent with strains having experienced interclade recombination during their evolutionary history. A detailed examination of mitochondrial genomes also revealed clear examples of interclade recombination among sequenced strains. These analyses therefore establish that both (para)sexual recombination and mitotic mutational processes drive evolution of this important pathogen. To further facilitate the study of C. albicans genomes, we also introduce an online platform, SNPMap, to examine SNP patterns in sequenced isolates.IMPORTANCE Mutations introduce variation into the genome upon which selection can act. Defining the nature of these changes is critical for determining species evolution, as well as for understanding the genetic changes driving important cellular processes. The heterozygous diploid fungus Candida albicans is both a frequent commensal organism and a prevalent opportunistic pathogen. A prevailing theory is that C. albicans evolves primarily through the gradual buildup of mitotic mutations, and a pressing issue is whether sexual or parasexual processes also operate within natural populations. Here, we establish that the C. albicans genome evolves by a combination of localized mutation and both short-tract and long-tract loss-of-heterozygosity (LOH) events within the sequenced isolates. Mutations are more prevalent within noncoding and heterozygous regions and LOH increases towards chromosome ends. Furthermore, we provide evidence for genetic exchange between isolates, establishing that sexual or parasexual processes have contributed to the diversity of both nuclear and mitochondrial genomes.
Collapse
|
13
|
Cheung S, Manhas S, Measday V. Retrotransposon targeting to RNA polymerase III-transcribed genes. Mob DNA 2018; 9:14. [PMID: 29713390 PMCID: PMC5911963 DOI: 10.1186/s13100-018-0119-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2017] [Accepted: 04/16/2018] [Indexed: 12/20/2022] Open
Abstract
Retrotransposons are genetic elements that are similar in structure and life cycle to retroviruses by replicating via an RNA intermediate and inserting into a host genome. The Saccharomyces cerevisiae (S. cerevisiae) Ty1-5 elements are long terminal repeat (LTR) retrotransposons that are members of the Ty1-copia (Pseudoviridae) or Ty3-gypsy (Metaviridae) families. Four of the five S. cerevisiae Ty elements are inserted into the genome upstream of RNA Polymerase (Pol) III-transcribed genes such as transfer RNA (tRNA) genes. This particular genomic locus provides a safe environment for Ty element insertion without disruption of the host genome and is a targeting strategy used by retrotransposons that insert into compact genomes of hosts such as S. cerevisiae and the social amoeba Dictyostelium. The mechanism by which Ty1 targeting is achieved has been recently solved due to the discovery of an interaction between Ty1 Integrase (IN) and RNA Pol III subunits. We describe the methods used to identify the Ty1-IN interaction with Pol III and the Ty1 targeting consequences if the interaction is perturbed. The details of Ty1 targeting are just beginning to emerge and many unexplored areas remain including consideration of the 3-dimensional shape of genome. We present a variety of other retrotransposon families that insert adjacent to Pol III-transcribed genes and the mechanism by which the host machinery has been hijacked to accomplish this targeting strategy. Finally, we discuss why retrotransposons selected Pol III-transcribed genes as a target during evolution and how retrotransposons have shaped genome architecture.
Collapse
Affiliation(s)
- Stephanie Cheung
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - Savrina Manhas
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
| | - Vivien Measday
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of British Columbia, Vancouver, BC V6T 1Z4 Canada
- Department of Food Science, Wine Research Centre, Faculty of Land and Food Systems, University of British Columbia, Room 325-2205 East Mall, Vancouver, British Columbia V6T 1Z4 Canada
| |
Collapse
|
14
|
Li Q, Zhang Y, Zhang Z, Li X, Yao D, Wang Y, Ouyang X, Li Y, Song W, Xiao Y. A D-genome-originated Ty1/Copia-type retrotransposon family expanded significantly in tetraploid cottons. Mol Genet Genomics 2017; 293:33-43. [PMID: 28849273 DOI: 10.1007/s00438-017-1359-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 08/08/2017] [Indexed: 10/19/2022]
Abstract
Retrotransposons comprise of a major fraction of higher plant genomes, and their proliferation and elimination have profound effects on genome evolution and gene functions as well. Previously we found a D-genome-originated Ty1/Copia-type LTR (DOCL) retrotransposon in the chromosome A08 of upland cotton. To further characterize the DOCL retrotransposon family, a total of 342 DOCL retrotransposons were identified in the sequenced cotton genomes, including 73, 157, and 112 from Gossypium raimondii, G. hirsutum, and G. barbadense, respectively. According to phylogenetic analysis, the DOCL family was divided into nine groups (G1-G9), among which five groups (G1-G4 and G9, including 292 members) were proliferated after the formation of tetraploid cottons. It was found that the majority of DOCL retrotransposons (especially those in G2, G3 and G9) inserted in non-allelic loci in G. hirsutum and G. barbadense, suggesting that their proliferations were relatively independent in different tetraploid cottons. Furthermore, DOCL retrotransposons inserted in coding regions largely eliminated expression of the targeted genes in G. hirsutum or G. barbadense. Our data suggested that recent proliferation of retrotransposon families like DOCL was one of important evolutionary forces driving diversification and evolution of tetraploid cottons.
Collapse
Affiliation(s)
- Qian Li
- Biotechnology Research Center, Chongqing Key Laboratory of Application and Safety Control of Genetically Modified Crops, Southwest University, Beibei, Chongqing, China
| | - Yue Zhang
- Biotechnology Research Center, Chongqing Key Laboratory of Application and Safety Control of Genetically Modified Crops, Southwest University, Beibei, Chongqing, China
| | - Zhengsheng Zhang
- College of Agronomy and Biological Science and Technology, Southwest University, Beibei, Chongqing, China
| | - Xianbi Li
- Biotechnology Research Center, Chongqing Key Laboratory of Application and Safety Control of Genetically Modified Crops, Southwest University, Beibei, Chongqing, China
| | - Dan Yao
- Biotechnology Research Center, Chongqing Key Laboratory of Application and Safety Control of Genetically Modified Crops, Southwest University, Beibei, Chongqing, China
| | - Yi Wang
- Biotechnology Research Center, Chongqing Key Laboratory of Application and Safety Control of Genetically Modified Crops, Southwest University, Beibei, Chongqing, China
| | - Xufen Ouyang
- Biotechnology Research Center, Chongqing Key Laboratory of Application and Safety Control of Genetically Modified Crops, Southwest University, Beibei, Chongqing, China
| | - Yaohua Li
- Biotechnology Research Center, Chongqing Key Laboratory of Application and Safety Control of Genetically Modified Crops, Southwest University, Beibei, Chongqing, China
| | - Wu Song
- Institute of Xinjiang Naturally-Colored Cotton, China Colored Cotton (Group) Company, Urumchi, Xinjiang Uygur Autonomous Region, China
| | - Yuehua Xiao
- Biotechnology Research Center, Chongqing Key Laboratory of Application and Safety Control of Genetically Modified Crops, Southwest University, Beibei, Chongqing, China.
| |
Collapse
|
15
|
The landscape and structural diversity of LTR retrotransposons in Musa genome. Mol Genet Genomics 2017; 292:1051-1067. [PMID: 28601922 DOI: 10.1007/s00438-017-1333-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2016] [Accepted: 06/07/2017] [Indexed: 10/19/2022]
Abstract
Long terminal repeat retrotransposons represent a major component of plant genomes and act as drivers of genome evolution and diversity. Musa is an important fruit crop and also used as a starchy vegetable in many countries. BAC sequence analysis by dot plot was employed to investigate the LTR retrotransposons from Musa genomes. Fifty intact LTR retrotransposons from selected Musa BACs were identified by dot plot analysis and further BLASTN searches retrieved 153 intact copies, 61 truncated, and a great number of partial copies/remnants from GenBank database. LARD-like elements were also identified with several copies dispersed among the Musa genotypes. The predominant elements were the LTR retrotransposons Copia and Gypsy, while Caulimoviridae (pararetrovirus) were rare in the Musa genome. PCR amplification of reverse transcriptase (RT) sequences revealed their abundance in almost all tested Musa accessions and their ancient nature before the divergence of Musa species. The phylogenetic analysis based on RT sequences of Musa and other retrotransposons clustered them into Gypsy, Caulimoviridae, and Copia lineages. Most of the Musa-related elements clustered in their respective groups, while some grouped with other elements indicating homologous sequences. The present work will be helpful to understand the LTR retrotransposons landscape, giving a complete picture of the nature of the elements, their structural features, annotation, and evolutionary dynamics in the Musa genome.
Collapse
|
16
|
Wichadakul D, Kobmoo N, Ingsriswang S, Tangphatsornruang S, Chantasingh D, Luangsa-ard JJ, Eurwilaichitr L. Insights from the genome of Ophiocordyceps polyrhachis-furcata to pathogenicity and host specificity in insect fungi. BMC Genomics 2015; 16:881. [PMID: 26511477 PMCID: PMC4625970 DOI: 10.1186/s12864-015-2101-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 10/16/2015] [Indexed: 01/19/2023] Open
Abstract
Background Ophiocordyceps unilateralis is an outstanding insect fungus for its biology to manipulate host ants’ behavior and for its extreme host-specificity. Through the sequencing and annotation of Ophiocordyceps polyrhachis-furcata, a species in the O. unilateralis species complex specific to the ant Polyrhachis furcata, comparative analyses on genes involved in pathogenicity and virulence between this fungus and other fungi were undertaken in order to gain insights into its biology and the emergence of host specificity. Results O. polyrhachis-furcata possesses various genes implicated in pathogenicity and virulence common with other fungi. Overall, this fungus possesses protein-coding genes similar to those found on other insect fungi with available genomic resources (Beauveria bassiana, Metarhizium robertsii (formerly classified as M. anisopliae s.l.), Metarhizium acridum, Cordyceps militaris, Ophiocordyceps sinensis). Comparative analyses in regard of the host ranges of insect fungi showed a tendency toward contractions of various gene families for narrow host-range species, including cuticle-degrading genes (proteases, carbohydrate esterases) and some families of pathogen-host interaction (PHI) genes. For many families of genes, O. polyrhachis-furcata had the least number of genes found; some genes commonly found in other insect fungi are even absent (e.g. Class 1 hydrophobin). However, there are expansions of genes involved in 1) the production of bacterial-like toxins in O. polyrhachis-furcata, compared with other entomopathogenic fungi, and 2) retrotransposable elements. Conclusions The gain and loss of gene families helps us understand how fungal pathogenicity in insect hosts evolved. The loss of various genes involved throughout the pathogenesis for O. unilateralis would result in a reduced capacity to exploit larger ranges of hosts and therefore in the different level of host specificity, while the expansions of other gene families suggest an adaptation to particular environments with unexpected strategies like oral toxicity, through the production of bacterial-like toxins, or sophisticated mechanisms underlying pathogenicity through retrotransposons. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2101-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Duangdao Wichadakul
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Neung, Khlong Luang, 12120, Pathum Thani, Thailand. .,Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Floor 17th, Building 4, Payathai Rd., Wangmai, Pathumwan, 10330, Bangkok, Thailand.
| | - Noppol Kobmoo
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Neung, Khlong Luang, 12120, Pathum Thani, Thailand.
| | - Supawadee Ingsriswang
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Neung, Khlong Luang, 12120, Pathum Thani, Thailand.
| | - Sithichoke Tangphatsornruang
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Neung, Khlong Luang, 12120, Pathum Thani, Thailand.
| | - Duriya Chantasingh
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Neung, Khlong Luang, 12120, Pathum Thani, Thailand.
| | - Janet Jennifer Luangsa-ard
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Neung, Khlong Luang, 12120, Pathum Thani, Thailand.
| | - Lily Eurwilaichitr
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency, 113 Thailand Science Park, Phahonyothin Rd., Khlong Neung, Khlong Luang, 12120, Pathum Thani, Thailand.
| |
Collapse
|
17
|
Abstract
Only few Candida species, e.g., Candida albicans, Candida glabrata, Candida dubliniensis, and Candida parapsilosis, are successful colonizers of a human host. Under certain circumstances these species can cause infections ranging from superficial to life-threatening disseminated candidiasis. The success of C. albicans, the most prevalent and best studied Candida species, as both commensal and human pathogen depends on its genetic, biochemical, and morphological flexibility which facilitates adaptation to a wide range of host niches. In addition, formation of biofilms provides additional protection from adverse environmental conditions. Furthermore, in many host niches Candida cells coexist with members of the human microbiome. The resulting fungal-bacterial interactions have a major influence on the success of C. albicans as commensal and also influence disease development and outcome. In this chapter, we review the current knowledge of important survival strategies of Candida spp., focusing on fundamental fitness and virulence traits of C. albicans.
Collapse
Affiliation(s)
- Melanie Polke
- Research Group Microbial Immunology, Hans-Knoell-Institute, Jena, Germany; Department Microbial Pathogenicity Mechanisms, Hans-Knoell-Institute, Jena, Germany
| | - Bernhard Hube
- Department Microbial Pathogenicity Mechanisms, Hans-Knoell-Institute, Jena, Germany; Friedrich-Schiller-University, Jena, Germany; Center for Sepsis Control and Care, Jena University Hospital, Jena, Germany
| | - Ilse D Jacobsen
- Research Group Microbial Immunology, Hans-Knoell-Institute, Jena, Germany; Friedrich-Schiller-University, Jena, Germany
| |
Collapse
|