1
|
Bambil D, Costa M, Alencar Figueiredo LFD. PmiR-Select ® - a computational approach to plant pre-miRNA identification in genomes. Mol Genet Genomics 2025; 300:12. [PMID: 39751956 DOI: 10.1007/s00438-024-02221-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 12/21/2024] [Indexed: 01/04/2025]
Abstract
Precursors of microRNAs (pre-miRNAs) are less used in silico to mine miRNAs. This study developed PmiR-Select® based on covariance models (CMs) to identify new pre-miRNAs, detecting conserved secondary structural features across RNA sequences and eliminating the redundancy. The pipeline preceded PmiR-Select® filtered 20% plant pre-miRNAs (from 38589 to 8677) from miRBase. The second filter reduced pre-miRNAs by 7% (from 8677 to 8045) through length limit to pre-miRNAs (70-300 nt) and miRNAs (20-24 nt). The 80% redundancy threshold was statistically the best, eliminating 55% pre-miRNAs (from 8045 to 3608). Angiosperms retained the highest number of pre-miRNAs and their families (2981 and 2202), followed by gymnosperms (362 and 271), bryophytes (183 and 119), and algae (82 and 78). Thirty-seven conserved pre-miRNA families happened among plant land clades, but none with algae. The PmiR-Select® was applied to the rice genome, producing 8536 pre-miRNAs from 36 families. The 80% redundancy threshold retained 3% pre-miRNAs (n = 264) from 36 families, valuable experimental and computational research resources. 14% (n = 1216) of 8536 were new pre-miRNAs from 19 new families in rice. Only 16 new sequences from six families overlapped (39 to 54% identities) with rice pre-miRNAs and five species on miRBase. The validation against mature miRNAs identified 8086 pre-miRNAs from 13 families. Eleven ones have already been recorded, but two new and abundant pre-miRNAs [miR437 (n = 296) and miR1435 (n = 725)] scattered in all 12-rice chromosomes. PmiR-Select® identified pre-miRNAs, decreased the redundancy, and discovered new miRNAs. These findings pave the way to delineating benchtop and computational experiments.
Collapse
Affiliation(s)
- Deborah Bambil
- Department of Cell Biology, Biology Institute, University of Brasília (UnB), Brasília, DF, 70910-900, Brazil.
- Federal Institute of Brasília (IFB), Brasília, DF, 70830-450, Brazil.
- Department of Botany, Biology Institute, UnB, Brasília, DF, 70910-900, Brazil.
| | - Mirele Costa
- Department of Computation, UnB, Brasília, DF, 70910-900, Brazil
| | | |
Collapse
|
2
|
Wang L, Shi P, Ping Z, Huang Q, Jiang L, Ma N, Wang Q, Xu J, Zou Y, Huang Z. The golden genome annotation of Ganoderma lingzhi reveals a more complex scenario of eukaryotic gene structure and transcription activity. BMC Biol 2024; 22:271. [PMID: 39587587 PMCID: PMC11590231 DOI: 10.1186/s12915-024-02073-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Accepted: 11/18/2024] [Indexed: 11/27/2024] Open
Abstract
BACKGROUND It is generally accepted that nuclear genes in eukaryotes are located independently on chromosomes and expressed in a monocistronic manner. However, accumulating evidence suggests a more complex landscape of gene structure and transcription. Ganoderma lingzhi, a model medicinal fungus, currently lacks high-quality genome annotation, hindering genetic studies. RESULTS Here, we reported a golden annotation of G. lingzhi, featuring 14,147 high-confidence genes derived from extensive manual corrections. Novel characteristics of gene structure and transcription were identified accordingly. Notably, non-canonical splicing sites accounted for 1.99% of the whole genome, with the predominant types being GC-AG (1.85%), GT-AC (0.05%), and GT-GG (0.04%). 1165 pairs of genes were found to have overlapped transcribed regions, and 92.19% of which showed opposite directions of gene transcription. A total of 5,412,158 genetic variations were identified among 13 G. lingzhi strains, and the manually corrected gene sets resulted in enhanced functional annotation of these variations. More than 60% of G. lingzhi genes were alternatively spliced. In addition, we found that two or more protein-coding genes (PCGs) can be transcribed into a single RNA molecule, referred to as polycistronic genes. In total, 1272 polycistronic genes associated with 2815 PCGs were identified. CONCLUSIONS The widespread presence of polycistronic genes in G. lingzhi strongly complements the theory that polycistron is also present in eukaryotic genomes. The extraordinary gene structure and transcriptional activity uncovered through this golden annotation provide implications for the study of genes, genomes, and related studies in G. lingzhi and other eukaryotes.
Collapse
Affiliation(s)
- Lining Wang
- Guangdong Engineering Laboratory of Biomass Value-added Utilization, Guangdong Engineering Research & Development Center for Comprehensive Utilization of Plant Fiber, Guangzhou Key Laboratory for Comprehensive Utilization of Plant Fiber, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, 510316, China
| | - Peiqi Shi
- The Second Clinical College, Guangzhou University of Chinese Medicine, Guangzhou, 510120, China
| | - Zhaohua Ping
- Guangdong Engineering Laboratory of Biomass Value-added Utilization, Guangdong Engineering Research & Development Center for Comprehensive Utilization of Plant Fiber, Guangzhou Key Laboratory for Comprehensive Utilization of Plant Fiber, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, 510316, China
| | - Qinghua Huang
- Guangdong Engineering Laboratory of Biomass Value-added Utilization, Guangdong Engineering Research & Development Center for Comprehensive Utilization of Plant Fiber, Guangzhou Key Laboratory for Comprehensive Utilization of Plant Fiber, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, 510316, China
| | - Liqun Jiang
- Guangdong Engineering Laboratory of Biomass Value-added Utilization, Guangdong Engineering Research & Development Center for Comprehensive Utilization of Plant Fiber, Guangzhou Key Laboratory for Comprehensive Utilization of Plant Fiber, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, 510316, China
| | - Nianfang Ma
- Guangdong Engineering Laboratory of Biomass Value-added Utilization, Guangdong Engineering Research & Development Center for Comprehensive Utilization of Plant Fiber, Guangzhou Key Laboratory for Comprehensive Utilization of Plant Fiber, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, 510316, China
| | - Qingfu Wang
- Guangdong Engineering Laboratory of Biomass Value-added Utilization, Guangdong Engineering Research & Development Center for Comprehensive Utilization of Plant Fiber, Guangzhou Key Laboratory for Comprehensive Utilization of Plant Fiber, Institute of Biological and Medical Engineering, Guangdong Academy of Sciences, Guangzhou, 510316, China.
| | - Jiang Xu
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, 100700, China.
| | - Yajie Zou
- Institute of Agricultural Resources and Regional Planning, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.
| | - Zhihai Huang
- The Second Clinical College, Guangzhou University of Chinese Medicine, Guangzhou, 510120, China.
| |
Collapse
|
3
|
Du Y, Cao L, Wang S, Guo L, Tan L, Liu H, Feng Y, Wu W. Differences in alternative splicing and their potential underlying factors between animals and plants. J Adv Res 2024; 64:83-98. [PMID: 37981087 PMCID: PMC11464654 DOI: 10.1016/j.jare.2023.11.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 08/16/2023] [Accepted: 11/14/2023] [Indexed: 11/21/2023] Open
Abstract
BACKGROUND Alternative splicing (AS), a posttranscriptional process, contributes to the complexity of transcripts from a limited number of genes in a genome, and AS is considered a great source of genetic and phenotypic diversity in eukaryotes. In animals, AS is tightly regulated during the processes of cell growth and differentiation, and its dysregulation is involved in many diseases, including cancers. Likewise, in plants, AS occurs in all stages of plant growth and development, and it seems to play important roles in the rapid reprogramming of genes in response to environmental stressors. To date, the prevalence and functional roles of AS have been extensively reviewed in animals and plants. However, AS differences between animals and plants, especially their underlying molecular mechanisms and impact factors, are anecdotal and rarely reviewed. AIM OF REVIEW This review aims to broaden our understanding of AS roles in a variety of biological processes and provide insights into the underlying mechanisms and impact factors likely leading to AS differences between animals and plants. KEY SCIENTIFIC CONCEPTS OF REVIEW We briefly summarize the roles of AS regulation in physiological and biochemical activities in animals and plants. Then, we underline the differences in the process of AS between plants and animals and especially analyze the potential impact factors, such as gene exon/intron architecture, 5'/3' untranslated regions (UTRs), spliceosome components, chromatin dynamics and transcription speeds, splicing factors [serine/arginine-rich (SR) proteins and heterogeneous nuclear ribonucleoproteins (hnRNPs)], noncoding RNAs, and environmental stimuli, which might lead to the differences. Moreover, we compare the nonsense-mediated mRNA decay (NMD)-mediated turnover of the transcripts with a premature termination codon (PTC) in animals and plants. Finally, we summarize the current AS knowledge published in animals versus plants and discuss the potential development of disease therapies and superior crops in the future.
Collapse
Affiliation(s)
- Yunfei Du
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, 311300, Hangzhou, China
| | - Lu Cao
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, 311300, Hangzhou, China
| | - Shuo Wang
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, 311300, Hangzhou, China
| | - Liangyu Guo
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, 311300, Hangzhou, China
| | - Lingling Tan
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, 311300, Hangzhou, China
| | - Hua Liu
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, 311300, Hangzhou, China
| | - Ying Feng
- Key Laboratory of Nutrition, Metabolism and Food Safety, Shanghai Institute of Nutrition and Health (SINH), Chinese Academy of Sciences (CAS), Shanghai 200032, China.
| | - Wenwu Wu
- State Key Laboratory of Subtropical Silviculture, Zhejiang A&F University, Lin'an, 311300, Hangzhou, China.
| |
Collapse
|
4
|
Mikina W, Hałakuc P, Milanowski R. Transposon-derived introns as an element shaping the structure of eukaryotic genomes. Mob DNA 2024; 15:15. [PMID: 39068498 PMCID: PMC11282704 DOI: 10.1186/s13100-024-00325-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 07/23/2024] [Indexed: 07/30/2024] Open
Abstract
The widely accepted hypothesis postulates that the first spliceosomal introns originated from group II self-splicing introns. However, it is evident that not all spliceosomal introns in the nuclear genes of modern eukaryotes are inherited through vertical transfer of intronic sequences. Several phenomena contribute to the formation of new introns but their most common origin seems to be the insertion of transposable elements. Recent analyses have highlighted instances of mass gains of new introns from transposable elements. These events often coincide with an increase or change in the spliceosome's tolerance to splicing signals, including the acceptance of noncanonical borders. Widespread acquisitions of transposon-derived introns occur across diverse evolutionary lineages, indicating convergent processes. These events, though independent, likely require a similar set of conditions. These conditions include the presence of transposon elements with features enabling their removal at the RNA level as introns and/or the existence of a splicing mechanism capable of excising unusual sequences that would otherwise not be recognized as introns by standard splicing machinery. Herein we summarize those mechanisms across different eukaryotic lineages.
Collapse
Affiliation(s)
- Weronika Mikina
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Żwirki i Wigury 101, Warsaw, 02‑089, Poland
| | - Paweł Hałakuc
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Żwirki i Wigury 101, Warsaw, 02‑089, Poland
| | - Rafał Milanowski
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Żwirki i Wigury 101, Warsaw, 02‑089, Poland.
| |
Collapse
|
5
|
Alhusayni S, Roswanjaya YP, Rutten L, Huisman R, Bertram S, Sharma T, Schon M, Kohlen W, Klein J, Geurts R. A rare non-canonical splice site in Trema orientalis SYMRK does not affect its dual symbiotic functioning in endomycorrhiza and rhizobium nodulation. BMC PLANT BIOLOGY 2023; 23:587. [PMID: 37996841 PMCID: PMC10668435 DOI: 10.1186/s12870-023-04594-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 11/08/2023] [Indexed: 11/25/2023]
Abstract
BACKGROUND Nitrogen-fixing nodules occur in ten related taxonomic lineages interspersed with lineages of non-nodulating plant species. Nodules result from an endosymbiosis between plants and diazotrophic bacteria; rhizobia in the case of legumes and Parasponia and Frankia in the case of actinorhizal species. Nodulating plants share a conserved set of symbiosis genes, whereas related non-nodulating sister species show pseudogenization of several key nodulation-specific genes. Signalling and cellular mechanisms critical for nodulation have been co-opted from the more ancient plant-fungal arbuscular endomycorrhizal symbiosis. Studies in legumes and actinorhizal plants uncovered a key component in symbiotic signalling, the LRR-type SYMBIOSIS RECEPTOR KINASE (SYMRK). SYMRK is essential for nodulation and arbuscular endomycorrhizal symbiosis. To our surprise, however, despite its arbuscular endomycorrhizal symbiosis capacities, we observed a seemingly critical mutation in a donor splice site in the SYMRK gene of Trema orientalis, the non-nodulating sister species of Parasponia. This led us to investigate the symbiotic functioning of SYMRK in the Trema-Parasponia lineage and to address the question of to what extent a single nucleotide polymorphism in a donor splice site affects the symbiotic functioning of SYMRK. RESULTS We show that SYMRK is essential for nodulation and endomycorrhization in Parasponia andersonii. Subsequently, it is revealed that the 5'-intron donor splice site of SYMRK intron 12 is variable and, in most dicotyledon species, doesn't contain the canonical dinucleotide 'GT' signature but the much less common motif 'GC'. Strikingly, in T. orientalis, this motif is converted into a rare non-canonical 5'-intron donor splice site 'GA'. This SYMRK allele, however, is fully functional and spreads in the T. orientalis population of Malaysian Borneo. A further investigation into the occurrence of the non-canonical GA-AG splice sites confirmed that these are extremely rare. CONCLUSION SYMRK functioning is highly conserved in legumes, actinorhizal plants, and Parasponia. The gene possesses a non-common 5'-intron GC donor splice site in intron 12, which is converted into a GA in T. orientalis accessions of Malaysian Borneo. The discovery of this functional GA-AG splice site in SYMRK highlights a gap in our understanding of splice donor sites.
Collapse
Affiliation(s)
- Sultan Alhusayni
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
- Biological Sciences Department, College of Science, King Faisal University, 31982, Al-Ahsa, Saudi Arabia
| | - Yuda Purwana Roswanjaya
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
- Research Centre for Applied Microbiology, National Research and Innovation Agency (BRIN), Cibinong, 16911, Indonesia
| | - Luuk Rutten
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Rik Huisman
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Simon Bertram
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Trupti Sharma
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Michael Schon
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Wouter Kohlen
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Joël Klein
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands.
| | - Rene Geurts
- Laboratory of Molecular Biology, Cluster of Plant Development, Plant Science Group, Wageningen University, Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands.
| |
Collapse
|
6
|
Larue GE, Roy SW. Where the minor things are: a pan-eukaryotic survey suggests neutral processes may explain much of minor intron evolution. Nucleic Acids Res 2023; 51:10884-10908. [PMID: 37819006 PMCID: PMC10639083 DOI: 10.1093/nar/gkad797] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 09/12/2023] [Accepted: 09/19/2023] [Indexed: 10/13/2023] Open
Abstract
Spliceosomal introns are gene segments removed from RNA transcripts by ribonucleoprotein machineries called spliceosomes. In some eukaryotes a second 'minor' spliceosome is responsible for processing a tiny minority of introns. Despite its seemingly modest role, minor splicing has persisted for roughly 1.5 billion years of eukaryotic evolution. Identifying minor introns in over 3000 eukaryotic genomes, we report diverse evolutionary histories including surprisingly high numbers in some fungi and green algae, repeated loss, as well as general biases in their positional and genic distributions. We estimate that ancestral minor intron densities were comparable to those of vertebrates, suggesting a trend of long-term stasis. Finally, three findings suggest a major role for neutral processes in minor intron evolution. First, highly similar patterns of minor and major intron evolution contrast with both functionalist and deleterious model predictions. Second, observed functional biases among minor intron-containing genes are largely explained by these genes' greater ages. Third, no association of intron splicing with cell proliferation in a minor intron-rich fungus suggests that regulatory roles are lineage-specific and thus cannot offer a general explanation for minor splicing's persistence. These data constitute the most comprehensive view of minor introns and their evolutionary history to date, and provide a foundation for future studies of these remarkable genetic elements.
Collapse
Affiliation(s)
- Graham E Larue
- Quantitative and Systems Biology Graduate Program, University of California Merced, Merced, CA 95343, USA
| | - Scott W Roy
- Department of Molecular and Cell Biology, University of California Merced, Merced, CA 95343, USA
- Department of Biology, San Francisco State University, San Francisco, CA 94132, USA
| |
Collapse
|
7
|
Le AV, Větrovský T, Barucic D, Saraiva JP, Dobbler PT, Kohout P, Pospíšek M, da Rocha UN, Kléma J, Baldrian P. Improved recovery and annotation of genes in metagenomes through the prediction of fungal introns. Mol Ecol Resour 2023; 23:1800-1811. [PMID: 37561110 DOI: 10.1111/1755-0998.13852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 06/27/2023] [Accepted: 07/31/2023] [Indexed: 08/11/2023]
Abstract
Metagenomics provides a tool to assess the functional potential of environmental and host-associated microbiomes based on the analysis of environmental DNA: assembly, gene prediction and annotation. While gene prediction is straightforward for most bacterial and archaeal taxa, it has limited applicability in the majority of eukaryotic organisms, including fungi that contain introns in gene coding sequences. As a consequence, eukaryotic genes are underrepresented in metagenomics datasets and our understanding of the contribution of fungi and other eukaryotes to microbiome functioning is limited. Here, we developed a machine intelligence-based algorithm that predicts fungal introns in environmental DNA with reasonable precision and used it to improve the annotation of environmental metagenomes. Intron removal increased the number of predicted genes by up to 9.1% and improved the annotation of several others. The proportion of newly predicted genes increased with the share of eukaryotic genes in the metagenome and-within fungal taxa-increased with the number of introns per gene. Our approach provides a tool named SVMmycointron for improved metagenome annotation, especially of microbiomes with a high proportion of eukaryotes. The scripts described in the paper are made publicly available and can be readily utilized by microbiome researchers analysing metagenomics data.
Collapse
Affiliation(s)
- Anh Vu Le
- Department of Computer Science, Czech Technical University in Prague, Praha, Czech Republic
| | - Tomáš Větrovský
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| | - Denis Barucic
- Department of Computer Science, Czech Technical University in Prague, Praha, Czech Republic
| | - Joao Pedro Saraiva
- Department of Environmental Microbiology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Germany
| | - Priscila Thiago Dobbler
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| | - Petr Kohout
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| | - Martin Pospíšek
- Department of Genetics and Microbiology, Charles University, Praha, Czech Republic
| | - Ulisses Nunes da Rocha
- Department of Environmental Microbiology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Germany
| | - Jiří Kléma
- Department of Computer Science, Czech Technical University in Prague, Praha, Czech Republic
| | - Petr Baldrian
- Laboratory of Environmental Microbiology, Institute of Microbiology of the Czech Academy of Sciences, Praha, Czech Republic
| |
Collapse
|
8
|
Cheng W, Hong C, Zeng F, Liu N, Gao H. Sequence variations affect the 5' splice site selection of plant introns. PLANT PHYSIOLOGY 2023; 193:1281-1296. [PMID: 37394939 DOI: 10.1093/plphys/kiad375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 05/31/2023] [Accepted: 06/04/2023] [Indexed: 07/04/2023]
Abstract
Introns are noncoding sequences spliced out of pre-mRNAs by the spliceosome to produce mature mRNAs. The 5' ends of introns mostly begin with GU and have a conserved sequence motif of AG/GUAAGU that could base-pair with the core sequence of U1 snRNA of the spliceosome. Intriguingly, ∼ 1% of introns in various eukaryotic species begin with GC. This occurrence could cause misannotation of genes; however, the underlying splicing mechanism is unclear. We analyzed the sequences around the intron 5' splice site (ss) in Arabidopsis (Arabidopsis thaliana) and found sequences at the GC intron ss are much more stringent than those of GT introns. Mutational analysis at various positions of the intron 5' ss revealed that although mutations impair base pairing, different mutations at the same site can have different effects, suggesting that steric hindrance also affects splicing. Moreover, mutations of 5' ss often activate a hidden ss nearby. Our data suggest that the 5' ss is selected via a competition between the major ss and the nearby minor ss. This work not only provides insights into the splicing mechanism of intron 5' ss but also improves the accuracy of gene annotation and the study of the evolution of intron 5' ss.
Collapse
Affiliation(s)
- Wenzhen Cheng
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Conghao Hong
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Fang Zeng
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Nan Liu
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Hongbo Gao
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| |
Collapse
|
9
|
Baker L, David C, Jacobs DJ. Ab initio gene prediction for protein-coding regions. BIOINFORMATICS ADVANCES 2023; 3:vbad105. [PMID: 37638212 PMCID: PMC10448985 DOI: 10.1093/bioadv/vbad105] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 07/04/2023] [Accepted: 08/08/2023] [Indexed: 08/29/2023]
Abstract
Motivation Ab initio gene prediction in nonmodel organisms is a difficult task. While many ab initio methods have been developed, their average accuracy over long segments of a genome, and especially when assessed over a wide range of species, generally yields results with sensitivity and specificity levels in the low 60% range. A common weakness of most methods is the tendency to learn patterns that are species-specific to varying degrees. The need exists for methods to extract genetic features that can distinguish coding and noncoding regions that are not sensitive to specific organism characteristics. Results A new method based on a neural network (NN) that uses a collection of sensors to create input features is presented. It is shown that accurate predictions are achieved even when trained on organisms that are significantly different phylogenetically than test organisms. A consensus prediction algorithm for a CoDing Sequence (CDS) is subsequently applied to the first nucleotide level of NN predictions that boosts accuracy through a data-driven procedure that optimizes a CDS/non-CDS threshold. An aggregate accuracy benchmark at the nucleotide level shows that this new approach performs better than existing ab initio methods, while requiring significantly less training data. Availability and implementation https://github.com/BioMolecularPhysicsGroup-UNCC/MachineLearning.
Collapse
Affiliation(s)
- Lonnie Baker
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, NC 28223, United States
| | - Charles David
- Department of Bioinformatics, The New Zealand Institute for Plant and Food Research, Lincoln 7608, New Zealand
| | - Donald J Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, NC 28223, United States
- UNC Charlotte School of Data Science, University of North Carolina at Charlotte, NC 28223, United States
| |
Collapse
|
10
|
Langlands-Perry C, Pitarch A, Lapalu N, Cuenin M, Bergez C, Noly A, Amezrou R, Gélisse S, Barrachina C, Parrinello H, Suffert F, Valade R, Marcel TC. Quantitative and qualitative plant-pathogen interactions call upon similar pathogenicity genes with a spectrum of effects. FRONTIERS IN PLANT SCIENCE 2023; 14:1128546. [PMID: 37235026 PMCID: PMC10206311 DOI: 10.3389/fpls.2023.1128546] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 04/19/2023] [Indexed: 05/28/2023]
Abstract
Septoria leaf blotch is a foliar wheat disease controlled by a combination of plant genetic resistances and fungicides use. R-gene-based qualitative resistance durability is limited due to gene-for-gene interactions with fungal avirulence (Avr) genes. Quantitative resistance is considered more durable but the mechanisms involved are not well documented. We hypothesize that genes involved in quantitative and qualitative plant-pathogen interactions are similar. A bi-parental population of Zymoseptoria tritici was inoculated on wheat cultivar 'Renan' and a linkage analysis performed to map QTL. Three pathogenicity QTL, Qzt-I05-1, Qzt-I05-6 and Qzt-I07-13, were mapped on chromosomes 1, 6 and 13 in Z. tritici, and a candidate pathogenicity gene on chromosome 6 was selected based on its effector-like characteristics. The candidate gene was cloned by Agrobacterium tumefaciens-mediated transformation, and a pathology test assessed the effect of the mutant strains on 'Renan'. This gene was demonstrated to be involved in quantitative pathogenicity. By cloning a newly annotated quantitative-effect gene in Z. tritici that is effector-like, we demonstrated that genes underlying pathogenicity QTL can be similar to Avr genes. This opens up the previously probed possibility that 'gene-for-gene' underlies not only qualitative but also quantitative plant-pathogen interactions in this pathosystem.
Collapse
Affiliation(s)
- Camilla Langlands-Perry
- Université Paris-Saclay, INRAE, UR BIOGER, Palaiseau, France
- ARVALIS Institut du Végétal, Boigneville, France
| | - Anaïs Pitarch
- Université Paris-Saclay, INRAE, UR BIOGER, Palaiseau, France
| | - Nicolas Lapalu
- Université Paris-Saclay, INRAE, UR BIOGER, Palaiseau, France
| | - Murielle Cuenin
- Université Paris-Saclay, INRAE, UR BIOGER, Palaiseau, France
| | | | - Alicia Noly
- Université Paris-Saclay, INRAE, UR BIOGER, Palaiseau, France
| | - Reda Amezrou
- Université Paris-Saclay, INRAE, UR BIOGER, Palaiseau, France
| | | | - Célia Barrachina
- MGX-Montpellier GenomiX, Univ. Montpellier, CNRS, INSERM, Montpellier, France
| | - Hugues Parrinello
- MGX-Montpellier GenomiX, Univ. Montpellier, CNRS, INSERM, Montpellier, France
| | | | | | | |
Collapse
|
11
|
Valach M, Moreira S, Petitjean C, Benz C, Butenko A, Flegontova O, Nenarokova A, Prokopchuk G, Batstone T, Lapébie P, Lemogo L, Sarrasin M, Stretenowich P, Tripathi P, Yazaki E, Nara T, Henrissat B, Lang BF, Gray MW, Williams TA, Lukeš J, Burger G. Recent expansion of metabolic versatility in Diplonema papillatum, the model species of a highly speciose group of marine eukaryotes. BMC Biol 2023; 21:99. [PMID: 37143068 PMCID: PMC10161547 DOI: 10.1186/s12915-023-01563-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 03/10/2023] [Indexed: 05/06/2023] Open
Abstract
BACKGROUND Diplonemid flagellates are among the most abundant and species-rich of known marine microeukaryotes, colonizing all habitats, depths, and geographic regions of the world ocean. However, little is known about their genomes, biology, and ecological role. RESULTS We present the first nuclear genome sequence from a diplonemid, the type species Diplonema papillatum. The ~ 280-Mb genome assembly contains about 32,000 protein-coding genes, likely co-transcribed in groups of up to 100. Gene clusters are separated by long repetitive regions that include numerous transposable elements, which also reside within introns. Analysis of gene-family evolution reveals that the last common diplonemid ancestor underwent considerable metabolic expansion. D. papillatum-specific gains of carbohydrate-degradation capability were apparently acquired via horizontal gene transfer. The predicted breakdown of polysaccharides including pectin and xylan is at odds with reports of peptides being the predominant carbon source of this organism. Secretome analysis together with feeding experiments suggest that D. papillatum is predatory, able to degrade cell walls of live microeukaryotes, macroalgae, and water plants, not only for protoplast feeding but also for metabolizing cell-wall carbohydrates as an energy source. The analysis of environmental barcode samples shows that D. papillatum is confined to temperate coastal waters, presumably acting in bioremediation of eutrophication. CONCLUSIONS Nuclear genome information will allow systematic functional and cell-biology studies in D. papillatum. It will also serve as a reference for the highly diverse diplonemids and provide a point of comparison for studying gene complement evolution in the sister group of Kinetoplastida, including human-pathogenic taxa.
Collapse
Affiliation(s)
- Matus Valach
- Department of Biochemistry, Robert-Cedergren Centre for Bioinformatics and Genomics, Université de Montréal, Montreal, QC, Canada.
| | - Sandrine Moreira
- Department of Biochemistry, Robert-Cedergren Centre for Bioinformatics and Genomics, Université de Montréal, Montreal, QC, Canada
| | - Celine Petitjean
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Corinna Benz
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
| | - Anzhelika Butenko
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
- Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Olga Flegontova
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Anna Nenarokova
- School of Biological Sciences, University of Bristol, Bristol, UK
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
| | - Galina Prokopchuk
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
| | - Tom Batstone
- School of Biological Sciences, University of Bristol, Bristol, UK
- Present address: High Performance Computing Centre, Bristol, UK
| | - Pascal Lapébie
- Architecture et Fonction des Macromolécules Biologiques (AFMB), CNRS, Aix Marseille Université, Marseille, France
| | - Lionnel Lemogo
- Department of Biochemistry, Robert-Cedergren Centre for Bioinformatics and Genomics, Université de Montréal, Montreal, QC, Canada
- Present address: Environment Climate Change Canada, Dorval, QC, Canada
| | - Matt Sarrasin
- Department of Biochemistry, Robert-Cedergren Centre for Bioinformatics and Genomics, Université de Montréal, Montreal, QC, Canada
| | - Paul Stretenowich
- Department of Biochemistry, Robert-Cedergren Centre for Bioinformatics and Genomics, Université de Montréal, Montreal, QC, Canada
- Present address: Canadian Centre for Computational Genomics; McGill Genome Centre, McGill University, Montreal, QC, Canada
| | - Pragya Tripathi
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
| | - Euki Yazaki
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Hirosawa, Wako, Saitama, Japan
| | - Takeshi Nara
- Laboratory of Molecular Parasitology, Graduate School of Life Science and Technology, Iryo Sosei University, Iwaki City, Fukushima, Japan
| | - Bernard Henrissat
- Architecture et Fonction des Macromolécules Biologiques (AFMB), CNRS, Aix Marseille Université, Marseille, France
- Present address: DTU Bioengineering, Technical University of Denmark, Lyngby, Denmark
- Department of Biological Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - B Franz Lang
- Department of Biochemistry, Robert-Cedergren Centre for Bioinformatics and Genomics, Université de Montréal, Montreal, QC, Canada
| | - Michael W Gray
- Department of Biochemistry and Molecular Biology, Institute for Comparative Genomics, Dalhousie University, Halifax, NS, Canada
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Julius Lukeš
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
| | - Gertraud Burger
- Department of Biochemistry, Robert-Cedergren Centre for Bioinformatics and Genomics, Université de Montréal, Montreal, QC, Canada.
| |
Collapse
|
12
|
Chu G, Li P, Zhao Q, He R, Zhao Y. Mutation spectrum of Kallmann syndrome: identification of five novel mutations across ANOS1 and FGFR1. Reprod Biol Endocrinol 2023; 21:23. [PMID: 36859276 PMCID: PMC9976430 DOI: 10.1186/s12958-023-01074-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 02/14/2023] [Indexed: 03/03/2023] Open
Abstract
BACKGROUND Kallmann syndrome (KS) is a common type of idiopathic hypogonadotropic hypogonadism. To date, more than 30 genes including ANOS1 and FGFR1 have been identified in different genetic models of KS without affirmatory genotype-phenotype correlation, and novel mutations have been found. METHODS A total of 35 unrelated patients with clinical features of disorder of sex development were recruited. Custom-panel sequencing or whole-exome sequencing was performed to detect the pathogenic mutations. Sanger sequencing was performed to verify single-nucleotide variants. Copy number variation-sequencing (CNV-seq) was performed to determine CNVs. The pathogenicity of the identified variant was predicted in silico. mRNA transcript analysis and minigene reporter assay were performed to test the effect of the mutation on splicing. RESULTS ANOS1 gene c.709 T > A and c.711 G > T were evaluated as pathogenic by several commonly used software, and c.1063-2 A > T was verified by transcriptional splicing assay. The c.1063-2 A > T mutation activated a cryptic splice acceptor site downstream of the original splice acceptor site and resulted in an aberrant splicing of the 24-basepair at the 5' end of exon 8, yielding a new transcript with c.1063-1086 deletion. FRFR1 gene c.1835delA was assessed as pathogenic according to the ACMG guideline. The CNV of del(8)(p12p11.22)chr8:g.36140000_38460000del was judged as pathogenic according to the ACMG & ClinGen technical standards. CONCLUSIONS Herein, we identified three novel ANOS1 mutations and two novel FGFR1 variations in Chinese KS families. In silico prediction and functional experiment evaluated the pathogenesis of ANOS1 mutations. FRFR1 c.1835delA mutation and del(8)(p12p11.22)chr8:g.36140000_38460000del were assessed as pathogenic variations. Therefore, our study expands the spectrum of mutations associated with KS and provides diagnostic evidence for patients who carry the same mutation in the future.
Collapse
Affiliation(s)
- Guoming Chu
- Department of Clinical Genetics, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China
| | - Pingping Li
- Center of Reproductive Medicine, Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China
| | - Qian Zhao
- Department of Pediatric Urology, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China
| | - Rong He
- Department of Clinical Genetics, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China
| | - Yanyan Zhao
- Department of Clinical Genetics, Shengjing Hospital of China Medical University, Shenyang, 110004, Liaoning, China.
| |
Collapse
|
13
|
Kunz L, Sotiropoulos AG, Graf J, Razavi M, Keller B, Müller MC. The broad use of the Pm8 resistance gene in wheat resulted in hypermutation of the AvrPm8 gene in the powdery mildew pathogen. BMC Biol 2023; 21:29. [PMID: 36755285 PMCID: PMC9909948 DOI: 10.1186/s12915-023-01513-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 01/11/2023] [Indexed: 02/10/2023] Open
Abstract
BACKGROUND Worldwide wheat production is under constant threat by fast-evolving fungal pathogens. In the last decades, wheat breeding for disease resistance heavily relied on the introgression of chromosomal segments from related species as genetic sources of new resistance. The Pm8 resistance gene against the powdery mildew disease has been introgressed from rye into wheat as part of a large 1BL.1RS chromosomal translocation encompassing multiple disease resistance genes and yield components. Due to its high agronomic value, this translocation has seen continuous global use since the 1960s on large growth areas, even after Pm8 resistance was overcome by the powdery mildew pathogen. The long-term use of Pm8 at a global scale provided the unique opportunity to study the consequences of such extensive resistance gene application on pathogen evolution. RESULTS Using genome-wide association studies in a population of wheat mildew isolates, we identified the avirulence effector AvrPm8 specifically recognized by Pm8. Haplovariant mining in a global mildew population covering all major wheat growing areas of the world revealed 17 virulent haplotypes of the AvrPm8 gene that grouped into two functional categories. The first one comprised amino acid polymorphisms at a single position along the AvrPm8 protein, which we confirmed to be crucial for the recognition by Pm8. The second category consisted of numerous destructive mutations to the AvrPm8 open reading frame such as disruptions of the start codon, gene truncations, gene deletions, and interference with mRNA splicing. With the exception of a single, likely ancient, gain-of-virulence mutation found in mildew isolates around the world, all AvrPm8 virulence haplotypes were found in geographically restricted regions, indicating that they occurred recently as a consequence of the frequent Pm8 use. CONCLUSIONS In this study, we show that the broad and prolonged use of the Pm8 gene in wheat production worldwide resulted in a multitude of gain-of-virulence mechanisms affecting the AvrPm8 gene in the wheat powdery mildew pathogen. Based on our findings, we conclude that both standing genetic variation as well as locally occurring new mutations contributed to the global breakdown of the Pm8 resistance gene introgression.
Collapse
Affiliation(s)
- Lukas Kunz
- grid.7400.30000 0004 1937 0650Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Alexandros G. Sotiropoulos
- grid.7400.30000 0004 1937 0650Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Johannes Graf
- grid.7400.30000 0004 1937 0650Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland
| | - Mohammad Razavi
- grid.419414.d0000 0000 9770 1268Iranian Research Institute of Plant Protection, Agricultural Research, Education and Extension Organization, Tehran, Iran
| | - Beat Keller
- Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland.
| | - Marion C. Müller
- grid.7400.30000 0004 1937 0650Department of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland ,grid.6936.a0000000123222966Chair of Phytopathology, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| |
Collapse
|
14
|
Parakkunnel R, Naik K B, Vanishree G, C S, Purru S, Bhaskar K U, Bhat KV, Kumar S. Gene fusions, micro-exons and splice variants define stress signaling by AP2/ERF and WRKY transcription factors in the sesame pan-genome. FRONTIERS IN PLANT SCIENCE 2022; 13:1076229. [PMID: 36618639 PMCID: PMC9817154 DOI: 10.3389/fpls.2022.1076229] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 12/02/2022] [Indexed: 06/17/2023]
Abstract
Evolutionary dynamics of AP2/ERF and WRKY genes, the major components of defense response were studied extensively in the sesame pan-genome. Massive variation was observed for gene copy numbers, genome location, domain structure, exon-intron structure and protein parameters. In the pan-genome, 63% of AP2/ERF members were devoid of introns whereas >99% of WRKY genes contained multiple introns. AP2 subfamily was found to be micro-exon rich with the adjoining intronic sequences sharing sequence similarity to many stress-responsive and fatty acid metabolism genes. WRKY family included extensive multi-domain gene fusions where the additional domains significantly enhanced gene and exonic sizes as well as gene copy numbers. The fusion genes were found to have roles in acquired immunity, stress response, cell and membrane integrity as well as ROS signaling. The individual genomes shared extensive synteny and collinearity although ecological adaptation was evident among the Chinese and Indian accessions. Significant positive selection effects were noticed for both micro-exon and multi-domain genes. Splice variants with changes in acceptor, donor and branch sites were common and 6-7 splice variants were detected per gene. The study ascertained vital roles of lipid metabolism and chlorophyll biosynthesis in the defense response and stress signaling pathways. 60% of the studied genes localized in the nucleus while 20% preferred chloroplast. Unique cis-element distribution was noticed in the upstream promoter region with MYB and STRE in WRKY genes while MYC was present in the AP2/ERF genes. Intron-less genes exhibited great diversity in the promoter sequences wherein the predominance of dosage effect indicated variable gene expression levels. Mimicking the NBS-LRR genes, a chloroplast localized WRKY gene, Swetha_24868, with additional domains of chorismate mutase, cAMP and voltage-dependent potassium channel was found to act as a master regulator of defense signaling, triggering immunity and reducing ROS levels.
Collapse
Affiliation(s)
- Ramya Parakkunnel
- ICAR- Indian Institute of Seed Science, Regional Station, Gandhi Krishi Vigyana Kendra (GKVK) Campus, Bengaluru, India
| | - Bhojaraja Naik K
- ICAR- Indian Institute of Seed Science, Regional Station, Gandhi Krishi Vigyana Kendra (GKVK) Campus, Bengaluru, India
| | - Girimalla Vanishree
- ICAR- Indian Institute of Seed Science, Regional Station, Gandhi Krishi Vigyana Kendra (GKVK) Campus, Bengaluru, India
| | - Susmita C
- ICAR- Indian Institute of Seed Science, Mau, Uttar Pradesh, India
| | - Supriya Purru
- ICAR- National Academy of Agricultural Research Management, Hyderabad, Telengana, India
| | - Udaya Bhaskar K
- ICAR- Indian Institute of Seed Science, Regional Station, Gandhi Krishi Vigyana Kendra (GKVK) Campus, Bengaluru, India
| | - KV. Bhat
- Division of Genomic Resources, ICAR- National Bureau of Plant Genetic Resources, New Delhi, India
| | - Sanjay Kumar
- ICAR- Indian Institute of Seed Science, Mau, Uttar Pradesh, India
| |
Collapse
|
15
|
Lin Z, Xu K, Cai G, Liu Y, Li Y, Zhang Z, Nielsen J, Shi S, Liu Z. Characterization of cross-species transcription and splicing from Penicillium to Saccharomyces cerevisiae. J Ind Microbiol Biotechnol 2021; 48:kuab054. [PMID: 34387324 PMCID: PMC8788760 DOI: 10.1093/jimb/kuab054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 08/04/2021] [Indexed: 11/14/2022]
Abstract
Heterologous expression of eukaryotic gene clusters in yeast has been widely used for producing high-value chemicals and bioactive secondary metabolites. However, eukaryotic transcription cis-elements are still undercharacterized, and the cross-species expression mechanism remains poorly understood. Here we used the whole expression unit (including original promoter, terminator, and open reading frame with introns) of orotidine 5'-monophosphate decarboxylases from 14 Penicillium species as a showcase, and analyzed their cross-species expression in Saccharomyces cerevisiae. We found that pyrG promoters from the Penicillium species could drive URA3 expression in yeast, and that inefficient cross-species splicing of Penicillium introns might result in weak cross-species expression. Thus, this study demonstrates cross-species expression from Penicillium to yeast, and sheds light on the opportunities and challenges of cross-species expression of fungi expression units and gene clusters in yeast without refactoring for novel natural product discovery.
Collapse
Affiliation(s)
- Zhenquan Lin
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
| | - Kang Xu
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
| | - Guang Cai
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
| | - Yangqingxue Liu
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
| | - Yi Li
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
| | - Zhihao Zhang
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
| | - Jens Nielsen
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
- Department of Biology and Biological Engineering, Chalmers University of Technology, SE-412 96 Gothenburg, Sweden
- BioInnovation Institute, Ole Maaløes Vej 3, DK 2200 Copenhagen N, Denmark
| | - Shuobo Shi
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
| | - Zihe Liu
- College of Life Science and Technology, Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, 100029 Beijing, China
| |
Collapse
|
16
|
Scalzitti N, Kress A, Orhand R, Weber T, Moulinier L, Jeannin-Girardon A, Collet P, Poch O, Thompson JD. Spliceator: multi-species splice site prediction using convolutional neural networks. BMC Bioinformatics 2021; 22:561. [PMID: 34814826 PMCID: PMC8609763 DOI: 10.1186/s12859-021-04471-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 11/09/2021] [Indexed: 12/14/2022] Open
Abstract
Background Ab initio prediction of splice sites is an essential step in eukaryotic genome annotation. Recent predictors have exploited Deep Learning algorithms and reliable gene structures from model organisms. However, Deep Learning methods for non-model organisms are lacking. Results We developed Spliceator to predict splice sites in a wide range of species, including model and non-model organisms. Spliceator uses a convolutional neural network and is trained on carefully validated data from over 100 organisms. We show that Spliceator achieves consistently high accuracy (89–92%) compared to existing methods on independent benchmarks from human, fish, fly, worm, plant and protist organisms. Conclusions Spliceator is a new Deep Learning method trained on high-quality data, which can be used to predict splice sites in diverse organisms, ranging from human to protists, with consistently high accuracy. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04471-3.
Collapse
Affiliation(s)
- Nicolas Scalzitti
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Arnaud Kress
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France.,BiGEst-ICube Platform, ICube Laboratory, UMR7357, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Romain Orhand
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Thomas Weber
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Luc Moulinier
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France.,BiGEst-ICube Platform, ICube Laboratory, UMR7357, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Anne Jeannin-Girardon
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Pierre Collet
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Olivier Poch
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France
| | - Julie D Thompson
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France.
| |
Collapse
|
17
|
In Silico Study of the RSH ( RelA/ SpoT Homologs) Gene Family and Expression Analysis in Response to PGPR Bacteria and Salinity in Brassica napus. Int J Mol Sci 2021; 22:ijms221910666. [PMID: 34639007 PMCID: PMC8509286 DOI: 10.3390/ijms221910666] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/21/2021] [Accepted: 09/28/2021] [Indexed: 12/21/2022] Open
Abstract
Among several mechanisms involved in the plant stress response, synthesis of guanosine tetra and pentaphosphates (alarmones), homologous to the bacterial stringent response, is of crucial importance. Plant alarmones affect, among others, photosynthetic activity, metabolite accumulation, and nutrient remobilization, and thus regulate plant growth and development. The plant RSH (RelA/SpoT homolog) genes, that encode synthetases and/or hydrolases of alarmones, have been characterized in a limited number of plant species, e.g., Arabidopsis thaliana, Oryza sativa, and Ipomoea nil. Here, we used dry-to-wet laboratory research approaches to characterize RSH family genes in the polyploid plant Brassica napus. There are 12 RSH genes in the genome of rapeseed that belong to four types of RSH genes: 6 RSH1, 2 RSH2, 3 RSH3, and 1 CRSH. BnRSH genes contain 13-24 introns in RSH1, 2-6 introns in RSH2, 1-6 introns in RSH3, and 2-3 introns in the CRSH genes. In the promoter regions of the RSH genes, we showed the presence of regulatory elements of the response to light, plant hormones, plant development, and abiotic and biotic stresses. The wet-lab analysis showed that expression of BnRSH genes is generally not significantly affected by salt stress, but that the presence of PGPR bacteria, mostly of Serratia sp., increased the expression of BnRSH significantly. The obtained results show that BnRSH genes are differently affected by biotic and abiotic factors, which indicates their different functions in plants.
Collapse
|
18
|
G-Quadruplex in Gene Encoding Large Subunit of Plant RNA Polymerase II: A Billion-Year-Old Story. Int J Mol Sci 2021; 22:ijms22147381. [PMID: 34299001 PMCID: PMC8306923 DOI: 10.3390/ijms22147381] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/24/2021] [Accepted: 07/05/2021] [Indexed: 12/12/2022] Open
Abstract
G-quadruplexes have long been perceived as rare and physiologically unimportant nucleic acid structures. However, several studies have revealed their importance in molecular processes, suggesting their possible role in replication and gene expression regulation. Pathways involving G-quadruplexes are intensively studied, especially in the context of human diseases, while their involvement in gene expression regulation in plants remains largely unexplored. Here, we conducted a bioinformatic study and performed a complex circular dichroism measurement to identify a stable G-quadruplex in the gene RPB1, coding for the RNA polymerase II large subunit. We found that this G-quadruplex-forming locus is highly evolutionarily conserved amongst plants sensu lato (Archaeplastida) that share a common ancestor more than one billion years old. Finally, we discussed a new hypothesis regarding G-quadruplexes interacting with UV light in plants to potentially form an additional layer of the regulatory network.
Collapse
|
19
|
Dikaya V, El Arbi N, Rojas-Murcia N, Nardeli SM, Goretti D, Schmid M. Insights into the role of alternative splicing in plant temperature response. JOURNAL OF EXPERIMENTAL BOTANY 2021:erab234. [PMID: 34105719 DOI: 10.1093/jxb/erab234] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Indexed: 05/21/2023]
Abstract
Alternative splicing occurs in all eukaryotic organisms. Since the first description of multiexon genes and the splicing machinery, the field has expanded rapidly, especially in animals and yeast. However, our knowledge about splicing in plants is still quite fragmented. Though eukaryotes show some similarity in the composition and dynamics of the splicing machinery, observations of unique plant traits are only starting to emerge. For instance, plant alternative splicing is closely linked to their ability to perceive various environmental stimuli. Due to their sessile lifestyle, temperature is a central source of information allowing plants to adjust their development to match current growth conditions. Hence, seasonal temperature fluctuations and day-night cycles can strongly influence plant morphology across developmental stages. Here we discuss the available data about temperature-dependent alternative splicing in plants. Given its fragmented state it is not always possible to fit specific observations into a coherent picture, yet it is sufficient to estimate the complexity of this field and the need of further research. Better understanding of alternative splicing as a part of plant temperature response and adaptation may also prove to be a powerful tool for both, fundamental and applied sciences.
Collapse
Affiliation(s)
- Varvara Dikaya
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Nabila El Arbi
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Nelson Rojas-Murcia
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Sarah Muniz Nardeli
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Daniela Goretti
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Markus Schmid
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
- Beijing Advanced Innovation Centre for Tree Breeding by Molecular Design, Beijing Forestry University, Beijing, People's Republic of China
| |
Collapse
|
20
|
Mahadani P, Hazra A. Expression and splicing dynamics of WRKY family genes along physiological exigencies of tea plant (Camellia sinensis). Biologia (Bratisl) 2021. [DOI: 10.1007/s11756-021-00784-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
21
|
Farhat S, Le P, Kayal E, Noel B, Bigeard E, Corre E, Maumus F, Florent I, Alberti A, Aury JM, Barbeyron T, Cai R, Da Silva C, Istace B, Labadie K, Marie D, Mercier J, Rukwavu T, Szymczak J, Tonon T, Alves-de-Souza C, Rouzé P, Van de Peer Y, Wincker P, Rombauts S, Porcel BM, Guillou L. Rapid protein evolution, organellar reductions, and invasive intronic elements in the marine aerobic parasite dinoflagellate Amoebophrya spp. BMC Biol 2021; 19:1. [PMID: 33407428 PMCID: PMC7789003 DOI: 10.1186/s12915-020-00927-9] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Accepted: 11/12/2020] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Dinoflagellates are aquatic protists particularly widespread in the oceans worldwide. Some are responsible for toxic blooms while others live in symbiotic relationships, either as mutualistic symbionts in corals or as parasites infecting other protists and animals. Dinoflagellates harbor atypically large genomes (~ 3 to 250 Gb), with gene organization and gene expression patterns very different from closely related apicomplexan parasites. Here we sequenced and analyzed the genomes of two early-diverging and co-occurring parasitic dinoflagellate Amoebophrya strains, to shed light on the emergence of such atypical genomic features, dinoflagellate evolution, and host specialization. RESULTS We sequenced, assembled, and annotated high-quality genomes for two Amoebophrya strains (A25 and A120), using a combination of Illumina paired-end short-read and Oxford Nanopore Technology (ONT) MinION long-read sequencing approaches. We found a small number of transposable elements, along with short introns and intergenic regions, and a limited number of gene families, together contribute to the compactness of the Amoebophrya genomes, a feature potentially linked with parasitism. While the majority of Amoebophrya proteins (63.7% of A25 and 59.3% of A120) had no functional assignment, we found many orthologs shared with Dinophyceae. Our analyses revealed a strong tendency for genes encoded by unidirectional clusters and high levels of synteny conservation between the two genomes despite low interspecific protein sequence similarity, suggesting rapid protein evolution. Most strikingly, we identified a large portion of non-canonical introns, including repeated introns, displaying a broad variability of associated splicing motifs never observed among eukaryotes. Those introner elements appear to have the capacity to spread over their respective genomes in a manner similar to transposable elements. Finally, we confirmed the reduction of organelles observed in Amoebophrya spp., i.e., loss of the plastid, potential loss of a mitochondrial genome and functions. CONCLUSION These results expand the range of atypical genome features found in basal dinoflagellates and raise questions regarding speciation and the evolutionary mechanisms at play while parastitism was selected for in this particular unicellular lineage.
Collapse
Affiliation(s)
- Sarah Farhat
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
- School of Marine and Atmospheric Sciences, Stony Brook University, Stony Brook, New York, 11794, USA
| | - Phuong Le
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Ehsan Kayal
- Sorbonne Université, CNRS, FR2424, Station Biologique de Roscoff, Place Georges Teissier, 29680, Roscoff, France
| | - Benjamin Noel
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Estelle Bigeard
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France
| | - Erwan Corre
- Sorbonne Université, CNRS, FR2424, Station Biologique de Roscoff, Place Georges Teissier, 29680, Roscoff, France
| | - Florian Maumus
- URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
| | - Isabelle Florent
- Unité Molécules de Communication et Adaptation des Microorganismes (MCAM, UMR7245), Muséum national d'Histoire naturelle, CNRS, CP 52, 57 rue Cuvier, 75005, Paris, France
| | - Adriana Alberti
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Tristan Barbeyron
- Sorbonne Université, CNRS, UMR 8227, Station Biologique de Roscoff, Place Georges Teissier, 29680, Roscoff, France
| | - Ruibo Cai
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France
| | - Corinne Da Silva
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Benjamin Istace
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Karine Labadie
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Dominique Marie
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France
| | - Jonathan Mercier
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Tsinda Rukwavu
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Jeremy Szymczak
- Sorbonne Université, CNRS, FR2424, Station Biologique de Roscoff, Place Georges Teissier, 29680, Roscoff, France
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France
| | - Thierry Tonon
- Centre for Novel Agricultural Products, Department of Biology, University of York, Heslington, York, YO10 5DD, UK
| | - Catharina Alves-de-Souza
- Algal Resources Collection, MARBIONC, Center for Marine Sciences, University of North Carolina Wilmington, 5600 Marvin K. Moss Lane, Wilmington, NC, 28409, USA
| | - Pierre Rouzé
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France
| | - Stephane Rombauts
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Betina M Porcel
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, 91057, Evry, France.
| | - Laure Guillou
- Sorbonne Université, CNRS, UMR7144 Adaptation et Diversité en Milieu Marin, Ecology of Marine Plankton (ECOMAP), Station Biologique de Roscoff SBR, 29680, Roscoff, France.
| |
Collapse
|
22
|
Fang S, Hou X, Qiu K, He R, Feng X, Liang X. The occurrence and function of alternative splicing in fungi. FUNGAL BIOL REV 2020. [DOI: 10.1016/j.fbr.2020.10.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
23
|
Sielemann K, Hafner A, Pucker B. The reuse of public datasets in the life sciences: potential risks and rewards. PeerJ 2020; 8:e9954. [PMID: 33024631 PMCID: PMC7518187 DOI: 10.7717/peerj.9954] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 08/25/2020] [Indexed: 12/13/2022] Open
Abstract
The 'big data' revolution has enabled novel types of analyses in the life sciences, facilitated by public sharing and reuse of datasets. Here, we review the prodigious potential of reusing publicly available datasets and the associated challenges, limitations and risks. Possible solutions to issues and research integrity considerations are also discussed. Due to the prominence, abundance and wide distribution of sequencing data, we focus on the reuse of publicly available sequence datasets. We define 'successful reuse' as the use of previously published data to enable novel scientific findings. By using selected examples of successful reuse from different disciplines, we illustrate the enormous potential of the practice, while acknowledging the respective limitations and risks. A checklist to determine the reuse value and potential of a particular dataset is also provided. The open discussion of data reuse and the establishment of this practice as a norm has the potential to benefit all stakeholders in the life sciences.
Collapse
Affiliation(s)
- Katharina Sielemann
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, Bielefeld, Germany
| | - Alenka Hafner
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Current Affiliation: Intercollege Graduate Degree Program in Plant Biology, Penn State University, University Park, State College, PA, United States of America
| | - Boas Pucker
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec) & Faculty of Biology, Bielefeld University, Bielefeld, Germany
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
24
|
Frey K, Pucker B. Animal, Fungi, and Plant Genome Sequences Harbor Different Non-Canonical Splice Sites. Cells 2020; 9:E458. [PMID: 32085510 PMCID: PMC7072748 DOI: 10.3390/cells9020458] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 02/11/2020] [Accepted: 02/14/2020] [Indexed: 11/17/2022] Open
Abstract
Most protein-encoding genes in eukaryotes contain introns, which are interwoven with exons. Introns need to be removed from initial transcripts in order to generate the final messenger RNA (mRNA), which can be translated into an amino acid sequence. Precise excision of introns by the spliceosome requires conserved dinucleotides, which mark the splice sites. However, there are variations of the highly conserved combination of GT at the 5' end and AG at the 3' end of an intron in the genome. GC-AG and AT-AC are two major non-canonical splice site combinations, which have been known for years. Recently, various minor non-canonical splice site combinations were detected with numerous dinucleotide permutations. Here, we expand systematic investigations of non-canonical splice site combinations in plants across eukaryotes by analyzing fungal and animal genome sequences. Comparisons of splice site combinations between these three kingdoms revealed several differences, such as an apparently increased CT-AC frequency in fungal genome sequences. Canonical GT-AG splice site combinations in antisense transcripts are a likely explanation for this observation, thus indicating annotation errors. In addition, high numbers of GA-AG splice site combinations were observed in Eurytemoraaffinis and Oikopleuradioica. A variant in one U1 small nuclear RNA (snRNA) isoform might allow the recognition of GA as a 5' splice site. In depth investigation of splice site usage based on RNA-Seq read mappings indicates a generally higher flexibility of the 3' splice site compared to the 5' splice site across animals, fungi, and plants.
Collapse
Affiliation(s)
- Katharina Frey
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec), Bielefeld University, 33615 Bielefeld, Germany;
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, 33615 Bielefeld, Germany
| | - Boas Pucker
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec), Bielefeld University, 33615 Bielefeld, Germany;
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr-University Bochum, Universitätsstraße 150, 44801 Bochum, Germany
| |
Collapse
|