1
|
Miller JR, Adjeroh DA. Machine learning on alignment features for parent-of-origin classification of simulated hybrid RNA-seq. BMC Bioinformatics 2024; 25:109. [PMID: 38475727 DOI: 10.1186/s12859-024-05728-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 03/01/2024] [Indexed: 03/14/2024] Open
Abstract
BACKGROUND Parent-of-origin allele-specific gene expression (ASE) can be detected in interspecies hybrids by virtue of RNA sequence variants between the parental haplotypes. ASE is detectable by differential expression analysis (DEA) applied to the counts of RNA-seq read pairs aligned to parental references, but aligners do not always choose the correct parental reference. RESULTS We used public data for species that are known to hybridize. We measured our ability to assign RNA-seq read pairs to their proper transcriptome or genome references. We tested software packages that assign each read pair to a reference position and found that they often favored the incorrect species reference. To address this problem, we introduce a post process that extracts alignment features and trains a random forest classifier to choose the better alignment. On each simulated hybrid dataset tested, our machine-learning post-processor achieved higher accuracy than the aligner by itself at choosing the correct parent-of-origin per RNA-seq read pair. CONCLUSIONS For the parent-of-origin classification of RNA-seq, machine learning can improve the accuracy of alignment-based methods. This approach could be useful for enhancing ASE detection in interspecies hybrids, though RNA-seq from real hybrids may present challenges not captured by our simulations. We believe this is the first application of machine learning to this problem domain.
Collapse
Affiliation(s)
- Jason R Miller
- Department of Computer Science, Mathematics, Engineering, Shepherd University, Shepherdstown, WV, USA.
- EVOGENE, Department of Biosciences, University of Oslo, Oslo, Norway.
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, USA.
| | - Donald A Adjeroh
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV, USA
| |
Collapse
|
2
|
Huang KY, Kan SL, Shen TT, Gong P, Feng YY, Du H, Zhao YP, Wan T, Wang XQ, Ran JH. A Comprehensive Evolutionary Study of Chloroplast RNA Editing in Gymnosperms: A Novel Type of G-to-A RNA Editing Is Common in Gymnosperms. Int J Mol Sci 2022; 23:ijms231810844. [PMID: 36142757 PMCID: PMC9505161 DOI: 10.3390/ijms231810844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 09/09/2022] [Accepted: 09/13/2022] [Indexed: 12/05/2022] Open
Abstract
Although more than 9100 plant plastomes have been sequenced, RNA editing sites of the whole plastome have been experimentally verified in only approximately 21 species, which seriously hampers the comprehensive evolutionary study of chloroplast RNA editing. We investigated the evolutionary pattern of chloroplast RNA editing sites in 19 species from all 13 families of gymnosperms based on a combination of genomic and transcriptomic data. We found that the chloroplast C-to-U RNA editing sites of gymnosperms shared many common characteristics with those of other land plants, but also exhibited many unique characteristics. In contrast to that noted in angiosperms, the density of RNA editing sites in ndh genes was not the highest in the sampled gymnosperms, and both loss and gain events at editing sites occurred frequently during the evolution of gymnosperms. In addition, GC content and plastomic size were positively correlated with the number of chloroplast RNA editing sites in gymnosperms, suggesting that the increase in GC content could provide more materials for RNA editing and facilitate the evolution of RNA editing in land plants or vice versa. Interestingly, novel G-to-A RNA editing events were commonly found in all sampled gymnosperm species, and G-to-A RNA editing exhibits many different characteristics from C-to-U RNA editing in gymnosperms. This study revealed a comprehensive evolutionary scenario for chloroplast RNA editing sites in gymnosperms, and reported that a novel type of G-to-A RNA editing is prevalent in gymnosperms.
Collapse
Affiliation(s)
- Kai-Yuan Huang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Sheng-Long Kan
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Ting-Ting Shen
- School of Earth Sciences, East China University of Technology, Nanchang 330013, China
| | - Pin Gong
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Yuan-Yuan Feng
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hong Du
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Yun-Peng Zhao
- Laboratory of Systematic & Evolutionary Botany and Biodiversity, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
| | - Tao Wan
- Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| | - Xiao-Quan Wang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jin-Hua Ran
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Correspondence:
| |
Collapse
|
3
|
Huang Y, Li J, Yang Z, An W, Xie C, Liu S, Zheng X. Comprehensive analysis of complete chloroplast genome and phylogenetic aspects of ten Ficus species. BMC PLANT BIOLOGY 2022; 22:253. [PMID: 35606691 PMCID: PMC9125854 DOI: 10.1186/s12870-022-03643-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 05/12/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND The large genus Ficus comprises approximately 800 species, most of which possess high ornamental and ecological values. However, its evolutionary history remains largely unknown. Plastome (chloroplast genome) analysis had become an essential tool for species identification and for unveiling evolutionary relationships between species, genus and other rank groups. In this work we present the plastomes of ten Ficus species. RESULTS The complete chloroplast (CP) genomes of eleven Ficus specimens belonging to ten species were determined and analysed. The full length of the Ficus plastome was nearly 160 kbp with a similar overall GC content, ranging from 35.88 to 36.02%. A total of 114 unique genes, distributed in 80 protein-coding genes, 30 tRNAs, and 4 rRNAs, were annotated in each of the Ficus CP genome. In addition, these CP genomes showed variation in their inverted repeat regions (IR). Tandem repeats and mononucleotide simple sequence repeat (SSR) are widely distributed across the Ficus CP genome. Comparative genome analysis showed low sequence variability. In addition, eight variable regions to be used as potential molecular markers were proposed for future Ficus species identification. According to the phylogenetic analysis, these ten Ficus species were clustered together and further divided into three clades based on different subgenera. Simultaneously, it also showed the relatedness between Ficus and Morus. CONCLUSION The chloroplast genome structure of 10 Ficus species was similar to that of other angiosperms, with a typical four-part structure. Chloroplast genome sizes vary slightly due to expansion and contraction of the IR region. And the variation of noncoding regions of the chloroplast genome is larger than that of coding regions. Phylogenetic analysis showed that these eleven sampled CP genomes were divided into three clades, clustered with species from subgenus Urostigma, Sycomorus, and Ficus, respectively. These results support the Berg classification system, in which the subgenus Ficus was further decomposed into the subgenus Sycomorus. In general, the sequencing and analysis of Ficus plastomes, especially the ones of species with no or limited sequences available yet, contribute to the study of genetic diversity and species evolution of Ficus, while providing useful information for taxonomic and phylogenetic studies of Ficus.
Collapse
Affiliation(s)
- Yuying Huang
- Institute of Medicinal Plant Physiology and Ecology, School of Pharmaceutical Sciences, Guangzhou University of Chinese Medicine, 232th Waihuangdong Road, Higher Education Mega Center, Panyu District, Guangzhou, Guangdong, China
| | - Jing Li
- Traditional Chinese Medicine Gynecology Laboratory in Lingnan Medical Research Center, Guangzhou University of Chinese Medicine, Guangzhou, 510410, China
| | - Zerui Yang
- Institute of Medicinal Plant Physiology and Ecology, School of Pharmaceutical Sciences, Guangzhou University of Chinese Medicine, 232th Waihuangdong Road, Higher Education Mega Center, Panyu District, Guangzhou, Guangdong, China
| | - Wenli An
- Institute of Medicinal Plant Physiology and Ecology, School of Pharmaceutical Sciences, Guangzhou University of Chinese Medicine, 232th Waihuangdong Road, Higher Education Mega Center, Panyu District, Guangzhou, Guangdong, China
| | - Chunzhu Xie
- Institute of Medicinal Plant Physiology and Ecology, School of Pharmaceutical Sciences, Guangzhou University of Chinese Medicine, 232th Waihuangdong Road, Higher Education Mega Center, Panyu District, Guangzhou, Guangdong, China
| | - Shanshan Liu
- Institute of Medicinal Plant Physiology and Ecology, School of Pharmaceutical Sciences, Guangzhou University of Chinese Medicine, 232th Waihuangdong Road, Higher Education Mega Center, Panyu District, Guangzhou, Guangdong, China
| | - Xiasheng Zheng
- Institute of Medicinal Plant Physiology and Ecology, School of Pharmaceutical Sciences, Guangzhou University of Chinese Medicine, 232th Waihuangdong Road, Higher Education Mega Center, Panyu District, Guangzhou, Guangdong, China.
| |
Collapse
|
4
|
Fauskee BD, Sigel EM, Pryer KM, Grusz AL. Variation in frequency of plastid RNA editing within Adiantum implies rapid evolution in fern plastomes. AMERICAN JOURNAL OF BOTANY 2021; 108:820-827. [PMID: 33969475 DOI: 10.1002/ajb2.1649] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 12/18/2020] [Indexed: 06/12/2023]
Abstract
PREMISE Recent studies of plant RNA editing have demonstrated that the number of editing sites can vary widely among large taxonomic groups (orders, families). Yet, very little is known about intrageneric variation in frequency of plant RNA editing, and no study has been conducted in ferns. METHODS We determined plastid RNA-editing counts for two species of Adiantum (Pteridaceae), A. shastense and A. aleuticum, by implementing a pipeline that integrated read-mapping and SNP-calling software to identify RNA-editing sites. We then compared the edits found in A. aleuticum and A. shastense with previously published edits from A. capillus-veneris by generating alignments for each plastid gene. RESULTS We found direct evidence for 505 plastid RNA-editing sites in A. aleuticum and 509 in A. shastense, compared with 350 sites in A. capillus-veneris. We observed striking variation in the number and location of the RNA-editing sites among the three species, with reverse (U-to-C) editing sites showing a higher degree of conservation than forward (C-to-U) sites. Additionally, sites involving start and stop codons were highly conserved. CONCLUSIONS Variation in the frequency of RNA editing within Adiantum implies that RNA-editing sites can be rapidly gained or lost throughout evolution. However, varying degrees of conservation between both C-to-U and U-to-C sites and sites in start or stop codons, versus other codons, hints at the likely independent origin of both types of edits and a potential selective advantage conferred by RNA editing.
Collapse
Affiliation(s)
- Blake D Fauskee
- Department of Biology, Duke University, Durham, NC, 27708, USA
| | - Erin M Sigel
- Department of Biological Sciences, University of New Hampshire, Durham, NH, 03824, USA
| | | | - Amanda L Grusz
- Department of Biology, University of Minnesota Duluth, Duluth, MN, 55812, USA
| |
Collapse
|
5
|
Krüger M, Abeyawardana OAJ, Juříček M, Krüger C, Štorchová H. Variation in plastid genomes in the gynodioecious species Silene vulgaris. BMC PLANT BIOLOGY 2019; 19:568. [PMID: 31856730 PMCID: PMC6921581 DOI: 10.1186/s12870-019-2193-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Accepted: 12/10/2019] [Indexed: 05/10/2023]
Abstract
BACKGROUND Gynodioecious species exist in two sexes - male-sterile females and hermaphrodites. Male sterility in higher plants often results from mitonuclear interaction between the CMS (cytoplasmic male sterility) gene(s) encoded by mitochondrial genome and by nuclear-encoded restorer genes. Mitochondrial and nuclear-encoded transcriptomes in females and hermaphrodites are intensively studied, but little is known about sex-specific gene expression in plastids. We have compared plastid transcriptomes between females and hermaphrodites in two haplotypes of a gynodioecious species Silene vulgaris with known CMS candidate genes. RESULTS We generated complete plastid genome sequences from five haplotypes S. vulgaris including the haplotypes KRA and KOV, for which complete mitochondrial genome sequences were already published. We constructed a phylogenetic tree based on plastid sequences of S. vulgaris. Whereas lowland S. vulgaris haplotypes including KRA and KOV clustered together, the accessions from high European mountains diverged early in the phylogram. S. vulgaris belongs among Silene species with slowly evolving plastid genomes, but we still detected 212 substitutions and 112 indels between two accessions of this species. We estimated elevated Ka/Ks in the ndhF gene, which may reflect the adaptation of S. vulgaris to high altitudes, or relaxed selection. We compared depth of coverage and editing rates between female and hermaphrodite plastid transcriptomes and found no significant differences between the two sexes. We identified 51 unique C to U editing sites in the plastid genomes of S. vulgaris, 38 of them in protein coding regions, 2 in introns, and 11 in intergenic regions. The editing site in the psbZ gene was edited only in one of two plastid genomes under study. CONCLUSIONS We revealed no significant differences between the sexes in plastid transcriptomes of two haplotypes of S. vulgaris. It suggests that gene expression of plastid genes is not affected by CMS in flower buds of S. vulgaris, although both sexes may still differ in plastid gene expression in specific tissues. We revealed the difference between the plastid transcriptomes of two S. vulgaris haplotypes in editing rate and in the coverage of several antisense transcripts. Our results document the variation in plastid genomes and transcriptomes in S. vulgaris.
Collapse
Affiliation(s)
- Manuela Krüger
- Plant Reproduction Laboratory, Institute of Experimental Botany v.v.i, Czech Academy of Sciences, Rozvojová 263, 16502 Prague, Czech Republic
| | - Oushadee A. J. Abeyawardana
- Plant Reproduction Laboratory, Institute of Experimental Botany v.v.i, Czech Academy of Sciences, Rozvojová 263, 16502 Prague, Czech Republic
| | - Miloslav Juříček
- Plant Reproduction Laboratory, Institute of Experimental Botany v.v.i, Czech Academy of Sciences, Rozvojová 263, 16502 Prague, Czech Republic
| | | | - Helena Štorchová
- Plant Reproduction Laboratory, Institute of Experimental Botany v.v.i, Czech Academy of Sciences, Rozvojová 263, 16502 Prague, Czech Republic
| |
Collapse
|
6
|
MORF9 Functions in Plastid RNA Editing with Tissue Specificity. Int J Mol Sci 2019; 20:ijms20184635. [PMID: 31546885 PMCID: PMC6769653 DOI: 10.3390/ijms20184635] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Revised: 09/11/2019] [Accepted: 09/13/2019] [Indexed: 11/17/2022] Open
Abstract
RNA editing in plant mitochondria and plastids converts specific nucleotides from cytidine (C) to uridine (U). These editing events differ among plant species and are relevant to developmental stages or are impacted by environmental conditions. Proteins of the MORF family are essential components of plant editosomes. One of the members, MORF9, is considered the core protein of the editing complex and is involved in the editing of most sites in chloroplasts. In this study, the phenotypes of a T-DNA insertion line with loss of MORF9 and of the genetic complementation line of Arabidopsis were analyzed, and the editing efficiencies of plastid RNAs in roots, rosette leaves, and flowers from the morf9 mutant and the wild-type (WT) control were compared by bulk-cDNA sequencing. The results showed that most of the known MORF9-associated plastid RNA editing events in rosette leaves and flowers were similarly reduced by morf9 mutation, with the exception that the editing rate of the sites ndhB-872 and psbF-65 declined in the leaves and that of ndhB-586 decreased only in the flowers. In the roots, however, the loss of MORF9 had a much lower effect on overall plastid RNA editing, with nine sites showing no significant editing efficiency change, including accD-794, ndhD-383, psbZ-50, ndhF-290, ndhD-878, matK-706, clpP1-559, rpoA-200, and ndhD-674, which were reduced in the other tissues. Furthermore, we found that during plant aging, MORF9 mRNA level, but not the protein level, was downregulated in senescent leaves. On the basis of these observations, we suggest that MORF9-mediated RNA editing is tissue-dependent and the resultant organelle proteomes are pertinent to the specific tissue functions.
Collapse
|
7
|
Chu D, Wei L. The chloroplast and mitochondrial C-to-U RNA editing in Arabidopsis thaliana shows signals of adaptation. PLANT DIRECT 2019; 3:e00169. [PMID: 31517178 PMCID: PMC6732656 DOI: 10.1002/pld3.169] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2019] [Revised: 08/18/2019] [Accepted: 08/23/2019] [Indexed: 05/20/2023]
Abstract
C-to-U RNA editing is the conversion from cytidine to uridine at RNA level. In plants, the genes undergo C-to-U RNA modification are mainly chloroplast and mitochondrial genes. Case studies have identified the roles of C-to-U editing in various biological processes, but the functional consequence of the majority of C-to-U editing events is still undiscovered. We retrieved the deep sequenced transcriptome data in roots and shoots of Arabidopsis thaliana and profiled their C-to-U RNA editomes and gene expression patterns. We investigated the editing level and conservation pattern of these C-to-U editing sites. The levels of nonsynonymous C-to-U editing events are higher than levels of synonymous events. The fraction of nonsynonymous editing sites is higher than neutral expectation. Highly edited cytidines are more conserved at DNA level, and the gene expression levels are correlated with C-to-U editing levels. Our results demonstrate that the global C-to-U editome is shaped by natural selection and that many nonsynonymous C-to-U editing events are adaptive. The editing mechanism might be positively selected and maintained and could have profound effects on the modified RNAs.
Collapse
Affiliation(s)
- Duan Chu
- College of Life SciencesBeijing Normal UniversityBeijingChina
| | - Lai Wei
- College of Life SciencesBeijing Normal UniversityBeijingChina
| |
Collapse
|