1
|
Xiao M, Hu X, Li Y, Liu Q, Shen S, Jiang T, Zhang L, Zhou Y, Li Y, Luo X, Bai L, Yan W. Comparative analysis of codon usage patterns in the chloroplast genomes of nine forage legumes. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2024; 30:153-166. [PMID: 38623162 PMCID: PMC11016040 DOI: 10.1007/s12298-024-01421-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 02/19/2024] [Accepted: 02/23/2024] [Indexed: 04/17/2024]
Abstract
Leguminosae is one of the three largest families of angiosperms after Compositae and Orchidaceae. It is widely distributed and grows in a variety of environments, including plains, mountains, deserts, forests, grasslands, and even waters where almost all legumes can be found. It is one of the most important sources of starch, protein and oil in the food of mankind and also an important source of high-quality forage material for animals, which has important economic significance. In our study, the codon usage patterns and variation sources of the chloroplast genome of nine important forage legumes were systematically analyzed. Meanwhile, we also constructed a phylogenetic tree based on the whole chloroplast genomes and protein coding sequences of these nine forage legumes. Our results showed that the chloroplast genomes of nine forage legumes end with A/T bases, and seven identical high-frequency (HF) codons were detected among the nine forage legumes. ENC-GC3s mapping, PR2 analysis, and neutral analysis showed that the codon bias of nine forage legumes was influenced by many factors, among which natural selection was the main influencing factor. The codon usage frequency showed that the Nicotiana tabacum and Saccharomyces cerevisiae can be considered as receptors for the exogenous expression of chloroplast genes of these nine forage legumes. The phylogenetic relationships of the chloroplast genomes and protein coding genes were highly similar, and the nine forage legumes were divided into three major clades. Among the clades Melilotus officinalis was more closely related to Medicago sativa, and Galega officinalis was more closely related to Galega orientalis. This study provides a scientific basis for the molecular markers research, species identification and phylogenetic studies of forage legumes. Supplementary Information The online version contains supplementary material available at 10.1007/s12298-024-01421-0.
Collapse
Affiliation(s)
- Mingkun Xiao
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Xiang Hu
- Tropical Eco-agricultural Research Institute, Yunnan Academy of Agricultural Sciences, Yuanmou, Yunnan China
| | - Yaqi Li
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Qian Liu
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Shaobin Shen
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Tailing Jiang
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Linhui Zhang
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Yingchun Zhou
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Yuexian Li
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Xin Luo
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Lina Bai
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| | - Wei Yan
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan China
| |
Collapse
|
2
|
Zhang K, Wang Y, Zhang Y, Shan X. Codon usage characterization and phylogenetic analysis of the mitochondrial genome in Hemerocallis citrina. BMC Genom Data 2024; 25:6. [PMID: 38218810 PMCID: PMC10788020 DOI: 10.1186/s12863-024-01191-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 01/04/2024] [Indexed: 01/15/2024] Open
Abstract
BACKGROUND Hemerocallis citrina Baroni is a traditional vegetable crop widely cultivated in eastern Asia for its high edible, medicinal, and ornamental value. The phenomenon of codon usage bias (CUB) is prevalent in various genomes and provides excellent clues for gaining insight into organism evolution and phylogeny. Comprehensive analysis of the CUB of mitochondrial (mt) genes can provide rich genetic information for improving the expression efficiency of exogenous genes and optimizing molecular-assisted breeding programmes in H. citrina. RESULTS Here, the CUB patterns in the mt genome of H. citrina were systematically analyzed, and the possible factors shaping CUB were further evaluated. Composition analysis of codons revealed that the overall GC (GCall) and GC at the third codon position (GC3) contents of mt genes were lower than 50%, presenting a preference for A/T-rich nucleotides and A/T-ending codons in H. citrina. The high values of the effective number of codons (ENC) are indicative of fairly weak CUB. Significant correlations of ENC with the GC3 and codon counts were observed, suggesting that not only compositional constraints but also gene length contributed greatly to CUB. Combined ENC-plot, neutrality plot, and Parity rule 2 (PR2)-plot analyses augmented the inference that the CUB patterns of the H. citrina mitogenome can be attributed to multiple factors. Natural selection, mutation pressure, and other factors might play a major role in shaping the CUB of mt genes, although natural selection is the decisive factor. Moreover, we identified a total of 29 high-frequency codons and 22 optimal codons, which exhibited a consistent preference for ending in A/T. Subsequent relative synonymous codon usage (RSCU)-based cluster and mt protein coding gene (PCG)-based phylogenetic analyses suggested that H. citrina is close to Asparagus officinalis, Chlorophytum comosum, Allium cepa, and Allium fistulosum in evolutionary terms, reflecting a certain correlation between CUB and evolutionary relationships. CONCLUSIONS There is weak CUB in the H. citrina mitogenome that is subject to the combined effects of multiple factors, especially natural selection. H. citrina was found to be closely related to Asparagus officinalis, Chlorophytum comosum, Allium cepa, and Allium fistulosum in terms of their evolutionary relationships as well as the CUB patterns of their mitogenomes. Our findings provide a fundamental reference for further studies on genetic modification and phylogenetic evolution in H. citrina.
Collapse
Affiliation(s)
- Kun Zhang
- College of Agriculture and Life Sciences, Shanxi Datong University, Datong, Shanxi, China.
- Key Laboratory of Organic Dry Farming for Special Crops in Datong City, Datong, Shanxi, China.
| | - Yiheng Wang
- State Key Laboratory of Vegetable Biobreeding, Tianjin Academy of Agricultural Sciences, Tianjin, China
| | - Yue Zhang
- College of Agriculture and Life Sciences, Shanxi Datong University, Datong, Shanxi, China
| | - Xiaofei Shan
- College of Agriculture and Life Sciences, Shanxi Datong University, Datong, Shanxi, China
| |
Collapse
|
3
|
Cao Y, Yin D, Pang B, Li H, Liu Q, Zhai Y, Ma N, Shen H, Jia Q, Wang D. Assembly and phylogenetic analysis of the mitochondrial genome of endangered medicinal plant Huperzia crispata. Funct Integr Genomics 2023; 23:295. [PMID: 37691055 DOI: 10.1007/s10142-023-01223-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 08/08/2023] [Accepted: 08/28/2023] [Indexed: 09/12/2023]
Abstract
Huperzia crispata is a traditional Chinese herb plant and has attracted special attention in recent years for its products Hup A can serve as an acetylcholinesterase inhibitor (AChEI). Although the chloroplast (cp) genome of H. crispata has been studied, there are no reports regarding the Huperzia mitochondrial (mt) genome since the previously reported H. squarrosa has been revised as Phlegmariurus squarrosus. The mt genome of H. crispata was sequenced using a combination of long-read nanopore and Illumina sequencing platforms. The entire H. crispata mt genome was assembled in a circular with a length of 412,594 bp and a total of 91 genes, including 45 tRNAs, 6 rRNAs, 37 protein-coding genes (PCGs), and 3 pseudogenes. Notably, the rps8 gene was present in P. squarrosus and a pseudogene rps8 was presented in H. crispata, which was lacking in most of Pteridophyta and Gymnospermae. Intron-encoded maturase (mat-atp9i85 and mat-cobi787) genes were present in H. crispata and P. squarrosus, but lost in other examined lycophytes, ferns, and Gymnospermae plants. Collinearity analysis showed that the mt genome of H. crispata and P. squarrossus is highly conservative compared to other ferns. Relative synonymous codon usage (RSCU) analysis showed that the amino acids most frequently found were phenylalanine (Phe) (4.77%), isoleucine (Ile) (4.71%), lysine (Lys) (4.26%), while arginine (Arg) (0.32%), and histidine (His) (0.42%) were rarely found. Simple sequence repeats (SSR) analysis revealed that a total of 114 SSRs were identified in the mt genome of H. crispata and account for 0.35% of the whole mt genome. Monomer repeats were the majority types of SSRs and represent 91.89% of the total SSRs. In addition, a total of 1948 interspersed repeats (158 forward, 147 palindromic, and 5 reverse repeats) with a length ranging from 30 bp to 14,945 bp were identified in the H. crispata mt genome and the 30-39-bp repeats were the most abundant type. Gene transfer analysis indicated that a total of 12 homologous fragments were discovered between the cp and mt genomes of H. crispata, accounting for 0.93% and 2.48% of the total cp and mt genomes, respectively. The phylogenetic trees revealed that H. crispata was the sister of P. squarrosus. The Ka/Ks analysis results suggested that most PCGs, except atp6 gene, were subject to purification selection during evolution. Our study provides extensive information on the features of the H. crispata mt genome and will help unravel evolutionary relationships, and molecular identification within lycophytes.
Collapse
Affiliation(s)
- Yu Cao
- Key Laboratory of Plant Secondary Metabolism Regulation in Zhejiang Province, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Dengpan Yin
- Key Laboratory of Plant Secondary Metabolism Regulation in Zhejiang Province, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Bo Pang
- Key Laboratory of Plant Secondary Metabolism Regulation in Zhejiang Province, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Haibo Li
- Yuyao Seedling Management Station, Ningbo, Zhejiang, 315400, China
| | - Qiao Liu
- Key Laboratory of Plant Secondary Metabolism Regulation in Zhejiang Province, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Yufeng Zhai
- Key Laboratory of Plant Secondary Metabolism Regulation in Zhejiang Province, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Nan Ma
- Key Laboratory of Plant Secondary Metabolism Regulation in Zhejiang Province, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Hongjun Shen
- Ningbo Delai Medicinal Material Planting Co, Ltd, 315444, Ningbo, Zhejiang, 315444, China
| | - Qiaojun Jia
- Key Laboratory of Plant Secondary Metabolism Regulation in Zhejiang Province, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China
| | - Dekai Wang
- Key Laboratory of Plant Secondary Metabolism Regulation in Zhejiang Province, College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Zhejiang, 310018, Hangzhou, China.
| |
Collapse
|
4
|
Frazão A, Thode VA, Lohmann LG. Comparative chloroplast genomics and insights into the molecular evolution of Tanaecium (Bignonieae, Bignoniaceae). Sci Rep 2023; 13:12469. [PMID: 37528152 PMCID: PMC10394017 DOI: 10.1038/s41598-023-39403-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 07/25/2023] [Indexed: 08/03/2023] Open
Abstract
Species of Tanaecium (Bignonieae, Bignoniaceae) are lianas distributed in the Neotropics and centered in the Amazon. Members of the genus exhibit exceptionally diverse flower morphology and pollination systems. Here, we sequenced, assembled, and annotated 12 complete and four partial chloroplast genomes representing 15 Tanaecium species and more than 70% of the known diversity in the genus. Gene content and order were similar in all species of Tanaecium studied, with genome sizes ranging between 158,470 and 160,935 bp. Tanaecium chloroplast genomes have 137 genes, including 80-81 protein-coding genes, 37 tRNA genes, and four rRNA genes. No rearrangements were found in Tanaecium plastomes, but two different patterns of boundaries between regions were recovered. Tanaecium plastomes show nucleotide variability, although only rpoA was hypervariable. Multiple SSRs and repeat regions were detected, and eight genes were found to have signatures of positive selection. Phylogeny reconstruction using 15 Tanaecium plastomes resulted in a strongly supported topology, elucidating several relationships not recovered previously and bringing new insights into the evolution of the genus.
Collapse
Affiliation(s)
- Annelise Frazão
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP, Brazil.
- Departamento de Biodiversidade e Bioestatística, Instituto de Biociências, Universidade Estadual Paulista, Botucatu, SP, Brazil.
| | - Verônica A Thode
- Programa de Pós-Graduação em Botânica, Departamento de Botânica, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Lúcia G Lohmann
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP, Brazil.
- Department of Integrative Biology, University and Jepson Herbaria, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
5
|
Wang Y, Jiang D, Guo K, Zhao L, Meng F, Xiao J, Niu Y, Sun Y. Comparative analysis of codon usage patterns in chloroplast genomes of ten Epimedium species. BMC Genom Data 2023; 24:3. [PMID: 36624369 PMCID: PMC9830715 DOI: 10.1186/s12863-023-01104-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 01/05/2023] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND The Phenomenon of codon usage bias exists in the genomes of prokaryotes and eukaryotes. The codon usage pattern is affected by environmental factors, base mutation, gene flow and gene expression level, among which natural selection and mutation pressure are the main factors. The study of codon preference is an effective method to analyze the source of evolutionary driving forces in organisms. Epimedium species are perennial herbs with ornamental and medicinal value distributed worldwide. The chloroplast genome is self-replicating and maternally inherited which is usually used to study species evolution, gene expression and genetic transformation. RESULTS The results suggested that chloroplast genomes of Epimedium species preferred to use codons ending with A/U. 17 common high-frequency codons and 2-6 optimal codons were found in the chloroplast genomes of Epimedium species, respectively. According to the ENc-plot, PR2-plot and neutrality-plot, the formation of codon preference in Epimedium was affected by multiple factors, and natural selection was the dominant factor. By comparing the codon usage frequency with 4 common model organisms, it was found that Arabidopsis thaliana, Populus trichocarpa, and Saccharomyces cerevisiae were suitable exogenous expression receptors. CONCLUSION The evolutionary driving force in the chloroplast genomes of 10 Epimedium species probably comes from mutation pressure. Our results provide an important theoretical basis for evolutionary analysis and transgenic research of chloroplast genes.
Collapse
Affiliation(s)
- Yingzhe Wang
- grid.449428.70000 0004 1797 7280College of Pharmacy, Jining Medical University, Rizhao, Shandong China ,grid.440665.50000 0004 1757 641XSchool of Pharmaceutical Sciences, Changchun University of Chinese Medicine, Changchun, Jilin China
| | - Dacheng Jiang
- grid.440665.50000 0004 1757 641XSchool of Pharmaceutical Sciences, Changchun University of Chinese Medicine, Changchun, Jilin China
| | - Kun Guo
- grid.440665.50000 0004 1757 641XSchool of Pharmaceutical Sciences, Changchun University of Chinese Medicine, Changchun, Jilin China
| | - Lei Zhao
- grid.440665.50000 0004 1757 641XSchool of Pharmaceutical Sciences, Changchun University of Chinese Medicine, Changchun, Jilin China
| | - Fangfang Meng
- grid.440665.50000 0004 1757 641XSchool of Pharmaceutical Sciences, Changchun University of Chinese Medicine, Changchun, Jilin China
| | - Jinglei Xiao
- grid.440665.50000 0004 1757 641XSchool of Pharmaceutical Sciences, Changchun University of Chinese Medicine, Changchun, Jilin China
| | - Yuan Niu
- Lanzhou Agro-Technical Research and Popularization Center, Lanzhou, Gansu China
| | - Yunlong Sun
- grid.449428.70000 0004 1797 7280College of Pharmacy, Jining Medical University, Rizhao, Shandong China
| |
Collapse
|
6
|
Comparison of Boraginales Plastomes: Insights into Codon Usage Bias, Adaptive Evolution, and Phylogenetic Relationships. DIVERSITY 2022. [DOI: 10.3390/d14121104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The Boraginales (Boraginaceae a.l.) comprise more than 2450 species worldwide. However, little knowledge exists of the characteristics of the complete plastid genome. In this study, three new sequences representing the first pt genome of Heliotropiaceae and Cordiaceae were assembled and compared with other Boraginales species. The pt genome sizes of Cordia dichotoma, Heliotropium arborescens, and Tournefortia montana were 151,990 bp, 156,243 bp, and 155,891 bp, respectively. Multiple optimal codons were identified, which may provide meaningful information for enhancing the gene expression of Boraginales species. Furthermore, codon usage bias analyses revealed that natural selection and other factors may dominate codon usage patterns in the Boraginales species. The boundaries of the IR/LSC and IR/SSC regions were significantly different, and we also found a signal of obvious IR region expansion in the pt genome of Nonea vesicaria and Arnebia euchroma. Genes with high nucleic acid diversity (pi) values were also calculated, which may be used as potential DNA barcodes to investigate the phylogenetic relationships in Boraginales. psaI, rpl33, rpl36, and rps19 were found to be under positive selection, and these genes play an important role in our understanding of the adaptive evolution of the Boraginales species. Phylogenetic analyses implied that Boraginales can be divided into two groups. The existence of two subfamilies (Lithospermeae and Boragineae) in Boraginaceae is also strongly supported. Our study provides valuable information on pt genome evolution and phylogenetic relationships in the Boraginales species.
Collapse
|
7
|
Li L, Li M, Wu J, Yin H, Dunwell JM, Zhang S. Genome-wide identification and comparative evolutionary analysis of sorbitol metabolism pathway genes in four Rosaceae species and three model plants. BMC PLANT BIOLOGY 2022; 22:341. [PMID: 35836134 PMCID: PMC9284748 DOI: 10.1186/s12870-022-03729-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Accepted: 06/29/2022] [Indexed: 06/15/2023]
Abstract
In contrast to most land plant species, sorbitol, instead of sucrose, is the major photosynthetic product in many Rosaceae species. It has been well illustrated that three key functional genes encoding sorbitol-6-phosphate dehydrogenase (S6PDH), sorbitol dehydrogenase (SDH), and sorbitol transporter (SOT), are mainly responsible for the synthesis, degradation and transportation of sorbitol. In this study, the genome-wide identification of S6PDH, SDH and SOT genes was conducted in four Rosaceae species, peach, mei, apple and pear, and showed the sorbitol bio-pathway to be dominant (named sorbitol present group, SPG); another three related species, including tomato, poplar and Arabidopsis, showed a non-sorbitol bio-pathway (named sorbitol absent group, SAG). To understand the evolutionary differences of the three important gene families between SAG and SPG, their corresponding gene duplication, evolutionary rate, codon bias and positive selection patterns have been analyzed and compared. The sorbitol pathway genes in SPG were found to be expanded through dispersed and tandem gene duplications. Branch-specific model analyses revealed SDH and S6PDH clade A were under stronger purifying selection in SPG. A higher frequency of optimal codons was found in S6PDH and SDH than that of SOT in SPG, confirming the purifying selection effect on them. In addition, branch-site model analyses revealed SOT genes were under positive selection in SPG. Expression analyses showed diverse expression patterns of sorbitol-related genes. Overall, these findings provide new insights in the evolutionary characteristics for the three key sorbitol metabolism-related gene families in Rosaceae and other non-sorbitol dominant pathway species.
Collapse
Affiliation(s)
- Leiting Li
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
- Shanghai Center for Plant Stress Biology and CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Meng Li
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
| | - Juyou Wu
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
| | - Hao Yin
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
| | - Jim M. Dunwell
- School of Agriculture, Policy and Development, University of Reading, Earley Gate, Reading, UK
| | - Shaoling Zhang
- College of Horticulture, Nanjing Agricultural University, Nanjing, Jiangsu China
| |
Collapse
|
8
|
Abstract
Codon usage bias is the preferential or non-random use of synonymous codons, a ubiquitous phenomenon observed in bacteria, plants and animals. Different species have consistent and characteristic codon biases. Codon bias varies not only with species, family or group within kingdom, but also between the genes within an organism. Codon usage bias has evolved through mutation, natural selection, and genetic drift in various organisms. Genome composition, GC content, expression level and length of genes, position and context of codons in the genes, recombination rates, mRNA folding, and tRNA abundance and interactions are some factors influencing codon bias. The factors shaping codon bias may also be involved in evolution of the universal genetic code. Codon-usage bias is critical factor determining gene expression and cellular function by influencing diverse processes such as RNA processing, protein translation and protein folding. Codon usage bias reflects the origin, mutation patterns and evolution of the species or genes. Investigations of codon bias patterns in genomes can reveal phylogenetic relationships between organisms, horizontal gene transfers, molecular evolution of genes and identify selective forces that drive their evolution. Most important application of codon bias analysis is in the design of transgenes, to increase gene expression levels through codon optimization, for development of transgenic crops. The review gives an overview of deviations of genetic code, factors influencing codon usage or bias, codon usage bias of nuclear and organellar genes, computational methods to determine codon usage and the significance as well as applications of codon usage analysis in biological research, with emphasis on plants.
Collapse
Affiliation(s)
| | - Varatharajalu Udayasuriyan
- Department of Biotechnology, Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore, 641003, India
| | - Vijaipal Bhadana
- ICAR-Indian Institute of Agricultural Biotechnology, Ranchi, Jharkhand, 834010, India
| |
Collapse
|
9
|
Wang P, Mao Y, Su Y, Wang J. Comparative analysis of transcriptomic data shows the effects of multiple evolutionary selection processes on codon usage in Marsupenaeus japonicus and Marsupenaeus pulchricaudatus. BMC Genomics 2021; 22:781. [PMID: 34717552 PMCID: PMC8557549 DOI: 10.1186/s12864-021-08106-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Accepted: 10/19/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Kuruma shrimp, a major commercial shrimp species in the world, has two cryptic or sibling species, Marsupenaeus japonicus and Marsupenaeus pulchricaudatus. Codon usage analysis would contribute to our understanding of the genetic and evolutionary characteristics of the two Marsupenaeus species. In this study, we analyzed codon usage and related indices using coding sequences (CDSs) from RNA-seq data. RESULTS Using CodonW 1.4.2 software, we performed the codon bias analysis of transcriptomes obtained from hepatopancreas tissues, which indicated weak codon bias. Almost all parameters had similar correlations for both species. The gene expression level (FPKM) was negatively correlated with A/T3s. We determined 12 and 14 optimal codons for M. japonicus and M. pulchricaudatus, respectively, and all optimal codons have a C/G-ending. The two Marsupenaeus species had different usage frequencies of codon pairs, which contributed to further analysis of transcriptional differences between them. Orthologous genes that underwent positive selection (ω > 1) had a higher correlation coefficient than that of experienced purifying selection (ω < 1). Parity Rule 2 (PR2) and effective number of codons (ENc) plot analysis showed that the codon usage patterns of both species were influenced by both mutations and selection. Moreover, the average observed ENc value was lower than the expected value for both species, suggesting that factors other than GC may play roles in these phenomena. The results of multispecies clustering based on codon preference were consistent with traditional classification. CONCLUSIONS This study provides a relatively comprehensive understanding of the correlations among codon usage bias, gene expression, and selection pressures of CDSs for M. japonicus and M. pulchricaudatus. The genetic evolution was driven by mutations and selection pressure. Moreover, the results point out new insights into the specificities and evolutionary characteristics of the two Marsupenaeus species.
Collapse
Affiliation(s)
- Panpan Wang
- Jiangsu Key Laboratory of Marine Bioresources and Environment/ Jiangsu Key Laboratory of Marine Biotechnology, Jiangsu Ocean University, Lianyungang, 222005, China
- Co-Innovation Center of Jiangsu Marine Bio-Industry Technology, Jiangsu Ocean University, Lianyungang, 222005, China
- The Jiangsu Provincial Infrastructure for Conservation and Utilization of Agricultural Germplasm, Nanjing, 210014, China
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Yong Mao
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, Fujian, China.
- Fujian Key Laboratory of Genetics and Breeding of Marine Organisms, Xiamen University, Xiamen, 361102, China.
| | - Yongquan Su
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Jun Wang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| |
Collapse
|
10
|
Anwar AM, Aljabri M, El-Soda M. Patterns of genome-wide codon usage bias in tobacco, tomato and potato. BIOTECHNOL BIOTEC EQ 2021. [DOI: 10.1080/13102818.2021.1911684] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Affiliation(s)
- Ali Mostafa Anwar
- Department of Genetics, Faculty of Agriculture, Cairo University, Giza, Egypt
| | - Maha Aljabri
- Department of Biology, Faculty of Applied Sciences, Umm Al‐Qura University, Makkah, Saudi Arabia
- Research Laboratories Centre, Faculty of Applied Sciences, Umm Al-Qura University, Makkah, Saudi Arabia
| | - Mohamed El-Soda
- Department of Genetics, Faculty of Agriculture, Cairo University, Giza, Egypt
| |
Collapse
|
11
|
Duan H, Zhang Q, Wang C, Li F, Tian F, Lu Y, Hu Y, Yang H, Cui G. Analysis of codon usage patterns of the chloroplast genome in Delphinium grandiflorum L. reveals a preference for AT-ending codons as a result of major selection constraints. PeerJ 2021; 9:e10787. [PMID: 33552742 PMCID: PMC7819120 DOI: 10.7717/peerj.10787] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 12/24/2020] [Indexed: 01/28/2023] Open
Abstract
Background Codon usage bias analysis is a suitable strategy for identifying the principal evolutionary driving forces in different organisms. Delphinium grandiflorum L. is a perennial herb with high economic value and typical biological characteristics. Evolutionary analysis of D. grandiflorum can provide a rich resource of genetic information for developing hybridization resources of the genus Delphinium. Methods Synonymous codon usage (SCU) and related indices of 51 coding sequences from the D. grandiflorum chloroplast (cp) genome were calculated using Codon W, Cups of EMBOSS, SPSS and Microsoft Excel. Multivariate statistical analysis combined by principal component analysis (PCA), correspondence analysis (COA), PR2-plot mapping analysis and ENC plot analysis was then conducted to explore the factors affecting the usage of synonymous codons. Results The SCU bias of D. grandiflorum was weak and codons preferred A/T ending. A SCU imbalance between A/T and G/C at the third base position was revealed by PR2-plot mapping analysis. A total of eight codons were identified as the optimal codons. The PCA and COA results indicated that base composition (GC content, GC3 content) and gene expression were important for SCU bias. A majority of genes were distributed below the expected curve from the ENC plot analysis and up the standard curve by neutrality plot analysis. Our results showed that with the exception of notable mutation pressure effects, the majority of genetic evolution in the D. grandiflorum cp genome might be driven by natural selection. Discussions Our results provide a theoretical foundation for elucidating the genetic architecture and mechanisms of D. grandiflorum, and contribute to enriching D. grandiflorum genetic resources.
Collapse
Affiliation(s)
- Huirong Duan
- Chinese Academy of Agricultural Sciences Lanzhou Institute of Husbandry and Pharmaceutical Science, Lanzhou, China
| | - Qian Zhang
- Chinese Academy of Agricultural Sciences Lanzhou Institute of Husbandry and Pharmaceutical Science, Lanzhou, China
| | - Chunmei Wang
- Chinese Academy of Agricultural Sciences Lanzhou Institute of Husbandry and Pharmaceutical Science, Lanzhou, China
| | - Fang Li
- Institute of Grassland Science, Chinese Academy of Agricultural Sciences, Hohhot, China
| | - Fuping Tian
- Chinese Academy of Agricultural Sciences Lanzhou Institute of Husbandry and Pharmaceutical Science, Lanzhou, China
| | - Yuan Lu
- Chinese Academy of Agricultural Sciences Lanzhou Institute of Husbandry and Pharmaceutical Science, Lanzhou, China
| | - Yu Hu
- Chinese Academy of Agricultural Sciences Lanzhou Institute of Husbandry and Pharmaceutical Science, Lanzhou, China
| | - Hongshan Yang
- Chinese Academy of Agricultural Sciences Lanzhou Institute of Husbandry and Pharmaceutical Science, Lanzhou, China
| | - Guangxin Cui
- Chinese Academy of Agricultural Sciences Lanzhou Institute of Husbandry and Pharmaceutical Science, Lanzhou, China
| |
Collapse
|
12
|
Dong S, Zhang L, Pang W, Zhang Y, Wang C, Li Z, Ma L, Tang W, Yang G, Song H. Comprehensive analysis of coding sequence architecture features and gene expression in Arachis duranensis. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2021; 27:213-222. [PMID: 33707864 PMCID: PMC7907404 DOI: 10.1007/s12298-021-00938-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 01/04/2021] [Accepted: 01/20/2021] [Indexed: 06/09/2023]
Abstract
Coding sequence (CDS) architecture affects gene expression levels in organisms. Codon optimization can increase the gene expression level. Therefore, understanding codon usage patterns has important implications for research on genetic engineering and exogenous gene expression. To date, the codon usage patterns of many model plants have been analyzed. However, the relationship between CDS architecture and gene expression in Arachis duranensis remains poorly understood. According to the results of genome sequencing, A. duranensis has many resistant genes that can be used to improve the cultivated peanut. In this study, bioinformatic approaches were used to estimate A. duranensis CDS architectures, including frequency of the optimal codon (Fop), polypeptide length and GC contents at the first (GC1), second (GC2) and third (GC3) codon positions. In addition, Arachis RNA-seq datasets were downloaded from PeanutBase. The relationships between gene expression and CDS architecture were assessed both under normal growth as well as nematode and drought stress conditions. A total of 26 codons with high frequency were identified, which preferentially ended with A or T in A. duranensis CDSs under the above-mentioned three conditions. A similar CDS architecture was found in differentially expressed genes (DEGs) under nematode and drought stresses. The GC1 content differed between DEGs and non-differentially expressed genes (NDEGs) under both drought and nematode stresses. The expression levels of DEGs were affected by different CDS architectures compared with NDEGs under drought stress. In addition, no correlation was found between differential gene expression and CDS architecture neither under nematode nor under drought stress. These results aid the understanding of gene expression in A. duranensis.
Collapse
Affiliation(s)
- Shuwei Dong
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Long Zhang
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Wenhui Pang
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Yongli Zhang
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Chang Wang
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Zhenyi Li
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Lichao Ma
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Wei Tang
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Guofeng Yang
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Hui Song
- Grassland Agri-Husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| |
Collapse
|
13
|
Demographic history and adaptive synonymous and nonsynonymous variants of nuclear genes in Rhododendron oldhamii (Ericaceae). Sci Rep 2020; 10:16658. [PMID: 33028947 PMCID: PMC7542430 DOI: 10.1038/s41598-020-73748-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 09/22/2020] [Indexed: 11/23/2022] Open
Abstract
Demographic events are important in shaping the population genetic structure and exon variation can play roles in adaptive divergence. Twelve nuclear genes were used to investigate the species-level phylogeography of Rhododendron oldhamii, test the difference in the average GC content of coding sites and of third codon positions with that of surrounding non-coding regions, and test exon variants associated with environmental variables. Spatial expansion was suggested by R2 index of the aligned intron sequences of all genes of the regional samples and sum of squared deviations statistic of the aligned intron sequences of all genes individually and of all genes of the regional and pooled samples. The level of genetic differentiation was significantly different between regional samples. Significantly lower and higher average GC contents across 94 sequences of the 12 genes at third codon positions of coding sequences than that of surrounding non-coding regions were found. We found seven exon variants associated strongly with environmental variables. Our results demonstrated spatial expansion of R. oldhamii in the late Pleistocene and the optimal third codon position could end in A or T rather than G or C as frequent alleles and could have been important for adaptive divergence in R. oldhamii.
Collapse
|
14
|
Da Ines O, Michard R, Fayos I, Bastianelli G, Nicolas A, Guiderdoni E, White C, Sourdille P. Bread wheat TaSPO11-1 exhibits evolutionarily conserved function in meiotic recombination across distant plant species. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 103:2052-2068. [PMID: 32559326 DOI: 10.1111/tpj.14882] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 05/29/2020] [Indexed: 05/24/2023]
Abstract
The manipulation of meiotic recombination in crops is essential to develop new plant varieties rapidly, helping to produce more cultivars in a sustainable manner. One option is to control the formation and repair of the meiosis-specific DNA double-strand breaks (DSBs) that initiate recombination between the homologous chromosomes and ultimately lead to crossovers. These DSBs are introduced by the evolutionarily conserved topoisomerase-like protein SPO11 and associated proteins. Here, we characterized the homoeologous copies of the SPO11-1 protein in hexaploid bread wheat (Triticum aestivum). The genome contains three SPO11-1 gene copies that exhibit 93-95% identity at the nucleotide level, and clearly the A and D copies originated from the diploid ancestors Triticum urartu and Aegilops tauschii, respectively. Furthermore, phylogenetic analysis of 105 plant genomes revealed a clear partitioning between monocots and dicots, with the seven main motifs being almost fully conserved, even between clades. The functional similarity of the proteins among monocots was confirmed through complementation analysis of the Oryza sativa (rice) spo11-1 mutant by the wheat TaSPO11-1-5D coding sequence. Also, remarkably, although the wheat and Arabidopsis SPO11-1 proteins share only 55% identity and the partner proteins also differ, the TaSPO11-1-5D cDNA significantly restored the fertility of the Arabidopsis spo11-1 mutant, indicating a robust functional conservation of the SPO11-1 protein activity across distant plants. These successful heterologous complementation assays, using both Arabidopsis and rice hosts, are good surrogates to validate the functionality of candidate genes and cDNA, as well as variant constructs, when the transformation and mutant production in wheat is much longer and more tedious.
Collapse
Affiliation(s)
- Olivier Da Ines
- Université Clermont Auvergne, CNRS, Inserm, GReD, Clermont-Ferrand, F-63000, France
| | - Robin Michard
- Université Clermont-Auvergne (UCA), INRAE, UMR1095 - Genetics, Diversity & Ecophysiology of Cereals, Clermont-Ferrand, 63000, France
- Meiogenix, 27 rue du Chemin Vert, Paris, 75011, France
| | - Ian Fayos
- Meiogenix, 27 rue du Chemin Vert, Paris, 75011, France
- UMR AGAP, CIRAD, Montpellier Cedex 5, 34398, France
- Université de Montpellier, CIRAD, INRAE, Montpellier SupAgro, Montpellier, 34398, France
| | | | - Alain Nicolas
- Meiogenix, 27 rue du Chemin Vert, Paris, 75011, France
- Institut Curie, Centre de recherche, CNRS UMR 3244, PSL University, 26 rue d'Ulm, Paris Cedex 05, 75248, France
| | - Emmanuel Guiderdoni
- UMR AGAP, CIRAD, Montpellier Cedex 5, 34398, France
- Université de Montpellier, CIRAD, INRAE, Montpellier SupAgro, Montpellier, 34398, France
| | - Charles White
- Université Clermont Auvergne, CNRS, Inserm, GReD, Clermont-Ferrand, F-63000, France
| | - Pierre Sourdille
- Université Clermont-Auvergne (UCA), INRAE, UMR1095 - Genetics, Diversity & Ecophysiology of Cereals, Clermont-Ferrand, 63000, France
| |
Collapse
|
15
|
Chu D, Wei L. Genome-wide analysis on the maize genome reveals weak selection on synonymous mutations. BMC Genomics 2020; 21:333. [PMID: 32349669 PMCID: PMC7190201 DOI: 10.1186/s12864-020-6745-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 04/21/2020] [Indexed: 02/06/2023] Open
Abstract
Background Synonymous mutations are able to change the tAI (tRNA adaptation index) of a codon and consequently affect the local translation rate. Intuitively, one may hypothesize that those synonymous mutations which increase the tAI values are favored by natural selection. Results We use the maize (Zea mays) genome to test our assumption. The first supporting evidence is that the tAI-increasing synonymous mutations have higher fixed-to-polymorphic ratios than the tAI-decreasing ones. Next, the DAF (derived allele frequency) or MAF (minor allele frequency) of the former is significantly higher than the latter. Moreover, similar results are obtained when we investigate CAI (codon adaptation index) instead of tAI. Conclusion The synonymous mutations in the maize genome are not strictly neutral. The tAI-increasing mutations are positively selected while those tAI-decreasing ones undergo purifying selection. This selection force might be weak but should not be automatically ignored.
Collapse
Affiliation(s)
- Duan Chu
- College of Life Sciences, Beijing Normal University, No. 19 Xinjiekouwai Street, Haidian District, Beijing, China
| | - Lai Wei
- College of Life Sciences, Beijing Normal University, No. 19 Xinjiekouwai Street, Haidian District, Beijing, China.
| |
Collapse
|
16
|
Mass-spectrometry-based draft of the Arabidopsis proteome. Nature 2020; 579:409-414. [PMID: 32188942 DOI: 10.1038/s41586-020-2094-2] [Citation(s) in RCA: 261] [Impact Index Per Article: 65.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 01/17/2020] [Indexed: 01/05/2023]
Abstract
Plants are essential for life and are extremely diverse organisms with unique molecular capabilities1. Here we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana. Our analysis provides initial answers to how many genes exist as proteins (more than 18,000), where they are expressed, in which approximate quantities (a dynamic range of more than six orders of magnitude) and to what extent they are phosphorylated (over 43,000 sites). We present examples of how the data may be used, such as to discover proteins that are translated from short open-reading frames, to uncover sequence motifs that are involved in the regulation of protein production, and to identify tissue-specific protein complexes or phosphorylation-mediated signalling events. Interactive access to this resource for the plant community is provided by the ProteomicsDB and ATHENA databases, which include powerful bioinformatics tools to explore and characterize Arabidopsis proteins, their modifications and interactions.
Collapse
|
17
|
Walsh JR, Woodhouse MR, Andorf CM, Sen TZ. Tissue-specific gene expression and protein abundance patterns are associated with fractionation bias in maize. BMC PLANT BIOLOGY 2020; 20:4. [PMID: 31900107 PMCID: PMC6942271 DOI: 10.1186/s12870-019-2218-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2019] [Accepted: 12/24/2019] [Indexed: 05/26/2023]
Abstract
BACKGROUND Maize experienced a whole-genome duplication event approximately 5 to 12 million years ago. Because this event occurred after speciation from sorghum, the pre-duplication subgenomes can be partially reconstructed by mapping syntenic regions to the sorghum chromosomes. During evolution, maize has had uneven gene loss between each ancient subgenome. Fractionation and divergence between these genomes continue today, constantly changing genetic make-up and phenotypes and influencing agronomic traits. RESULTS Here we regenerate the subgenome reconstructions for the most recent maize reference genome assembly. Based on both expression and abundance data for homeologous gene pairs across multiple tissues, we observed functional divergence of genes across subgenomes. Although the genes in the larger maize subgenome are often expressing more highly than their homeologs in the smaller subgenome, we observed cases where homeolog expression dominance switches in different tissues. We demonstrate for the first time that protein abundances are higher in the larger subgenome, but they also show tissue-specific dominance, a pattern similar to RNA expression dominance. We also find that pollen expression is uniquely decoupled from protein abundance. CONCLUSION Our study shows that the larger subgenome has a greater range of functional assignments and that there is a relative lack of overlap between the subgenomes in terms of gene functions than would be suggested by similar patterns of gene expression and protein abundance. Our study also revealed that some reactions are catalyzed uniquely by the larger and smaller subgenomes. The tissue-specific, nonequivalent expression-level dominance pattern observed here implies a change in regulatory control which favors differentiated selective pressure on the retained duplicates leading to eventual change in gene functions.
Collapse
Affiliation(s)
- Jesse R Walsh
- U.S. Department of Agriculture, Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
| | - Margaret R Woodhouse
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
- U.S. Department of Agriculture, Agricultural Research Service, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA, 94710, USA
| | - Carson M Andorf
- U.S. Department of Agriculture, Agricultural Research Service, Corn Insects and Crop Genetics Research Unit, Ames, IA, 50011, USA
- Department of Computer Science, Iowa State University, Ames, IA, 50011, USA
| | - Taner Z Sen
- U.S. Department of Agriculture, Agricultural Research Service, Western Regional Research Center, Crop Improvement and Genetics Research Unit, Albany, CA, 94710, USA.
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
18
|
Miller JB, Pickett BD, Ridge PG. JustOrthologs: a fast, accurate and user-friendly ortholog identification algorithm. Bioinformatics 2019; 35:546-552. [PMID: 30084941 PMCID: PMC6378933 DOI: 10.1093/bioinformatics/bty669] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Revised: 07/11/2018] [Accepted: 07/31/2018] [Indexed: 11/13/2022] Open
Abstract
Motivation Orthologous gene identification is fundamental to all aspects of biology. For example, ortholog identification between species can provide functional insights for genes of unknown function and is a necessary step in phylogenetic inference. Currently, most ortholog identification algorithms require all-versus-all BLAST comparisons, which are time-consuming and memory intensive. Results In contrast to existing approaches, JustOrthologs exploits the conservation of gene structure by using the lengths of coding sequence regions and dinucleotide percentages to identify orthologs. In comparison to OrthoMCL, OMA and OrthoFinder, JustOrthologs decreases ortholog identification runtime by more than 96% and achieves comparable precision and recall scores. The computational speedup allowed us to conduct pairwise comparisons of 1197 complete genomes (780 eukaryotes and 417 archaea). We confirmed gene annotations for 384 120 genes, grouped 1 675 415 genes in previously unreported ortholog groups, and identified 51 429 potentially mislabeled genes across 622 843 ortholog groups. Availability and implementation JustOrthologs is an open source collaborative software package available in the GitHub repository: https://github.com/ridgelab/JustOrthologs/. All test FASTA files used for comparisons are freely available at https://github.com/ridgelab/JustOrthologs/comparisonFastaFiles/. Reference genomes used in this work are available for download from the NCBI repository: ftp://ftp.ncbi.nih.gov/genomes/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Justin B Miller
- Department of Biology, Brigham Young University, Provo, UT, USA
| | | | - Perry G Ridge
- Department of Biology, Brigham Young University, Provo, UT, USA
| |
Collapse
|
19
|
Synonymous Codon Usages as an Evolutionary Dynamic for Chlamydiaceae. Int J Mol Sci 2018; 19:ijms19124010. [PMID: 30545112 PMCID: PMC6321445 DOI: 10.3390/ijms19124010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 12/06/2018] [Accepted: 12/10/2018] [Indexed: 01/08/2023] Open
Abstract
The family of Chlamydiaceae contains a group of obligate intracellular bacteria that can infect a wide range of hosts. The evolutionary trend of members in this family is a hot topic, which benefits our understanding of the cross-infection of these pathogens. In this study, 14 whole genomes of 12 Chlamydia species were used to investigate the nucleotide, codon, and amino acid usage bias by synonymous codon usage value and information entropy method. The results showed that all the studied Chlamydia spp. had A/T rich genes with over-represented A or T at the third positions and G or C under-represented at these positions, suggesting that nucleotide usages influenced synonymous codon usages. The overall codon usage trend from synonymous codon usage variations divides the Chlamydia spp. into four separate clusters, while amino acid usage divides the Chlamydia spp. into two clusters with some exceptions, which reflected the genetic diversity of the Chlamydiaceae family members. The overall codon usage pattern represented by the effective number of codons (ENC) was significantly positively correlated to gene GC3 content. A negative correlation exists between ENC and the codon adaptation index for some Chlamydia species. These results suggested that mutation pressure caused by nucleotide composition constraint played an important role in shaping synonymous codon usage patterns. Furthermore, codon usage of T3ss and Pmps gene families adapted to that of the corresponding genome. Taken together, analyses help our understanding of evolutionary interactions between nucleotide, synonymous codon, and amino acid usages in genes of Chlamydiaceae family members.
Collapse
|
20
|
Camiolo S, Toome-Heller M, Aime MC, Haridas S, Grigoriev IV, Porceddu A, Mannazzu I. An analysis of codon bias in six red yeast species. Yeast 2018; 36:53-64. [PMID: 30264407 DOI: 10.1002/yea.3359] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 09/10/2018] [Accepted: 09/23/2018] [Indexed: 11/11/2022] Open
Abstract
Red yeasts, primarily species of Rhodotorula, Sporobolomyces, and other genera of Pucciniomycotina, are traditionally considered proficient systems for lipid and terpene production, and only recently have also gained consideration for the production of a wider range of molecules of biotechnological potential. Improvements of transgene delivery protocols and regulated gene expression systems have been proposed, but a dearth of information on compositional and/or structural features of genes has prevented transgene sequence optimization efforts for high expression levels. Here, the codon compositional features of genes in six red yeast species were characterized, and the impact that evolutionary forces may have played in shaping this compositional bias was dissected by using several computational approaches. Results obtained are compatible with the hypothesis that mutational bias, although playing a significant role, cannot alone explain synonymous codon usage bias of genes. Nevertheless, several lines of evidences indicated a role for translational selection in driving the synonymous codons that allow high expression efficiency. These optimal synonymous codons are identified for each of the six species analyzed. Moreover, the presence of intragenic patterns of codon usage, which are thought to facilitate polyribosome formation, was highlighted. The information presented should be taken into consideration for transgene design for optimal expression in red yeast species.
Collapse
Affiliation(s)
- Salvatore Camiolo
- Dipartimento di Agraria, Università degli Studi di Sassari, Sassari, Italy
| | - Merje Toome-Heller
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, Indiana, USA
| | - M Catherine Aime
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, Indiana, USA
| | - Sajeet Haridas
- US Department of Energy Joint Genome Institute, Walnut Creek, California, USA
| | - Igor V Grigoriev
- US Department of Energy Joint Genome Institute, Walnut Creek, California, USA
| | - Andrea Porceddu
- Dipartimento di Agraria, Università degli Studi di Sassari, Sassari, Italy
| | - Ilaria Mannazzu
- Dipartimento di Agraria, Università degli Studi di Sassari, Sassari, Italy
| |
Collapse
|
21
|
Abstract
Genome and transcript sequences are composed of long strings of nucleotide monomers (A, C, G, and T/U) that require different quantities of nitrogen atoms for biosynthesis. Here, it is shown that the strength of selection acting on transcript nitrogen content is influenced by the amount of nitrogen plants require to conduct photosynthesis. Specifically, plants that require more nitrogen to conduct photosynthesis experience stronger selection on transcript sequences to use synonymous codons that cost less nitrogen to biosynthesize. It is further shown that the strength of selection acting on transcript nitrogen cost constrains molecular sequence evolution such that genes experiencing stronger selection evolve at a slower rate. Together these findings reveal that the plant molecular clock is set by photosynthetic efficiency, and provide a mechanistic explanation for changes in plant speciation rates that occur concomitant with improvements in photosynthetic efficiency and changes in the environment such as light, temperature, and atmospheric CO2 concentration.
Collapse
Affiliation(s)
- Steven Kelly
- Department of Plant Sciences, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
22
|
Paul P, Malakar AK, Chakraborty S. Compositional bias coupled with selection and mutation pressure drives codon usage in Brassica campestris genes. Food Sci Biotechnol 2017; 27:725-733. [PMID: 30263798 DOI: 10.1007/s10068-017-0285-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Revised: 11/28/2017] [Accepted: 12/03/2017] [Indexed: 11/25/2022] Open
Abstract
The plant Brassica campestris includes the vegetables turnip and Chinese cabbage, important plants of economic importance. Here, we have analysed the codon usage bias of B. campestris for 116 protein coding genes. Neutrality analysis showed that B. campestris had a wide range of GC3s, and a significant correlation was observed between GC12 and GC3. Nc versus GC3s plot showed a few genes on or proximate to the expected curve, but the majority of points were found to be scattered distantly from the expected curve. Correspondence analysis on codon usage revealed that the position preference of codons on multidimensional space totally depends on the presence of A and T at synonymous third codon position. These results altogether suggest that composition bias along with selection (major) and mutation pressure (minor) affects the codon usage pattern of the protein coding genes in Brassica campestris.
Collapse
Affiliation(s)
- Prosenjit Paul
- Department of Biotechnology, Assam University, Silchar, Assam 788011 India
| | - Arup Kumar Malakar
- Department of Biotechnology, Assam University, Silchar, Assam 788011 India
| | | |
Collapse
|
23
|
Mazumdar P, Binti Othman R, Mebus K, Ramakrishnan N, Ann Harikrishna J. Codon usage and codon pair patterns in non-grass monocot genomes. ANNALS OF BOTANY 2017; 120:893-909. [PMID: 29155926 PMCID: PMC5710610 DOI: 10.1093/aob/mcx112] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Accepted: 09/19/2017] [Indexed: 05/19/2023]
Abstract
BACKGROUND AND AIMS Studies on codon usage in monocots have focused on grasses, and observed patterns of this taxon were generalized to all monocot species. Here, non-grass monocot species were analysed to investigate the differences between grass and non-grass monocots. METHODS First, studies of codon usage in monocots were reviewed. The current information was then extended regarding codon usage, as well as codon-pair context bias, using four completely sequenced non-grass monocot genomes (Musa acuminata, Musa balbisiana, Phoenix dactylifera and Spirodela polyrhiza) for which comparable transcriptome datasets are available. Measurements were taken regarding relative synonymous codon usage, effective number of codons, derived optimal codon and GC content and then the relationships investigated to infer the underlying evolutionary forces. KEY RESULTS The research identified optimal codons, rare codons and preferred codon-pair context in the non-grass monocot species studied. In contrast to the bimodal distribution of GC3 (GC content in third codon position) in grasses, non-grass monocots showed a unimodal distribution. Disproportionate use of G and C (and of A and T) in two- and four-codon amino acids detected in the analysis rules out the mutational bias hypothesis as an explanation of genomic variation in GC content. There was found to be a positive relationship between CAI (codon adaptation index; predicts the level of expression of a gene) and GC3. In addition, a strong correlation was observed between coding and genomic GC content and negative correlation of GC3 with gene length, indicating a strong impact of GC-biased gene conversion (gBGC) in shaping codon usage and nucleotide composition in non-grass monocots. CONCLUSION Optimal codons in these non-grass monocots show a preference for G/C in the third codon position. These results support the concept that codon usage and nucleotide composition in non-grass monocots are mainly driven by gBGC.
Collapse
Affiliation(s)
- Purabi Mazumdar
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
| | - RofinaYasmin Binti Othman
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
| | - Katharina Mebus
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
| | - N Ramakrishnan
- Electrical and Computer System Engineering, School of Engineering, Monash University Malaysia, Bandar Sunway, Malaysia
| | - Jennifer Ann Harikrishna
- Centre for Research in Biotechnology for Agriculture, University of Malaya, Kuala Lumpur, Malaysia
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur, Malaysia
- For correspondence. E-mail:
| |
Collapse
|
24
|
Song H, Gao H, Liu J, Tian P, Nan Z. Comprehensive analysis of correlations among codon usage bias, gene expression, and substitution rate in Arachis duranensis and Arachis ipaënsis orthologs. Sci Rep 2017; 7:14853. [PMID: 29093502 PMCID: PMC5665869 DOI: 10.1038/s41598-017-13981-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 10/04/2017] [Indexed: 11/22/2022] Open
Abstract
The relationship between evolutionary rates and gene expression in model plant orthologs is well documented. However, little is known about the relationships between gene expression and evolutionary trends in Arachis orthologs. We identified 7,435 one-to-one orthologs, including 925 single-copy and 6,510 multiple-copy sequences in Arachis duranensis and Arachis ipaënsis. Codon usage was stronger for shorter polypeptides, which were encoded by codons with higher GC contents. Highly expressed coding sequences had higher codon usage bias, GC content, and expression breadth. Additionally, expression breadth was positively correlated with polypeptide length, but there was no correlation between gene expression and polypeptide length. Inferred selective pressure was also negatively correlated with both gene expression and expression breadth in all one-to-one orthologs, while positively but non-significantly correlated with gene expression in sequences with signatures of positive selection. Gene expression levels and expression breadth were significantly higher for single-copy genes than for multiple-copy genes. Similarly, the gene expression and expression breadth in sequences with signatures of purifying selection were higher than those of sequences with positive selective signatures. These results indicated that gene expression differed between single-copy and multiple-copy genes as well as sequences with signatures of positive and purifying selection.
Collapse
Affiliation(s)
- Hui Song
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China.
| | - Hongjuan Gao
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China
| | - Jing Liu
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China
| | - Pei Tian
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China
| | - Zhibiao Nan
- State Key Laboratory of Grassland Agro-ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730000, China.
| |
Collapse
|
25
|
Miller JB, Hippen AA, Belyeu JR, Whiting MF, Ridge PG. Missing something? Codon aversion as a new character system in phylogenetics. Cladistics 2017; 33:545-556. [PMID: 34706488 DOI: 10.1111/cla.12183] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/24/2016] [Indexed: 01/02/2023] Open
Abstract
Although many studies have documented codon usage bias in different species, the importance of codon usage in a phylogenetic framework remains largely unknown. We demonstrate that a phylogenetic signal is present in the codon usage and non-usage biases of 17 717 orthologues evaluated across 72 tetrapod species using a simple parsimony analysis of a binary matrix of codon characters. Phylogenies estimated using stop codons were more congruent with previous hypotheses than phylogenies based on any other single codon or a combination of codons. Although each codon is present in every species, specific genes have different codon preferences and may or may not use every possible codon. This observation allowed us to map the pattern of codon usage and non-usage across the topology. These results suggest that codon usage is phylogenetically conserved across shallow and deep levels within tetrapods.
Collapse
Affiliation(s)
- Justin B Miller
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Ariel A Hippen
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Jonathon R Belyeu
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Michael F Whiting
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA.,M.L. Bean Museum, Brigham Young University, Provo, UT, 84602, USA
| | - Perry G Ridge
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| |
Collapse
|
26
|
Abstract
Mistranslation errors compromise fitness by wasting resources on nonfunctional proteins. In order to reduce the cost of mistranslations, natural selection chooses the most accurately translated codons at sites that are particularly important for protein structure and function. We investigated the determinants underlying selection for translational accuracy in several species of plants belonging to three clades: Brassicaceae, Fabidae, and Poaceae. Although signatures of translational selection were found in genes from a wide range of species, the underlying factors varied in nature and intensity. Indeed, the degree of synonymous codon bias at evolutionarily conserved sites varied among plant clades while remaining uniform within each clade. This is unlikely to solely reflect the diversity of tRNA pools because there is little correlation between synonymous codon bias and tRNA abundance, so other factors must affect codon choice and translational accuracy in plant genes. Accordingly, synonymous codon choice at a given site was affected not only by the selection pressure at that site, but also its participation in protein domains or mRNA secondary structures. Although these effects were detected in all the species we analyzed, their impact on translation accuracy was distinct in evolutionarily distant plant clades. The domain effect was found to enhance translational accuracy in dicot and monocot genes with a high GC content, but to oppose the selection of more accurate codons in monocot genes with a low GC content.
Collapse
|
27
|
Szövényi P, Ullrich KK, Rensing SA, Lang D, van Gessel N, Stenøien HK, Conti E, Reski R. Selfing in Haploid Plants and Efficacy of Selection: Codon Usage Bias in the Model Moss Physcomitrella patens. Genome Biol Evol 2017; 9:1528-1546. [PMID: 28549175 PMCID: PMC5507605 DOI: 10.1093/gbe/evx098] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/25/2017] [Indexed: 12/15/2022] Open
Abstract
A long-term reduction in effective population size will lead to major shift in genome evolution. In particular, when effective population size is small, genetic drift becomes dominant over natural selection. The onset of self-fertilization is one evolutionary event considerably reducing effective size of populations. Theory predicts that this reduction should be more dramatic in organisms capable for haploid than for diploid selfing. Although theoretically well-grounded, this assertion received mixed experimental support. Here, we test this hypothesis by analyzing synonymous codon usage bias of genes in the model moss Physcomitrella patens frequently undergoing haploid selfing. In line with population genetic theory, we found that the effect of natural selection on synonymous codon usage bias is very weak. Our conclusion is supported by four independent lines of evidence: 1) Very weak or nonsignificant correlation between gene expression and codon usage bias, 2) no increased codon usage bias in more broadly expressed genes, 3) no evidence that codon usage bias would constrain synonymous and nonsynonymous divergence, and 4) predominant role of genetic drift on synonymous codon usage predicted by a model-based analysis. These findings show striking similarity to those observed in AT-rich genomes with weak selection for optimal codon usage and GC content overall. Our finding is in contrast to a previous study reporting adaptive codon usage bias in the moss P. patens.
Collapse
Affiliation(s)
- Péter Szövényi
- Department of Systematic and Evolutionary Botany, University of Zurich, Switzerland
| | - Kristian K. Ullrich
- Plant Cell Biology, Faculty of Biology, University of Marburg, Germany
- Present address: Max-Planck-Insitut für Evolutionsbiologie, Plön, Germany
| | - Stefan A. Rensing
- Plant Cell Biology, Faculty of Biology, University of Marburg, Germany
- BIOSS—Centre for Biological Signalling Studies, University of Freiburg, Germany
| | - Daniel Lang
- Plant Genome and Systems Biology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Nico van Gessel
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Germany
| | | | - Elena Conti
- Department of Systematic and Evolutionary Botany, University of Zurich, Switzerland
| | - Ralf Reski
- BIOSS—Centre for Biological Signalling Studies, University of Freiburg, Germany
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Germany
| |
Collapse
|