1
|
Hu X, Li Y, Meng F, Duan Y, Sun M, Yang S, Liu H. Analysis of chloroplast genome characteristics and codon usage bias in 14 species of Annonaceae. Funct Integr Genomics 2024; 24:109. [PMID: 38797780 DOI: 10.1007/s10142-024-01389-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 05/18/2024] [Accepted: 05/21/2024] [Indexed: 05/29/2024]
Abstract
For the study of species evolution, chloroplast gene expression, and transformation, the chloroplast genome is an invaluable resource. Codon usage bias (CUB) analysis is a tool that is utilized to improve gene expression and investigate evolutionary connections in genetic transformation. In this study, we analysed chloroplast genome differences, codon usage patterns and the sources of variation on CUB in 14 Annonaceae species using bioinformatics tools. The study showed that there was a significant variation in both gene sizes and numbers between the 14 species, but conservation was still maintained. It's worth noting that there were noticeable differences in the IR/SC sector boundary and the types of SSRs among the 14 species. The mono-nucleotide repeat type was the most common, with A/T repeats being more prevalent than G/C repeats. Among the different types of repeats, forward and palindromic repeats were the most abundant, followed by reverse repeats, and complement repeats were relatively rare. Codon composition analysis revealed that all 14 species had a frequency of GC lower than 50%. Additionally, it was observed that the proteins in-coding sequences of chloroplast genes tend to end with A/T at the third codon position. Among these species, 21 codons exhibited bias (RSCU > 1), and there were 8 high-frequency (HF) codons and 5 optimal codons that were identical across the species. According to the ENC-plot and Neutrality plot analysis, natural selection had less impact on the CUB of A. muricate and A. reticulata. Based on the PR2-plot, it was evident that base G had a higher frequency than C, and T had a higher frequency A. The correspondence analysis (COA) revealed that codon usage patterns different in Annonaceae.
Collapse
Affiliation(s)
- Xiang Hu
- Tropical Eco-agriculture Research Institute, Yunnan Academy of Agricultural Sciences, Yuanmou, Yunnan, 651300, China
| | - Yaqi Li
- Tropical and Subtropical Cash Crops Research Institute, Yunnan Academy of Agricultural Sciences, Baoshan, Yunnan, 678000, China
| | - Fuxuan Meng
- Tropical Eco-agriculture Research Institute, Yunnan Academy of Agricultural Sciences, Yuanmou, Yunnan, 651300, China
| | - Yuanjie Duan
- Tropical Eco-agriculture Research Institute, Yunnan Academy of Agricultural Sciences, Yuanmou, Yunnan, 651300, China
| | - Manying Sun
- Tropical Eco-agriculture Research Institute, Yunnan Academy of Agricultural Sciences, Yuanmou, Yunnan, 651300, China
| | - Shiying Yang
- Tropical Eco-agriculture Research Institute, Yunnan Academy of Agricultural Sciences, Yuanmou, Yunnan, 651300, China
| | - Haigang Liu
- Tropical Eco-agriculture Research Institute, Yunnan Academy of Agricultural Sciences, Yuanmou, Yunnan, 651300, China.
| |
Collapse
|
2
|
Khandia R, Pandey MK, Garg R, Khan AA, Baklanov I, Alanazi AM, Nepali P, Gurjar P, Choudhary OP. Molecular insights into codon usage analysis of mitochondrial fission and fusion gene: relevance to neurodegenerative diseases. Ann Med Surg (Lond) 2024; 86:1416-1425. [PMID: 38463054 PMCID: PMC10923317 DOI: 10.1097/ms9.0000000000001725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 01/05/2024] [Indexed: 03/12/2024] Open
Abstract
Mitochondrial dysfunction is the leading cause of neurodegenerative disorders like Alzheimer's disease and Parkinson's disease. Mitochondria is a highly dynamic organelle continuously undergoing the process of fission and fusion for even distribution of components and maintaining proper shape, number, and bioenergetic functionality. A set of genes governs the process of fission and fusion. OPA1, Mfn1, and Mfn2 govern fusion, while Drp1, Fis1, MIEF1, and MIEF2 genes control fission. Determination of specific molecular patterns of transcripts of these genes revealed the impact of compositional constraints on selecting optimal codons. AGA and CCA codons were over-represented, and CCC, GTC, TTC, GGG, ACG were under-represented in the fusion gene set. In contrast, CTG was over-represented, and GCG, CCG, and TCG were under-represented in the fission gene set. Hydropathicity analysis revealed non-polar protein products of both fission and fusion gene set transcripts. AGA codon repeats are an integral part of translational regulation machinery and present a distinct pattern of over-representation and under-representation in different transcripts within the gene sets, suggestive of selective translational force precisely controlling the occurrence of the codon. Out of six synonymous codons, five synonymous codons encoding for leucine were used differently in both gene sets. Hence, forces regulating the occurrence of AGA and five synonymous leucine-encoding codons suggest translational selection. A correlation of mutational bias with gene expression and codon bias and GRAVY and AROMA signifies the selection pressure in both gene sets, while the correlation of compositional bias with gene expression, codon bias, protein properties, and minimum free energy signifies the presence of compositional constraints. More than 25% of codons of both gene sets showed a significant difference in codon usage. The overall analysis shed light on molecular features of gene sets involved in fission and fusion.
Collapse
Affiliation(s)
| | - Megha Katare Pandey
- Translational Medicine Center, All India Institute of Medical Sciences, Bhopal
| | | | - Azmat Ali Khan
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Igor Baklanov
- Department of Philosophy, North Caucasus Federal University, Stavropol, Russia
| | - Amer M. Alanazi
- Pharmaceutical Biotechnology Laboratory, Department of Pharmaceutical Chemistry, College of Pharmacy, King Saud University, Riyadh, Saudi Arabia
| | - Prakash Nepali
- Government Medical Officer, Bhimad Primary Health Care Center, Government of Nepal, Tanahun, Nepal
| | - Pankaj Gurjar
- Centre for Global Health Research, Saveetha Medical College and Hospital, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India
- Department of Science and Engineering, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| | - Om Prakash Choudhary
- Department of Veterinary Anatomy, College of Veterinary Science, Guru Angad Dev Veterinary and Animal Sciences University (GADVASU), Rampura Phul, Bathinda, Punjab, India
| |
Collapse
|
3
|
Wang ZK, Liu Y, Zheng HY, Tang MQ, Xie SQ. Comparative Analysis of Codon Usage Patterns in Nuclear and Chloroplast Genome of Dalbergia (Fabaceae). Genes (Basel) 2023; 14:genes14051110. [PMID: 37239470 DOI: 10.3390/genes14051110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 05/04/2023] [Accepted: 05/16/2023] [Indexed: 05/28/2023] Open
Abstract
The Dalbergia plants are widely distributed across more than 130 tropical and subtropical countries and have significant economic and medicinal value. Codon usage bias (CUB) is a critical feature for studying gene function and evolution, which can provide a better understanding of biological gene regulation. In this study, we comprehensively analyzed the CUB patterns of the nuclear genome, chloroplast genome, and gene expression, as well as systematic evolution of Dalbergia species. Our results showed that the synonymous and optimal codons in the coding regions of both nuclear and chloroplast genome of Dalbergia preferred ending with A/U at the third codon base. Natural selection was the primary factor affecting the CUB features. Furthermore, in highly expressed genes of Dalbergia odorifera, we found that genes with stronger CUB exhibited higher expression levels, and these highly expressed genes tended to favor the use of G/C-ending codons. In addition, the branching patterns of the protein-coding sequences and the chloroplast genome sequences were very similar in the systematic tree, and different with the cluster from the CUB of the chloroplast genome. This study highlights the CUB patterns and features of Dalbergia species in different genomes, explores the correlation between CUB preferences and gene expression, and further investigates the systematic evolution of Dalbergia, providing new insights into codon biology and the evolution of Dalbergia plants.
Collapse
Affiliation(s)
- Zu-Kai Wang
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Hainan Key Laboratory for Biology of Tropical Ornamental Plant Germplasm, School of Forestry, Hainan University, Haikou 570228, China
| | - Yi Liu
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Hainan Key Laboratory for Biology of Tropical Ornamental Plant Germplasm, School of Forestry, Hainan University, Haikou 570228, China
| | - Hao-Yue Zheng
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Hainan Key Laboratory for Biology of Tropical Ornamental Plant Germplasm, School of Forestry, Hainan University, Haikou 570228, China
| | - Min-Qiang Tang
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Hainan Key Laboratory for Biology of Tropical Ornamental Plant Germplasm, School of Forestry, Hainan University, Haikou 570228, China
| | - Shang-Qian Xie
- Key Laboratory of Genetics and Germplasm Innovation of Tropical Special Forest Trees and Ornamental Plants (Ministry of Education), Hainan Key Laboratory for Biology of Tropical Ornamental Plant Germplasm, School of Forestry, Hainan University, Haikou 570228, China
| |
Collapse
|
4
|
Yang S, Li G, Li H. Molecular characterizations of genes in chloroplast genomes of the genus Arachis L. (Fabaceae) based on the codon usage divergence. PLoS One 2023; 18:e0281843. [PMID: 36917565 PMCID: PMC10013919 DOI: 10.1371/journal.pone.0281843] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 02/01/2023] [Indexed: 03/16/2023] Open
Abstract
Studies on the molecular characteristics of chloroplast genome are generally important for clarifying the evolutionary processes of plant species. The base composition, the effective number of codons, the relative synonymous codon usage, the codon bias index, and their correlation coefficients of a total of 41 genes in 21 chloroplast genomes of the genus Arachis were investigated to further perform the correspondence and clustering analyses, revealing significantly higher variations in genomes of wild species than those of the cultivated taxa. The codon usage patterns of all 41 genes in the genus Arachis were AT-rich, suggesting that the natural selection was the main factor affecting the evolutionary history of these genomes. Five genes (i.e., ndhC, petD, atpF, rpl14, and rps11) and five genes (i.e., atpE, psbD, psaB, ycf2, and rps12) showed higher and lower base usage divergences, respectively. This study provided novel insights into our understanding of the molecular evolution of chloroplast genomes in the genus Arachis.
Collapse
Affiliation(s)
- Shuwei Yang
- School of Intelligent Science and Information Engineering, Xi’an Peihua University, Xi’An, Shaanxi, China
| | - Gun Li
- Department of Biomedical Engineering, Laboratory for Biodiversity Science, School of Electronic Information Engineering, Xi’An Technological University, Xi’An, Shaanxi, China
- * E-mail: (GL); (HL)
| | - Hao Li
- College of Food Engineering, Jilin Engineering Normal University, Changchun, Jilin, China
- * E-mail: (GL); (HL)
| |
Collapse
|
5
|
Li G, Zhang L, Xue P. Codon usage divergence of important functional genes in Mycobacterium tuberculosis. Int J Biol Macromol 2022; 209:1197-1204. [PMID: 35460756 DOI: 10.1016/j.ijbiomac.2022.04.112] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 04/13/2022] [Accepted: 04/15/2022] [Indexed: 12/31/2022]
Abstract
Sequence characteristics are usually used to explain the adaptive ability to hosts, metabolism, genetic diversity, drug resistance, and infectivity of Mycobacterium tuberculosis. Exploring the codon usage pattern of coding sequences in Mycobacterium tuberculosis is of great significance. In the present study, two hundred random complete genomes of Mycobacterium tuberculosis were downloaded from the National Center for Biotechnology Information database. The important codon usage pattern, such as the codon bias index, the effective number of codons, the relative synonymous codon usage as well as the base component, of twenty one specific functional genes were counted or calculated. The differences of the relative synonymous codon usage values among those functional genes, and the summation of the standard deviations of codon usage parameters were used to evaluate the divergence degree of the concerned genes. The results show that among the concerned genes, 1) all genes are high GC sequences, the codon usage frequency corresponding to each amino acid of these functional genes had a significant bias; 2) the genes of those with high effective number of codons, such as the coding sequences of Myco-bacterial membrane protein large family, usually have higher divergences; and 3) genes with lower divergences, such as the ag85A and the sigH, are usually highly conserved and are often used as drug target genes. The findings of the present work would improve new understandings on the evolution of Mycobacterium tuberculosis and on the measures to prevent and control tuberculosis from the gene engineering.
Collapse
Affiliation(s)
- Gun Li
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'An Technological University, Xi'An, China.
| | - Liang Zhang
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'An Technological University, Xi'An, China
| | - Pei Xue
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'An Technological University, Xi'An, China
| |
Collapse
|
6
|
Amerifar S, Norouzi M, Ghandi M. A tool for feature extraction from biological sequences. Brief Bioinform 2022; 23:6563937. [PMID: 35383372 DOI: 10.1093/bib/bbac108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 03/01/2022] [Accepted: 03/03/2022] [Indexed: 11/12/2022] Open
Abstract
With the advances in sequencing technologies, a huge amount of biological data is extracted nowadays. Analyzing this amount of data is beyond the ability of human beings, creating a splendid opportunity for machine learning methods to grow. The methods, however, are practical only when the sequences are converted into feature vectors. Many tools target this task including iLearnPlus, a Python-based tool which supports a rich set of features. In this paper, we propose a holistic tool that extracts features from biological sequences (i.e. DNA, RNA and Protein). These features are the inputs to machine learning models that predict properties, structures or functions of the input sequences. Our tool not only supports all features in iLearnPlus but also 30 additional features which exist in the literature. Moreover, our tool is based on R language which makes an alternative for bioinformaticians to transform sequences into feature vectors. We have compared the conversion time of our tool with that of iLearnPlus: we transform the sequences much faster. We convert small nucleotides by a median of 2.8X faster, while we outperform iLearnPlus by a median of 6.3X for large sequences. Finally, in amino acids, our tool achieves a median speedup of 23.9X.
Collapse
Affiliation(s)
- Sare Amerifar
- Bioinformatics, Tatbiat Modares University, Jalal Al Ahmad, 14115-111, Tehran, Iran
| | - Mahammad Norouzi
- Computer Science, Technical University of Darmstadt, Hochschulstr. 1, 64293, Hesse, Germany
| | - Mahmoud Ghandi
- Bioinformatics, Monte Rosa Therapeutics, Summer Street, 02210, Boston, United States
| |
Collapse
|
7
|
Genetic parameters estimation for some wild wheat species and their F1 hybrids grown in different regions of Saudi Arabia. Saudi J Biol Sci 2022; 29:521-525. [PMID: 35002448 PMCID: PMC8717142 DOI: 10.1016/j.sjbs.2021.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 09/07/2021] [Accepted: 09/08/2021] [Indexed: 11/25/2022] Open
Abstract
Wheat (Triticum aestivum L.) is the most important crop for human nutrition that underpins the food safety of Saudi Arabia. The investigation here was to determine heterosis effects using different genetic methods: heterosis over better, mid parents, the genetic advance, and genotype, phenotypic coefficient of variation for estimation some traits among six wheat landraces and their F1 hybrids. In 2019, these landraces were sown using hand and after 100 days, the emasculation and crossing were made among these six landraces using hand emasculation of anthers. In 2020, seeds for these genotypes (six wheat landraces and their F1) were sown under normal irrigation accordingly done in 2019. The results showed that the most important parent was Mabia resulted with the highest value in number of tiller/ plant, 1,000-grain weight, and fresh shoot weight. The highest value of plant height among six parents was Naqra while highest value at the same trait among F1 hybrids was P3 XP6. The estimations of heterosis showed that out of 15 crosses, one cross (P1XP5) was significantly better yield than all crosses for these four traits. The genotype coefficient of variation (GCV) ranged from 12.5% to 8.7% while phenotypic coefficient of variation ranged from 17.7% to 11.3%. The correlation coefficients was found between fresh shoot weight and number of tiller and plant height and umber of tiller. Wild wheat still serve as a source of useful germplasm with proven adaption and productivity and thus assembles of the wild wheat assortments are the initial step of breeding program.
Collapse
|