1
|
Fakhar AZ, Liu J, Pajerowska-Mukhtar KM, Mukhtar MS. The ORFans' tale: new insights in plant biology. TRENDS IN PLANT SCIENCE 2023; 28:1379-1390. [PMID: 37453923 DOI: 10.1016/j.tplants.2023.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 05/17/2023] [Accepted: 06/19/2023] [Indexed: 07/18/2023]
Abstract
Orphan genes (OGs) are protein-coding genes without a significant sequence similarity in closely related species. Despite their functional importance, very little is known about the underlying molecular mechanisms by which OGs participate in diverse biological processes. Here, we discuss the evolutionary mechanisms of OGs' emergence with relevance to species-specific adaptations. We also provide a mechanistic view of the involvement of OGs in multiple processes, including growth, development, reproduction, and carbon-metabolism-mediated immunity. We highlight the interconnection between OGs and the sucrose nonfermenting 1 (SNF1)-related protein kinases (SnRKs)-target of rapamycin (TOR) signaling axis for phytohormone signaling, nutrient metabolism, and stress responses. Finally, we propose a high-throughput pipeline for OGs' interspecies and intraspecies gene transfer through a transgenic approach for future biotechnological advances.
Collapse
Affiliation(s)
- Ali Zeeshan Fakhar
- Department of Biology, University of Alabama at Birmingham, 1300 University Blvd., Birmingham, AL 35294, USA
| | - Jinbao Liu
- Department of Biology, University of Alabama at Birmingham, 1300 University Blvd., Birmingham, AL 35294, USA
| | | | - M Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, 1300 University Blvd., Birmingham, AL 35294, USA.
| |
Collapse
|
2
|
Zhao Y, Huang S, Zhang Y, Tan C, Feng H. Role of Brassica orphan gene BrLFM on leafy head formation in Chinese cabbage (Brassica rapa). TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:170. [PMID: 37420138 DOI: 10.1007/s00122-023-04411-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 06/22/2023] [Indexed: 07/09/2023]
Abstract
Brassica orphan gene BrFLM, identified by two allelic mutants, was involved in leafy head formation in Chinese cabbage. Leafy head formation is a unique agronomic trait of Chinese cabbage that determines its yield and quality. In our previous study, an EMS mutagenesis Chinese cabbage mutant library was constructed using the heading Chinese cabbage double haploid (DH) line FT as the wild-type. Here, we screened two extremely similar leafy head deficiency mutants lfm-1 and lfm-2 with geotropic growth leaves from the library to investigate the gene(s) related to leafy head formation. Reciprocal crossing results showed that these two mutants were allelic. We utilized lfm-1 to identify the mutant gene(s). Genetic analysis showed that the mutated trait was controlled by a single nuclear gene Brlfm. Mutmap analysis showed that Brlfm was located on chromosome A05, and BraA05g012440.3C or BraA05g021450.3C were the candidate gene. Kompetitive allele-specific PCR analysis eliminated BraA05g012440.3C from the candidates. Sanger sequencing identified an SNP from G to A at the 271st nucleotide on BraA05g021450.3C. The sequencing of lfm-2 detected another non-synonymous SNP (G to A) located at the 266st nucleotide on BraA05g021450.3C, which verified its function on leafy head formation. We blasted BraA05g021450.3C on database and found that it belongs to a Brassica orphan gene encoding an unknown 13.74 kDa protein, named BrLFM. Subcellular localization showed that BrLFM was located in the nucleus. These findings reveal that BrLFM is involved in leafy head formation in Chinese cabbage.
Collapse
Affiliation(s)
- Yonghui Zhao
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China
| | - Shengnan Huang
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China
| | - Yun Zhang
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China
| | - Chong Tan
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China
| | - Hui Feng
- College of Horticulture, Shenyang Agricultural University, 120 Dongling Road, Shenhe District, Shenyang, 110866, People's Republic of China.
| |
Collapse
|
3
|
Jiang M, Li X, Dong X, Zu Y, Zhan Z, Piao Z, Lang H. Research Advances and Prospects of Orphan Genes in Plants. FRONTIERS IN PLANT SCIENCE 2022; 13:947129. [PMID: 35874010 PMCID: PMC9305701 DOI: 10.3389/fpls.2022.947129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 06/23/2022] [Indexed: 06/15/2023]
Abstract
Orphan genes (OGs) are defined as genes having no sequence similarity with genes present in other lineages. OGs have been regarded to play a key role in the development of lineage-specific adaptations and can also serve as a constant source of evolutionary novelty. These genes have often been found related to various stress responses, species-specific traits, special expression regulation, and also participate in primary substance metabolism. The advancement in sequencing tools and genome analysis methods has made the identification and characterization of OGs comparatively easier. In the study of OG functions in plants, significant progress has been made. We review recent advances in the fast evolving characteristics, expression modulation, and functional analysis of OGs with a focus on their role in plant biology. We also emphasize current challenges, adoptable strategies and discuss possible future directions of functional study of OGs.
Collapse
Affiliation(s)
- Mingliang Jiang
- School of Agriculture, Jilin Agricultural Science and Technology College, Jilin, China
| | - Xiaonan Li
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Xiangshu Dong
- School of Agriculture, Yunnan University, Kunming, China
| | - Ye Zu
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Zongxiang Zhan
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Zhongyun Piao
- College of Horticulture, Shenyang Agricultural University, Shenyang, China
| | - Hong Lang
- School of Agriculture, Jilin Agricultural Science and Technology College, Jilin, China
| |
Collapse
|
4
|
Cardoso-Silva CB, Aono AH, Mancini MC, Sforça DA, da Silva CC, Pinto LR, Adams KL, de Souza AP. Taxonomically Restricted Genes Are Associated With Responses to Biotic and Abiotic Stresses in Sugarcane ( Saccharum spp.). FRONTIERS IN PLANT SCIENCE 2022; 13:923069. [PMID: 35845637 PMCID: PMC9280035 DOI: 10.3389/fpls.2022.923069] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 06/13/2022] [Indexed: 06/15/2023]
Abstract
Orphan genes (OGs) are protein-coding genes that are restricted to particular clades or species and lack homology with genes from other organisms, making their biological functions difficult to predict. OGs can rapidly originate and become functional; consequently, they may support rapid adaptation to environmental changes. Extensive spread of mobile elements and whole-genome duplication occurred in the Saccharum group, which may have contributed to the origin and diversification of OGs in the sugarcane genome. Here, we identified and characterized OGs in sugarcane, examined their expression profiles across tissues and genotypes, and investigated their regulation under varying conditions. We identified 319 OGs in the Saccharum spontaneum genome without detected homology to protein-coding genes in green plants, except those belonging to Saccharinae. Transcriptomic analysis revealed 288 sugarcane OGs with detectable expression levels in at least one tissue or genotype. We observed similar expression patterns of OGs in sugarcane genotypes originating from the closest geographical locations. We also observed tissue-specific expression of some OGs, possibly indicating a complex regulatory process for maintaining diverse functional activity of these genes across sugarcane tissues and genotypes. Sixty-six OGs were differentially expressed under stress conditions, especially cold and osmotic stresses. Gene co-expression network and functional enrichment analyses suggested that sugarcane OGs are involved in several biological mechanisms, including stimulus response and defence mechanisms. These findings provide a valuable genomic resource for sugarcane researchers, especially those interested in selecting stress-responsive genes.
Collapse
Affiliation(s)
- Cláudio Benício Cardoso-Silva
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Alexandre Hild Aono
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Melina Cristina Mancini
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Danilo Augusto Sforça
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Carla Cristina da Silva
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Agronomy Department, Federal University of Viçosa (UFV), Viçosa, Brazil
| | - Luciana Rossini Pinto
- Sugarcane Research Advanced Centre, Agronomic Institute of Campinas (IAC/APTA), Ribeirão Preto, Brazil
| | - Keith L. Adams
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Anete Pereira de Souza
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
| |
Collapse
|
5
|
Ma D, Lai Z, Ding Q, Zhang K, Chang K, Li S, Zhao Z, Zhong F. Identification, Characterization and Function of Orphan Genes Among the Current Cucurbitaceae Genomes. FRONTIERS IN PLANT SCIENCE 2022; 13:872137. [PMID: 35599909 PMCID: PMC9114813 DOI: 10.3389/fpls.2022.872137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 03/28/2022] [Indexed: 06/15/2023]
Abstract
Orphan genes (OGs) that are missing identifiable homologs in other lineages may potentially make contributions to a variety of biological functions. The Cucurbitaceae family consists of a wide range of fruit crops of worldwide or local economic significance. To date, very few functional mechanisms of OGs in Cucurbitaceae are known. In this study, we systematically identified the OGs of eight Cucurbitaceae species using a comparative genomics approach. The content of OGs varied widely among the eight Cucurbitaceae species, ranging from 1.63% in chayote to 16.55% in wax gourd. Genetic structure analysis showed that OGs have significantly shorter protein lengths and fewer exons in Cucurbitaceae. The subcellular localizations of OGs were basically the same, with only subtle differences. Except for aggregation in some chromosomal regions, the distribution density of OGs was higher near the telomeres and relatively evenly distributed on the chromosomes. Gene expression analysis revealed that OGs had less abundantly and highly tissue-specific expression. Interestingly, the largest proportion of these OGs was significantly more tissue-specific expressed in the flower than in other tissues, and more detectable expression was found in the male flower. Functional prediction of OGs showed that (1) 18 OGs associated with male sterility in watermelon; (2) 182 OGs associated with flower development in cucumber; (3) 51 OGs associated with environmental adaptation in watermelon; (4) 520 OGs may help with the large fruit size in wax gourd. Our results provide the molecular basis and research direction for some important mechanisms in Cucurbitaceae species and domesticated crops.
Collapse
Affiliation(s)
- Dongna Ma
- College of Horticulture, Fujian Agriculture and Forestry University, Fujian, China
- College of the Environment and Ecology, Xiamen University, Fujian, China
| | - Zhengfeng Lai
- Subtropical Agricultural Research Institute, Fujian Academy of Agriculture Sciences, Fujian, China
| | - Qiansu Ding
- College of the Environment and Ecology, Xiamen University, Fujian, China
| | - Kun Zhang
- College of Horticulture, Fujian Agriculture and Forestry University, Fujian, China
| | - Kaizhen Chang
- College of Horticulture, Fujian Agriculture and Forestry University, Fujian, China
| | - Shuhao Li
- College of Horticulture, Fujian Agriculture and Forestry University, Fujian, China
| | - Zhizhu Zhao
- College of the Environment and Ecology, Xiamen University, Fujian, China
| | - Fenglin Zhong
- College of Horticulture, Fujian Agriculture and Forestry University, Fujian, China
| |
Collapse
|
6
|
Zhao Z, Ma D. Genome-Wide Identification, Characterization and Function Analysis of Lineage-Specific Genes in the Tea Plant Camellia sinensis. Front Genet 2021; 12:770570. [PMID: 34858483 PMCID: PMC8631334 DOI: 10.3389/fgene.2021.770570] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Accepted: 10/14/2021] [Indexed: 11/22/2022] Open
Abstract
Genes that have no homologous sequences with other species are called lineage-specific genes (LSGs), are common in living organisms, and have an important role in the generation of new functions, adaptive evolution and phenotypic alteration of species. Camellia sinensis var. sinensis (CSS) is one of the most widely distributed cultivars for quality green tea production. The rich catechins in tea have antioxidant, free radical elimination, fat loss and cancer prevention potential. To further understand the evolution and utilize the function of LSGs in tea, we performed a comparative genomics approach to identify Camellia-specific genes (CSGs). Our result reveals that 1701 CSGs were identified specific to CSS, accounting for 3.37% of all protein-coding genes. The majority of CSGs (57.08%) were generated by gene duplication, and the time of duplication occurrence coincide with the time of two genome-wide replication (WGD) events that happened in CSS genome. Gene structure analysis revealed that CSGs have shorter gene lengths, fewer exons, higher GC content and higher isoelectric point. Gene expression analysis showed that CSG had more tissue-specific expression compared to evolutionary conserved genes (ECs). Weighted gene co-expression network analysis (WGCNA) showed that 18 CSGs are mainly associated with catechin synthesis-related pathways, including phenylalanine biosynthesis, biosynthesis of amino acids, pentose phosphate pathway, photosynthesis and carbon metabolism. Besides, we found that the expression of three CSGs (CSS0030246, CSS0002298, and CSS0030939) was significantly down-regulated in response to both types of stresses (salt and drought). Our study first systematically identified LSGs in CSS, and comprehensively analyzed the features and potential functions of CSGs. We also identified key candidate genes, which will provide valuable assistance for further studies on catechin synthesis and provide a molecular basis for the excavation of excellent germplasm resources.
Collapse
Affiliation(s)
- Zhizhu Zhao
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Dongna Ma
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| |
Collapse
|
7
|
Ma D, Ding Q, Guo Z, Zhao Z, Wei L, Li Y, Song S, Zheng HL. Identification, characterization and expression analysis of lineage-specific genes within mangrove species Aegiceras corniculatum. Mol Genet Genomics 2021; 296:1235-1247. [PMID: 34363105 DOI: 10.1007/s00438-021-01810-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Accepted: 07/22/2021] [Indexed: 11/25/2022]
Abstract
Lineage-specific genes (LSGs) are the genes that have no recognizable homology to any sequences in other species, which are important drivers for the generation of new functions, phenotypic changes, and facilitating species adaptation to environment. Aegiceras corniculatum is one of major mangrove plant species adapted to waterlogging and saline conditions, and the exploration of aegiceras-specific genes (ASGs) is important to reveal its adaptation to the harsh environment. Here, we performed a systematic analysis on ASGs, focusing on their sequence characterization, origination and expression patterns. Our results reveal that there are 4823 ASGs in the genome, approximately 11.84% of all protein-coding genes. High proportion (45.78%) of ASGs originate from gene duplication, and the time of gene duplication of ASGs is consistent with the timing of two genome-wide replication (WGD) events that occurred in A. corniculatum, and also coincides with a short period of global warming during the Paleocene-Eocene Maximum (PETM, 55.5 million years ago). Gene structure analysis showed that ASGs have shorter protein lengths, fewer exons, and higher isoelectric point. Expression patterns analysis showed that ASGs had low levels of expression and more tissue-specific expression. Weighted gene co-expression network analysis (WGCNA) revealed that 86 ASGs co-expressed gene modules were primarily involved in pathways related to adversity stress, including plant hormone signal transduction, phenylpropanoid biosynthesis, photosynthesis, peroxisome and pentose phosphate pathway. This study provides a comprehensive analysis of the characteristics and potential functions of ASGs and identifies key candidate genes, which will contribute to the subsequent further investigation of the adaptation of A. corniculatum to intertidal coastal wetland habitats.
Collapse
Affiliation(s)
- Dongna Ma
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Qiansu Ding
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Zejun Guo
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Zhizhu Zhao
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Liufeng Wei
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Yiying Li
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Institute of Applied Ecology, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Shiwei Song
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China
| | - Hai-Lei Zheng
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, 361005, China.
| |
Collapse
|
8
|
Jiang M, Zhan Z, Li H, Dong X, Cheng F, Piao Z. Brassica rapa orphan genes largely affect soluble sugar metabolism. HORTICULTURE RESEARCH 2020; 7:181. [PMID: 33328469 PMCID: PMC7603504 DOI: 10.1038/s41438-020-00403-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 07/31/2020] [Accepted: 09/01/2020] [Indexed: 05/04/2023]
Abstract
Orphan genes (OGs), which are genes unique to a specific taxon, play a vital role in primary metabolism. However, little is known about the functional significance of Brassica rapa OGs (BrOGs) that were identified in our previous study. To study their biological functions, we developed a BrOG overexpression (BrOGOE) mutant library of 43 genes in Arabidopsis thaliana and assessed the phenotypic variation of the plants. We found that 19 of the 43 BrOGOE mutants displayed a mutant phenotype and 42 showed a variable soluble sugar content. One mutant, BrOG1OE, with significantly elevated fructose, glucose, and total sugar contents but a reduced sucrose content, was selected for in-depth analysis. BrOG1OE showed reduced expression and activity of the Arabidopsis sucrose synthase gene (AtSUS); however, the activity of invertase was unchanged. In contrast, silencing of two copies of BrOG1 in B. rapa, BraA08002322 (BrOG1A) and BraSca000221 (BrOG1B), by the use of an efficient CRISPR/Cas9 system of Chinese cabbage (B. rapa ssp. campestris) resulted in decreased fructose, glucose, and total soluble sugar contents because of the upregulation of BrSUS1b, BrSUS3, and, specifically, the BrSUS5 gene in the edited BrOG1 transgenic line. In addition, we observed increased sucrose content and SUS activity in the BrOG1 mutants, with the activity of invertase remaining unchanged. Thus, BrOG1 probably affected soluble sugar metabolism in a SUS-dependent manner. This is the first report investigating the function of BrOGs with respect to soluble sugar metabolism and reinforced the idea that OGs are a valuable resource for nutrient metabolism.
Collapse
Affiliation(s)
- Mingliang Jiang
- Molecular Biology of Vegetable Laboratory, College of Horticulture, Shenyang Agricultural University, Shenyang, 110866, China
| | - Zongxiang Zhan
- Molecular Biology of Vegetable Laboratory, College of Horticulture, Shenyang Agricultural University, Shenyang, 110866, China
| | - Haiyan Li
- Molecular Biology of Vegetable Laboratory, College of Horticulture, Shenyang Agricultural University, Shenyang, 110866, China
| | - Xiangshu Dong
- School of Agriculture, Yunnan University, Kunming, 650504, China
| | - Feng Cheng
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops of the Ministry of Agriculture, Sino-Dutch Joint Laboratory of Horticultural Genomics, Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Zhongyun Piao
- Molecular Biology of Vegetable Laboratory, College of Horticulture, Shenyang Agricultural University, Shenyang, 110866, China.
| |
Collapse
|
9
|
Gao Q, Jin X, Xia E, Wu X, Gu L, Yan H, Xia Y, Li S. Identification of Orphan Genes in Unbalanced Datasets Based on Ensemble Learning. Front Genet 2020; 11:820. [PMID: 33133122 PMCID: PMC7567012 DOI: 10.3389/fgene.2020.00820] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 07/08/2020] [Indexed: 11/13/2022] Open
Abstract
Orphan genes are associated with regulatory patterns, but experimental methods for identifying orphan genes are both time-consuming and expensive. Designing an accurate and robust classification model to detect orphan and non-orphan genes in unbalanced distribution datasets poses a particularly huge challenge. Synthetic minority over-sampling algorithms (SMOTE) are selected in a preliminary step to deal with unbalanced gene datasets. To identify orphan genes in balanced and unbalanced Arabidopsis thaliana gene datasets, SMOTE algorithms were then combined with traditional and advanced ensemble classified algorithms respectively, using Support Vector Machine, Random Forest (RF), AdaBoost (adaptive boosting), GBDT (gradient boosting decision tree), and XGBoost (extreme gradient boosting). After comparing the performance of these ensemble models, SMOTE algorithms with XGBoost achieved an F1 score of 0.94 with the balanced A. thaliana gene datasets, but a lower score with the unbalanced datasets. The proposed ensemble method combines different balanced data algorithms including Borderline SMOTE (BSMOTE), Adaptive Synthetic Sampling (ADSYN), SMOTE-Tomek, and SMOTE-ENN with the XGBoost model separately. The performances of the SMOTE-ENN-XGBoost model, which combined over-sampling and under-sampling algorithms with XGBoost, achieved higher predictive accuracy than the other balanced algorithms with XGBoost models. Thus, SMOTE-ENN-XGBoost provides a theoretical basis for developing evaluation criteria for identifying orphan genes in unbalanced and biological datasets.
Collapse
Affiliation(s)
- Qijuan Gao
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agriculture University, Hefei, China
| | - Xiu Jin
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agriculture University, Hefei, China
| | - Enhua Xia
- State Key Laboratory of Tea Plant Biology and Utilization, Anhui Agricultural University, Hefei, China
| | - Xiangwei Wu
- School of Resources and Environment, Anhui Agricultural University, Hefei, China
| | - Lichuan Gu
- School of Information and Computer Science, Anhui Agricultural University, Hefei, China
| | - Hanwei Yan
- Key Laboratory of Crop Biology of Anhui Province, Anhui Agricultural University, Hefei, China
| | - Yingchun Xia
- School of Information and Computer Science, Anhui Agricultural University, Hefei, China
| | - Shaowen Li
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agriculture University, Hefei, China
| |
Collapse
|
10
|
Zhang J, Lei Y, Wang B, Li S, Yu S, Wang Y, Li H, Liu Y, Ma Y, Dai H, Wang J, Zhang Z. The high-quality genome of diploid strawberry (Fragaria nilgerrensis) provides new insights into anthocyanin accumulation. PLANT BIOTECHNOLOGY JOURNAL 2020; 18:1908-1924. [PMID: 32003918 PMCID: PMC7415782 DOI: 10.1111/pbi.13351] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Revised: 01/15/2020] [Accepted: 01/20/2020] [Indexed: 05/11/2023]
Abstract
Fragaria nilgerrensis is a wild diploid strawberry species endemic to east and southeast region in Asia and provides a rich source of genetic variations for strawberry improvement. Here, we present a chromosome-scale assembly of F. nilgerrensis using single-molecule real-time (SMRT) Pacific Biosciences sequencing and chromosome conformation capture (Hi-C) genome scaffolding. The genome assembly size was 270.3 Mb, with a contig N50 of ∼8.5 Mb. A total of 28 780 genes and 117.2 Mb of transposable elements were annotated for this genome. Next, detailed comparative genomics with the high-quality F. vesca reference genome was conducted to obtain the difference among transposable elements, SNPs, Indels, and so on. The genome size of F. nilgerrensis was enhanced by around 50 Mb relatively to F. vesca, which is mainly due to expansion of transposable elements. In comparison with the F. vesca genome, we identified 4 561 825 SNPs, 846 301 Indels, 4243 inversions, 35 498 translocations and 10 099 relocations. We also found a marked expansion of genes involved in phenylpropanoid biosynthesis, starch and sucrose metabolism, cyanoamino acid metabolism, plant-pathogen interaction, brassinosteroid biosynthesis and plant hormone signal transduction in F. nilgerrensis, which may account for its specific phenotypes and considerable environmental adaptability. Interestingly, we found sequence variations in the upstream regulatory region of FnMYB10, a core transcriptional activator of anthocyanin biosynthesis, resulted in the low expression level of the FnMYB10 gene, which is likely responsible for white fruit phenotype of F. nilgerrensis. The high-quality F. nilgerrensis genome will be a valuable resource for biological research and comparative genomics research.
Collapse
Affiliation(s)
- Junxiang Zhang
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | - Yingying Lei
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | - Baotian Wang
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | - Song Li
- Biomarker Technologies CorporationBeijingChina
| | - Shuang Yu
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | - Yan Wang
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | - He Li
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | - Yuexue Liu
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | - Yue Ma
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | - Hongyan Dai
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| | | | - Zhihong Zhang
- Liaoning Key Laboratory of Strawberry Breeding and CultivationCollege of HorticultureShenyang Agricultural UniversityShenyangChina
| |
Collapse
|
11
|
Chen K, Tian Z, Chen P, He H, Jiang F, Long CA. Genome-wide identification, characterization and expression analysis of lineage-specific genes within Hanseniaspora yeasts. FEMS Microbiol Lett 2020; 367:5837084. [PMID: 32407480 DOI: 10.1093/femsle/fnaa077] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Accepted: 05/12/2020] [Indexed: 12/13/2022] Open
Abstract
Lineage-specific genes (LSGs) are defined as genes with sequences that are not significantly similar to those in any other lineage. LSGs have been proposed, and sometimes shown, to have significant effects in the evolution of biological function. In this study, two sets of Hanseniaspora spp. LSGs were identified by comparing the sequences of the Kloeckera apiculata genome and of 80 other yeast genomes. This study identified 344 Hanseniaspora-specific genes (HSGs) and 109 genes ('orphan genes') specific to K. apiculata. Three thousand three hundred thirty-one K. apiculata genes that showed significant similarity to at least one sequence outside the Hanseniaspora were classified into evolutionarily conserved genes. We analyzed their sequence features, functional categories, gene origin, gene structure and gene expression. We also investigated the predicted cellular roles and Gene Ontology categories of the LSGs using functional inference. The patterns of the functions of LSGs do not deviate significantly from genome-wide average. The results showed that a few LSGs were formed by gene duplication, followed by rapid sequence divergence. Many of the HSGs and orphan genes exhibited altered expression in response to abiotic stress. Studying these LSGs might be helpful for understanding the molecular mechanism of yeast adaption.
Collapse
Affiliation(s)
- Kai Chen
- School of Biological Engineering and Food, Hubei University of Technology, Wuhan 430068, China
| | - Zhonghuan Tian
- Key Laboratory of Horticultural Plant Biology of the Ministry of Education, National Centre of Citrus Breeding, Huazhong Agricultural University, Wuhan 430070, China
| | - Ping Chen
- Department of Pediatric Hematology, Tongji Hospital Affiliated to Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430000, China
| | - Hua He
- School of Landscape Architecture and Horticulture, Wuhan Institute of Bioengineering, Wuhan 430415, China
| | - Fatang Jiang
- School of Biological Engineering and Food, Hubei University of Technology, Wuhan 430068, China
| | - Chao-An Long
- Key Laboratory of Horticultural Plant Biology of the Ministry of Education, National Centre of Citrus Breeding, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
12
|
Identification, characterization and expression analysis of lineage-specific genes within Triticeae. Genomics 2019; 112:1343-1350. [PMID: 31401233 DOI: 10.1016/j.ygeno.2019.08.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 08/04/2019] [Accepted: 08/07/2019] [Indexed: 12/11/2022]
Abstract
Lineage-specific genes (LSGs) are a set of genes in a given taxon without significant sequence similarity to genes and intergenic sequences of other taxa and are functional. The tribe Triticeae mainly includes species of different ploidy levels, such as staple food crops wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.). This study is aimed at mining and characterizing the Triticeae-specific genes (TSGs) using expressed sequence data of wheat. A total of 3812 TSGs was identified and they were generally characterized by smaller size, fewer exons, shorter open reading frames and lower expression levels. Most TSGs were expressed with tissue preference and many of them were predominantly expressed in reproduction related tissues, especially in young stamen. Nearly one third of the TSGs were stress-responsive and inducible under abiotic and/or biotic stresses. A co-expression-based annotation supported the relevance of some TSGs with reproduction and stress responses, indicating their potential economic importance.
Collapse
|
13
|
Li G, Wu X, Hu Y, Muñoz-Amatriaín M, Luo J, Zhou W, Wang B, Wang Y, Wu X, Huang L, Lu Z, Xu P. Orphan genes are involved in drought adaptations and ecoclimatic-oriented selections in domesticated cowpea. JOURNAL OF EXPERIMENTAL BOTANY 2019; 70:3101-3110. [PMID: 30949664 DOI: 10.1093/jxb/erz145] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Accepted: 03/20/2019] [Indexed: 05/19/2023]
Abstract
Orphan genes (OGs) are genes that are restricted to a single species or a particular taxonomic group. To date, little is known about the functions of OGs in domesticated crops. Here, we report our findings on the relationships between OGs and environmental adaptation in cowpea (Vigna unguiculata). We identified 578 expressed OGs, of which 73.2% were predicted to be non-coding. Transcriptomic analyses revealed a high rate of OGs that were drought inducible in roots when compared with conserved genes. Co-expression analysis further revealed the possible involvement of OGs in stress response pathways. Overexpression of UP12_8740, a drought-inducible OG, conferred enhanced tolerance to osmotic stresses and soil drought. By combining Capture-Seq and fluorescence-based Kompetitive allele-specific PCR (KASP), we efficiently genotyped single nucleotide polymorphisms (SNPs) on OGs across a 223 accession cowpea germplasm collection. Population genomic parameters, including polymorphism information content (PIC), expected heterozygosity (He), nucleotide diversity (π), and Tajima's D statistics, that were calculated based on these SNPs, showed distinct signatures between the grain- and vegetable-type subpopulations of cowpea. This study reinforces the idea that OGs are a valuable resource for identifying new genes related to species-specific environmental adaptations and fosters new insights that artificial selection on OGs might have contributed to balancing the adaptive and agronomic traits in domesticated crops in various ecoclimatic conditions.
Collapse
Affiliation(s)
- Guojing Li
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
- State Key Lab Breeding Base for Sustainable Control of Plant Pest and Disease, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xinyi Wu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Yaowen Hu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Maria Muñoz-Amatriaín
- Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA, USA
| | - Jie Luo
- Central Laboratory of Zhejiang Academy of Agricultural Sciences, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Wen Zhou
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Baogen Wang
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Ying Wang
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xiaohua Wu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Lijuan Huang
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
- College of Horticulture, Northwest Agriculture and Forestry University, Yangling, China
| | - Zhongfu Lu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Pei Xu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
- State Key Lab Breeding Base for Sustainable Control of Plant Pest and Disease, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| |
Collapse
|
14
|
Jiang M, Dong X, Lang H, Pang W, Zhan Z, Li X, Piao Z. Mining of Brassica-Specific Genes (BSGs) and Their Induction in Different Developmental Stages and under Plasmodiophora brassicae Stress in Brassica rapa. Int J Mol Sci 2018; 19:ijms19072064. [PMID: 30012965 PMCID: PMC6073354 DOI: 10.3390/ijms19072064] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 06/29/2018] [Accepted: 07/13/2018] [Indexed: 11/16/2022] Open
Abstract
Orphan genes, also called lineage-specific genes (LSGs), are important for responses to biotic and abiotic stresses, and are associated with lineage-specific structures and biological functions. To date, there have been no studies investigating gene number, gene features, or gene expression patterns of orphan genes in Brassica rapa. In this study, 1540 Brassica-specific genes (BSGs) and 1824 Cruciferae-specific genes (CSGs) were identified based on the genome of Brassica rapa. The genic features analysis indicated that BSGs and CSGs possessed a lower percentage of multi-exon genes, higher GC content, and shorter gene length than evolutionary-conserved genes (ECGs). In addition, five types of BSGs were obtained and 145 out of 529 real A subgenome-specific BSGs were verified by PCR in 51 species. In silico and semi-qPCR, gene expression analysis of BSGs suggested that BSGs are expressed in various tissue and can be induced by Plasmodiophora brassicae. Moreover, an A/C subgenome-specific BSG, BSGs1, was specifically expressed during the heading stage, indicating that the gene might be associated with leafy head formation. Our results provide valuable biological information for studying the molecular function of BSGs for Brassica-specific phenotypes and biotic stress in B. rapa.
Collapse
Affiliation(s)
- Mingliang Jiang
- College of Horticulture, Shenyang Agricultural University, #120 Dongling Road, Shenyang 110866, China.
| | - Xiangshu Dong
- School of Agriculture, Yunnan University, Kunming 650504, China.
| | - Hong Lang
- Key Laboratory of Northeast Rice Biology and Breeding, Ministry of Agriculture, Rice Research Institute, Shenyang Agricultural University, Shenyang 110866, China.
| | - Wenxing Pang
- College of Horticulture, Shenyang Agricultural University, #120 Dongling Road, Shenyang 110866, China.
| | - Zongxiang Zhan
- College of Horticulture, Shenyang Agricultural University, #120 Dongling Road, Shenyang 110866, China.
| | - Xiaonan Li
- College of Horticulture, Shenyang Agricultural University, #120 Dongling Road, Shenyang 110866, China.
| | - Zhongyun Piao
- College of Horticulture, Shenyang Agricultural University, #120 Dongling Road, Shenyang 110866, China.
| |
Collapse
|
15
|
Feau N, Beauseigle S, Bergeron MJ, Bilodeau GJ, Birol I, Cervantes-Arango S, Dhillon B, Dale AL, Herath P, Jones SJ, Lamarche J, Ojeda DI, Sakalidis ML, Taylor G, Tsui CK, Uzunovic A, Yueh H, Tanguay P, Hamelin RC. Genome-Enhanced Detection and Identification (GEDI) of plant pathogens. PeerJ 2018; 6:e4392. [PMID: 29492338 PMCID: PMC5825881 DOI: 10.7717/peerj.4392] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 01/29/2018] [Indexed: 12/17/2022] Open
Abstract
Plant diseases caused by fungi and Oomycetes represent worldwide threats to crops and forest ecosystems. Effective prevention and appropriate management of emerging diseases rely on rapid detection and identification of the causal pathogens. The increase in genomic resources makes it possible to generate novel genome-enhanced DNA detection assays that can exploit whole genomes to discover candidate genes for pathogen detection. A pipeline was developed to identify genome regions that discriminate taxa or groups of taxa and can be converted into PCR assays. The modular pipeline is comprised of four components: (1) selection and genome sequencing of phylogenetically related taxa, (2) identification of clusters of orthologous genes, (3) elimination of false positives by filtering, and (4) assay design. This pipeline was applied to some of the most important plant pathogens across three broad taxonomic groups: Phytophthoras (Stramenopiles, Oomycota), Dothideomycetes (Fungi, Ascomycota) and Pucciniales (Fungi, Basidiomycota). Comparison of 73 fungal and Oomycete genomes led the discovery of 5,939 gene clusters that were unique to the targeted taxa and an additional 535 that were common at higher taxonomic levels. Approximately 28% of the 299 tested were converted into qPCR assays that met our set of specificity criteria. This work demonstrates that a genome-wide approach can efficiently identify multiple taxon-specific genome regions that can be converted into highly specific PCR assays. The possibility to easily obtain multiple alternative regions to design highly specific qPCR assays should be of great help in tackling challenging cases for which higher taxon-resolution is needed.
Collapse
Affiliation(s)
- Nicolas Feau
- Department of Forest and Conservation Sciences, Forest Sciences Centre, University of British Columbia, Vancouver, BC, Canada
| | | | | | | | - Inanc Birol
- BC Cancer agency, Genome Sciences Centre, Vancouver, BC, Canada
| | - Sandra Cervantes-Arango
- Department of Forest and Conservation Sciences, Forest Sciences Centre, University of British Columbia, Vancouver, BC, Canada
| | - Braham Dhillon
- Department of Plant Pathology, University of Arkansas at Fayetteville, Fayetteville, AR, United States of America
| | - Angela L. Dale
- Department of Forest and Conservation Sciences, Forest Sciences Centre, University of British Columbia, Vancouver, BC, Canada
- FPInnovations, Vancouver, BC, Canada
| | - Padmini Herath
- Department of Forest and Conservation Sciences, Forest Sciences Centre, University of British Columbia, Vancouver, BC, Canada
| | - Steven J.M. Jones
- BC Cancer agency, Genome Sciences Centre, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Josyanne Lamarche
- Canadian Forest Service, Natural Resources Canada, Quebec city, Quebec, Canada
| | - Dario I. Ojeda
- Department of Biology Unit of Ecology and Genetics, University of Oulu, Oulu, Finland
| | - Monique L. Sakalidis
- Department of Plant, Soil & Microbial Sciences and Department of Forestry, Michigan State University, East Lansing, MI, United States of America
| | - Greg Taylor
- BC Cancer agency, Genome Sciences Centre, Vancouver, BC, Canada
| | - Clement K.M. Tsui
- Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
| | | | - Hesther Yueh
- Department of Forest and Conservation Sciences, Forest Sciences Centre, University of British Columbia, Vancouver, BC, Canada
| | - Philippe Tanguay
- Canadian Forest Service, Natural Resources Canada, Quebec city, Quebec, Canada
| | - Richard C. Hamelin
- Department of Forest and Conservation Sciences, Forest Sciences Centre, University of British Columbia, Vancouver, BC, Canada
- Foresterie et géomatique, Institut de Biologie Intégrative des Systèmes, Laval University, Quebec city, Quebec, Canada
| |
Collapse
|
16
|
Prabh N, Rödelsperger C. Are orphan genes protein-coding, prediction artifacts, or non-coding RNAs? BMC Bioinformatics 2016; 17:226. [PMID: 27245157 PMCID: PMC4888513 DOI: 10.1186/s12859-016-1102-x] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 05/24/2016] [Indexed: 12/26/2022] Open
Abstract
Background Current genome sequencing projects reveal substantial numbers of taxonomically restricted, so called orphan genes that lack homology with genes from other evolutionary lineages. However, it is not clear to what extent orphan genes are real, genomic artifacts, or represent non-coding RNAs. Results Here, we use a simple set of assumptions to test the nature of orphan genes. First, a sequence that is transcribed is considered a real biological entity. Second, every sequence that is supported by proteome data or shows a depletion of non-synonymous substitutions is a protein-coding gene. Using genomic, transcriptomic and proteomic data for the nematode Pristionchus pacificus, we show that between 4129–7997 (42–81 %) of predicted orphan genes are expressed and 3818–7545 (39–76 %) of orphan genes are under negative selection. In three cases that exhibited strong evolutionary constraint but lacked expression evidence in 14 RNA-seq samples, we could experimentally validate the predicted gene structures. Comparing different data sets to infer selection on orphan gene clusters, we find that the presence of a closely related genome provides the most powerful resource to robustly identify evidence of negative selection. However, even in the absence of other genomic data, the availability of paralogous sequences was enough to show negative selection in 8–10 % of orphan genes. Conclusions Our study shows that the great majority of previously identified orphan genes in P. pacificus are indeed protein-coding genes. Even though this work represents a case study on a single species, our approach can be transferred to genomic data of other non-model organisms in order to ascertain the protein-coding nature of orphan genes.
Collapse
Affiliation(s)
- Neel Prabh
- Department for Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Spemannstrasse 35, 72076, Tübingen, Germany
| | - Christian Rödelsperger
- Department for Evolutionary Biology, Max-Planck-Institute for Developmental Biology, Spemannstrasse 35, 72076, Tübingen, Germany.
| |
Collapse
|