1
|
Guo Y, Wang T, Lu X, Li W, Lv X, Peng Q, Zhang J, Gao J, Hu M. Comparative genome-wide analysis of circular RNAs in Brassica napus L.: target-site versus non-target-site resistance to herbicide stress. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2024; 137:176. [PMID: 38969812 DOI: 10.1007/s00122-024-04678-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Accepted: 06/15/2024] [Indexed: 07/07/2024]
Abstract
Circular RNAs (circRNAs), a class of non-coding RNA molecules, are recognized for their unique functions; however, their responses to herbicide stress in Brassica napus remain unclear. In this study, the role of circRNAs in response to herbicide treatment was investigated in two rapeseed cultivars: MH33, which confers non-target-site resistance (NTSR), and EM28, which exhibits target-site resistance (TSR). The genome-wide circRNA profiles of herbicide-stressed and non-stressed seedlings were analyzed. The findings indicate that NTSR seedlings exhibited a greater abundance of circRNAs, shorter lengths of circRNAs and their parent genes, and more diverse functions of parent genes compared with TSR seedlings. Compared to normal-growth plants, the herbicide-stressed group exhibited similar trends in the number of circRNAs, functions of parent genes, and differentially expressed circRNAs as observed in NTSR seedlings. In addition, a greater number of circRNAs that function as competing microRNA (miRNA) sponges were identified in the herbicide stress and NTSR groups compared to the normal-growth and TSR groups, respectively. The differentially expressed circRNAs were validated by qPCR. The differntially expressed circRNA-miRNA networks were predicted, and the mRNAs targeted by these miRNAs were annotated. Our results suggest that circRNAs play a crucial role in responding to herbicide stress, exhibiting distinct responses between NTSR and TSR in rapeseed. These findings offer valuable insights into the mechanisms underlying herbicide resistance in rapeseed.
Collapse
Affiliation(s)
- Yue Guo
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Ting Wang
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Xinyu Lu
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Weilong Li
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Xinlei Lv
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Qi Peng
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Jiefu Zhang
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Jianqin Gao
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China
| | - Maolong Hu
- Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, China.
| |
Collapse
|
2
|
Dvorak P, Hlavac V, Hanicinec V, Rao BH, Soucek P. Genes divided according to the relative position of the longest intron show increased representation in different KEGG pathways. BMC Genomics 2024; 25:649. [PMID: 38943073 PMCID: PMC11214234 DOI: 10.1186/s12864-024-10558-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 06/24/2024] [Indexed: 07/01/2024] Open
Abstract
Despite the fact that introns mean an energy and time burden for eukaryotic cells, they play an irreplaceable role in the diversification and regulation of protein production. As a common feature of eukaryotic genomes, it has been reported that in protein-coding genes, the longest intron is usually one of the first introns. The goal of our work was to find a possible difference in the biological function of genes that fulfill this common feature compared to genes that do not. Data on the lengths of all introns in genes were extracted from the genomes of six vertebrates (human, mouse, koala, chicken, zebrafish and fugu) and two other model organisms (nematode worm and arabidopsis). We showed that more than 40% of protein-coding genes have the relative position of the longest intron located in the second or third tertile of all introns. Genes divided according to the relative position of the longest intron were found to be significantly increased in different KEGG pathways. Genes with the longest intron in the first tertile predominate in a range of pathways for amino acid and lipid metabolism, various signaling, cell junctions or ABC transporters. Genes with the longest intron in the second or third tertile show increased representation in pathways associated with the formation and function of the spliceosome and ribosomes. In the two groups of genes defined in this way, we further demonstrated the difference in the length of the longest introns and the distribution of their absolute positions. We also pointed out other characteristics, namely the positive correlation between the length of the longest intron and the sum of the lengths of all other introns in the gene and the preservation of the exact same absolute and relative position of the longest intron between orthologous genes.
Collapse
Affiliation(s)
- Pavel Dvorak
- Department of Biology, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300, Pilsen, Czech Republic.
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300, Pilsen, Czech Republic.
- Institute of Medical Genetics, University Hospital Pilsen, Dr. Edvarda Benese 13, 30599, Pilsen, Czech Republic.
| | - Viktor Hlavac
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300, Pilsen, Czech Republic
- Toxicogenomics Unit, National Institute of Public Health, Srobarova 48, 10042, Prague, Czech Republic
| | - Vojtech Hanicinec
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300, Pilsen, Czech Republic
| | - Bhavana Hemantha Rao
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300, Pilsen, Czech Republic
| | - Pavel Soucek
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300, Pilsen, Czech Republic
- Toxicogenomics Unit, National Institute of Public Health, Srobarova 48, 10042, Prague, Czech Republic
| |
Collapse
|
3
|
Hu Z, Chen J, Olatoye MO, Zhang H, Lin Z. Transcriptome-wide expression landscape and starch synthesis pathway co-expression network in sorghum. THE PLANT GENOME 2024; 17:e20448. [PMID: 38602082 DOI: 10.1002/tpg2.20448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/12/2024]
Abstract
The gene expression landscape across different tissues and developmental stages reflects their biological functions and evolutionary patterns. Integrative and comprehensive analyses of all transcriptomic data in an organism are instrumental to obtaining a comprehensive picture of gene expression landscape. Such studies are still very limited in sorghum, which limits the discovery of the genetic basis underlying complex agricultural traits in sorghum. We characterized the genome-wide expression landscape for sorghum using 873 RNA-sequencing (RNA-seq) datasets representing 19 tissues. Our integrative analysis of these RNA-seq data provides the most comprehensive transcriptomic atlas for sorghum, which will be valuable for the sorghum research community for functional characterizations of sorghum genes. Based on the transcriptome atlas, we identified 595 housekeeping genes (HKGs) and 2080 tissue-specific expression genes (TEGs) for the 19 tissues. We identified different gene features between HKGs and TEGs, and we found that HKGs have experienced stronger selective constraints than TEGs. Furthermore, we built a transcriptome-wide co-expression network (TW-CEN) comprising 35 modules with each module enriched in specific Gene Ontology terms. High-connectivity genes in TW-CEN tend to express at high levels while undergoing intensive selective pressure. We also built global and seed-preferential co-expression networks of starch synthesis pathways, which indicated that photosynthesis and microtubule-based movement play important roles in starch synthesis. The global transcriptome atlas of sorghum generated by this study provides an important functional genomics resource for trait discovery and insight into starch synthesis regulation in sorghum.
Collapse
Affiliation(s)
- Zhenbin Hu
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| | - Junhao Chen
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| | - Marcus O Olatoye
- USDA-ARS, Forage Seed and Cereal Research Unit, Prosser, Washington, USA
| | - Hengyou Zhang
- State Key Laboratory of Black Soils Conservation and Utilization, Key Laboratory of Soybean Molecular Design and Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Harbin, China
| | - Zhenguo Lin
- Department of Biology, Saint Louis University, Saint Louis, Missouri, USA
| |
Collapse
|
4
|
McCoy MJ, Fire AZ. Parallel gene size and isoform expansion of ancient neuronal genes. Curr Biol 2024; 34:1635-1645.e3. [PMID: 38460513 PMCID: PMC11043017 DOI: 10.1016/j.cub.2024.02.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/16/2023] [Accepted: 02/11/2024] [Indexed: 03/11/2024]
Abstract
How nervous systems evolved is a central question in biology. A diversity of synaptic proteins is thought to play a central role in the formation of specific synapses leading to nervous system complexity. The largest animal genes, often spanning hundreds of thousands of base pairs, are known to be enriched for expression in neurons at synapses and are frequently mutated or misregulated in neurological disorders and diseases. Although many of these genes have been studied independently in the context of nervous system evolution and disease, general principles underlying their parallel evolution remain unknown. To investigate this, we directly compared orthologous gene sizes across eukaryotes. By comparing relative gene sizes within organisms, we identified a distinct class of large genes with origins predating the diversification of animals and, in many cases, the emergence of neurons as dedicated cell types. We traced this class of ancient large genes through evolution and found orthologs of the large synaptic genes potentially driving the immense complexity of metazoan nervous systems, including in humans and cephalopods. Moreover, we found that while these genes are evolving under strong purifying selection, as demonstrated by low dN/dS ratios, they have simultaneously grown larger and gained the most isoforms in animals. This work provides a new lens through which to view this distinctive class of large and multi-isoform genes and demonstrates how intrinsic genomic properties, such as gene length, can provide flexibility in molecular evolution and allow groups of genes and their host organisms to evolve toward complexity.
Collapse
Affiliation(s)
- Matthew J McCoy
- Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA.
| | - Andrew Z Fire
- Department of Pathology, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA; Department of Genetics, Stanford University School of Medicine, 300 Pasteur Drive, Stanford, CA 94305, USA.
| |
Collapse
|
5
|
Sun J, Okada M, Tameshige T, Shimizu-Inatsugi R, Akiyama R, Nagano A, Sese J, Shimizu K. A low-coverage 3' RNA-seq to detect homeolog expression in polyploid wheat. NAR Genom Bioinform 2023; 5:lqad067. [PMID: 37448590 PMCID: PMC10336777 DOI: 10.1093/nargab/lqad067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 06/12/2023] [Accepted: 06/26/2023] [Indexed: 07/15/2023] Open
Abstract
Although allopolyploid species are common among natural and crop species, it is not easy to distinguish duplicated genes, known as homeologs, during their genomic analysis. Yet, cost-efficient RNA sequencing (RNA-seq) is to be developed for large-scale transcriptomic studies such as time-series analysis and genome-wide association studies in allopolyploids. In this study, we employed a 3' RNA-seq utilizing 3' untranslated regions (UTRs) containing frequent mutations among homeologous genes, compared to coding sequence. Among the 3' RNA-seq protocols, we examined a low-cost method Lasy-Seq using an allohexaploid bread wheat, Triticum aestivum. HISAT2 showed the best performance for 3' RNA-seq with the least mapping errors and quick computational time. The number of detected homeologs was further improved by extending 1 kb of the 3' UTR annotation. Differentially expressed genes in response to mild cold treatment detected by the 3' RNA-seq were verified with high-coverage conventional RNA-seq, although the latter detected more differentially expressed genes. Finally, downsampling showed that even a 2 million sequencing depth can still detect more than half of expressed homeologs identifiable by the conventional 32 million reads. These data demonstrate that this low-cost 3' RNA-seq facilitates large-scale transcriptomic studies of allohexaploid wheat and indicate the potential application to other allopolyploid species.
Collapse
Affiliation(s)
- Jianqiang Sun
- Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, 3-1-1 Kannondai, Tsukuba, Ibaraki 305-8517, Japan
| | - Moeko Okada
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
- Kihara Institute for Biological Research, Yokohama City University, 641-12 Maioka, Totsuka-ward, Yokohama, Kanagawa 244-0813, Japan
| | - Toshiaki Tameshige
- Kihara Institute for Biological Research, Yokohama City University, 641-12 Maioka, Totsuka-ward, Yokohama, Kanagawa 244-0813, Japan
- Division of Biological Sciences, Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5, Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Rie Shimizu-Inatsugi
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Reiko Akiyama
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Atsushi J Nagano
- Faculty of Agriculture, Ryukoku University, Yokotani 1-5, Seta Ohe-cho, Otsu, Shiga 520-2194, Japan
- Institute for Advanced Biosciences, Keio University, 403-1 Nipponkoku, Daihouji, Tsuruoka, Yamagata 997-0017, Japan
| | - Jun Sese
- Humanome Lab, Inc., 2-4-10, Tsukiji, Chuo-ku, Tokyo 104-0045, Japan
| | | |
Collapse
|
6
|
McCoy MJ, Fire AZ. Ancient origins of complex neuronal genes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.28.534655. [PMID: 37034725 PMCID: PMC10081198 DOI: 10.1101/2023.03.28.534655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
How nervous systems evolved is a central question in biology. An increasing diversity of synaptic proteins is thought to play a central role in the formation of specific synapses leading to nervous system complexity. The largest animal genes, often spanning millions of base pairs, are known to be enriched for expression in neurons at synapses and are frequently mutated or misregulated in neurological disorders and diseases. While many of these genes have been studied independently in the context of nervous system evolution and disease, general principles underlying their parallel evolution remain unknown. To investigate this, we directly compared orthologous gene sizes across eukaryotes. By comparing relative gene sizes within organisms, we identified a distinct class of large genes with origins predating the diversification of animals and in many cases the emergence of dedicated neuronal cell types. We traced this class of ancient large genes through evolution and found orthologs of the large synaptic genes driving the immense complexity of metazoan nervous systems, including in humans and cephalopods. Moreover, we found that while these genes are evolving under strong purifying selection as demonstrated by low dN/dS scores, they have simultaneously grown larger and gained the most isoforms in animals. This work provides a new lens through which to view this distinctive class of large and multi-isoform genes and demonstrates how intrinsic genomic properties, such as gene length, can provide flexibility in molecular evolution and allow groups of genes and their host organisms to evolve toward complexity.
Collapse
Affiliation(s)
- Matthew J. McCoy
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Whitman Center, Marine Biological Laboratory, Woods Hole, MA 02543, USA
| | - Andrew Z. Fire
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
7
|
Tao W, Bian J, Tang M, Zeng Y, Luo R, Ke Q, Li T, Li Y, Cui L. Genomic insights into positive selection during barley domestication. BMC PLANT BIOLOGY 2022; 22:267. [PMID: 35641942 PMCID: PMC9158214 DOI: 10.1186/s12870-022-03655-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Accepted: 05/23/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Cultivated barley (Hordeum vulgare) is widely used in animal feed, beverages, and foods and has become a model crop for molecular evolutionary studies. Few studies have examined the evolutionary fates of different types of genes in barley during the domestication process. RESULTS The rates of nonsynonymous substitution (Ka) to synonymous substitution (Ks) were calculated by comparing orthologous genes in different barley groups (wild vs. landrace and landrace vs. improved cultivar). The rates of evolution, properties, expression patterns, and diversity of positively selected genes (PSGs) and negatively selected genes (NSGs) were compared. PSGs evolved more rapidly, possessed fewer exons, and had lower GC content than NSGs; they were also shorter and had shorter intron, exon, and first exon lengths. Expression levels were lower, the tissue specificity of expression was higher, and codon usage bias was weaker for PSGs than for NSGs. Nucleotide diversity analysis revealed that PSGs have undergone a more severe genetic bottleneck than NSGs. Several candidate PSGs were involved in plant growth and development, which might make them as excellent targets for the molecular breeding of barley. CONCLUSIONS Our comprehensive analysis of the evolutionary, structural, and functional divergence between PSGs and NSGs in barley provides new insight into the evolutionary trajectory of barley during domestication. Our findings also aid future functional studies of PSGs in barley.
Collapse
Affiliation(s)
- Wenjing Tao
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Jianxin Bian
- Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong, 261325 China
| | - Minqiang Tang
- College of Forestry, Hainan University, Haikou, Hainan, 570228 China
| | - Yan Zeng
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Ruihan Luo
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Qinglin Ke
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Tingting Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Yihan Li
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| | - Licao Cui
- College of Bioscience and Engineering, Jiangxi Agricultural University, Nanchang, Jiangxi, 330045 China
| |
Collapse
|
8
|
Extended intergenic DNA contributes to neuron-specific expression of neighboring genes in the mammalian nervous system. Nat Commun 2022; 13:2733. [PMID: 35585070 PMCID: PMC9117226 DOI: 10.1038/s41467-022-30192-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 04/20/2022] [Indexed: 11/08/2022] Open
Abstract
Mammalian genomes comprise largely intergenic noncoding DNA with numerous cis-regulatory elements. Whether and how the size of intergenic DNA affects gene expression in a tissue-specific manner remain unknown. Here we show that genes with extended intergenic regions are preferentially expressed in neural tissues but repressed in other tissues in mice and humans. Extended intergenic regions contain twice as many active enhancers in neural tissues compared to other tissues. Neural genes with extended intergenic regions are globally co-expressed with neighboring neural genes controlled by distinct enhancers in the shared intergenic regions. Moreover, generic neural genes expressed in multiple tissues have significantly longer intergenic regions than neural genes expressed in fewer tissues. The intergenic regions of the generic neural genes have many tissue-specific active enhancers containing distinct transcription factor binding sites specific to each neural tissue. We also show that genes with extended intergenic regions are enriched for neural genes only in vertebrates. The expansion of intergenic regions may reflect the regulatory complexity of tissue-type-specific gene expression in the nervous system.
Collapse
|
9
|
Li Z, Zhang Y, Li W, Irwin AJ, Finkel ZV. Conservation and architecture of housekeeping genes in the model marine diatom Thalassiosira pseudonana. THE NEW PHYTOLOGIST 2022; 234:1363-1376. [PMID: 35179783 DOI: 10.1111/nph.18039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 02/06/2022] [Indexed: 06/14/2023]
Abstract
Housekeeping genes (HKGs) are constitutively expressed with low variation across tissues/conditions. They are thought to be highly conserved and fundamental to cellular maintenance, with distinctive genomic features. Here, we identify 1505 HKGs in the unicellular marine diatom Thalassiosira pseudonana based on an RNA-seq analysis of 232 samples taken under 12 experimental conditions over 0-72 h. We identify promising internal reference genes (IRGs) for T. pseudonana from the most stably expressed HKGs. A comparative analysis indicates < 18% of HKGs in T. pseudonana have orthologs in other eukaryotes, including other diatom species. Contrary to work on human tissues, T. pseudonana HKGs are longer than non-HKGs, due to elongated introns. More ancient HKGs tend to be shorter than more recent HKGs, and expression levels of HKGs decrease more rapidly with gene length relative to non-HKGs. Our results indicate that HKGs are highly variable across the tree of life and thus unlikely to be universally fundamental for cellular maintenance. We hypothesize that the distinct genomic features of HKGs of T. pseudonana may be a consequence of selection pressures associated with high expression and low variance across conditions.
Collapse
Affiliation(s)
- Zhengke Li
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Weiyang University Park, Xi'an, Shaanxi, 710021, China
- Department of Oceanography, Dalhousie University, 1355 Oxford St, Halifax, NS, B3H 4R2, Canada
| | - Yong Zhang
- Department of Oceanography, Dalhousie University, 1355 Oxford St, Halifax, NS, B3H 4R2, Canada
- College of Environmental Science and Engineering, Fujian Key Laboratory of Pollution Control and Resource Recycling, Fujian Normal University, No. 8 Shangsan Road, Fuzhou, Fujian, 350007, China
| | - Wei Li
- College of Life and Environmental Sciences, Huangshan University, 39 Xihai Road, Huangshan, Anhui, 245041, China
| | - Andrew J Irwin
- Department of Mathematics & Statistics, Dalhousie University, 1355 Oxford St, Halifax, NS, B3H 4R2, Canada
| | - Zoe V Finkel
- Department of Oceanography, Dalhousie University, 1355 Oxford St, Halifax, NS, B3H 4R2, Canada
| |
Collapse
|
10
|
Mukherjee D, Saha D, Acharya D, Mukherjee A, Ghosh TC. Interplay between gene expression and gene architecture as a consequence of gene and genome duplications: evidence from metabolic genes of Arabidopsis thaliana. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2022; 28:1091-1108. [PMID: 35722515 PMCID: PMC9203644 DOI: 10.1007/s12298-022-01188-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 05/16/2022] [Accepted: 05/18/2022] [Indexed: 05/03/2023]
Abstract
Gene and genome duplications have been widespread during the evolution of flowering plant which resulted in the increment of biological complexity as well as creation of plasticity of a genome helping the species to adapt to changing environments. Duplicated genes with higher evolutionary rates can act as a mechanism of generating novel functions in secondary metabolism. In this study, we explored duplication as a potential factor governing the expression heterogeneity and gene architecture of Primary Metabolic Genes (PMGs) and Secondary Metabolic Genes (SMGs) of Arabidopsis thaliana. It is remarkable that different types of duplication processes controlled gene expression and tissue specificity differently in PMGs and SMGs. A complex relationship exists between gene architecture and expression patterns of primary and secondary metabolic genes. Our study reflects, expression heterogeneity and gene structure variation of primary and secondary metabolism in Arabidopsis thaliana are partly results of duplication events of different origins. Our study suggests that duplication has differential effect on PMGs and SMGs regarding expression pattern by controlling gene structure, epigenetic modifications, multifunctionality and subcellular compartmentalization. This study provides an insight into the evolution of metabolism in plants in the light of gene and genome scale duplication. Supplementary Information The online version contains supplementary material available at 10.1007/s12298-022-01188-2.
Collapse
Affiliation(s)
- Dola Mukherjee
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700 054 India
| | - Deeya Saha
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700 054 India
| | - Debarun Acharya
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700 054 India
| | - Ashutosh Mukherjee
- Department of Botany, Vivekananda College, 269, Diamond Harbour Road, Thakurpukur, Kolkata, West Bengal 700063 India
| | - Tapash Chandra Ghosh
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata, 700 054 India
| |
Collapse
|
11
|
Costly circRNAs, Effective Population Size, and the Origins of Molecular Complexity. J Mol Evol 2021; 89:598-600. [PMID: 34698879 DOI: 10.1007/s00239-021-10033-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 10/16/2021] [Indexed: 10/20/2022]
Abstract
While much excitement has attended the discovery and study of circular RNAs, a new study in Cell Reports suggests that most mammalian circRNAs are not only functionless, but in fact costly. Comparison across three species is also consistent with the influential but rarely tested Drift-Barrier Hypothesis of molecular complexity. According to this hypothesis, nonessential genomic elements are slightly deleterious elements that fix by genetic drift and, thus, are generally more abundant in species with small effective population sizes. I discuss the implications of these new results for the Drift-Barrier hypothesis. In particular, I note the distinction between two classes of genomic elements, based on whether they are created by 'standard' small-scale mutations (basepair substitutions, indels, etc.) or larger, more idiosyncratic mutations (segmental duplications, transposable element propagation, etc.) I suggest that the Drift-Barrier Hypothesis is likely to apply to the former class, but perhaps not the latter class.
Collapse
|
12
|
Xu W, Li Y, Li Y, Liu C, Wang Y, Xia G, Wang M. Asymmetric Somatic Hybridization Affects Synonymous Codon Usage Bias in Wheat. Front Genet 2021; 12:682324. [PMID: 34178040 PMCID: PMC8226224 DOI: 10.3389/fgene.2021.682324] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 05/07/2021] [Indexed: 11/24/2022] Open
Abstract
Asymmetric somatic hybridization is an efficient strategy for crop breeding by introducing exogenous chromatin fragments, which leads to whole genomic shock and local chromosomal shock that induces genome-wide genetic variation including indel (insertion and deletion) and nucleotide substitution. Nucleotide substitution causes synonymous codon usage bias (SCUB), an indicator of genomic mutation and natural selection. However, how asymmetric somatic hybridization affects SCUB has not been addressed. Here, we explored this issue by comparing expressed sequence tags of a common wheat cultivar and its asymmetric somatic hybrid line. Asymmetric somatic hybridization affected SCUB and promoted the bias to A- and T-ending synonymous codon (SCs). SCUB frequencies in chromosomes introgressed with exogenous fragments were comparable to those in chromosomes without exogenous fragments, showing that exogenous fragments had no local chromosomal effect. Asymmetric somatic hybridization affected SCUB frequencies in indel-flanking sequences more strongly than in non-flanking sequences, and this stronger effect was present in both chromosomes with and without exogenous fragments. DNA methylation-driven SCUB shift was more pronounced than other SC pairs. SCUB shift was similar among seven groups of allelic chromosomes as well as three sub-genomes. Our work demonstrates that the SCUB shift induced by asymmetric somatic hybridization is attributed to the whole genomic shock, and DNA methylation is a putative force of SCUB shift during asymmetric somatic hybridization. Asymmetric somatic hybridization provides an available method for deepening the nature of SCUB shift and genetic variation induced by genomic shock.
Collapse
Affiliation(s)
- Wenjing Xu
- The Key Laboratory of Plant Development and Environmental Adaption, Ministry of Education, School of Life Science, Shandong University, Jinan, China
| | - Yingchun Li
- The Key Laboratory of Plant Development and Environmental Adaption, Ministry of Education, School of Life Science, Shandong University, Jinan, China
| | - Yajing Li
- The Key Laboratory of Plant Development and Environmental Adaption, Ministry of Education, School of Life Science, Shandong University, Jinan, China
| | - Chun Liu
- The Key Laboratory of Plant Development and Environmental Adaption, Ministry of Education, School of Life Science, Shandong University, Jinan, China
| | - Yanxia Wang
- Shijiazhuang Academy of Agriculture and Forestry Sciences, Shijiazhuang, China
| | - Guangmin Xia
- The Key Laboratory of Plant Development and Environmental Adaption, Ministry of Education, School of Life Science, Shandong University, Jinan, China
| | - Mengcheng Wang
- The Key Laboratory of Plant Development and Environmental Adaption, Ministry of Education, School of Life Science, Shandong University, Jinan, China
| |
Collapse
|
13
|
Role of Gene Length in Control of Human Gene Expression: Chromosome-Specific and Tissue-Specific Effects. Int J Genomics 2021; 2021:8902428. [PMID: 33688492 PMCID: PMC7911607 DOI: 10.1155/2021/8902428] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 01/12/2021] [Accepted: 02/03/2021] [Indexed: 11/19/2022] Open
Abstract
This study was carried out to pursue the observation that the level of gene expression is affected by gene length in the human genome. As transcription is a time-dependent process, it is expected that gene expression will be inversely related to gene length, and this is found to be the case. Here, I describe the results of studies performed to test whether the gene length/gene expression linkage is affected by two factors, the chromosome where the gene is located and the tissue where it is expressed. Studies were performed with a database of 3538 human genes that were divided into short, midlength, and long groups. Chromosome groups were then compared in the expression level of genes with the same length. A similar analysis was performed with 19 human tissues. Tissue-specific groups were compared in the expression level of genes with the same length. Both chromosome and tissue studies revealed new information about the role of gene length in control of gene expression. Chromosome studies led to the identification of two chromosome populations that differ in the expression level of short genes. A high level of expression was observed in chromosomes 2-10, 12-15, and 18 and a low level in 1, 11, 16-17, 19-20, 22, and 24. Studies with tissue-specific genes led to the identification of two tissues, brain and liver, which differ in the expression level of short genes. The results are interpreted to support the view that the level of a gene's expression can be affected by the chromosome and the tissue where the gene is transcribed.
Collapse
|
14
|
Vihinen M. Functional effects of protein variants. Biochimie 2020; 180:104-120. [PMID: 33164889 DOI: 10.1016/j.biochi.2020.10.009] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 10/15/2020] [Accepted: 10/19/2020] [Indexed: 12/11/2022]
Abstract
Genetic and other variations frequently affect protein functions. Scientific articles can contain confusing descriptions about which function or property is affected, and in many cases the statements are pure speculation without any experimental evidence. To clarify functional effects of protein variations of genetic or non-genetic origin, a systematic conceptualisation and framework are introduced. This framework describes protein functional effects on abundance, activity, specificity and affinity, along with countermeasures, which allow cells, tissues and organisms to tolerate, avoid, repair, attenuate or resist (TARAR) the effects. Effects on abundance discussed include gene dosage, restricted expression, mis-localisation and degradation. Enzymopathies, effects on kinetics, allostery and regulation of protein activity are subtopics for the effects of variants on activity. Variation outcomes on specificity and affinity comprise promiscuity, specificity, affinity and moonlighting. TARAR mechanisms redress variations with active and passive processes including chaperones, redundancy, robustness, canalisation and metabolic and signalling rewiring. A framework for pragmatic protein function analysis and presentation is introduced. All of the mechanisms and effects are described along with representative examples, most often in relation to diseases. In addition, protein function is discussed from evolutionary point of view. Application of the presented framework facilitates unambiguous, detailed and specific description of functional effects and their systematic study.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, BMC B13, Lund University, SE-22 184, Lund, Sweden.
| |
Collapse
|
15
|
Cao Y, Jiang L, Wang L, Cai Y. Evolutionary Rate Heterogeneity and Functional Divergence of Orthologous Genes in Pyrus. Biomolecules 2019; 9:biom9090490. [PMID: 31527450 PMCID: PMC6770726 DOI: 10.3390/biom9090490] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 09/09/2019] [Accepted: 09/12/2019] [Indexed: 11/21/2022] Open
Abstract
Negatively selected genes (NSGs) and positively selected genes (PSGs) are the two types of most nuclear protein-coding genes in organisms. However, the evolutionary rates and characteristics of different types of genes have been rarely understood. In the present study, we investigate the rates of synonymous substitution (Ks) and the rates of non-synonymous substitution (Ka) by comparing the orthologous genes of two sequenced Pyrus species, Pyrus bretschneideri and Pyrus communis. Subsequently, we compared the evolutionary rates, gene structures, and expression profiles during different fruit development between PSGs and NSGs. Compared with the NSGs, the PSGs have fewer exons, shorter gene length, lower synonymous substitution rates and have higher evolutionary rates. Remarkably, gene expression patterns between two Pyrus species fruit indicated functional divergence for most of the orthologous genes derived from a common ancestor, and subfunctionalization for some of them. Overall, the present study shows that PSGs differs from NSGs not only under environmental selective pressure (Ka/Ks), but also in their structural, functional, and evolutionary properties. Additionally, our resulting data provides important insights for the evolution and highlights the diversification of orthologous genes in two Pyrus species.
Collapse
Affiliation(s)
- Yunpeng Cao
- Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, Changsha 410004, China.
- School of Life Sciences, Anhui Agricultural University, Hefei 230036, China.
| | - Lan Jiang
- Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, Changsha 410004, China.
| | - Lihu Wang
- College of Landscape and Ecological Engineering, Hebei University of Engineering, Handan 056038, China.
| | - Yongping Cai
- School of Life Sciences, Anhui Agricultural University, Hefei 230036, China.
| |
Collapse
|
16
|
Memon D, Bi J, Miller CJ. In silico prediction of housekeeping long intergenic non-coding RNAs reveals HKlincR1 as an essential player in lung cancer cell survival. Sci Rep 2019; 9:7372. [PMID: 31089191 PMCID: PMC6517443 DOI: 10.1038/s41598-019-43758-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 04/29/2019] [Indexed: 12/27/2022] Open
Abstract
Prioritising long intergenic noncoding RNAs (lincRNAs) for functional characterisation is a significant challenge. Here we applied computational approaches to discover lincRNAs expected to play a critical housekeeping (HK) role within the cell. Using the Illumina Human BodyMap RNA sequencing dataset as a starting point, we first identified lincRNAs ubiquitously expressed across a panel of human tissues. This list was then further refined by reference to conservation score, secondary structure and promoter DNA methylation status. Finally, we used tumour expression and copy number data to identify lincRNAs rarely downregulated or deleted in multiple tumour types. The resulting list of candidate essential lincRNAs was then subjected to co-expression analyses using independent data from ENCODE and The Cancer Genome Atlas (TCGA). This identified a substantial subset with a predicted role in DNA replication and cell cycle regulation. One of these, HKlincR1, was selected for further characterisation. Depletion of HKlincR1 affected cell growth in multiple lung cancer cell lines, and led to disruption of genes involved in cell growth and viability. In addition, HKlincR1 expression was correlated with overall survival in lung adenocarcinoma patients. Our in silico studies therefore reveal a set of housekeeping noncoding RNAs of interest both in terms of their role in normal homeostasis, and their relevance in tumour growth and maintenance.
Collapse
Affiliation(s)
- Danish Memon
- RNA Biology Group, CRUK Manchester Institute, The University of Manchester, Alderley Park, Manchester, SK10 4TG, UK
- European Bioinformatics Institute (EMBL-EBI)/Cancer Research UK Cambridge Institute, The University of Cambridge, Cambridge, UK
| | - Jing Bi
- RNA Biology Group, CRUK Manchester Institute, The University of Manchester, Alderley Park, Manchester, SK10 4TG, UK
| | - Crispin J Miller
- RNA Biology Group, CRUK Manchester Institute, The University of Manchester, Alderley Park, Manchester, SK10 4TG, UK.
| |
Collapse
|
17
|
Huang X, Li S, Zhan A. Genome-Wide Identification and Evaluation of New Reference Genes for Gene Expression Analysis Under Temperature and Salinity Stresses in Ciona savignyi. Front Genet 2019; 10:71. [PMID: 30809246 PMCID: PMC6380166 DOI: 10.3389/fgene.2019.00071] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 01/28/2019] [Indexed: 01/14/2023] Open
Abstract
Rapid adaptation/accommodation to changing environments largely contributes to maximal survival of invaders during biological invasions, usually leading to success in crossing multiple barriers and finally in varied environments in recipient habitats. Gene expression is one of the most important and rapid ways during responses to environmental stresses. Selection of proper reference genes is the crucial prerequisite for gene expression analysis using the common approach, real-time quantitative PCR (RT-qPCR). Here we identified eight candidate novel reference genes from the RNA-Seq data in an invasive model ascidian Ciona savignyi under temperature and salinity stresses. Subsequently, the expression stability of these eight novel reference genes, as well as other six traditionally used reference genes, was evaluated using RT-qPCR and comprehensive tool RefFinder. Under the temperature stress, two traditional reference genes, ribosomal proteins S15 and L17 (RPS15, RPL17), and one novel gene Ras homolog A (RhoA), were recommended as the top three stable genes, which can be used to normalize target genes with a high and moderate expression level, respectively. Under the salinity stress, transmembrane 9 superfamily member (TMN), MOB kinase activator 1A-like gene (MOB) and ubiquitin-conjugating enzyme (UBQ2) were suggested as the top three stable genes. On the other hand, several commonly used reference genes such as α-tubulin (TubA), β-tubulin (TubB) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH) showed unstable expressions, thus these genes should not be used as internal controls for gene expression analysis. We also tested the expression level of an important stress response gene, large proline-rich protein bag6-like gene (BAG) using different reference genes. As expected, we observed different results and conclusions when using different normalization methods, thus suggesting the importance of selection of proper reference genes and associated normalization methods. Our results provide a valuable reference gene resource for the normalization of gene expression in the study of environmental adaptation/accommodation during biological invasions using C. savignyi as a model.
Collapse
Affiliation(s)
- Xuena Huang
- Key Laboratory of Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing, China
| | - Shiguo Li
- Key Laboratory of Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China
| | - Aibin Zhan
- Key Laboratory of Environmental Biotechnology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
18
|
Rigau M, Juan D, Valencia A, Rico D. Intronic CNVs and gene expression variation in human populations. PLoS Genet 2019; 15:e1007902. [PMID: 30677042 PMCID: PMC6345438 DOI: 10.1371/journal.pgen.1007902] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 12/17/2018] [Indexed: 11/19/2022] Open
Abstract
Introns can be extraordinarily large and they account for the majority of the DNA sequence in human genes. However, little is known about their population patterns of structural variation and their functional implication. By combining the most extensive maps of CNVs in human populations, we have found that intronic losses are the most frequent copy number variants (CNVs) in protein-coding genes in human, with 12,986 intronic deletions, affecting 4,147 genes (including 1,154 essential genes and 1,638 disease-related genes). This intronic length variation results in dozens of genes showing extreme population variability in size, with 40 genes with 10 or more different sizes and up to 150 allelic sizes. Intronic losses are frequent in evolutionarily ancient genes that are highly conserved at the protein sequence level. This result contrasts with losses overlapping exons, which are observed less often than expected by chance and almost exclusively affect primate-specific genes. An integrated analysis of CNVs and RNA-seq data showed that intronic loss can be associated with significant differences in gene expression levels in the population (CNV-eQTLs). These intronic CNV-eQTLs regions are enriched for intronic enhancers and can be associated with expression differences of other genes showing long distance intron-promoter 3D interactions. Our data suggests that intronic structural variation of protein-coding genes makes an important contribution to the variability of gene expression and splicing in human populations.
Collapse
Affiliation(s)
- Maria Rigau
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - David Juan
- Institut de Biologia Evolutiva, Consejo Superior de Investigaciones Científicas–Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Daniel Rico
- Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
19
|
Wang M, Ji Y, Feng S, Liu C, Xiao Z, Wang X, Wang Y, Xia G. The non-random patterns of genetic variation induced by asymmetric somatic hybridization in wheat. BMC PLANT BIOLOGY 2018; 18:244. [PMID: 30332989 PMCID: PMC6192298 DOI: 10.1186/s12870-018-1474-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Accepted: 10/05/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND Asymmetric somatic hybridization is an efficient crop breeding approach by introducing several exogenous chromatin fragments, which leads to genomic shock and therefore induces genome-wide genetic variation. However, the fundamental question concerning the genetic variation such as whether it occurs randomly and suffers from selection pressure remains unknown. RESULTS Here, we explored this issue by comparing expressed sequence tags of a common wheat cultivar and its asymmetric somatic hybrid line. Both nucleotide substitutions and indels (insertions and deletions) had lower frequencies in coding sequences than in un-translated regions. The frequencies of nucleotide substitutions and indels were both comparable between chromosomes with and without introgressed fragments. Nucleotide substitutions distributed unevenly and were preferential to indel-flanking sequences, and the frequency of nucleotide substitutions at 5'-flanking sequences of indels was obviously higher in chromosomes with introgressed fragments than in those without exogenous fragment. Nucleotide substitutions and indels both had various frequencies among seven groups of allelic chromosomes, and the frequencies of nucleotide substitutions were strongly negatively correlative to those of indels. Among three sets of genomes, the frequencies of nucleotide substitutions and indels were both heterogeneous, and the frequencies of nucleotide substitutions exhibited drastically positive correlation to those of indels. CONCLUSIONS Our work demonstrates that the genetic variation induced by asymmetric somatic hybridization is attributed to both whole genomic shock and local chromosomal shock, which is a predetermined and non-random genetic event being closely associated with selection pressure. Asymmetric somatic hybrids provide a worthwhile model to further investigate the nature of genomic shock induced genetic variation.
Collapse
Affiliation(s)
- Mengcheng Wang
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100 People’s Republic of China
| | - Yujie Ji
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing, 210095 China
| | - Shiting Feng
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100 People’s Republic of China
| | - Chun Liu
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100 People’s Republic of China
| | - Zhen Xiao
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100 People’s Republic of China
| | - Xiaoping Wang
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100 People’s Republic of China
| | - Yanxia Wang
- Shijiazhuang Academy of Agriculture and Forestry Sciences, Shijiazhuang, 050041 China
| | - Guangmin Xia
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, Ministry of Education, School of Life Science, Shandong University, 27 Shandanan Road, Jinan, Shandong 250100 People’s Republic of China
| |
Collapse
|
20
|
Mukherjee D, Saha D, Acharya D, Mukherjee A, Chakraborty S, Ghosh TC. The role of introns in the conservation of the metabolic genes of Arabidopsis thaliana. Genomics 2018; 110:310-317. [DOI: 10.1016/j.ygeno.2017.12.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 12/06/2017] [Accepted: 12/08/2017] [Indexed: 10/18/2022]
|
21
|
Wei K, Zhang T, Ma L. Divergent and convergent evolution of housekeeping genes in human-pig lineage. PeerJ 2018; 6:e4840. [PMID: 29844985 PMCID: PMC5971102 DOI: 10.7717/peerj.4840] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 05/03/2018] [Indexed: 11/27/2022] Open
Abstract
Housekeeping genes are ubiquitously expressed and maintain basic cellular functions across tissue/cell type conditions. The present study aimed to develop a set of pig housekeeping genes and compare the structure, evolution and function of housekeeping genes in the human–pig lineage. By using RNA sequencing data, we identified 3,136 pig housekeeping genes. Compared with human housekeeping genes, we found that pig housekeeping genes were longer and subjected to slightly weaker purifying selection pressure and faster neutral evolution. Common housekeeping genes, shared by the two species, achieve stronger purifying selection than species-specific genes. However, pig- and human-specific housekeeping genes have similar functions. Some species-specific housekeeping genes have evolved independently to form similar protein active sites or structure, such as the classical catalytic serine–histidine–aspartate triad, implying that they have converged for maintaining the basic cellular function, which allows them to adapt to the environment. Human and pig housekeeping genes have varied structures and gene lists, but they have converged to maintain basic cellular functions essential for the existence of a cell, regardless of its specific role in the species. The results of our study shed light on the evolutionary dynamics of housekeeping genes.
Collapse
Affiliation(s)
- Kai Wei
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China
| | - Tingting Zhang
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China
| | - Lei Ma
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China
| |
Collapse
|
22
|
Chang YC, Ding Y, Dong L, Zhu LJ, Jensen RV, Hsiao LL. Differential expression patterns of housekeeping genes increase diagnostic and prognostic value in lung cancer. PeerJ 2018; 6:e4719. [PMID: 29761043 PMCID: PMC5949062 DOI: 10.7717/peerj.4719] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 04/16/2018] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Using DNA microarrays, we previously identified 451 genes expressed in 19 different human tissues. Although ubiquitously expressed, the variable expression patterns of these "housekeeping genes" (HKGs) could separate one normal human tissue type from another. Current focus on identifying "specific disease markers" is problematic as single gene expression in a given sample represents the specific cellular states of the sample at the time of collection. In this study, we examine the diagnostic and prognostic potential of the variable expressions of HKGs in lung cancers. METHODS Microarray and RNA-seq data for normal lungs, lung adenocarcinomas (AD), squamous cell carcinomas of the lung (SQCLC), and small cell carcinomas of the lung (SCLC) were collected from online databases. Using 374 of 451 HKGs, differentially expressed genes between pairs of sample types were determined via two-sided, homoscedastic t-test. Principal component analysis and hierarchical clustering classified normal lung and lung cancers subtypes according to relative gene expression variations. We used uni- and multi-variate cox-regressions to identify significant predictors of overall survival in AD patients. Classifying genes were selected using a set of training samples and then validated using an independent test set. Gene Ontology was examined by PANTHER. RESULTS This study showed that the differential expression patterns of 242, 245, and 99 HKGs were able to distinguish normal lung from AD, SCLC, and SQCLC, respectively. From these, 70 HKGs were common across the three lung cancer subtypes. These HKGs have low expression variation compared to current lung cancer markers (e.g., EGFR, KRAS) and were involved in the most common biological processes (e.g., metabolism, stress response). In addition, the expression pattern of 106 HKGs alone was a significant classifier of AD versus SQCLC. We further highlighted that a panel of 13 HKGs was an independent predictor of overall survival and cumulative risk in AD patients. DISCUSSION Here we report HKG expression patterns may be an effective tool for evaluation of lung cancer states. For example, the differential expression pattern of 70 HKGs alone can separate normal lung tissue from various lung cancers while a panel of 106 HKGs was a capable class predictor of subtypes of non-small cell carcinomas. We also reported that HKGs have significantly lower variance compared to traditional cancer markers across samples, highlighting the robustness of a panel of genes over any one specific biomarker. Using RNA-seq data, we showed that the expression pattern of 13 HKGs is a significant, independent predictor of overall survival for AD patients. This reinforces the predictive power of a HKG panel across different gene expression measurement platforms. Thus, we propose the expression patterns of HKGs alone may be sufficient for the diagnosis and prognosis of individuals with lung cancer.
Collapse
Affiliation(s)
- Yu-Chun Chang
- Division of Renal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Yan Ding
- Division of Renal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Lingsheng Dong
- Research Computing, Harvard Medical School, Boston, MA, United States of America
| | - Lang-Jing Zhu
- Division of Renal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States of America
- Department of Nephrology, The Eighth Affiliated Hospital, Sun Yat-sen University, Shenzhen, China
| | - Roderick V. Jensen
- Department of Biological Sciences, Virginia Polytechnic Institute and State University (Virginia Tech), Blacksburg, United States of America
| | - Li-Li Hsiao
- Division of Renal Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States of America
| |
Collapse
|
23
|
Pellicer J, Hidalgo O, Dodsworth S, Leitch IJ. Genome Size Diversity and Its Impact on the Evolution of Land Plants. Genes (Basel) 2018; 9:E88. [PMID: 29443885 PMCID: PMC5852584 DOI: 10.3390/genes9020088] [Citation(s) in RCA: 162] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Revised: 02/02/2018] [Accepted: 02/05/2018] [Indexed: 01/09/2023] Open
Abstract
Genome size is a biodiversity trait that shows staggering diversity across eukaryotes, varying over 64,000-fold. Of all major taxonomic groups, land plants stand out due to their staggering genome size diversity, ranging ca. 2400-fold. As our understanding of the implications and significance of this remarkable genome size diversity in land plants grows, it is becoming increasingly evident that this trait plays not only an important role in shaping the evolution of plant genomes, but also in influencing plant community assemblages at the ecosystem level. Recent advances and improvements in novel sequencing technologies, as well as analytical tools, make it possible to gain critical insights into the genomic and epigenetic mechanisms underpinning genome size changes. In this review we provide an overview of our current understanding of genome size diversity across the different land plant groups, its implications on the biology of the genome and what future directions need to be addressed to fill key knowledge gaps.
Collapse
Affiliation(s)
- Jaume Pellicer
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew TW9 3DS, UK.
| | - Oriane Hidalgo
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew TW9 3DS, UK.
| | - Steven Dodsworth
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew TW9 3DS, UK.
| | - Ilia J Leitch
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew TW9 3DS, UK.
| |
Collapse
|
24
|
Nikolaou C. Invisible cities: segregated domains in the yeast genome with distinct structural and functional attributes. Curr Genet 2017; 64:247-258. [PMID: 28780612 DOI: 10.1007/s00294-017-0731-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Revised: 07/31/2017] [Accepted: 08/02/2017] [Indexed: 02/07/2023]
Abstract
Recent advances in our understanding of the three-dimensional organization of the eukaryotic nucleus have rendered the spatial distribution of genes increasingly relevant. In a recent work (Tsochatzidou et al., Nucleic Acids Res 45:5818-5828, 2017), we proposed the existence of a functional compartmentalization of the yeast genome according to which, genes occupying the chromosomal regions at the nuclear periphery have distinct structural, functional and evolutionary characteristics compared to their centromeric-proximal counterparts. Around the same time, it was also shown that the genome of Saccharomyces cerevisiae is organized in topologically associated domains (TADs), which are largely associated with the replication timing. In this work, we proceed to investigate whether such units of three-dimensional genomic organization can be linked to transcriptional activity as a driving force for the shaping of genomic architecture. Through the application of a simple boundary-calling criterion in genome-wide 3C data, we define ~100 TAD-like domains which can be clustered in six different classes with radically different nucleosomal organizations, significant variations in transcription factor binding and uneven chromosomal distribution. Approximately ~20% of the genome is found to be confined in regions with "closed" chromatin structure around gene promoters. Most interestingly, we find both "open" and "closed" regions to be segregated, in the sense that they tend to avoid inter-chromosomal interactions. Our data further enforce the notion of a marked compartmentalization of the yeast genome in isolated territories, with implications in its function and evolution.
Collapse
Affiliation(s)
- Christoforos Nikolaou
- Computational Genomics Group, Department of Biology, University of Crete, 70013, Herakleion, Greece.
| |
Collapse
|
25
|
Guo Y, Liu J, Zhang J, Liu S, Du J. Selective modes determine evolutionary rates, gene compactness and expression patterns in Brassica. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 91:34-44. [PMID: 28332757 DOI: 10.1111/tpj.13541] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Revised: 02/28/2017] [Accepted: 03/15/2017] [Indexed: 05/18/2023]
Abstract
It has been well documented that most nuclear protein-coding genes in organisms can be classified into two categories: positively selected genes (PSGs) and negatively selected genes (NSGs). The characteristics and evolutionary fates of different types of genes, however, have been poorly understood. In this study, the rates of nonsynonymous substitution (Ka ) and the rates of synonymous substitution (Ks ) were investigated by comparing the orthologs between the two sequenced Brassica species, Brassica rapa and Brassica oleracea, and the evolutionary rates, gene structures, expression patterns, and codon bias were compared between PSGs and NSGs. The resulting data show that PSGs have higher protein evolutionary rates, lower synonymous substitution rates, shorter gene length, fewer exons, higher functional specificity, lower expression level, higher tissue-specific expression and stronger codon bias than NSGs. Although the quantities and values are different, the relative features of PSGs and NSGs have been largely verified in the model species Arabidopsis. These data suggest that PSGs and NSGs differ not only under selective pressure (Ka /Ks ), but also in their evolutionary, structural and functional properties, indicating that selective modes may serve as a determinant factor for measuring evolutionary rates, gene compactness and expression patterns in Brassica.
Collapse
Affiliation(s)
- Yue Guo
- Provincial Key Laboratory of Agrobiology, Institute of Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
| | - Jing Liu
- Provincial Key Laboratory of Agrobiology, Institute of Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
| | - Jiefu Zhang
- Key Laboratory of Cotton and Rapeseed, Ministry of Agriculture of People's Republic of China, Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
| | - Shengyi Liu
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, 430062, China
| | - Jianchang Du
- Provincial Key Laboratory of Agrobiology, Institute of Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
- Key Laboratory of Cotton and Rapeseed, Ministry of Agriculture of People's Republic of China, Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, 430062, China
| |
Collapse
|
26
|
Corrales M, Rosado A, Cortini R, van Arensbergen J, van Steensel B, Filion GJ. Clustering of Drosophila housekeeping promoters facilitates their expression. Genome Res 2017; 27:1153-1161. [PMID: 28420691 PMCID: PMC5495067 DOI: 10.1101/gr.211433.116] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 04/12/2017] [Indexed: 11/25/2022]
Abstract
Housekeeping genes of animal genomes cluster in the same chromosomal regions. It has long been suggested that this organization contributes to their steady expression across all the tissues of the organism. Here, we show that the activity of Drosophila housekeeping gene promoters depends on the expression of their neighbors. By measuring the expression of ∼85,000 reporters integrated in Kc167 cells, we identified the best predictors of expression as chromosomal contacts with the promoters and terminators of active genes. Surprisingly, the chromatin composition at the insertion site and the contacts with enhancers were less informative. These results are substantiated by the existence of genomic “paradoxical” domains, rich in euchromatic features and enhancers, but where the reporters are expressed at low level, concomitant with a deficit of interactions with promoters and terminators. This indicates that the proper function of housekeeping genes relies not on contacts with long distance enhancers but on spatial clustering. Overall, our results suggest that spatial proximity between genes increases their expression and that the linear architecture of the Drosophila genome contributes to this effect.
Collapse
Affiliation(s)
- Marc Corrales
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Aránzazu Rosado
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Ruggero Cortini
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Joris van Arensbergen
- Division of Gene Regulation, Netherlands Cancer Institute (NKI), 1066CX Amsterdam, The Netherlands
| | - Bas van Steensel
- Division of Gene Regulation, Netherlands Cancer Institute (NKI), 1066CX Amsterdam, The Netherlands
| | - Guillaume J Filion
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
27
|
Tarallo A, Gambi MC, D'Onofrio G. Lifestyle and DNA base composition in polychaetes. Physiol Genomics 2016; 48:883-888. [PMID: 27764763 DOI: 10.1152/physiolgenomics.00018.2016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Accepted: 09/27/2016] [Indexed: 11/22/2022] Open
Abstract
A comparative analysis of polychaete species, classified as motile and low-motile forms, highlighted that the former were characterized not only by a higher metabolic rate (MR), but also by a higher genomic GC content. The fluctuation of both variables was not affected by the phylogenetic relationship of the species. Thus, present results further support that a very active lifestyle affects MR and GC at the same time, showing an unexpected similarity between invertebrates and vertebrates. In teleosts, indeed, a similar pattern has been also observed in comparisons of migratory and nonmigratory species. A cause-effect link between MR and GC has not yet been proved, but the fact that the two variables are significantly linked in all the organisms so far analyzed is, most probably, of relevant biological and evolutionary meaning. The present results fit very well within the frame of the metabolic rate hypothesis proposed to explain the DNA base composition variability among organisms. On the contrary, the thermostability hypothesis was not supported. At present, no data about the recombination rate in polychaetes were available to test the biased gene conversion (BGC hypothesis).
Collapse
Affiliation(s)
- Andrea Tarallo
- Stazione Zoologica Anton Dohrn, Department of Biology and Evolution of Marine Organisms, Naples, Italy; and
| | - Maria Cristina Gambi
- Stazione Zoologica Anton Dohrn, Department of Integrative Marine Ecology (Villa Dohrn-Benthic Ecology Center), Ischia, Naples, Italy
| | - Giuseppe D'Onofrio
- Stazione Zoologica Anton Dohrn, Department of Biology and Evolution of Marine Organisms, Naples, Italy; and
| |
Collapse
|
28
|
Rivera-Casas C, González-Romero R, Vizoso-Vazquez Á, Cheema MS, Cerdán ME, Méndez J, Ausió J, Eirin-Lopez JM. Characterization of mussel H2A.Z.2: a new H2A.Z variant preferentially expressed in germinal tissues from Mytilus. Biochem Cell Biol 2016; 94:480-490. [DOI: 10.1139/bcb-2016-0056] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Histones are the fundamental constituents of the eukaryotic chromatin, facilitating the physical organization of DNA in chromosomes and participating in the regulation of its metabolism. The H2A family displays the largest number of variants among core histones, including the renowned H2A.X, macroH2A, H2A.B (Bbd), and H2A.Z. This latter variant is especially interesting because of its regulatory role and its differentiation into 2 functionally divergent variants (H2A.Z.1 and H2A.Z.2), further specializing the structure and function of vertebrate chromatin. In the present work we describe, for the first time, the presence of a second H2A.Z variant (H2A.Z.2) in the genome of a non-vertebrate animal, the mussel Mytilus. The molecular and evolutionary characterization of mussel H2A.Z.1 and H2A.Z.2 histones is consistent with their functional specialization, supported on sequence divergence at promoter and coding regions as well as on varying gene expression patterns. More precisely, the expression of H2A.Z.2 transcripts in gonadal tissue and its potential upregulation in response to genotoxic stress might be mirroring the specialization of this variant in DNA repair. Overall, the findings presented in this work complement recent reports describing the widespread presence of other histone variants across eukaryotes, supporting an ancestral origin and conserved role for histone variants in chromatin.
Collapse
Affiliation(s)
- Ciro Rivera-Casas
- Chromatin Structure and Evolution (Chromevol) Group, Department of Biological Sciences, Florida International University, North Miami, FL 33181, USA
| | - Rodrigo González-Romero
- Chromatin Structure and Evolution (Chromevol) Group, Department of Biological Sciences, Florida International University, North Miami, FL 33181, USA
| | - Ángel Vizoso-Vazquez
- Exprela Group, Department of Cellular and Molecular Biology, University of A Coruña, A Coruña E15071, Spain
| | - Manjinder S. Cheema
- Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC V8W 3P6, Canada
| | - M. Esperanza Cerdán
- Exprela Group, Department of Cellular and Molecular Biology, University of A Coruña, A Coruña E15071, Spain
| | - Josefina Méndez
- Xenomar Group, Department of Cellular and Molecular Biology, University of A Coruña, A Coruña E15071, Spain
| | - Juan Ausió
- Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC V8W 3P6, Canada
| | - Jose M. Eirin-Lopez
- Chromatin Structure and Evolution (Chromevol) Group, Department of Biological Sciences, Florida International University, North Miami, FL 33181, USA
| |
Collapse
|
29
|
Biswas K, Chakraborty S, Podder S, Ghosh TC. Insights into the dN/dS ratio heterogeneity between brain specific genes and widely expressed genes in species of different complexity. Genomics 2016; 108:11-7. [DOI: 10.1016/j.ygeno.2016.04.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Revised: 04/22/2016] [Accepted: 04/23/2016] [Indexed: 01/07/2023]
|
30
|
Yang L, Wang S, Zhou M, Chen X, Zuo Y, Sun D, Lv Y. Comparative analysis of housekeeping and tissue-selective genes in human based on network topologies and biological properties. Mol Genet Genomics 2016; 291:1227-41. [PMID: 26897376 DOI: 10.1007/s00438-016-1178-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 01/26/2016] [Indexed: 01/14/2023]
Abstract
Housekeeping genes are genes that are turned on most of the time in almost every tissue to maintain cellular functions. Tissue-selective genes are predominantly expressed in one or a few biologically relevant tissue types. Benefitting from the massive gene expression microarray data obtained over the past decades, the properties of housekeeping and tissue-selective genes can now be investigated on a large-scale manner. In this study, we analyzed the topological properties of housekeeping and tissue-selective genes in the protein-protein interaction (PPI) network. Furthermore, we compared the biological properties and amino acid usage between these two gene groups. The results indicated that there were significant differences in topological properties between housekeeping and tissue-selective genes in the PPI network, and housekeeping genes had higher centrality properties and may play important roles in the complex biological network environment. We also found that there were significant differences in multiple biological properties and many amino acid compositions. The functional genes enrichment and subcellular localizations analysis was also performed to investigate the characterization of housekeeping and tissue-selective genes. The results indicated that the two gene groups showed significant different enrichment in drug targets, disease genes and toxin targets, and located in different subcellular localizations. At last, the discriminations between the properties of two gene groups were measured by the F-score, and expression stage had the most discriminative index in all properties. These findings may elucidate the biological mechanisms for understanding housekeeping and tissue-selective genes and may contribute to better annotate housekeeping and tissue-selective genes in other organisms.
Collapse
Affiliation(s)
- Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Shiyuan Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Meng Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Xiaowen Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yongchun Zuo
- The National Research Center for Animal Transgenic Biotechnology, Inner Mongolia University, Hohhot, 010021, China
| | - Dianjun Sun
- Center for Endemic Disease Control, Chinese Center for Disease Control and Prevention, Harbin Medical University, Harbin, 150081, China.
| | - Yingli Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| |
Collapse
|
31
|
Kang M, Wang J, Huang H. Nitrogen limitation as a driver of genome size evolution in a group of karst plants. Sci Rep 2015; 5:11636. [PMID: 26109237 PMCID: PMC4479984 DOI: 10.1038/srep11636] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2015] [Accepted: 05/29/2015] [Indexed: 01/22/2023] Open
Abstract
Genome size is of fundamental biological importance with significance in predicting structural and functional attributes of organisms. Although abundant evidence has shown that the genome size can be largely explained by differential proliferation and removal of non-coding DNA of the genome, the evolutionary and ecological basis of genome size variation remains poorly understood. Nitrogen (N) and phosphorus (P) are essential elements of DNA and protein building blocks, yet often subject to environmental limitation in natural ecosystems. Using phylogenetic comparative methods, we test this hypothesis by determining whether leaf N and P availability affects genome sizes in 99 species of Primulina (Gesneriaceae), a group of soil specialists adapted to limestone karst environment in south China. We find that genome sizes in Primulina are strongly positively correlated with plant N content, but the correlation with plant P content is not significant when phylogeny history was taken into account. This study shows for the first time that N limitation might have been a plausible driver of genome size variation in a group of plants. We propose that competition for nitrogen nutrient between DNA synthesis and cellular functions is a possible mechanism for genome size evolution in Primulina under N-limitation.
Collapse
Affiliation(s)
- Ming Kang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Jing Wang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Hongwen Huang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| |
Collapse
|
32
|
Abstract
The searching of human housekeeping (HK) genes has been a long quest since the emergence of transcriptomics, and is instrumental for us to understand the structure of genome and the fundamentals of biological processes. The resolved genes are frequently used in evolution studies and as normalization standards in quantitative gene-expression analysis. Within the past 20 years, more than a dozen HK-gene studies have been conducted, yet none of them sampled human tissues completely. We believe an integration of these results will help remove false positive genes owing to the inadequate sampling. Surprisingly, we only find one common gene across 15 examined HK-gene datasets comprising 187 different tissue and cell types. Our subsequent analyses suggest that it might not be appropriate to rigidly define HK genes as expressed in all tissue types that have diverse developmental, physiological, and pathological states. It might be beneficial to use more robustly identified HK functions for filtering criteria, in which the representing genes can be a subset of genome. These genes are not necessarily the same, and perhaps need not to be the same, everywhere in our body.
Collapse
Affiliation(s)
- Yijuan Zhang
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Ding Li
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Bingyun Sun
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| |
Collapse
|
33
|
Pingault L, Choulet F, Alberti A, Glover N, Wincker P, Feuillet C, Paux E. Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome. Genome Biol 2015; 16:29. [PMID: 25853487 PMCID: PMC4355351 DOI: 10.1186/s13059-015-0601-9] [Citation(s) in RCA: 86] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 01/28/2015] [Indexed: 12/19/2022] Open
Abstract
Background Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. Results By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Conclusions Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation. Electronic supplementary material The online version of this article (doi:10.1186/s13059-015-0601-9) contains supplementary material, which is available to authorized users.
Collapse
|
34
|
Chaurasia A, Tarallo A, Bernà L, Yagi M, Agnisola C, D’Onofrio G. Length and GC content variability of introns among teleostean genomes in the light of the metabolic rate hypothesis. PLoS One 2014; 9:e103889. [PMID: 25093416 PMCID: PMC4122358 DOI: 10.1371/journal.pone.0103889] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2013] [Accepted: 07/07/2014] [Indexed: 01/30/2023] Open
Abstract
A comparative analysis of five teleostean genomes, namely zebrafish, medaka, three-spine stickleback, fugu and pufferfish was performed with the aim to highlight the nature of the forces driving both length and base composition of introns (i.e., bpi and GCi). An inter-genome approach using orthologous intronic sequences was carried out, analyzing independently both variables in pairwise comparisons. An average length shortening of introns was observed at increasing average GCi values. The result was not affected by masking transposable and repetitive elements harbored in the intronic sequences. The routine metabolic rate (mass specific temperature-corrected using the Boltzmann's factor) was measured for each species. A significant correlation held between average differences of metabolic rate, length and GC content, while environmental temperature of fish habitat was not correlated with bpi and GCi. Analyzing the concomitant effect of both variables, i.e., bpi and GCi, at increasing genomic GC content, a decrease of bpi and an increase of GCi was observed for the significant majority of the intronic sequences (from ∼40% to ∼90%, in each pairwise comparison). The opposite event, concomitant increase of bpi and decrease of GCi, was counter selected (from <1% to ∼10%, in each pairwise comparison). The results further support the hypothesis that the metabolic rate plays a key role in shaping genome architecture and evolution of vertebrate genomes.
Collapse
Affiliation(s)
- Ankita Chaurasia
- Genome Evolution and Organization – Dept. Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, Napoli, Italy
- Campus UAB - CRAG Bellaterra - Cerdanyola del Vallès, Barcelona, Spain
| | - Andrea Tarallo
- Genome Evolution and Organization – Dept. Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, Napoli, Italy
| | - Luisa Bernà
- Genome Evolution and Organization – Dept. Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, Napoli, Italy
- Molecular Biology Unit, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Mitsuharu Yagi
- Faculty of Fisheries, Nagasaki University, Bunkyo, Nagasaki, Japan
| | - Claudio Agnisola
- Department of Biological Sciences, University of Naples Federico II, Napoli, Italy
| | - Giuseppe D’Onofrio
- Genome Evolution and Organization – Dept. Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, Napoli, Italy
- * E-mail:
| |
Collapse
|
35
|
Sachkova MY, Slavokhotova AA, Grishin EV, Vassilevski AA. Structure of the yellow sac spider Cheiracanthium punctorium genes provides clues to evolution of insecticidal two-domain knottin toxins. INSECT MOLECULAR BIOLOGY 2014; 23:527-538. [PMID: 24717175 DOI: 10.1111/imb.12097] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Yellow sac spiders (Cheiracanthium punctorium, family Miturgidae) are unique in terms of venom composition, because, as we show here, two-domain toxins have replaced the usual one-domain peptides as the major constituents. We report the structure of the two-domain Che. punctorium toxins (CpTx), along with the corresponding cDNA and genomic DNA sequences. At least three groups of insecticidal CpTx were identified, each consisting of several members. Unlike many cone snail and snake toxins, accelerated evolution is not typical of cptx genes, which instead appear to be under the pressure of purifying selection. Both CpTx modules present the inhibitor cystine knot (ICK), or knottin signature; however, the sequence similarity between the domains is low. Conversely, notable similarity was found between separate domains of CpTx and one-domain toxins from spiders of the Lycosidae family. The observed chimerism is a landmark of exon shuffling events, but in contrast to many families of multidomain protein genes no introns were found in the cptx genes. Considering the possible scenarios, we suggest that an early transcription-mediated fusion event between two related one-domain toxin genes led to the emergence of a primordial cptx-like sequence. We conclude that evolution of toxin variability in spiders appears to be quite different from other venomous animals.
Collapse
Affiliation(s)
- M Y Sachkova
- M.M. Shemyakin and Yu.A. Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, Russian Federation
| | | | | | | |
Collapse
|
36
|
Breugelmans B, Jex AR, Korhonen PK, Mangiola S, Young ND, Sternberg PW, Boag PR, Hofmann A, Gasser RB. Bioinformatic exploration of RIO protein kinases of parasitic and free-living nematodes. Int J Parasitol 2014; 44:827-36. [PMID: 25038443 DOI: 10.1016/j.ijpara.2014.06.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Revised: 06/17/2014] [Accepted: 06/18/2014] [Indexed: 01/07/2023]
Abstract
Despite right open reading frame kinases (RIOKs) being essential for life, their functions, substrates and cellular pathways remain enigmatic. In the present study, gene structures were characterised for 26 RIOKs from draft genomes of parasitic and free-living nematodes. RNA-seq transcription profiles of riok genes were investigated for selected parasitic nematodes and showed that these kinases are transcribed in developmental stages that infect their mammalian host. Three-dimensional structural models of Caenorhabditis elegans RIOKs were predicted, and elucidated functional domains and conserved regions in nematode homologs. These findings provide prospects for functional studies of riok genes in C. elegans, and an opportunity for the design and validation of nematode-specific inhibitors of these enzymes in socioeconomic parasitic worms.
Collapse
Affiliation(s)
- Bert Breugelmans
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Aaron R Jex
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Pasi K Korhonen
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Stefano Mangiola
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Neil D Young
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia
| | - Paul W Sternberg
- Howard Hughes Medical Institute (HHMI), Division of Biology, California Institute of Technology, Pasadena, CA, USA
| | - Peter R Boag
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria, Australia
| | - Andreas Hofmann
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia; Structural Chemistry Program, Eskitis Institute, Griffith University, Brisbane, Australia
| | - Robin B Gasser
- Faculty of Veterinary Science, The University of Melbourne, Parkville, Victoria, Australia.
| |
Collapse
|
37
|
Kang M, Tao J, Wang J, Ren C, Qi Q, Xiang QY, Huang H. Adaptive and nonadaptive genome size evolution in Karst endemic flora of China. THE NEW PHYTOLOGIST 2014; 202:1371-1381. [PMID: 24533910 DOI: 10.1111/nph.12726] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2013] [Accepted: 01/16/2014] [Indexed: 05/03/2023]
Abstract
Genome size variation is of fundamental biological importance and has been a longstanding puzzle in evolutionary biology. Several hypotheses for genome size evolution including neutral, maladaptive, and adaptive models have been proposed, but the relative importance of these models remains controversial. Primulina is a genus that is highly diversified in the Karst region of southern China, where genome size variation and the underlying evolutionary mechanisms are poorly understood. We reconstructed the phylogeny of Primulina using DNA sequences for 104 species and determined the genome sizes of 101 species. We examined the phylogenetic signal in genome size variation, and tested the fit to different evolutionary models and for correlations with variation in latitude and specific leaf area (SLA). The results showed that genome size, SLA and latitudinal variation all displayed strong phylogenetic signals, but were best explained by different evolutionary models. Furthermore, significant positive relationships were detected between genome size and SLA and between genome size and latitude. Our study is the first to investigate genome size evolution on such a comprehensive scale and in the Karst region flora. We conclude that genome size in Primulina is phylogenetically conserved but its variation among species is a combined outcome of both neutral and adaptive evolution.
Collapse
Affiliation(s)
- Ming Kang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Junjie Tao
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Jing Wang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Chen Ren
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| | - Qingwen Qi
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Qiu-Yun Xiang
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, 27695-7612, USA
| | - Hongwen Huang
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
| |
Collapse
|
38
|
Chiang AWT, Shaw GTW, Hwang MJ. Partitioning the human transcriptome using HKera, a novel classifier of housekeeping and tissue-specific genes. PLoS One 2013; 8:e83040. [PMID: 24376628 PMCID: PMC3869736 DOI: 10.1371/journal.pone.0083040] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 10/30/2013] [Indexed: 01/12/2023] Open
Abstract
High-throughput transcriptomic experiments have made it possible to classify genes that are ubiquitously expressed as housekeeping (HK) genes and those expressed only in selective tissues as tissue-specific (TS) genes. Although partitioning a transcriptome into HK and TS genes is conceptually problematic owing to the lack of precise definitions and gene expression profile criteria for the two, information whether a gene is an HK or a TS gene can provide an initial clue to its cellular and/or functional role. Consequently, the development of new and novel HK (TS) classification methods has been a topic of considerable interest in post-genomics research. Here, we report such a development. Our method, called HKera, differs from the others by utilizing a novel property of HK genes that we have previously uncovered, namely that the ranking order of their expression levels, as opposed to the expression levels themselves, tends to be preserved from one tissue to another. Evaluated against multiple benchmark sets of human HK genes, including one recently derived from second generation sequencing data, HKera was shown to perform significantly better than five other classifiers that use different methodologies. An enrichment analysis of pathway and gene ontology annotations showed that HKera-predicted HK and TS genes have distinct functional roles and, together, cover most of the ontology categories. These results show that HKera is a good transcriptome partitioner that can be used to search for, and obtain useful expression and functional information for, novel HK (TS) genes.
Collapse
Affiliation(s)
- Austin W. T. Chiang
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan
- Institute of BioMedical Informatics, NationalYang-MingUniversity, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Grace T. W. Shaw
- Institute of BioMedical Informatics, NationalYang-MingUniversity, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Ming-Jing Hwang
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan
- Institute of BioMedical Informatics, NationalYang-MingUniversity, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
- * E-mail:
| |
Collapse
|
39
|
Eisenberg E, Levanon EY. Human housekeeping genes, revisited. Trends Genet 2013; 29:569-74. [PMID: 23810203 DOI: 10.1016/j.tig.2013.05.010] [Citation(s) in RCA: 803] [Impact Index Per Article: 73.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 05/06/2013] [Accepted: 05/30/2013] [Indexed: 10/26/2022]
Abstract
Housekeeping genes are involved in basic cell maintenance and, therefore, are expected to maintain constant expression levels in all cells and conditions. Identification of these genes facilitates exposure of the underlying cellular infrastructure and increases understanding of various structural genomic features. In addition, housekeeping genes are instrumental for calibration in many biotechnological applications and genomic studies. Advances in our ability to measure RNA expression have resulted in a gradual increase in the number of identified housekeeping genes. Here, we describe housekeeping gene detection in the era of massive parallel sequencing and RNA-seq. We emphasize the importance of expression at a constant level and provide a list of 3804 human genes that are expressed uniformly across a panel of tissues. Several exceptionally uniform genes are singled out for future experimental use, such as RT-PCR control genes. Finally, we discuss both ways in which current technology can meet some of past obstacles encountered, and several as yet unmet challenges.
Collapse
Affiliation(s)
- Eli Eisenberg
- Raymond and Beverly Sackler School of Physics and Astronomy, Tel-Aviv University, Tel Aviv 69978, Israel.
| | | |
Collapse
|
40
|
Catania F, Lynch M. A simple model to explain evolutionary trends of eukaryotic gene architecture and expression: how competition between splicing and cleavage/polyadenylation factors may affect gene expression and splice-site recognition in eukaryotes. Bioessays 2013; 35:561-70. [PMID: 23568225 DOI: 10.1002/bies.201200127] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Enormous phylogenetic variation exists in the number and sizes of introns in protein-coding genes. Although some consideration has been given to the underlying role of the population-genetic environment in defining such patterns, the influence of the intracellular environment remains virtually unexplored. Drawing from observations on interactions between co-transcriptional processes involved in splicing and mRNA 3'-end formation, a mechanistic model is proposed for splice-site recognition that challenges the commonly accepted intron- and exon-definition models. Under the suggested model, splicing factors that outcompete 3'-end processing factors for access to intronic binding sites concurrently favor the recruitment of 3'-end processing factors at the pre-mRNA tail. This hypothesis sheds new light on observations such as the intron-mediated enhancement of gene expression and the negative correlation between intron length and levels of gene expression.
Collapse
Affiliation(s)
- Francesco Catania
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| | | |
Collapse
|
41
|
Zhang Q, Edwards SV. The evolution of intron size in amniotes: a role for powered flight? Genome Biol Evol 2013; 4:1033-43. [PMID: 22930760 PMCID: PMC3490418 DOI: 10.1093/gbe/evs070] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Intronic DNA is a major component of eukaryotic genes and genomes and can be subject to
selective constraint and have functions in gene regulation. Intron size is of particular
interest given that it is thought to be the target of a variety of evolutionary forces and
has been suggested to be linked ultimately to various phenotypic traits, such as powered
flight. Using whole-genome analyses and comparative approaches that account for
phylogenetic nonindependence, we examined interspecific variation in intron size variation
in three data sets encompassing from 12 to 30 amniotes genomes and allowing for different
levels of genome coverage. In addition to confirming that intron size is negatively
associated with intron position and correlates with genome size, we found that on average
mammals have longer introns than birds and nonavian reptiles, a trend that is correlated
with the proliferation of repetitive elements in mammals. Two independent comparisons
between flying and nonflying sister groups both showed a reduction of intron size in
volant species, supporting an association between powered flight, or possibly the high
metabolic rates associated with flight, and reduced intron/genome size. Small intron size
in volant lineages is less easily explained as a neutral consequence of large effective
population size. In conclusion, we found that the evolution of intron size in amniotes
appears to be non-neutral, is correlated with genome size, and is likely influenced by
powered flight and associated high metabolic rates.
Collapse
Affiliation(s)
- Qu Zhang
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | | |
Collapse
|
42
|
Choi SS, Hannenhalli S. Three independent determinants of protein evolutionary rate. J Mol Evol 2013; 76:98-111. [PMID: 23400388 DOI: 10.1007/s00239-013-9543-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 01/16/2013] [Indexed: 12/15/2022]
Abstract
One of the most widely accepted ideas related to the evolutionary rates of proteins is that functionally important residues or regions evolve slower than other regions, a reasonable outcome of which should be a slower evolutionary rate of the proteins with a higher density of functionally important sites. Oddly, the role of functional importance, mainly measured by essentiality, in determining evolutionary rate has been challenged in recent studies. Several variables other than protein essentiality, such as expression level, gene compactness, protein-protein interactions, etc., have been suggested to affect protein evolutionary rate. In the present review, we try to refine the concept of functional importance of a gene, and consider three factors-functional importance, expression level, and gene compactness, as independent determinants of evolutionary rate of a protein, based not only on their known correlation with evolutionary rate but also on a reasonable mechanistic model. We suggest a framework based on these mechanistic models to correctly interpret the correlations between evolutionary rates and the various variables as well as the interrelationships among the variables.
Collapse
Affiliation(s)
- Sun Shim Choi
- Department of Medical Biotechnology, College of Biomedical Science, and Institute of Bioscience & Biotechnology, Kangwon National University, Chuncheon, South Korea.
| | | |
Collapse
|
43
|
Vinogradov AE. Large scale of human duplicate genes divergence. J Mol Evol 2012; 75:25-33. [PMID: 22922908 DOI: 10.1007/s00239-012-9516-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2011] [Accepted: 08/03/2012] [Indexed: 01/25/2023]
Abstract
Proteome complexity increases in the evolution mostly by means of gene duplication followed by divergence. In this genome-scale study of human genome I show that density distribution of duplicate gene pairs along the axis of protein divergence between pair members forms two main peaks with a small peak and plateau before the first main peak. This picture indicates the existence of three evolutionary stages of duplicate gene evolution. The analysis of various functional parameters (gene expression level and breadth, transcription factor targets, protein interaction networks) suggests that subfunctionalization (partition of function) is a predominant mode of divergence in the first main peak, whereas neofunctionalization (acquiring of novel functions) prevails in the second main peak. The young duplicate pairs show a much higher expression level compared with singleton genes and more diverged duplicates, which indicates that requirement for high gene dosage is important for retention of duplicates just after the duplication event. Thus, a prevailing route of duplicate evolution seems to be the high gene dosage-subfunctionalization-neofunctionalization. This adaptationist model suggests that an organism is evolving in the direction of its most intensively used functions.
Collapse
|
44
|
Tippmann SC, Ivanek R, Gaidatzis D, Schöler A, Hoerner L, van Nimwegen E, Stadler PF, Stadler MB, Schübeler D. Chromatin measurements reveal contributions of synthesis and decay to steady-state mRNA levels. Mol Syst Biol 2012; 8:593. [PMID: 22806141 PMCID: PMC3421439 DOI: 10.1038/msb.2012.23] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Accepted: 05/22/2012] [Indexed: 12/31/2022] Open
Abstract
Histone modification, polymerase binding, mRNA half-life, and miRNA abundance measurements in mouse cells are used to dissect the relative contribution of each to mRNA levels, revealing control primarily at the level of transcription, with minor contributions from post-transcriptional processes. ![]()
A linear model of three histone modifications and RNAP II occupancy can predict >80% of the variance in mRNA levels. mRNA half-life explains an additional 1.4% variance in mRNA levels. miRNA-mediated silencing does not explain any variance on a genome-wide scale. H3K36me3 has different predictive power in dividing and non-dividingcells.
Messenger RNA levels in eukaryotes are controlled by multiple consecutive regulatory processes, which can be classified into two layers: primary transcriptional regulation at the chromosomal level and secondary, co- and post-transcriptional regulation of the mRNA. To identify the individual contribution of these layers to steady-state RNA levels requires separate quantification. Using mouse as a model organism, we show that chromatin features are sufficient to model RNA levels but with different sensitivities in dividing versus postmitotic cells. In both cases, chromatin-derived transcription rates explain over 80% of the observed variance in measured RNA levels. Further inclusion of measurements of mRNA half-life and microRNA expression data enabled the identification of a low quantitative contribution of RNA decay by either microRNA or general differential turnover to final mRNA levels. Together, this establishes a chromatin-based quantitative model for the contribution of transcriptional and post-transcriptional processes to steady-state levels of messenger RNA.
Collapse
Affiliation(s)
- Sylvia C Tippmann
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Nam K, Ellegren H. Recombination drives vertebrate genome contraction. PLoS Genet 2012; 8:e1002680. [PMID: 22570634 PMCID: PMC3342960 DOI: 10.1371/journal.pgen.1002680] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2011] [Accepted: 03/15/2012] [Indexed: 11/19/2022] Open
Abstract
Selective and/or neutral processes may govern variation in DNA content and, ultimately, genome size. The observation in several organisms of a negative correlation between recombination rate and intron size could be compatible with a neutral model in which recombination is mutagenic for length changes. We used whole-genome data on small insertions and deletions within transposable elements from chicken and zebra finch to demonstrate clear links between recombination rate and a number of attributes of reduced DNA content. Recombination rate was negatively correlated with the length of introns, transposable elements, and intergenic spacer and with the rate of short insertions. Importantly, it was positively correlated with gene density, the rate of short deletions, the deletion bias, and the net change in sequence length. All these observations point at a pattern of more condensed genome structure in regions of high recombination. Based on the observed rates of small insertions and deletions and assuming that these rates are representative for the whole genome, we estimate that the genome of the most recent common ancestor of birds and lizards has lost nearly 20% of its DNA content up until the present. Expansion of transposable elements can counteract the effect of deletions in an equilibrium mutation model; however, since the activity of transposable elements has been low in the avian lineage, the deletion bias is likely to have had a significant effect on genome size evolution in dinosaurs and birds, contributing to the maintenance of a small genome. We also demonstrate that most of the observed correlations between recombination rate and genome contraction parameters are seen in the human genome, including for segregating indel polymorphisms. Our data are compatible with a neutral model in which recombination drives vertebrate genome size evolution and gives no direct support for a role of natural selection in this process. One major implication from genetic work done several decades ago is that the genome contains a lot of sequences that do not constitute genes or other functional elements. The total amount of DNA—the genome size—is thus not necessarily an indicator of DNA complexity or organismal complexity, an observation often referred to as the C-value paradox (C-value being a measure of DNA content). What then is it that determines genome size? One model posits that the evolution of genome size is not a consequence of natural selection but is instead governed by the incidence and character of naturally occurring mutations that affect the length of DNA, a process that is not affected by selection. Here we present the results of an analysis of how recombination affects the size of avian and human genomes. We find strong evidence that the rate of recombination is a driving force of genome size evolution. In regions of the genome where recombination occurs frequently, the loss of DNA caused by small deletions is particularly pronounced. Our simulations show that the effect of such recombination-driven genome contraction can be profound over evolutionary time scales. These observations lead to a model in which recombination is mutagenic for length changes and that the incidence of deletions increases with increasing recombination rate. Although we cannot formally exclude that natural selection contributes to the observed relationship between recombination and genome contraction, we find no evidence to support such a scenario.
Collapse
Affiliation(s)
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
- * E-mail:
| |
Collapse
|
46
|
Rogozin IB, Carmel L, Csuros M, Koonin EV. Origin and evolution of spliceosomal introns. Biol Direct 2012; 7:11. [PMID: 22507701 PMCID: PMC3488318 DOI: 10.1186/1745-6150-7-11] [Citation(s) in RCA: 217] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/15/2012] [Indexed: 12/31/2022] Open
Abstract
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information NLM/NIH, 8600 Rockville Pike, Bldg, 38A, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
47
|
Abstract
The intron–exon architecture of many eukaryotic genes raises the intriguing question of whether this unique organization serves any function, or is it simply a result of the spread of functionless introns in eukaryotic genomes. In this review, we show that introns in contemporary species fulfill a broad spectrum of functions, and are involved in virtually every step of mRNA processing. We propose that this great diversity of intronic functions supports the notion that introns were indeed selfish elements in early eukaryotes, but then independently gained numerous functions in different eukaryotic lineages. We suggest a novel criterion of evolutionary conservation, dubbed intron positional conservation, which can identify functional introns.
Collapse
Affiliation(s)
- Michal Chorev
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem Jerusalem, Israel
| | | |
Collapse
|
48
|
Yang D, Zhong F, Li D, Liu Z, Wei H, Jiang Y, He F. General trends in the utilization of structural factors contributing to biological complexity. Mol Biol Evol 2012; 29:1957-68. [PMID: 22328715 DOI: 10.1093/molbev/mss064] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
During evolution, proteins containing newly emerged domains and the increasing proportion of multidomain proteins in the full genome-encoded proteome (GEP) have substantially contributed to increasing biological complexity. However, it is not known how these two potential structural factors are preferentially utilized at given physiological states. Here, we classified proteins according to domain number and domain age and explored the general trends across species for the utilization of proteins from GEP to various certain-state proteomes (CSPs, i.e., all the proteins expressed at certain physiological states). We found that multidomain proteins or only older domain-containing proteins are significantly overrepresented in CSPs compared with GEP, which is a trend that is stronger in multicellular organisms than in unicellular organisms. Interestingly, the strengths of overrepresentation decreased during evolution of multicellular eukaryotes. When comparing across CSPs, we found that multidomain proteins are more overrepresented in complex tissues than in simpler ones, whereas no difference among proteins with domains of different ages is evident between complex and simple tissues. Thus, biological complexity under certain conditions is more significantly realized by diverse domain organization than by the emergence of new types of domain. In addition, we found that multidomain or only older domain-containing proteins tend to evolve slowly and generally are under stronger purifying selection, which may partly result from their general overrepresentation trends in CSPs.
Collapse
Affiliation(s)
- Dong Yang
- State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Radiation Medicine, Beijing, P R China
| | | | | | | | | | | | | |
Collapse
|
49
|
Sarda S, Zeng J, Hunt BG, Yi SV. The evolution of invertebrate gene body methylation. Mol Biol Evol 2012; 29:1907-16. [PMID: 22328716 DOI: 10.1093/molbev/mss062] [Citation(s) in RCA: 148] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
DNA methylation of transcription units (gene bodies) occurs in the genomes of many animal and plant species. Phylogenetic persistence of gene body methylation implies biological significance; yet, the functional roles of gene body methylation remain elusive. In this study, we analyzed methylation levels of orthologs from four distantly related invertebrate species, including the honeybee, silkworm, sea squirt, and sea anemone. We demonstrate that in all four species, gene bodies distinctively cluster to two groups, which correspond to high and low methylation levels. This pattern resembles that of sequence composition arising from the mutagenetic effect of DNA methylation. In spite of this effect, our results show that protein sequences of genes targeted by high levels of methylation are conserved relative to genes lacking methylation. Our investigation identified many genes that either gained or lost methylation during the course of invertebrate evolution. Most of these genes appear to have lost methylation in the insect lineages we investigated, particularly in the honeybee. We found that genes that are methylated in all four invertebrate taxa are enriched for housekeeping functions related to transcription and translation, whereas the loss of DNA methylation occurred in genes whose functions include cellular signaling and reproductive processes. Overall, our study helps to illuminate the functional significance of gene body methylation and its impacts on genome evolution in diverse invertebrate taxa.
Collapse
Affiliation(s)
- Shrutii Sarda
- School of Biology, Georgia Institute of Technology, Atlanta, Georgia, USA
| | | | | | | |
Collapse
|
50
|
Comparative analysis of the structural and expressional parameters of microRNA target genes. Gene 2012; 497:103-9. [PMID: 22305979 DOI: 10.1016/j.gene.2012.01.033] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2012] [Accepted: 01/18/2012] [Indexed: 02/02/2023]
Abstract
MicroRNAs (miRNAs) generally pair with the 3'UTRs of their target mRNAs to repress gene expression. It has reported that miRNA targets (TGs) are longer and evolve more slowly than non-targets (NTGs). We confirmed the observation and also found novel structural and expressional characteristics of TGs. The length difference between TGs and NTGs was greatest for the 3'UTRs, although a difference was also observed for CDSs and introns. Widely expressed genes were shorter for both TGs and NTGs; however, TGs were significantly longer than NTGs in all ranges of expression. TGs were more likely than NTGs to be widely expressed, which might explain why TGs evolve more slowly than NTGs. Finally, we found that TG mRNAs have faster decay rates. In addition, the decay rate of a TG mRNA transcript was found to be positively correlated with the number or density of target sites located in that TG's mRNA transcript.
Collapse
|