1
|
Capitanchik C, Wilkins OG, Wagner N, Gagneur J, Ule J. From computational models of the splicing code to regulatory mechanisms and therapeutic implications. Nat Rev Genet 2024:10.1038/s41576-024-00774-2. [PMID: 39358547 DOI: 10.1038/s41576-024-00774-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/27/2024] [Indexed: 10/04/2024]
Abstract
Since the discovery of RNA splicing and its role in gene expression, researchers have sought a set of rules, an algorithm or a computational model that could predict the splice isoforms, and their frequencies, produced from any transcribed gene in a specific cellular context. Over the past 30 years, these models have evolved from simple position weight matrices to deep-learning models capable of integrating sequence data across vast genomic distances. Most recently, new model architectures are moving the field closer to context-specific alternative splicing predictions, and advances in sequencing technologies are expanding the type of data that can be used to inform and interpret such models. Together, these developments are driving improved understanding of splicing regulatory mechanisms and emerging applications of the splicing code to the rational design of RNA- and splicing-based therapeutics.
Collapse
Affiliation(s)
- Charlotte Capitanchik
- The Francis Crick Institute, London, UK
- UK Dementia Research Institute at King's College London, London, UK
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry Psychology & Neuroscience, King's College London, London, UK
| | - Oscar G Wilkins
- The Francis Crick Institute, London, UK
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Nils Wagner
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Helmholtz Association - Munich School for Data Science (MUDS), Munich, Germany
| | - Julien Gagneur
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany.
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany.
| | - Jernej Ule
- The Francis Crick Institute, London, UK.
- UK Dementia Research Institute at King's College London, London, UK.
- Department of Basic and Clinical Neuroscience, Institute of Psychiatry Psychology & Neuroscience, King's College London, London, UK.
- National Institute of Chemistry, Ljubljana, Slovenia.
| |
Collapse
|
2
|
Marsh JI, Johri P. Biases in ARG-Based Inference of Historical Population Size in Populations Experiencing Selection. Mol Biol Evol 2024; 41:msae118. [PMID: 38874402 PMCID: PMC11245712 DOI: 10.1093/molbev/msae118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/05/2024] [Accepted: 06/11/2024] [Indexed: 06/15/2024] Open
Abstract
Inferring the demographic history of populations provides fundamental insights into species dynamics and is essential for developing a null model to accurately study selective processes. However, background selection and selective sweeps can produce genomic signatures at linked sites that mimic or mask signals associated with historical population size change. While the theoretical biases introduced by the linked effects of selection have been well established, it is unclear whether ancestral recombination graph (ARG)-based approaches to demographic inference in typical empirical analyses are susceptible to misinference due to these effects. To address this, we developed highly realistic forward simulations of human and Drosophila melanogaster populations, including empirically estimated variability of gene density, mutation rates, recombination rates, purifying, and positive selection, across different historical demographic scenarios, to broadly assess the impact of selection on demographic inference using a genealogy-based approach. Our results indicate that the linked effects of selection minimally impact demographic inference for human populations, although it could cause misinference in populations with similar genome architecture and population parameters experiencing more frequent recurrent sweeps. We found that accurate demographic inference of D. melanogaster populations by ARG-based methods is compromised by the presence of pervasive background selection alone, leading to spurious inferences of recent population expansion, which may be further worsened by recurrent sweeps, depending on the proportion and strength of beneficial mutations. Caution and additional testing with species-specific simulations are needed when inferring population history with non-human populations using ARG-based approaches to avoid misinference due to the linked effects of selection.
Collapse
Affiliation(s)
- Jacob I Marsh
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Parul Johri
- Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
- Integrative Program for Biological and Genome Sciences, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
3
|
Fingerhut JM, Lannes R, Whitfield TW, Thiru P, Yamashita YM. Co-transcriptional splicing facilitates transcription of gigantic genes. PLoS Genet 2024; 20:e1011241. [PMID: 38870220 PMCID: PMC11207136 DOI: 10.1371/journal.pgen.1011241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 06/26/2024] [Accepted: 05/31/2024] [Indexed: 06/15/2024] Open
Abstract
Although introns are typically tens to thousands of nucleotides, there are notable exceptions. In flies as well as humans, a small number of genes contain introns that are more than 1000 times larger than typical introns, exceeding hundreds of kilobases (kb) to megabases (Mb). It remains unknown why gigantic introns exist and how cells overcome the challenges associated with their transcription and RNA processing. The Drosophila Y chromosome contains some of the largest genes identified to date: multiple genes exceed 4Mb, with introns accounting for over 99% of the gene span. Here we demonstrate that co-transcriptional splicing of these gigantic Y-linked genes is important to ensure successful transcription: perturbation of splicing led to the attenuation of transcription, leading to a failure to produce mature mRNA. Cytologically, defective splicing of the Y-linked gigantic genes resulted in disorganization of transcripts within the nucleus suggestive of entanglement of transcripts, likely resulting from unspliced long RNAs. We propose that co-transcriptional splicing maintains the length of nascent transcripts of gigantic genes under a critical threshold, preventing their entanglement and ensuring proper gene expression. Our study reveals a novel biological significance of co-transcriptional splicing.
Collapse
Affiliation(s)
- Jaclyn M. Fingerhut
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Cambridge, Massachusetts, United States of America
| | - Romain Lannes
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Troy W. Whitfield
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Prathapan Thiru
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
| | - Yukiko M. Yamashita
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Cambridge, Massachusetts, United States of America
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| |
Collapse
|
4
|
Speakman E, Gunaratne GH. On a kneading theory for gene-splicing. CHAOS (WOODBURY, N.Y.) 2024; 34:043125. [PMID: 38579148 DOI: 10.1063/5.0199364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 03/05/2024] [Indexed: 04/07/2024]
Abstract
Two well-known facets in protein synthesis in eukaryotic cells are transcription of DNA to pre-RNA in the nucleus and the translation of messenger-RNA (mRNA) to proteins in the cytoplasm. A critical intermediate step is the removal of segments (introns) containing ∼97% of the nucleic-acid sites in pre-RNA and sequential alignment of the retained segments (exons) to form mRNA through a process referred to as splicing. Alternative forms of splicing enrich the proteome while abnormal splicing can enhance the likelihood of a cell developing cancer or other diseases. Mechanisms for splicing and origins of splicing errors are only partially deciphered. Our goal is to determine if rules on splicing can be inferred from data analytics on nucleic-acid sequences. Toward that end, we represent a nucleic-acid site as a point in a plane defined in terms of the anterior and posterior sub-sequences of the site. The "point-set" representation expands analytical approaches, including the use of statistical tools, to characterize genome sequences. It is found that point-sets for exons and introns are visually different, and that the differences can be quantified using a family of generalized moments. We design a machine-learning algorithm that can recognize individual exons or introns with 91% accuracy. Point-set distributions and generalized moments are found to differ between organisms.
Collapse
Affiliation(s)
- Ethan Speakman
- Department of Physics, University of Houston, Houston, Texas 77204, USA
| | | |
Collapse
|
5
|
Jo TS, Matsuda N, Hirohara T, Yamanaka H. Comparative evaluation for the performance of environmental DNA and RNA analyses targeting mitochondrial and nuclear genes from ayu (Plecoglossus altivelis). ENVIRONMENTAL MONITORING AND ASSESSMENT 2024; 196:374. [PMID: 38491297 DOI: 10.1007/s10661-024-12535-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 03/05/2024] [Indexed: 03/18/2024]
Abstract
Environmental DNA and RNA (eDNA and eRNA; collectively eNA) analyses have the potential for non-invasive and cost-efficient biomonitoring compared with traditional capture-based surveys. Although various types of eNA particles, including not only mitochondrial eDNA but also nuclear eDNA and their transcripts, are present in the water, performances of eNA detection and quantification have not yet been evaluated sufficiently across multiple mitochondrial and nuclear genes. We conducted a tank experiment with ayu (Plecoglossus altivelis) to compare the detection sensitivity, yields per water sample, and quantification variability between replicates of each type of eNAs. The assay targeting the multi-copy nuclear gene exhibited a higher sensitivity than the assay targeting the mitochondrial gene, and both the target eDNA and eRNA concentrations per water sample were higher for the nuclear gene. On the contrary, variation in eRNA quantifications per sample does not necessarily correspond to that in eDNA, and the intra-sample quantification variability (represented as the CVs between PCR replicates) tended to be larger for eRNA than eDNA. Our results suggested that, even if suitable to the sensitive detection of species occurrence, the use of eRNA particularly derived from multi-copy nuclear gene may not be necessarily appropriate for the reliable assessment of species abundance. The findings in this study would help optimize eNA analyses for making biomonitoring and stock assessment in aquatic environments more efficient and reliable.
Collapse
Affiliation(s)
- Toshiaki S Jo
- Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo, 102-0083, Japan.
- Ryukoku Center for Biodiversity Science, 1-5, Yokotani, Oe-cho, Seta, Otsu City, Shiga, 520-2194, Japan.
- Faculty of Advanced Science and Technology, Ryukoku University, 1-5, Yokotani, Oe-cho, Seta, Otsu City, Shiga, 520-2194, Japan.
| | - Nao Matsuda
- Shiga Prefectural Fisheries Experiment Station, 2138-3, Hassaka-cho, Hikone City, Shiga, 522-0057, Japan
| | - Takaya Hirohara
- Graduate School of Science and Technology, Ryukoku University, 1-5, Yokotani, Oe-cho, Seta, Otsu City, Shiga, 520-2194, Japan
- KANSO TECHNOS CO., LTD., Azuchimachi 1-3-5, Chuo-ku, Osaka, 541-0052, Japan
| | - Hiroki Yamanaka
- Ryukoku Center for Biodiversity Science, 1-5, Yokotani, Oe-cho, Seta, Otsu City, Shiga, 520-2194, Japan
- Faculty of Advanced Science and Technology, Ryukoku University, 1-5, Yokotani, Oe-cho, Seta, Otsu City, Shiga, 520-2194, Japan
| |
Collapse
|
6
|
Wang T, Long C, Chang M, Wu Y, Su S, Wei J, Jiang S, Wang X, He J, Xing D, He Y, Ran Y, Li W. Genome-wide identification of the B3 transcription factor family in pepper (Capsicum annuum) and expression patterns during fruit ripening. Sci Rep 2024; 14:2226. [PMID: 38278802 PMCID: PMC10817905 DOI: 10.1038/s41598-023-51080-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 12/30/2023] [Indexed: 01/28/2024] Open
Abstract
In plants, B3 transcription factors play important roles in a variety of aspects of their growth and development. While the B3 transcription factor has been extensively identified and studied in numerous species, there is limited knowledge regarding its B3 superfamily in pepper. Through the utilization of genome-wide sequence analysis, we identified a total of 106 B3 genes from pepper (Capsicum annuum), they are categorized into four subfamilies: RAV, ARF, LAV, and REM. Chromosome distribution, genetic structure, motif, and cis-acting element of the pepper B3 protein were analyzed. Conserved gene structure and motifs outside the B3 domain provided strong evidence for phylogenetic relationships, allowing potential functions to be deduced by comparison with homologous genes from Arabidopsis. According to the high-throughput transcriptome sequencing analysis, expression patterns differ during different phases of fruit development in the majority of the 106 B3 pepper genes. By using qRT-PCR analysis, similar expression patterns in fruits from various time periods were discovered. In addition, further analysis of the CaRAV4 gene showed that its expression level decreased with fruit ripening and located in the nucleus. B3 transcription factors have been genome-wide characterized in a variety of crops, but the present study is the first genome-wide analysis of the B3 superfamily in pepper. More importantly, although B3 transcription factors play key regulatory roles in fruit development, it is uncertain whether B3 transcription factors are involved in the regulation of the fruit development and ripening process in pepper and their specific regulatory mechanisms because the molecular mechanisms of the process have not been fully explained. The results of the study provide a foundation and new insights into the potential regulatory functions and molecular mechanisms of B3 genes in the development and ripening process of pepper fruits, and provide a solid theoretical foundation for the enhancement of the quality of peppers and their selection and breeding of high-yield varieties.
Collapse
Affiliation(s)
- Tao Wang
- College of Agriculture, Guizhou University, Guiyang, 550025, China
- Vegetable Research Institute, Guizhou University, Guiyang, 550025, China
- Engineering Research Center for Protected Vegetable Crops in Higher Learning Institutions of Guizhou Province, Guiyang, 550025, China
| | - Cha Long
- College of Agriculture, Guizhou University, Guiyang, 550025, China
- Vegetable Research Institute, Guizhou University, Guiyang, 550025, China
- Engineering Research Center for Protected Vegetable Crops in Higher Learning Institutions of Guizhou Province, Guiyang, 550025, China
| | - Meixia Chang
- College of Agriculture, Guizhou University, Guiyang, 550025, China
| | - Yuan Wu
- College of Agriculture, Guizhou University, Guiyang, 550025, China
| | - Shixian Su
- College of Agriculture, Guizhou University, Guiyang, 550025, China
| | - Jingjiang Wei
- College of Agriculture, Guizhou University, Guiyang, 550025, China
- Vegetable Research Institute, Guizhou University, Guiyang, 550025, China
| | - Suyan Jiang
- College of Agriculture, Guizhou University, Guiyang, 550025, China
| | - Xiujun Wang
- College of Brewing and Food Engineering, Guizhou University, Guiyang, 550025, China
| | - Jianwen He
- Pepper Research Institute of Guizhou Province, Guiyang, 550006, China
| | - Dan Xing
- Pepper Research Institute of Guizhou Province, Guiyang, 550006, China
| | - Yangbo He
- Agriculture Development and Research Institute of Guizhou Province, Guiyang, 550006, China
| | - Yaoqi Ran
- Agriculture Development and Research Institute of Guizhou Province, Guiyang, 550006, China
| | - Wei Li
- College of Agriculture, Guizhou University, Guiyang, 550025, China.
- Vegetable Research Institute, Guizhou University, Guiyang, 550025, China.
- Engineering Research Center for Protected Vegetable Crops in Higher Learning Institutions of Guizhou Province, Guiyang, 550025, China.
| |
Collapse
|
7
|
Baker L, David C, Jacobs DJ. Ab initio gene prediction for protein-coding regions. BIOINFORMATICS ADVANCES 2023; 3:vbad105. [PMID: 37638212 PMCID: PMC10448985 DOI: 10.1093/bioadv/vbad105] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 07/04/2023] [Accepted: 08/08/2023] [Indexed: 08/29/2023]
Abstract
Motivation Ab initio gene prediction in nonmodel organisms is a difficult task. While many ab initio methods have been developed, their average accuracy over long segments of a genome, and especially when assessed over a wide range of species, generally yields results with sensitivity and specificity levels in the low 60% range. A common weakness of most methods is the tendency to learn patterns that are species-specific to varying degrees. The need exists for methods to extract genetic features that can distinguish coding and noncoding regions that are not sensitive to specific organism characteristics. Results A new method based on a neural network (NN) that uses a collection of sensors to create input features is presented. It is shown that accurate predictions are achieved even when trained on organisms that are significantly different phylogenetically than test organisms. A consensus prediction algorithm for a CoDing Sequence (CDS) is subsequently applied to the first nucleotide level of NN predictions that boosts accuracy through a data-driven procedure that optimizes a CDS/non-CDS threshold. An aggregate accuracy benchmark at the nucleotide level shows that this new approach performs better than existing ab initio methods, while requiring significantly less training data. Availability and implementation https://github.com/BioMolecularPhysicsGroup-UNCC/MachineLearning.
Collapse
Affiliation(s)
- Lonnie Baker
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, NC 28223, United States
| | - Charles David
- Department of Bioinformatics, The New Zealand Institute for Plant and Food Research, Lincoln 7608, New Zealand
| | - Donald J Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, NC 28223, United States
- UNC Charlotte School of Data Science, University of North Carolina at Charlotte, NC 28223, United States
| |
Collapse
|
8
|
Won SY, Soundararajan P, Irulappan V, Kim JS. In-silico, evolutionary, and functional analysis of CHUP1 and its related proteins in Bienertia sinuspersici-a comparative study across C 3, C 4, CAM, and SCC 4 model plants. PeerJ 2023; 11:e15696. [PMID: 37456874 PMCID: PMC10348308 DOI: 10.7717/peerj.15696] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 06/14/2023] [Indexed: 07/18/2023] Open
Abstract
Single-cell C4 (SCC4) plants with bienertioid anatomy carry out photosynthesis in a single cell. Chloroplast movement is the underlying phenomenon, where chloroplast unusual positioning 1 (CHUP1) plays a key role. This study aimed to characterize CHUP1 and CHUP1-like proteins in an SCC4 photosynthetic plant, Bienertia sinuspersici. Also, a comparative analysis of SCC4 CHUP1 was made with C3, C4, and CAM model plants including an extant basal angiosperm, Amborella. The CHUP1 gene exists as a single copy from the basal angiosperms to SCC4 plants. Our analysis identified that Chenopodium quinoa, a recently duplicated allotetraploid, has two copies of CHUP1. In addition, the numbers of CHUP1-like and its associated proteins such as CHUP1-like_a, CHUP1-like_b, HPR, TPR, and ABP varied between the species. Hidden Markov Model analysis showed that the gene size of CHUP1-like_a and CHUP1-like_b of SCC4 species, Bienertia, and Suaeda were enlarged than other plants. Also, we identified that CHUP1-like_a and CHUP1-like_b are absent in Arabidopsis and Amborella, respectively. Motif analysis identified several conserved and variable motifs based on the orders (monocot and dicot) as well as photosynthetic pathways. For instance, CAM plants such as pineapple and cactus shared certain motifs of CHUP1-like_a irrespective of their distant phylogenetic relationship. The free ratio model showed that CHUP1 maintained purifying selection, whereas CHUP1-like_a and CHUP1-like_b have adaptive functions between SCC4 plants and quinoa. Similarly, rice and maize branches displayed functional diversification on CHUP1-like_b. Relative gene expression data showed that during the subcellular compartmentalization process of Bienertia, CHUP1 and actin-binding proteins (ABP) genes showed a similar pattern of expression. Altogether, the results of this study provide insight into the evolutionary and functional details of CHUP1 and its associated proteins in the development of the SCC4 system in comparison with other C3, C4, and CAM model plants.
Collapse
Affiliation(s)
- So Youn Won
- Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju-si, Jeollabuk-do, South Korea
| | - Prabhakaran Soundararajan
- Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju-si, Jeollabuk-do, South Korea
| | - Vadivelmurugan Irulappan
- Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju-si, Jeollabuk-do, South Korea
| | - Jung Sun Kim
- Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju-si, Jeollabuk-do, South Korea
| |
Collapse
|
9
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.17.541250. [PMID: 37292934 PMCID: PMC10245677 DOI: 10.1101/2023.05.17.541250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125
| | - John J. Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA, 02115
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125
| |
Collapse
|
10
|
Marquardt S, Petrillo E, Manavella PA. Cotranscriptional RNA processing and modification in plants. THE PLANT CELL 2023; 35:1654-1670. [PMID: 36259932 PMCID: PMC10226594 DOI: 10.1093/plcell/koac309] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 10/14/2022] [Indexed: 05/30/2023]
Abstract
The activities of RNA polymerases shape the epigenetic landscape of genomes with profound consequences for genome integrity and gene expression. A fundamental event during the regulation of eukaryotic gene expression is the coordination between transcription and RNA processing. Most primary RNAs mature through various RNA processing and modification events to become fully functional. While pioneering results positioned RNA maturation steps after transcription ends, the coupling between the maturation of diverse RNA species and their transcription is becoming increasingly evident in plants. In this review, we discuss recent advances in our understanding of the crosstalk between RNA Polymerase II, IV, and V transcription and nascent RNA processing of both coding and noncoding RNAs.
Collapse
Affiliation(s)
- Sebastian Marquardt
- Department of Plant and Environmental Sciences, Copenhagen Plant Science Centre, University of Copenhagen, Frederiksberg, Denmark
| | - Ezequiel Petrillo
- Universidad de Buenos Aires, Facultad de Ciencias Exactas y Naturales, Instituto de Fisiología, Biología Molecular y Neurociencias (IFIBYNE-CONICET-UBA), Buenos Aires, C1428EHA, Argentina
| | - Pablo A Manavella
- Instituto de Agrobiotecnología del Litoral (CONICET-UNL), Cátedra de Biología Celular y Molecular, Facultad de Bioquímica y Ciencias Biológicas, Universidad Nacional del Litoral, Santa Fe 3000, Argentina
| |
Collapse
|
11
|
García-Ruiz S, Zhang D, Gustavsson EK, Rocamora-Perez G, Grant-Peters M, Fairbrother-Browne A, Reynolds RH, Brenton JW, Gil-Martínez AL, Chen Z, Rio DC, Botia JA, Guelfi S, Collado-Torres L, Ryten M. Splicing accuracy varies across human introns, tissues and age. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.29.534370. [PMID: 37034741 PMCID: PMC10081249 DOI: 10.1101/2023.03.29.534370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Alternative splicing impacts most multi-exonic human genes. Inaccuracies during this process may have an important role in ageing and disease. Here, we investigated mis-splicing using RNA-sequencing data from ~14K control samples and 42 human body sites, focusing on split reads partially mapping to known transcripts in annotation. We show that mis-splicing occurs at different rates across introns and tissues and that these splicing inaccuracies are primarily affected by the abundance of core components of the spliceosome assembly and its regulators. Using publicly available data on short-hairpin RNA-knockdowns of numerous spliceosomal components and related regulators, we found support for the importance of RNA-binding proteins in mis-splicing. We also demonstrated that age is positively correlated with mis-splicing, and it affects genes implicated in neurodegenerative diseases. This in-depth characterisation of mis-splicing can have important implications for our understanding of the role of splicing inaccuracies in human disease and the interpretation of long-read RNA-sequencing data.
Collapse
Affiliation(s)
- S García-Ruiz
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815
| | - D Zhang
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
| | - E K Gustavsson
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815
| | - G Rocamora-Perez
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
| | - M Grant-Peters
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815
| | - A Fairbrother-Browne
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815
- Department of Medical and Molecular Genetics, School of Basic and Medical Biosciences, King's College London, London, UK
- Department of Neurodegenerative Disease, Queen Square Institute of Neurology, UCL, London, UK
| | - R H Reynolds
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815
| | - J W Brenton
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815
| | - A L Gil-Martínez
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- Department of Neurodegenerative Disease, Queen Square Institute of Neurology, UCL, London, UK
| | - Z Chen
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- Department of Neurodegenerative Disease, Queen Square Institute of Neurology, UCL, London, UK
| | - D C Rio
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- California Institute for Quantitative Biosciences, University of California, Berkeley, CA 94720, USA
| | - J A Botia
- Departamento de Ingeniería de la Información y las Comunicaciones, Universidad de Murcia, Murcia, Spain
| | - S Guelfi
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- Verge Genomics, South San Francisco, CA, 94080, USA
| | - L Collado-Torres
- Lieber Institute for Brain Development, Baltimore, MD, USA , 21205
| | - M Ryten
- Department of Genetics and Genomic Medicine Research & Teaching, UCL GOS Institute of Child Health, London, UK
- NIHR Great Ormond Street Hospital Biomedical Research Centre, University College London, London, UK
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815
| |
Collapse
|
12
|
Wang HL, Yin W, Xia X, Li Z. Orthologs of Human-Disease-Associated Genes in Plants Are Involved in Regulating Leaf Senescence. Life (Basel) 2023; 13:559. [PMID: 36836919 PMCID: PMC9965218 DOI: 10.3390/life13020559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 02/10/2023] [Accepted: 02/14/2023] [Indexed: 02/19/2023] Open
Abstract
As eukaryotes, plants and animals have many commonalities on the genetic level, although they differ greatly in appearance and physiological habits. The primary goal of current plant research is to improve the crop yield and quality. However, plant research has a wider aim, exploiting the evolutionary conservatism similarities between plants and animals, and applying discoveries in the field of botany to promote zoological research that will ultimately serve human health, although very few studies have addressed this aspect. Here, we analyzed 35 human-disease-related gene orthologs in plants and characterized the genes in depth. Thirty-four homologous genes were found to be present in the herbaceous annual plant Arabidopsis thaliana and the woody perennial plant Populus trichocarpa, with most of the genes having more than two exons, including the ATM gene with 78 exons. More surprisingly, 27 (79.4%) of the 34 homologous genes in Arabidopsis were found to be senescence-associated genes (SAGs), further suggesting a close relationship between human diseases and cellular senescence. Protein-protein interaction network analysis revealed that the 34 genes formed two main subnetworks, and genes in the first subnetwork interacted with 15 SAGs. In conclusion, our results show that most of the 34 homologs of human-disease-associated genes in plants are involved in the leaf senescence process, suggesting that leaf senescence may offer a means to study the pathogenesis of human diseases and to screen drugs for the treat of diseases.
Collapse
Affiliation(s)
| | | | - Xinli Xia
- National Engineering Research Center for Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Zhonghai Li
- National Engineering Research Center for Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| |
Collapse
|
13
|
Bhat P, Chow A, Emert B, Ettlin O, Quinodoz SA, Takei Y, Huang W, Blanco MR, Guttman M. 3D genome organization around nuclear speckles drives mRNA splicing efficiency. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.04.522632. [PMID: 36711853 PMCID: PMC9881923 DOI: 10.1101/2023.01.04.522632] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The nucleus is highly organized such that factors involved in transcription and processing of distinct classes of RNA are organized within specific nuclear bodies. One such nuclear body is the nuclear speckle, which is defined by high concentrations of protein and non-coding RNA regulators of pre-mRNA splicing. What functional role, if any, speckles might play in the process of mRNA splicing remains unknown. Here we show that genes localized near nuclear speckles display higher spliceosome concentrations, increased spliceosome binding to their pre-mRNAs, and higher co-transcriptional splicing levels relative to genes that are located farther from nuclear speckles. We show that directed recruitment of a pre-mRNA to nuclear speckles is sufficient to drive increased mRNA splicing levels. Finally, we show that gene organization around nuclear speckles is highly dynamic with differential localization between cell types corresponding to differences in Pol II occupancy. Together, our results integrate the longstanding observations of nuclear speckles with the biochemistry of mRNA splicing and demonstrate a critical role for dynamic 3D spatial organization of genomic DNA in driving spliceosome concentrations and controlling the efficiency of mRNA splicing.
Collapse
|
14
|
Costa MO, Silva R, Anselmo DHAL. Superstatistical and DNA sequence coding of the human genome. Phys Rev E 2022; 106:064407. [PMID: 36671113 DOI: 10.1103/physreve.106.064407] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 11/16/2022] [Indexed: 12/14/2022]
Abstract
In this work, by considering superstatistics we investigate the short-range correlations (SRCs) and the fluctuations in the distribution of lengths of strings of nucleotides. To this end, a stochastic model provides the distributions of the size of the exons based on the q-Gamma and inverse q-Gamma distributions. Specifically, we define a time series for exon sizes to investigate the SRC and the fluctuations through the superstatistics distributions. To test the model's viability, we use the Project Ensembl database of genes to extract the time evolution of exon sizes, calculated in terms of the number of base pairs (bp) in these biological databases. Our findings show that, depending on the chromosome, both distributions are suitable for describing the length distribution of human DNA for lengths greater than 10 bp. In addition, we used Bayesian statistics to perform a selection model approach, which revealed weak evidence for the inverse q-Gamma distribution for a considerable number of chromosomes.
Collapse
Affiliation(s)
- M O Costa
- Departamento de Física, Universidade Federal do Rio Grande do Norte, Natal - RN, 59072-970, Brasil
| | - R Silva
- Departamento de Física, Universidade Federal do Rio Grande do Norte, Natal - RN, 59072-970, Brasil and Programa de Pós-Graduação em Física, Universidade do Estado do Rio Grande do Norte, Mossoró - Rio Grande do Norte, 59610-210, Brasil
| | - D H A L Anselmo
- Departamento de Física, Universidade Federal do Rio Grande do Norte, Natal - RN, 59072-970, Brasil and Programa de Pós-Graduação em Física, Universidade do Estado do Rio Grande do Norte, Mossoró - Rio Grande do Norte, 59610-210, Brasil
| |
Collapse
|
15
|
Murray KO, Clanton TL, Horowitz M. Epigenetic responses to heat: From adaptation to maladaptation. Exp Physiol 2022; 107:1144-1158. [PMID: 35413138 PMCID: PMC9529784 DOI: 10.1113/ep090143] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Accepted: 03/25/2022] [Indexed: 11/08/2022]
Abstract
NEW FINDINGS What is the topic of this review? This review outlines the history of research on epigenetic adaptations to heat exposure. The perspective taken is that adaptations reflect properties of hormesis, whereby low, repeated doses of heat induce adaptation (acclimation/acclimatization); whereas brief, life-threatening exposures can induce maladaptive responses. What advances does it highlight? The epigenetic mechanisms underlying acclimation/acclimatization comprise specific molecular programmes on histones that regulate heat shock proteins transcriptionally and protect the organism from subsequent heat exposures, even after long delays. The epigenetic signalling underlying maladaptive responses might rely, in part, on extensive changes in DNA methylation that are sustained over time and might contribute to later health challenges. ABSTRACT Epigenetics plays a strong role in molecular adaptations to heat by producing a molecular memory of past environmental exposures. Moderate heat, over long periods of time, induces an 'adaptive' epigenetic memory, resulting in a condition of 'resilience' to future heat exposures or cross-tolerance to other forms of toxic stress. In contrast, intense, life-threatening heat exposures, such as severe heat stroke, can result in a 'maladaptive' epigenetic memory that can place an organism at risk of later health complications. These cellular memories are coded by post-translational modifications of histones on the nucleosomes and/or by changes in DNA methylation. They operate by inducing changes in the level of gene transcription and therefore phenotype. The adaptive response to heat acclimation functions, in part, by facilitating transcription of essential heat shock proteins and exhibits a biphasic short programme (maintaining DNA integrity, followed by a long-term consolidation). The latter accelerates acclimation responses after de-acclimation. Although less studied, the maladaptive responses to heat stroke appear to be coded in long-lasting changes in DNA methylation near the promoter region of genes involved with basic cell function. Whether these memories are also encoded in histone modifications is not yet known. There is considerable evidence that both adaptive and maladaptive epigenetic responses to heat can be inherited, although most evidence comes from lower organisms. Future challenges include understanding the signalling mechanisms responsible and discovering new ways to promote adaptive responses while suppressing maladaptive responses to heat, as all life forms adapt to life on a warming planet.
Collapse
Affiliation(s)
- Kevin O. Murray
- Department of Integrative Physiology, University of Colorado Boulder, Boulder, CO, USA
| | - Thomas L. Clanton
- Department of Applied Physiology and Kinesiology, University of Florida, Gainesville, FL, USA
| | - Michal Horowitz
- Laboratory of Environmental Physiology, Faculty of Dentistry, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
16
|
Reixachs‐Solé M, Eyras E. Uncovering the impacts of alternative splicing on the proteome with current omics techniques. WILEY INTERDISCIPLINARY REVIEWS. RNA 2022; 13:e1707. [PMID: 34979593 PMCID: PMC9542554 DOI: 10.1002/wrna.1707] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Revised: 11/27/2021] [Accepted: 11/29/2021] [Indexed: 12/15/2022]
Abstract
The high-throughput sequencing of cellular RNAs has underscored a broad effect of isoform diversification through alternative splicing on the transcriptome. Moreover, the differential production of transcript isoforms from gene loci has been recognized as a critical mechanism in cell differentiation, organismal development, and disease. Yet, the extent of the impact of alternative splicing on protein production and cellular function remains a matter of debate. Multiple experimental and computational approaches have been developed in recent years to address this question. These studies have unveiled how molecular changes at different steps in the RNA processing pathway can lead to differences in protein production and have functional effects. New and emerging experimental technologies open exciting new opportunities to develop new methods to fully establish the connection between messenger RNA expression and protein production and to further investigate how RNA variation impacts the proteome and cell function. This article is categorized under: RNA Processing > Splicing Regulation/Alternative Splicing Translation > Regulation RNA Evolution and Genomics > Computational Analyses of RNA.
Collapse
Affiliation(s)
- Marina Reixachs‐Solé
- The John Curtin School of Medical ResearchAustralian National UniversityCanberraAustralian Capital TerritoryAustralia
- EMBL Australia Partner Laboratory Network and the Australian National UniversityCanberraAustralian Capital TerritoryAustralia
| | - Eduardo Eyras
- The John Curtin School of Medical ResearchAustralian National UniversityCanberraAustralian Capital TerritoryAustralia
- EMBL Australia Partner Laboratory Network and the Australian National UniversityCanberraAustralian Capital TerritoryAustralia
- Catalan Institution for Research and Advanced StudiesBarcelonaSpain
- Hospital del Mar Medical Research Institute (IMIM)BarcelonaSpain
| |
Collapse
|
17
|
Li J, Yang Y, Sun X, Liu R, Xia W, Shi P, Zhou L, Wang Y, Wu Y, Lei X, Xiao Y. Development of Intron Polymorphism Markers and Their Association With Fatty Acid Component Variation in Oil Palm. FRONTIERS IN PLANT SCIENCE 2022; 13:885418. [PMID: 35720541 PMCID: PMC9201816 DOI: 10.3389/fpls.2022.885418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 05/16/2022] [Indexed: 06/15/2023]
Abstract
Oil palm (Elaeis guineensis Jacq.) is a tropical woody oil crop of the palm family and is known as "the oil king of the world," but its palm oil contains about 50% palmitic acid, which is considered unhealthy for humans. Intron polymorphisms (IP) are highly efficient and easily examined molecular markers located adjacent to exon regions of functional genes, thus may be associated with targeted trait variation. In order to speed up the breeding of oil palm fatty acid composition, the current study identified a total of 310 introns located within 52 candidate genes involved in fatty acid biosynthesis in the oil palm genome. Based on the intron sequences, 205 primer pairs were designed, 64 of which showed polymorphism among 70 oil palm individuals. Phenotypic variation of fatty acid content in the 70 oil palm individuals was also investigated. Association analysis revealed that 13 IP markers were significantly associated with fatty acid content variation, and these IP markers were located on chromosomes 2, 5, 6, 8, 9, and 10 of oil palm. The development of such IP markers may be useful for the genetic improvement of fatty acid composition in oil palm.
Collapse
Affiliation(s)
- Jing Li
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| | - Yaodong Yang
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| | - Xiwei Sun
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| | - Rui Liu
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| | - Wei Xia
- Institute of Tropical Agriculture and Forestry, Hainan University, Haikou, China
| | - Peng Shi
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| | - Lixia Zhou
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| | - Yong Wang
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| | - Yi Wu
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| | - Xintao Lei
- Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou, China
| | - Yong Xiao
- Hainan Key Laboratory of Tropical Oil Crops Biology/Coconut Research Institute, Chinese Academy of Tropical Agricultural Sciences, Wenchang, China
| |
Collapse
|
18
|
Foe VE. Does the Pachytene Checkpoint, a Feature of Meiosis, Filter Out Mistakes in Double-Strand DNA Break Repair and as a side-Effect Strongly Promote Adaptive Speciation? Integr Org Biol 2022; 4:obac008. [PMID: 36827645 PMCID: PMC8998493 DOI: 10.1093/iob/obac008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
This essay aims to explain two biological puzzles: why eukaryotic transcription units are composed of short segments of coding DNA interspersed with long stretches of non-coding (intron) DNA, and the near ubiquity of sexual reproduction. As is well known, alternative splicing of its coding sequences enables one transcription unit to produce multiple variants of each encoded protein. Additionally, padding transcription units with non-coding DNA (often many thousands of base pairs long) provides a readily evolvable way to set how soon in a cell cycle the various mRNAs will begin being expressed and the total amount of mRNA that each transcription unit can make during a cell cycle. This regulation complements control via the transcriptional promoter and facilitates the creation of complex eukaryotic cell types, tissues, and organisms. However, it also makes eukaryotes exceedingly vulnerable to double-strand DNA breaks, which end-joining break repair pathways can repair incorrectly. Transcription units cover such a large fraction of the genome that any mis-repair producing a reorganized chromosome has a high probability of destroying a gene. During meiosis, the synaptonemal complex aligns homologous chromosome pairs and the pachytene checkpoint detects, selectively arrests, and in many organisms actively destroys gamete-producing cells with chromosomes that cannot adequately synapse; this creates a filter favoring transmission to the next generation of chromosomes that retain the parental organization, while selectively culling those with interrupted transcription units. This same meiotic checkpoint, reacting to accidental chromosomal reorganizations inflicted by error-prone break repair, can, as a side effect, provide a mechanism for the formation of new species in sympatry. It has been a long-standing puzzle how something as seemingly maladaptive as hybrid sterility between such new species can arise. I suggest that this paradox is resolved by understanding the adaptive importance of the pachytene checkpoint, as outlined above.
Collapse
|
19
|
The new Haemaphysalis longicornis genome provides insights into its requisite biological traits. Genomics 2022; 114:110317. [DOI: 10.1016/j.ygeno.2022.110317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2021] [Revised: 12/21/2021] [Accepted: 02/15/2022] [Indexed: 11/20/2022]
|
20
|
Dosage sensitivity and exon shuffling shape the landscape of polymorphic duplicates in Drosophila and humans. Nat Ecol Evol 2021; 6:273-287. [PMID: 34969986 DOI: 10.1038/s41559-021-01614-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 11/10/2021] [Indexed: 11/08/2022]
Abstract
Despite polymorphic duplicate genes' importance for the early stages of duplicate gene evolution, they are less studied than old gene duplicates. Two essential questions thus remain poorly addressed: how does dosage sensitivity, imposed by stoichiometry in protein complexes or by X chromosome dosage compensation, affect the emergence of complete duplicate genes? Do introns facilitate intergenic and intragenic chimaerism as predicted by the theory of exon shuffling? Here, we analysed new data for Drosophila and public data for humans, to characterize polymorphic duplicate genes with respect to dosage, exon-intron structures and allele frequencies. We found that complete duplicate genes are under dosage constraint induced by protein stoichiometry but potentially tolerated by X chromosome dosage compensation. We also found that in the intron-rich human genome, gene fusions and intragenic duplications extensively use intronic breakpoints generating in-frame proteins, in accordance with the theory of exon shuffling. Finally, we found that only a small proportion of complete or partial duplicates are at high frequencies, indicating the deleterious nature of dosage or gene structural changes. Altogether, we demonstrate how mechanistic factors including dosage sensitivity and exon-intron structure shape the short-term functional consequences of gene duplication.
Collapse
|
21
|
Paul A, Chatterjee A, Subrahmanya S, Shen G, Mishra N. NHX Gene Family in Camellia sinensis: In-silico Genome-Wide Identification, Expression Profiles, and Regulatory Network Analysis. FRONTIERS IN PLANT SCIENCE 2021; 12:777884. [PMID: 34987532 PMCID: PMC8720784 DOI: 10.3389/fpls.2021.777884] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 11/22/2021] [Indexed: 06/14/2023]
Abstract
Salt stress affects the plant growth and productivity worldwide and NHX is one of those genes that are well known to improve salt tolerance in transgenic plants. It is well characterized in several plants, such as Arabidopsis thaliana and cotton; however, not much is known about NHXs in tea plant. In the present study, NHX genes of tea were obtained through a genome-wide search using A. thaliana as reference genome. Out of the 9 NHX genes in tea, 7 genes were localized in vacuole while the remaining 2 genes were localized in the endoplasmic reticulum (ER; CsNHX8) and plasma membrane (PM; CsNHX9), respectively. Furthermore, phylogenetic relationships along with structural analysis which includes gene structure, location, and protein-conserved motifs and domains were systematically examined and further, predictions were validated by the expression analysis. The dN/dS values show that the majority of tea NHX genes is subjected to strong purifying selection under the course of evolution. Also, functional interaction was carried out in Camellia sinensis based on the orthologous genes in A. thaliana. The expression profiles linked to various stress treatments revealed wide involvement of NHX genes from tea in response to various abiotic factors. This study provides the targets for further comprehensive identification, functional study, and also contributed for a better understanding of the NHX regulatory network in C. sinensis.
Collapse
Affiliation(s)
| | | | | | - Guoxin Shen
- Sericultural Research Institute, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Neelam Mishra
- Department of Botany, St. Joseph’s College Autonomous, Bangalore, India
| |
Collapse
|
22
|
Paul A, Srivastava AP, Subrahmanya S, Shen G, Mishra N. In-silico genome wide analysis of Mitogen activated protein kinase kinase kinase gene family in C. sinensis. PLoS One 2021; 16:e0258657. [PMID: 34735479 PMCID: PMC8568164 DOI: 10.1371/journal.pone.0258657] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 10/01/2021] [Indexed: 11/19/2022] Open
Abstract
Mitogen activated protein kinase kinase kinase (MAPKKK) form the upstream component of MAPK cascade. It is well characterized in several plants such as Arabidopsis and rice however the knowledge about MAPKKKs in tea plant is largely unknown. In the present study, MAPKKK genes of tea were obtained through a genome wide search using Arabidopsis thaliana as the reference genome. Among 59 candidate MAPKKK genes in tea, 17 genes were MEKK-like, 31 genes were Raf-like and 11 genes were ZIK- like. Additionally, phylogenetic relationships were established along with structural analysis, which includes gene structure, its location as well as conserved motifs, cis-acting regulatory elements and functional domain signatures that were systematically examined. Also, on the basis of one orthologous gene found between tea and Arabidopsis, functional interaction was carried out in C. sinensis based on an Arabidopsis association model. The expressional profiles indicated major involvement of MAPKKK genes from tea in response to various abiotic stress factors. Taken together, this study provides the targets for additional inclusive identification, functional study, and provides comprehensive knowledge for a better understanding of the MAPKKK cascade regulatory network in C. sinensis.
Collapse
Affiliation(s)
- Abhirup Paul
- Department of Biochemistry, REVA University, Bangalore, Karnataka, India
| | - Anurag P. Srivastava
- Department of Life Sciences, Garden City University, Bangalore, Karnataka, India
| | - Shreya Subrahmanya
- Department of Botany, St. Joseph’s College Autonomous, Bangalore, Karnataka, India
| | - Guoxin Shen
- Sericultural Research Institute, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Neelam Mishra
- Department of Botany, St. Joseph’s College Autonomous, Bangalore, Karnataka, India
| |
Collapse
|
23
|
Meher PK, Satpathy S. Improved recognition of splice sites in A. thaliana by incorporating secondary structure information into sequence-derived features: a computational study. 3 Biotech 2021; 11:484. [PMID: 34790508 DOI: 10.1007/s13205-021-03036-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 10/18/2021] [Indexed: 10/19/2022] Open
Abstract
Identification of splice sites is an important aspect with regard to the prediction of gene structure. In most of the existing splice site prediction studies, machine learning algorithms coupled with sequence-derived features have been successfully employed for splice site recognition. However, the splice site identification by incorporating the secondary structure information is lacking, particularly in plant species. Thus, we made an attempt in this study to evaluate the performance of structural features on the splice site prediction accuracy in Arabidopsis thaliana. Prediction accuracies were evaluated with the sequence-derived features alone as well as by incorporating the structural features into the sequence-derived features, where support vector machine (SVM) was employed as prediction algorithm. Both short (40 base pairs) and long (105 base pairs) sequence datasets were considered for evaluation. After incorporating the secondary structure features, improvements in accuracies were observed only for the longer sequence dataset and the improvement was found to be higher with the sequence-derived features that accounted nucleotide dependencies. On the other hand, either a little or no improvement in accuracies was found for the short sequence dataset. The performance of SVM was further compared with that of LogitBoost, Random Forest (RF), AdaBoost and XGBoost machine learning methods. The prediction accuracies of SVM, AdaBoost and XGBoost were observed to be at par and higher than that of RF and LogitBoost algorithms. While prediction was performed by taking all the sequence-derived features along with the structural features, a little improvement in accuracies was found as compared to the combination of individual sequence-based features and structural features. To the best of our knowledge, this is the first attempt concerning the computational prediction of splice sites using machine learning methods by incorporating the secondary structure information into the sequence-derived features. All the source codes are available at https://github.com/meher861982/SSFeature. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s13205-021-03036-8.
Collapse
|
24
|
Garcia JA, Lohmueller KE. Negative linkage disequilibrium between amino acid changing variants reveals interference among deleterious mutations in the human genome. PLoS Genet 2021; 17:e1009676. [PMID: 34319975 PMCID: PMC8351996 DOI: 10.1371/journal.pgen.1009676] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 08/09/2021] [Accepted: 06/22/2021] [Indexed: 11/18/2022] Open
Abstract
Evolutionary forces like Hill-Robertson interference and negative epistasis can lead to deleterious mutations being found on distinct haplotypes. However, the extent to which these forces depend on the selection and dominance coefficients of deleterious mutations and shape genome-wide patterns of linkage disequilibrium (LD) in natural populations with complex demographic histories has not been tested. In this study, we first used forward-in-time simulations to predict how negative selection impacts LD. Under models where deleterious mutations have additive effects on fitness, deleterious variants less than 10 kb apart tend to be carried on different haplotypes relative to pairs of synonymous SNPs. In contrast, for recessive mutations, there is no consistent ordering of how selection coefficients affect LD decay, due to the complex interplay of different evolutionary effects. We then examined empirical data of modern humans from the 1000 Genomes Project. LD between derived alleles at nonsynonymous SNPs is lower compared to pairs of derived synonymous variants, suggesting that nonsynonymous derived alleles tend to occur on different haplotypes more than synonymous variants. This result holds when controlling for potential confounding factors by matching SNPs for frequency in the sample (allele count), physical distance, magnitude of background selection, and genetic distance between pairs of variants. Lastly, we introduce a new statistic HR(j) which allows us to detect interference using unphased genotypes. Application of this approach to high-coverage human genome sequences confirms our finding that nonsynonymous derived alleles tend to be located on different haplotypes more often than are synonymous derived alleles. Our findings suggest that interference may play a pervasive role in shaping patterns of LD between deleterious variants in the human genome, and consequently influences genome-wide patterns of LD.
Collapse
Affiliation(s)
- Jesse A. Garcia
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
| | - Kirk E. Lohmueller
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, United States of America
| |
Collapse
|
25
|
PID: An integrative and comprehensive platform of plant intron. Comput Biol Chem 2021; 93:107528. [PMID: 34111777 DOI: 10.1016/j.compbiolchem.2021.107528] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 11/07/2020] [Accepted: 06/01/2021] [Indexed: 11/21/2022]
Abstract
Intron is a non-coding sequence of a broken gene and participates in important biological processes, such as transcription regulation, alternative splicing, and nuclear export. With the development of plant genomes, a comprehensive platform for intron analysis in plants must be established. Plant Intron Database (PID), a publicly available searchable database, was developed to efficiently store, query, analyze, and integrate intron resources in plants. The information of intron, exon, and gene can be searched by key words in PID. Users cannot only view intron length distribution pie chart and 5' and 3' splice site sequence feature maps in a statistical interface but can also browse intron information in a graphical visualization interface through JBrowse. ViroBlast for sequence homology searches, Intron detection and sequence interception tools were also provided. PID contains annotated genes from 118 sequenced plants, 24,782,048 introns, 30,843,049 exons, and 414 visual maps. This tool will greatly accelerate research on the distribution, length characteristics, and functions of introns in plants. PID is accessible at http://biodb.sdau.edu.cn/PID/index.php.
Collapse
|
26
|
Kumar S, Mutturi S. Alternative splicing regulates the α-glucosidase synthesis in Aspergillus neoniger NCIM 1400. Fungal Biol 2021; 125:658-665. [PMID: 34281659 DOI: 10.1016/j.funbio.2021.04.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 02/26/2021] [Accepted: 04/07/2021] [Indexed: 10/21/2022]
Abstract
Aspergillus neoniger NCIM 1400 whose cell-free fraction was earlier established for transglycosylation activity conferred by α-glucosidase gene (agdA), was subjected to sequence analysis. Preliminary results revealed certain dynamics in the intron splicing mechanism, and to ascertain these molecular events, a detailed study was carried. The electrophoresis results from the cDNA portion (B-fragment) of agdA showed multiple bands, indicating the amplification of one or more fragments. The sequence results of cDNA cloned vector revealed the retention type of alternative splicing in the agdA. The splicing mechanism of agdA in NCIM 1400 was compared to different A. niger strains, which harbours agdA orthologues, using PCR. It was observed that effective intron splicing leads to higher α-glucosidase activity from these selected Aspergillus spp. To explore the dynamics of intron retention in A. neoniger NCIM 1400, time-course analysis of intron retention, enzyme activity, and sugar consumption were carried over a period of 168 h of fungal growth. RT-qPCR results revealed that introns retention was not detected during the initial growth phase when the maltose and its hydrolysed product, glucose were consumed. Here we demonstrate that exhaustion of maltose causes increase in retention of introns in the mRNA transcripts of agdA gene, and this could be the possible mode of regulating this gene.
Collapse
Affiliation(s)
- Sandeep Kumar
- Microbiology & Fermentation Technology Department, CSIR-Central Food Technological Research Institute, Mysuru, Karnataka, 570020, India; AcSIR-Academy of Scientific & Innovative Research, Ghaziabad, UP, 201002, India
| | - Sarma Mutturi
- Microbiology & Fermentation Technology Department, CSIR-Central Food Technological Research Institute, Mysuru, Karnataka, 570020, India; AcSIR-Academy of Scientific & Innovative Research, Ghaziabad, UP, 201002, India.
| |
Collapse
|
27
|
Gehring NH, Roignant JY. Anything but Ordinary – Emerging Splicing Mechanisms in Eukaryotic Gene Regulation. Trends Genet 2021; 37:355-372. [DOI: 10.1016/j.tig.2020.10.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 10/14/2020] [Accepted: 10/19/2020] [Indexed: 12/11/2022]
|
28
|
Královičová J, Borovská I, Pengelly R, Lee E, Abaffy P, Šindelka R, Grutzner F, Vořechovský I. Restriction of an intron size en route to endothermy. Nucleic Acids Res 2021; 49:2460-2487. [PMID: 33550394 PMCID: PMC7969005 DOI: 10.1093/nar/gkab046] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 01/11/2021] [Accepted: 01/15/2021] [Indexed: 11/15/2022] Open
Abstract
Ca2+-insensitive and -sensitive E1 subunits of the 2-oxoglutarate dehydrogenase complex (OGDHC) regulate tissue-specific NADH and ATP supply by mutually exclusive OGDH exons 4a and 4b. Here we show that their splicing is enforced by distant lariat branch points (dBPs) located near the 5' splice site of the intervening intron. dBPs restrict the intron length and prevent transposon insertions, which can introduce or eliminate dBP competitors. The size restriction was imposed by a single dominant dBP in anamniotes that expanded into a conserved constellation of four dBP adenines in amniotes. The amniote clusters exhibit taxon-specific usage of individual dBPs, reflecting accessibility of their extended motifs within a stable RNA hairpin rather than U2 snRNA:dBP base-pairing. The dBP expansion took place in early terrestrial species and was followed by a uridine enrichment of large downstream polypyrimidine tracts in mammals. The dBP-protected megatracts permit reciprocal regulation of exon 4a and 4b by uridine-binding proteins, including TIA-1/TIAR and PUF60, which promote U1 and U2 snRNP recruitment to the 5' splice site and BP, respectively, but do not significantly alter the relative dBP usage. We further show that codons for residues critically contributing to protein binding sites for Ca2+ and other divalent metals confer the exon inclusion order that mirrors the Irving-Williams affinity series, linking the evolution of auxiliary splicing motifs in exons to metallome constraints. Finally, we hypothesize that the dBP-driven selection for Ca2+-dependent ATP provision by E1 facilitated evolution of endothermy by optimizing the aerobic scope in target tissues.
Collapse
Affiliation(s)
- Jana Královičová
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Ivana Borovská
- Slovak Academy of Sciences, Centre for Biosciences, 840 05 Bratislava, Slovak Republic
| | - Reuben Pengelly
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
| | - Eunice Lee
- School of Biological Sciences, University of Adelaide, Adelaide 5005, SA, Australia
| | - Pavel Abaffy
- Czech Academy of Sciences, Institute of Biotechnology, 25250 Vestec, Czech Republic
| | - Radek Šindelka
- Czech Academy of Sciences, Institute of Biotechnology, 25250 Vestec, Czech Republic
| | - Frank Grutzner
- School of Biological Sciences, University of Adelaide, Adelaide 5005, SA, Australia
| | - Igor Vořechovský
- University of Southampton, Faculty of Medicine, HDH, Southampton SO16 6YD, UK
| |
Collapse
|
29
|
Sessions SK, Wake DB. Forever young: Linking regeneration and genome size in salamanders. Dev Dyn 2020; 250:768-778. [DOI: 10.1002/dvdy.279] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 10/21/2020] [Accepted: 11/11/2020] [Indexed: 11/12/2022] Open
Affiliation(s)
| | - David B. Wake
- Department of Integrative Biology and Museum of Vertebrate Zoology University of California Berkeley California USA
| |
Collapse
|
30
|
MAPK cascade gene family in Camellia sinensis: In-silico identification, expression profiles and regulatory network analysis. BMC Genomics 2020; 21:613. [PMID: 32894062 PMCID: PMC7487466 DOI: 10.1186/s12864-020-07030-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 08/27/2020] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Mitogen Activated Protein Kinase (MAPK) cascade is a fundamental pathway in organisms for signal transduction. Though it is well characterized in various plants, there is no systematic study of this cascade in tea. RESULT In this study, 5 genes of Mitogen Activated Protein Kinase Kinase (MKK) and 16 genes of Mitogen Activated Protein Kinase (MPK) in Camellia sinensis were found through a genome-wide search taking Arabidopsis thaliana as the reference genome. Also, phylogenetic relationships along with structural analysis which includes gene structure, location as well as protein conserved motifs and domains, were systematically examined and further, predictions were validated by the results. The plant species taken for comparative study clearly displayed segmental duplication, which was a significant candidate for MAPK cascade expansion. Also, functional interaction was carried out in C. sinensis based on the orthologous genes in Arabidopsis. The expression profiles linked to various stress treatments revealed wide involvement of MAPK and MAPKK genes from Tea in response to various abiotic factors. In addition, the expression of these genes was analysed in various tissues. CONCLUSION This study provides the targets for further comprehensive identification, functional study, and also contributed for a better understanding of the MAPK cascade regulatory network in C. sinensis.
Collapse
|
31
|
Moyer DC, Larue GE, Hershberger CE, Roy SW, Padgett RA. Comprehensive database and evolutionary dynamics of U12-type introns. Nucleic Acids Res 2020; 48:7066-7078. [PMID: 32484558 PMCID: PMC7367187 DOI: 10.1093/nar/gkaa464] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/19/2020] [Accepted: 05/20/2020] [Indexed: 12/16/2022] Open
Abstract
During nuclear maturation of most eukaryotic pre-messenger RNAs and long non-coding RNAs, introns are removed through the process of RNA splicing. Different classes of introns are excised by the U2-type or the U12-type spliceosomes, large complexes of small nuclear ribonucleoprotein particles and associated proteins. We created intronIC, a program for assigning intron class to all introns in a given genome, and used it on 24 eukaryotic genomes to create the Intron Annotation and Orthology Database (IAOD). We then used the data in the IAOD to revisit several hypotheses concerning the evolution of the two classes of spliceosomal introns, finding support for the class conversion model explaining the low abundance of U12-type introns in modern genomes.
Collapse
Affiliation(s)
- Devlin C Moyer
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic Lerner College of Medicine, Cleveland Clinic and Department of Molecular Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Graham E Larue
- Department of Molecular and Cell Biology, University of California, Merced, Merced, CA 95343, USA
| | - Courtney E Hershberger
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic Lerner College of Medicine, Cleveland Clinic and Department of Molecular Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Scott W Roy
- Department of Biology, San Francisco State University, San Francisco, CA 94132, USA
| | - Richard A Padgett
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic Lerner College of Medicine, Cleveland Clinic and Department of Molecular Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
32
|
Mao S, Pachter L, Tse D, Kannan S. RefShannon: A genome-guided transcriptome assembler using sparse flow decomposition. PLoS One 2020; 15:e0232946. [PMID: 32484809 PMCID: PMC7266320 DOI: 10.1371/journal.pone.0232946] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 04/24/2020] [Indexed: 12/12/2022] Open
Abstract
High throughput sequencing of RNA (RNA-Seq) has become a staple in modern molecular biology, with applications not only in quantifying gene expression but also in isoform-level analysis of the RNA transcripts. To enable such an isoform-level analysis, a transcriptome assembly algorithm is utilized to stitch together the observed short reads into the corresponding transcripts. This task is complicated due to the complexity of alternative splicing - a mechanism by which the same gene may generate multiple distinct RNA transcripts. We develop a novel genome-guided transcriptome assembler, RefShannon, that exploits the varying abundances of the different transcripts, in enabling an accurate reconstruction of the transcripts. Our evaluation shows RefShannon is able to improve sensitivity effectively (up to 22%) at a given specificity in comparison with other state-of-the-art assemblers. RefShannon is written in Python and is available from Github (https://github.com/shunfumao/RefShannon).
Collapse
Affiliation(s)
- Shunfu Mao
- Department of Electrical and Computer Engineering, University of Washington, Seattle, WA, United States of America
| | - Lior Pachter
- Division of Biology and Biological Engineering, Caltech, Pasadena, CA, United States of America
| | - David Tse
- Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America
| | - Sreeram Kannan
- Department of Electrical and Computer Engineering, University of Washington, Seattle, WA, United States of America
- * E-mail:
| |
Collapse
|
33
|
Sharma H, Bhandawat A, Rahim MS, Kumar P, Choudhoury MP, Roy J. Novel intron length polymorphic (ILP) markers from starch biosynthesis genes reveal genetic relationships in Indian wheat varieties and related species. Mol Biol Rep 2020; 47:3485-3500. [PMID: 32281056 DOI: 10.1007/s11033-020-05434-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 04/03/2020] [Indexed: 11/28/2022]
Abstract
Introns experience lesser selection pressure, thus are liable for higher polymorphism. Intron Length Polymorphic (ILP) markers designed from exon-flanking introns exploits this polymorphic potential and have been proved to be a robust co-dominant marker in eukaryotes. Wheat is among the most consumed cereal crop by majority of the word population. It is a rich source of calories in the form of stored starch. In the current study, starch biosynthesis genes were mined for development of ILP markers and their subsequent utilization for genetic characterization of popular Indian wheat varieties and transferability to wild relatives. Sixty-one markers generated 122 alleles and showed 77-88.5% transferability (mean PIC: 0.36) to the related species. A subset of markers showed clear genetic distinctions (Avg. genetic dissimilarity = 0.42) among Indian wheat varieties, signifying the importance of novel ILPs. 'Kenphad25' showed maximum genetic dissimilarity with 'K 8962' (0.82), while maximum genetic similarity was observed between 'Safed Lerma' and 'RAJ 4037' (0.1). This is the first report of ILP markers in wheat and will be a useful genomic resource for future germplasm conservation and molecular breeding studies.
Collapse
Affiliation(s)
- Himanshu Sharma
- Agri-Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali, Punjab, India
| | - Abhishek Bhandawat
- Agri-Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali, Punjab, India
| | - Mohammed Saba Rahim
- Agri-Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali, Punjab, India
| | - Pankaj Kumar
- Agri-Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali, Punjab, India
| | - Mohini Pal Choudhoury
- Agri-Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali, Punjab, India
| | - Joy Roy
- Agri-Biotechnology Division, National Agri-Food Biotechnology Institute, Mohali, Punjab, India.
| |
Collapse
|
34
|
Ramírez-Camejo LA, Bayman P. Gene expression on the fly: A transcriptome-level view of Drosophila's immune response to the opportunistic fungal pathogen Aspergillus flavus. INFECTION GENETICS AND EVOLUTION 2020; 82:104308. [PMID: 32240802 DOI: 10.1016/j.meegid.2020.104308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Revised: 03/23/2020] [Accepted: 03/28/2020] [Indexed: 10/24/2022]
Abstract
Aspergilloses are opportunistic infections in animals and humans caused by several Aspergillus species, including Aspergillus flavus. Although the immune system of Drosophila melanogaster is extensively studied, little is known about the fly's specific responses to infection by A. flavus. We compared gene expression levels during induced infections in D. melanogaster by a virulent A. flavus isolate and a less virulent isolate, as well as from uninfected flies as a control. We found that 1081 of the 14,554 gene regions detected were significantly differentially expressed among treatments. Some of these up- and down- regulated genes were previously shown to be involved in defense responses against pathogens. Some are known to be involved in vitelline membrane formation in flies. Other up- and down-regulated genes are of unknown function. Understanding expression of these genes during the process of infection in flies should improve our knowledge of innate immunity in invertebrates, and by extension, in vertebrates as well.
Collapse
Affiliation(s)
- Luis A Ramírez-Camejo
- Purdue University, Department of Botany and Plant Pathology, West Lafayette, IN 47901, USA; Department of Biology, University of Puerto Rico - Río Piedras, San Juan, PR, USA; Coiba Scientific Station (COIBA AIP), City of Knowledge, Clayton, Panama, Panama.
| | - Paul Bayman
- Department of Biology, University of Puerto Rico - Río Piedras, San Juan, PR, USA
| |
Collapse
|
35
|
Comprehensive genomic analyses with 115 plastomes from algae to seed plants: structure, gene contents, GC contents, and introns. Genes Genomics 2020; 42:553-570. [PMID: 32200544 DOI: 10.1007/s13258-020-00923-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 03/09/2020] [Indexed: 02/08/2023]
Abstract
BACKGROUND Chloroplasts are a common character in plants. The chloroplasts in each plant lineage have shaped their own genomes, plastomes, by structural changes and transferring many genes to nuclear genomes during plant evolution. Some plastid genes have introns that are mostly group II introns. OBJECTIVE This study aimed to get genomic and evolutionary insights on the plastomes from green algae to flowering plants. METHODS Plastomes of 115 species from green algae, bryophytes, pteridophytes (spore bearing vascular plants), gymnosperms, and angiosperms were mined from NCBI organelle genome database. Plastome structure, gene contents and GC contents were analyzed by the in-house developed Phyton code. Intronic features including presence/absence, length, intron phases were analyzed by manually in the annotated information in NCBI. RESULTS The canonical quadripartite structures were retained in most plastomes except of a few plastomes that had lost an invert repeat (IR). Expansion or reduction or deletion of IRs resulted in the length variation of the plastomes. The number of protein coding genes ranged from 40 to 92 with an average 79.43 ± 5.84 per plastome and gene losses were apparent in specific lineages. The number of trn genes ranged from 13 to 33 with an average 21.19 ± 2.42 per plastome. Ribosomal RNA genes, rrn, were located in the IRs so that they were present in a duplicate except of the species that had lost one of the IR. GC contents were variable from 24.9 to 51.0% with an average 38.21 ± 3.27%, indicating bias to high AT contents. Plastid introns were present in 18 protein coding genes, six trn genes, and one rrn gene. Intron losses occurred among the orthologous genes in different plant lineages. The plastid introns were long compared with the nuclear introns, which might be related with the spliceosome nuclear introns and self-splicing group II plastid introns. The trnK-UUU intron contained the maturase encoding matK gene except in the chlorophyte algae and monilophyte ferns in which the trnK-UUU was lost, but matK retained. There were many annotation artefacts in the intron positions in the NCBI database. In the analysis of intron phases, phase 0 introns were more frequent than those of phase 2 and 3 introns. Phase polymorphism was observed in the introns of clpP which was derived from nucleotide insertion. Plastid trn introns were long compared to the archaeal or eukaryotic nuclear tRNA introns. Of the six plastid trn introns, one was at the D loop and other five were at the anticodon loop. The insertion sites were conserved among the trn genes in archaea, eukaryotic nuclear and plastid tRNA genes. CONCLUSIONS Current study refurbrished the previous findings of structural variations, gene contents, and GC contents of the chloroplast genomes from green algae to flowering plants. The study also included some noble findings and discussions on the plastome introns including their length variations and phase variation. We also presented and corrected some false annotations on the introns in protein coding and tRNA genes in the genome database, which might be confirmed by the chloroplast transcriptome analysis in the future.
Collapse
|
36
|
Franco AL, Figueredo A, Pereira LDM, de Sousa SM, Souza G, Carvalho MA, Simon MF, Viccini LF. Low cytomolecular diversification in the genus Stylosanthes Sw. (Papilionoideae, Leguminosae). Genet Mol Biol 2020; 43:e20180250. [PMID: 31429856 PMCID: PMC7197990 DOI: 10.1590/1678-4685-gmb-2018-0250] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 03/07/2019] [Indexed: 12/02/2022] Open
Abstract
Stylosanthes (Papilionoideae, Leguminosae) is a predominantly Neotropical genus with ~48 species that include worldwide important forage species. This study presents the chromosome number and morphology of eight species of the genus Stylosanthes (S. acuminata, S. gracilis, S. grandifolia, S. guianensis, S. hippocampoides, S. pilosa, S. macrocephala, and S. ruellioides). In addition, staining with CMA and DAPI, in situ hybridization with 5S and 35S rDNA probes, and estimation of DNA content were performed. The interpretation of Stylosanthes chromosome diversification was anchored by a comparison with the sister genus Arachis and a dated molecular phylogeny based on nuclear and plastid loci. Stylosanthes species showed 2n = 20, with low cytomolecular diversification regarding 5S rDNA, 35S rDNA, and genome size. Arachis has a more ancient diversification (~7 Mya in the Pliocene) than the relatively recent Stylosanthes (~2 Mya in the Pleistocene), and it seems more diverse than its sister lineage. Our data support the idea that the cytomolecular stability of Stylosanthes in relation to Arachis could be a result of its recent origin. The recent diversification of Stylosanthes could also be related to the low morphological differentiation among species, and to the recurrent formation of allopolyploid complexes.
Collapse
Affiliation(s)
- Ana Luiza Franco
- Universidade Federal de Juiz de Fora, Departamento de Biologia, Laboratório de Genética, Juiz de Fora, MG, Brazil
| | - Amanda Figueredo
- Universidade Federal de Pernambuco, Departamento de Botânica, Laboratório de Citogenética e Evolução Vegetal, CCB, Recife, PE, Brazil
| | - Lívia de Moraes Pereira
- Universidade Federal de Pernambuco, Departamento de Botânica, Laboratório de Citogenética e Evolução Vegetal, CCB, Recife, PE, Brazil
| | - Saulo Marçal de Sousa
- Universidade Federal de Juiz de Fora, Departamento de Biologia, Laboratório de Genética, Juiz de Fora, MG, Brazil
| | - Gustavo Souza
- Universidade Federal de Pernambuco, Departamento de Botânica, Laboratório de Citogenética e Evolução Vegetal, CCB, Recife, PE, Brazil
| | | | - Marcelo F. Simon
- Empresa Brasileira de Pesquisa Agropecuária, Embrapa Recursos Genéticos e Biotecnologia, PqEB, Brasília, DF, Brazil
| | - Lyderson Facio Viccini
- Universidade Federal de Juiz de Fora, Departamento de Biologia, Laboratório de Genética, Juiz de Fora, MG, Brazil
| |
Collapse
|
37
|
Niu G, Shao Z, Liu C, Chen T, Jiao Q, Hong Z. Comparative and evolutionary analyses of the divergence of plant oligosaccharyltransferase STT3 isoforms. FEBS Open Bio 2020; 10:468-483. [PMID: 32011067 PMCID: PMC7050244 DOI: 10.1002/2211-5463.12804] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 01/11/2020] [Accepted: 01/30/2020] [Indexed: 11/08/2022] Open
Abstract
STT3 is a catalytic subunit of hetero-oligomeric oligosaccharyltransferase (OST), which is important for asparagine-linked glycosylation. In mammals and plants, OSTs with different STT3 isoforms exhibit distinct levels of enzymatic efficiency or different responses to stressors. Although two different STT3 isoforms have been identified in both plants and animals, it remains unclear whether these isoforms result from gene duplication in an ancestral eukaryote. Furthermore, the molecular mechanisms underlying the functional divergences between the two STT3 isoforms in plant have not been well elucidated. Here, we conducted phylogenetic analysis of the major evolutionary node species and suggested that gene duplications of STT3 may have occurred independently in animals and plants. Across land plants, the exon-intron structure differed between the two STT3 isoforms, but was highly conserved for each isoform. Most angiosperm STT3a genes had 23 exons with intron phase 0, while STT3b genes had 6 exons with intron phase 2. Characteristic motifs (motif 18 and 19) of STT3s were mapped to different structure domains in the plant STT3 proteins. These two motifs overlap with regions of high nonsynonymous-to-synonymous substitution rates, suggesting the regions may be related to functional difference between STT3a and STT3b. In addition, promoter elements and gene expression profiles were different between the two isoforms, indicating expression pattern divergence of the two genes. Collectively, the identified differences may result in the functional divergence of plant STT3s.
Collapse
Affiliation(s)
- Guanting Niu
- State Key Laboratory of Pharmaceutical Biotechnology, NJU Advanced Institute for Life Sciences (NAILS), School of Life Sciences, Nanjing University, China
| | - Zhuqing Shao
- State Key Laboratory of Pharmaceutical Biotechnology, NJU Advanced Institute for Life Sciences (NAILS), School of Life Sciences, Nanjing University, China
| | - Chuanfa Liu
- Department of Biology, Institute of Plant and Food Science, Southern University of Science and Technology, Shenzhen, China
| | - Tianshu Chen
- State Key Laboratory of Pharmaceutical Biotechnology, NJU Advanced Institute for Life Sciences (NAILS), School of Life Sciences, Nanjing University, China
| | - Qingsong Jiao
- State Key Laboratory of Pharmaceutical Biotechnology, NJU Advanced Institute for Life Sciences (NAILS), School of Life Sciences, Nanjing University, China
| | - Zhi Hong
- State Key Laboratory of Pharmaceutical Biotechnology, NJU Advanced Institute for Life Sciences (NAILS), School of Life Sciences, Nanjing University, China
| |
Collapse
|
38
|
Frey K, Pucker B. Animal, Fungi, and Plant Genome Sequences Harbor Different Non-Canonical Splice Sites. Cells 2020; 9:E458. [PMID: 32085510 PMCID: PMC7072748 DOI: 10.3390/cells9020458] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 02/11/2020] [Accepted: 02/14/2020] [Indexed: 11/17/2022] Open
Abstract
Most protein-encoding genes in eukaryotes contain introns, which are interwoven with exons. Introns need to be removed from initial transcripts in order to generate the final messenger RNA (mRNA), which can be translated into an amino acid sequence. Precise excision of introns by the spliceosome requires conserved dinucleotides, which mark the splice sites. However, there are variations of the highly conserved combination of GT at the 5' end and AG at the 3' end of an intron in the genome. GC-AG and AT-AC are two major non-canonical splice site combinations, which have been known for years. Recently, various minor non-canonical splice site combinations were detected with numerous dinucleotide permutations. Here, we expand systematic investigations of non-canonical splice site combinations in plants across eukaryotes by analyzing fungal and animal genome sequences. Comparisons of splice site combinations between these three kingdoms revealed several differences, such as an apparently increased CT-AC frequency in fungal genome sequences. Canonical GT-AG splice site combinations in antisense transcripts are a likely explanation for this observation, thus indicating annotation errors. In addition, high numbers of GA-AG splice site combinations were observed in Eurytemoraaffinis and Oikopleuradioica. A variant in one U1 small nuclear RNA (snRNA) isoform might allow the recognition of GA as a 5' splice site. In depth investigation of splice site usage based on RNA-Seq read mappings indicates a generally higher flexibility of the 3' splice site compared to the 5' splice site across animals, fungi, and plants.
Collapse
Affiliation(s)
- Katharina Frey
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec), Bielefeld University, 33615 Bielefeld, Germany;
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Bielefeld University, 33615 Bielefeld, Germany
| | - Boas Pucker
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec), Bielefeld University, 33615 Bielefeld, Germany;
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr-University Bochum, Universitätsstraße 150, 44801 Bochum, Germany
| |
Collapse
|
39
|
Lu H, Cui X, Zhao Y, Magwanga RO, Li P, Cai X, Zhou Z, Wang X, Liu Y, Xu Y, Hou Y, Peng R, Wang K, Liu F. Identification of a genome-specific repetitive element in the Gossypium D genome. PeerJ 2020; 8:e8344. [PMID: 31915591 PMCID: PMC6944119 DOI: 10.7717/peerj.8344] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 12/04/2019] [Indexed: 01/15/2023] Open
Abstract
The activity of genome-specific repetitive sequences is the main cause of genome variation between Gossypium A and D genomes. Through comparative analysis of the two genomes, we retrieved a repetitive element termed ICRd motif, which appears frequently in the diploid Gossypium raimondii (D5) genome but rarely in the diploid Gossypium arboreum (A2) genome. We further explored the existence of the ICRd motif in chromosomes of G. raimondii, G. arboreum, and two tetraploid (AADD) cotton species, Gossypium hirsutum and Gossypium barbadense, by fluorescence in situ hybridization (FISH), and observed that the ICRd motif exists in the D5 and D-subgenomes but not in the A2 and A-subgenomes. The ICRd motif comprises two components, a variable tandem repeat (TR) region and a conservative sequence (CS). The two constituents each have hundreds of repeats that evenly distribute across 13 chromosomes of the D5genome. The ICRd motif (and its repeats) was revealed as the common conservative region harbored by ancient Long Terminal Repeat Retrotransposons. Identification and investigation of the ICRd motif promotes the study of A and D genome differences, facilitates research on Gossypium genome evolution, and provides assistance to subgenome identification and genome assembling.
Collapse
Affiliation(s)
- Hejun Lu
- Gembloux Agro-Bio Tech, University of Liège, Gembloux, Namur, Belgium.,Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Xinglei Cui
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Yanyan Zhao
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Richard Odongo Magwanga
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China.,School of Biological and Physical Sciences (SBPS), Jaramogi Oginga Odinga University of Science and Technology (JOOUST), Bondo-Kenya, Bondo, Kenya
| | - Pengcheng Li
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Xiaoyan Cai
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Zhongli Zhou
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Xingxing Wang
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Yuling Liu
- Anyang Institute of Technology, Anyang, Henan, China
| | - Yanchao Xu
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Yuqing Hou
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Renhai Peng
- Anyang Institute of Technology, Anyang, Henan, China
| | - Kunbo Wang
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China.,Tarium University, Alar, Xinjiang, China
| | - Fang Liu
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| |
Collapse
|
40
|
Lu H, Cui X, Zhao Y, Magwanga RO, Li P, Cai X, Zhou Z, Wang X, Liu Y, Xu Y, Hou Y, Peng R, Wang K, Liu F. Identification of a genome-specific repetitive element in the Gossypium D genome. PeerJ 2020; 8:e8344. [PMID: 31915591 DOI: 10.7287/peerj.preprints.27806v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Accepted: 12/04/2019] [Indexed: 05/23/2023] Open
Abstract
The activity of genome-specific repetitive sequences is the main cause of genome variation between Gossypium A and D genomes. Through comparative analysis of the two genomes, we retrieved a repetitive element termed ICRd motif, which appears frequently in the diploid Gossypium raimondii (D5) genome but rarely in the diploid Gossypium arboreum (A2) genome. We further explored the existence of the ICRd motif in chromosomes of G. raimondii, G. arboreum, and two tetraploid (AADD) cotton species, Gossypium hirsutum and Gossypium barbadense, by fluorescence in situ hybridization (FISH), and observed that the ICRd motif exists in the D5 and D-subgenomes but not in the A2 and A-subgenomes. The ICRd motif comprises two components, a variable tandem repeat (TR) region and a conservative sequence (CS). The two constituents each have hundreds of repeats that evenly distribute across 13 chromosomes of the D5genome. The ICRd motif (and its repeats) was revealed as the common conservative region harbored by ancient Long Terminal Repeat Retrotransposons. Identification and investigation of the ICRd motif promotes the study of A and D genome differences, facilitates research on Gossypium genome evolution, and provides assistance to subgenome identification and genome assembling.
Collapse
Affiliation(s)
- Hejun Lu
- Gembloux Agro-Bio Tech, University of Liège, Gembloux, Namur, Belgium
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Xinglei Cui
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Yanyan Zhao
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Richard Odongo Magwanga
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
- School of Biological and Physical Sciences (SBPS), Jaramogi Oginga Odinga University of Science and Technology (JOOUST), Bondo-Kenya, Bondo, Kenya
| | - Pengcheng Li
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Xiaoyan Cai
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Zhongli Zhou
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Xingxing Wang
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Yuling Liu
- Anyang Institute of Technology, Anyang, Henan, China
| | - Yanchao Xu
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Yuqing Hou
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| | - Renhai Peng
- Anyang Institute of Technology, Anyang, Henan, China
| | - Kunbo Wang
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
- Tarium University, Alar, Xinjiang, China
| | - Fang Liu
- Research Base of Tarium University, State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, Henan, China
| |
Collapse
|
41
|
Wang D. IntronDB: a database for eukaryotic intron features. Bioinformatics 2019; 35:4400-4401. [PMID: 30949679 DOI: 10.1093/bioinformatics/btz242] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2019] [Revised: 03/13/2019] [Accepted: 04/02/2019] [Indexed: 11/14/2022] Open
Abstract
SUMMARY The rate and extent of unbalanced eukaryotic intron changes exhibit dynamic patterns for different lineages of species or certain functional groups of genes with varied spatio-temporal expression modes affected by selective pressure. To date, only a few key conserved splicing signals or regulatory elements have been identified in introns and little is known about the remaining intronic regions. To trace the evolutionary trajectory of spliceosomal introns from available genomes under a unified framework, we present IntronDB, which catalogs ∼50 000 000 introns from over 1000 genomes spanning the major eukaryotic clades in the tree of life. Based on the position of introns relative to coding regions, it categorizes introns into three groups, such as 5'UTR, CDS and 3'UTR and subsequently divides CDS introns into three categories, such as phase 0, phase 1 and phase 2. It provides the quality evaluation for each sequence entry and characterizes the intronic parameters including number, size, sequence composition and positioning information as well as the features for exons and genes, making possible the comparisons between introns and exons. It reports the dinucleotides around the intron boundary and displays the consensus sequence features for all introns, small introns and large introns for each genome. By incorporating the taxonomic assignment of genomes, it performs high-level or genome-wide statistical analysis for single feature and coupled features both in a single genome and across multiple genomes. It offers functionalities to browse the data from representative protein-coding transcripts and download the data from all transcripts from protein-coding genes. AVAILABILITY AND IMPLEMENTATION http://www.nextgenbioinformatics.org/IntronDB. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dapeng Wang
- Department of Plant Sciences, University of Oxford, Oxford, UK.,LeedsOmics, University of Leeds, Leeds, UK
| |
Collapse
|
42
|
Bai G, Yang DH, Cao P, Yao H, Zhang Y, Chen X, Xiao B, Li F, Wang ZY, Yang J, Xie H. Genome-Wide Identification, Gene Structure and Expression Analysis of the MADS-Box Gene Family Indicate Their Function in the Development of Tobacco ( Nicotiana tabacum L.). Int J Mol Sci 2019; 20:E5043. [PMID: 31614589 PMCID: PMC6829366 DOI: 10.3390/ijms20205043] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 10/06/2019] [Accepted: 10/09/2019] [Indexed: 12/14/2022] Open
Abstract
MADS-box genes play a pivotal role in various processes, including floral and seed development, controlling flowering time, regulation of fruits ripening, and respond to abiotic and biotic stressors in planta. Tobacco (Nicotiana tabacum) has been widely used as a model plant for analyzing the gene function, however, there has been less information on the regulation of flowering, and the associated genes. In the present study, a total of 168 NtMADS-box genes were identified from tobacco, and their phylogenetic relationship, chromosome locations, and gene structures were further analyzed. NtMADS-box genes can be clustered into four sub-families of Mα, Mγ, MIKC*, and MIKCC. A total of 111 NtMADS-box genes were distributed on 20 chromosomes, and 57 NtMADS-box genes were located on the unanchored scaffolds due to the complex and incomplete assembly of the tobacco genome. Expression profiles of NtMADS-box genes by microarray from 23 different tissues indicated that members in different NtMADS-box gene subfamilies might play specific roles in the growth and flower development, and the transcript levels of 24 NtMADS-box genes were confirmed by quantitative real-time PCR. Importantly, overexpressed NtSOC1/NtMADS133 could promote early flowering and dwarfism in transgenic tobacco plants. Therefore, our findings provide insights on the characterization of NtMADS-box genes to further study their functions in plant development.
Collapse
Affiliation(s)
- Ge Bai
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, China.
- Key Laboratory of Tobacco Biotechnological Breeding, Kunming, 650021, China.
- National Tobacco Genetic Engineering Research Center, Kunming, 650021, China.
| | - Da-Hai Yang
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, China.
- Key Laboratory of Tobacco Biotechnological Breeding, Kunming, 650021, China.
- National Tobacco Genetic Engineering Research Center, Kunming, 650021, China.
| | - Peijian Cao
- China Tobacco Gene Research Centre, Zhengzhou Tobacco Research Institute, Zhengzhou, 450001, China.
| | - Heng Yao
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, China.
- Key Laboratory of Tobacco Biotechnological Breeding, Kunming, 650021, China.
- National Tobacco Genetic Engineering Research Center, Kunming, 650021, China.
| | - Yihan Zhang
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, China.
- Key Laboratory of Tobacco Biotechnological Breeding, Kunming, 650021, China.
- National Tobacco Genetic Engineering Research Center, Kunming, 650021, China.
| | - Xuejun Chen
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, China.
- Key Laboratory of Tobacco Biotechnological Breeding, Kunming, 650021, China.
- National Tobacco Genetic Engineering Research Center, Kunming, 650021, China.
| | - Bingguang Xiao
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, China.
- Key Laboratory of Tobacco Biotechnological Breeding, Kunming, 650021, China.
- National Tobacco Genetic Engineering Research Center, Kunming, 650021, China.
| | - Feng Li
- China Tobacco Gene Research Centre, Zhengzhou Tobacco Research Institute, Zhengzhou, 450001, China.
| | - Zhen-Yu Wang
- Hainan Key Laboratory for Sustainable Utilization of Tropical Bioresource, Institute of Tropical Agriculture and Forestry, Hainan University, Haikou, Hainan 570228, China.
| | - Jun Yang
- China Tobacco Gene Research Centre, Zhengzhou Tobacco Research Institute, Zhengzhou, 450001, China.
| | - He Xie
- Tobacco Breeding and Biotechnology Research Center, Yunnan Academy of Tobacco Agricultural Sciences, Kunming 650021, China.
- Key Laboratory of Tobacco Biotechnological Breeding, Kunming, 650021, China.
- National Tobacco Genetic Engineering Research Center, Kunming, 650021, China.
| |
Collapse
|
43
|
Zhou C, Zhu C, Fu H, Li X, Chen L, Lin Y, Lai Z, Guo Y. Genome-wide investigation of superoxide dismutase (SOD) gene family and their regulatory miRNAs reveal the involvement in abiotic stress and hormone response in tea plant (Camellia sinensis). PLoS One 2019; 14:e0223609. [PMID: 31600284 PMCID: PMC6786557 DOI: 10.1371/journal.pone.0223609] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Accepted: 09/24/2019] [Indexed: 02/06/2023] Open
Abstract
Superoxide dismutases (SODs), as a family of metalloenzymes related to the removal of reactive oxygen species (ROS), have not previously been investigated at genome-wide level in tea plant. In this study, 10 CsSOD genes were identified in tea plant genome, including 7 Cu/Zn-SODs (CSDs), 2 Fe-SODs (FSDs) and one Mn-SOD (MSD), and phylogenetically classified in three subgroups, respectively. Physico-chemical characteristic, conserved motifs and potential protein interaction analyses about CsSOD proteins were carried out. Exon-intron structures and codon usage bias about CsSOD genes were also examined. Exon-intron structures analysis revealed that different CsSOD genes contained various number of introns. On the basis of the prediction of regulatory miRNAs of CsSODs, a modification 5’ RNA ligase-mediated (RLM)-RACE was performed and validated that csn-miR398a-3p-1 directly cleaves CsCSD4. By prediction of cis-acting elements, the expression patterns of 10 CsSOD genes and their regulatory miRNAs were detected under cold, drought, exogenous methyl jasmonate (MeJA) and gibberellin (GA3) treatments. The results showed that most of CsSODs except for CsFSD2 were induced under cold stress and CsCSDs may play primary roles under drought stress; exogenous GA3 and MeJA could also stimulated/inhibited distinct CsSODs at different stages. In addition, we found that csn-miR398a-3p-1 negatively regulated the expression of CsCSD4 may be a crucial regulatory mechanism under cold stress. This study provides a certain basis for the studies about stress resistance in tea plants, even provide insight into comprehending the classification, evolution, diverse functions and influencing factors of expression patterns for CsSOD genes.
Collapse
Affiliation(s)
- Chengzhe Zhou
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
| | - Chen Zhu
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
- Institute of Horticultural Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
| | - Haifeng Fu
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
| | - Xiaozhen Li
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
| | - Lan Chen
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
| | - Yuling Lin
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
- Institute of Horticultural Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
| | - Zhongxiong Lai
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
- Institute of Horticultural Biotechnology, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
| | - Yuqiong Guo
- College of Horticulture, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
- Key Laboratory of Tea Science of Fujian Province, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
- * E-mail:
| |
Collapse
|
44
|
A comprehensive analysis of the B3 superfamily identifies tissue-specific and stress-responsive genes in chickpea ( Cicer arietinum L.). 3 Biotech 2019; 9:346. [PMID: 31497464 DOI: 10.1007/s13205-019-1875-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Accepted: 08/14/2019] [Indexed: 12/31/2022] Open
Abstract
The aim of this study was to provide a comprehensive analysis of the plant-specific B3 domain-containing transcription factors (TFs) in chickpea. Scanning of the chickpea genome resulted in the identification of 51 B3 domain-containing TFs that were located on seven out of eight chickpea chromosomes. Based on the presence of additional domains other than the B3 domain, the candidates were classified into four subfamilies, i.e., ARF (24), REM (19), LAV (6) and RAV (2). Phylogenetic analysis classified them into four groups in which members of the same group had similar intron-exon organization and motif composition. Genome duplication analysis of the candidate B3 genes revealed an event of segmental duplication that was instrumental in the expansion of the B3 gene family. Ka/Ks analysis showed that the B3 gene family was under purifying selection. Further, chickpea B3 genes showed maximum orthology with Medicago followed by soybean and Arabidopsis. Promoter analyses of the B3 genes led to the identification of several tissue-specific and stress-responsive cis-regulatory elements. Expression profiling of the candidate B3 genes using publicly available RNA-seq data of several chickpea tissues indicated their putative role in plant development and abiotic stress response. These findings were further validated by real-time expression analysis. Overall, this study provides a comprehensive analysis of the B3 domain-containing proteins in chickpea that would aid in devising strategies for crop manipulation in chickpea.
Collapse
|
45
|
Bartys N, Kierzek R, Lisowiec-Wachnicka J. The regulation properties of RNA secondary structure in alternative splicing. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1862:194401. [PMID: 31323437 DOI: 10.1016/j.bbagrm.2019.07.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Accepted: 07/09/2019] [Indexed: 11/30/2022]
Abstract
The RNA secondary structure is important for many functional processes in the cell. The secondary and tertiary structures of cellular RNAs are essential for the activity of these molecules in processes such as transcription, splicing, translation, and localization. New high-throughput analytical methods, including next generation sequencing, have allowed for the in-depth characterization of the 'RNA structurome': a new term describing how the RNA structure controls the activity of RNA by itself and how it regulates the expression of genes. In this review, we present many examples of the influence of structural motifs of RNA, long range interactions and global RNA structure on the alternative splicing processes. This article is part of a Special Issue entitled: RNA structure and splicing regulation edited by Francisco Baralle, Ravindra Singh and Stefan Stamm.
Collapse
Affiliation(s)
- Natalia Bartys
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Z. Noskowskiego 12/14, 61-704 Poznań, Poland
| | - Ryszard Kierzek
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Z. Noskowskiego 12/14, 61-704 Poznań, Poland
| | - Jolanta Lisowiec-Wachnicka
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Z. Noskowskiego 12/14, 61-704 Poznań, Poland.
| |
Collapse
|
46
|
Shenasa H, Hertel KJ. Combinatorial regulation of alternative splicing. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1862:194392. [PMID: 31276857 DOI: 10.1016/j.bbagrm.2019.06.003] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 06/21/2019] [Accepted: 06/24/2019] [Indexed: 12/23/2022]
Abstract
The generation of protein coding mRNAs from pre-mRNA is a fundamental biological process that is required for gene expression. Alternative pre-mRNA splicing is responsible for much of the transcriptomic and proteomic diversity observed in higher order eukaryotes. Aberrations that disrupt regular alternative splicing patterns are known to cause human diseases, including various cancers. Alternative splicing is a combinatorial process, meaning many factors affect which two splice sites are ligated together. The features that dictate exon inclusion are comprised of splice site strength, intron-exon architecture, RNA secondary structure, splicing regulatory elements, promoter use and transcription speed by RNA polymerase and the presence of post-transcriptional nucleotide modifications. A comprehensive view of all of the factors that influence alternative splicing decisions is necessary to predict splicing outcomes and to understand the molecular basis of disease. This article is part of a Special Issue entitled: RNA structure and splicing regulation edited by Francisco Baralle, Ravindra Singh and Stefan Stamm.
Collapse
Affiliation(s)
- Hossein Shenasa
- Department of Microbiology and Molecular Genetics, University of California, Irvine, CA 92697, United States of America
| | - Klemens J Hertel
- Department of Microbiology and Molecular Genetics, University of California, Irvine, CA 92697, United States of America.
| |
Collapse
|
47
|
Dvorak P, Leupen S, Soucek P. Functionally Significant Features in the 5' Untranslated Region of the ABCA1 Gene and Their Comparison in Vertebrates. Cells 2019; 8:cells8060623. [PMID: 31234415 PMCID: PMC6627321 DOI: 10.3390/cells8060623] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Revised: 06/17/2019] [Accepted: 06/19/2019] [Indexed: 02/07/2023] Open
Abstract
Single nucleotide polymorphisms located in 5′ untranslated regions (5′UTRs) can regulate gene expression and have clinical impact. Recognition of functionally significant sequences within 5′UTRs is crucial in next-generation sequencing applications. Furthermore, information about the behavior of 5′UTRs during gene evolution is scarce. Using the example of the ATP-binding cassette transporter A1 (ABCA1) gene (Tangier disease), we describe our algorithm for functionally significant sequence finding. 5′UTR features (upstream start and stop codons, open reading frames (ORFs), GC content, motifs, and secondary structures) were studied using freely available bioinformatics tools in 55 vertebrate orthologous genes obtained from Ensembl and UCSC. The most conserved sequences were suggested as hot spots. Exon and intron enhancers and silencers (sc35, ighg2 cgamma2, ctnt, gh-1, and fibronectin eda exon), transcription factors (TFIIA, TATA, NFAT1, NFAT4, and HOXA13), some of them cancer related, and microRNA (hsa-miR-4474-3p) were localized to these regions. An upstream ORF, overlapping with the main ORF in primates and possibly coding for a small bioactive peptide, was also detected. Moreover, we showed several features of 5′UTRs, such as GC content variation, hairpin structure conservation or 5′UTR segmentation, which are interesting from a phylogenetic point of view and can stimulate further evolutionary oriented research.
Collapse
Affiliation(s)
- Pavel Dvorak
- Department of Biology, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300 Pilsen, Czech Republic.
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300 Pilsen, Czech Republic.
| | - Sarah Leupen
- Department of Biological Sciences, University of Maryland Baltimore County, Baltimore, MD 21250, USA.
| | - Pavel Soucek
- Biomedical Center, Faculty of Medicine in Pilsen, Charles University, Alej Svobody 76, 32300 Pilsen, Czech Republic.
- Toxicogenomics Unit, National Institute of Public Health, Srobarova 48, 100 42 Prague 10, Czech Republic.
| |
Collapse
|
48
|
Lu H, Cui X, Liu Z, Liu Y, Wang X, Zhou Z, Cai X, Zhang Z, Guo X, Hua J, Ma Z, Wang X, Zhang J, Zhang H, Liu F, Wang K. Discovery and annotation of a novel transposable element family in Gossypium. BMC PLANT BIOLOGY 2018; 18:307. [PMID: 30486783 PMCID: PMC6264596 DOI: 10.1186/s12870-018-1519-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 11/13/2018] [Indexed: 05/23/2023]
Abstract
BACKGROUND Fluorescence in situ hybridization (FISH) is an efficient cytogenetic technology to study chromosome structure. Transposable element (TE) is an important component in eukaryotic genomes and can provide insights in the structure and evolution of eukaryotic genomes. RESULTS A FISH probe derived from bacterial artificial chromosome (BAC) clone 299N22 generated striking signals on all 26 chromosomes of the cotton diploid A genome (AA, 2x=26) but very few on the diploid D genome (DD, 2x=26). All 26 chromosomes of the A sub genome (At) of tetraploid cotton (AADD, 2n=4x=52) also gave positive signals with this FISH probe, whereas very few signals were observed on the D sub genome (Dt). Sequencing and annotation of BAC clone 299N22, revealed a novel Ty3/gypsy transposon family, which was named as 'CICR'. This family is a significant contributor to size expansion in the A (sub) genome but not in the D (sub) genome. Further FISH analysis with the LTR of CICR as a probe revealed that CICR is lineage-specific, since massive repeats were found in A and B genomic groups, but not in C-G genomic groups within the Gossypium genus. Molecular evolutionary analysis of CICR suggested that tetraploid cottons evolved after silence of the transposon family 1-1.5 million years ago (Mya). Furthermore, A genomes are more homologous with B genomes, and the C, E, F, and G genomes likely diverged from a common ancestor prior to 3.5-4 Mya, the time when CICR appeared. The genomic variation caused by the insertion of CICR in the A (sub) genome may have played an important role in the speciation of organisms with A genomes. CONCLUSIONS The CICR family is highly repetitive in A and B genomes of Gossypium, but not amplified in the C-G genomes. The differential amount of CICR family in At and Dt will aid in partitioning sub genome sequences for chromosome assemblies during tetraploid genome sequencing and will act as a method for assessing the accuracy of tetraploid genomes by looking at the proportion of CICR elements in resulting pseudochromosome sequences. The timeline of the expansion of CICR family provides a new reference for cotton evolutionary analysis, while the impact on gene function caused by the insertion of CICR elements will be a target for further analysis of investigating phenotypic differences between A genome and D genome species.
Collapse
Affiliation(s)
- Hejun Lu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
- Gembloux Agro-Bio Tech, University of Liège, 5030 Gembloux, Belgium
| | - Xinglei Cui
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
| | - Zhen Liu
- Anyang Institute of Technology, Anyang, 455000 Henan China
| | - Yuling Liu
- Anyang Institute of Technology, Anyang, 455000 Henan China
| | - Xingxing Wang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
| | - Zhongli Zhou
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
| | - Xiaoyan Cai
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
| | - Zhenmei Zhang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
| | - Xinlei Guo
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
| | - Jinping Hua
- Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193 China
| | - Zhiying Ma
- Key Laboratory for Crop Germplasm Resources of Hebei province, Hebei Agricultural University, Baoding, 071000 Hebei China
| | - Xiyin Wang
- Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan, 063000 Hebei China
| | - Jinfa Zhang
- Department of Plant and Environmental Sciences, New Mexico State University, Las Cruces, 88003 USA
| | - Hong Zhang
- Department of Biological Sciences, Texas Tech University, Lubbock, 79409 USA
| | - Fang Liu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
| | - Kunbo Wang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Science, Anyang, 455000 Henan China
| |
Collapse
|
49
|
Kim BY, Huber CD, Lohmueller KE. Deleterious variation shapes the genomic landscape of introgression. PLoS Genet 2018; 14:e1007741. [PMID: 30346959 PMCID: PMC6233928 DOI: 10.1371/journal.pgen.1007741] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Revised: 11/13/2018] [Accepted: 10/05/2018] [Indexed: 11/19/2022] Open
Abstract
While it is appreciated that population size changes can impact patterns of deleterious variation in natural populations, less attention has been paid to how gene flow affects and is affected by the dynamics of deleterious variation. Here we use population genetic simulations to examine how gene flow impacts deleterious variation under a variety of demographic scenarios, mating systems, dominance coefficients, and recombination rates. Our results show that admixture between populations can temporarily reduce the genetic load of smaller populations and cause increases in the frequency of introgressed ancestry, especially if deleterious mutations are recessive. Additionally, when fitness effects of new mutations are recessive, between-population differences in the sites at which deleterious variants exist creates heterosis in hybrid individuals. Together, these factors lead to an increase in introgressed ancestry, particularly when recombination rates are low. Under certain scenarios, introgressed ancestry can increase from an initial frequency of 5% to 30–75% and fix at many loci, even in the absence of beneficial mutations. Further, deleterious variation and admixture can generate correlations between the frequency of introgressed ancestry and recombination rate or exon density, even in the absence of other types of selection. The direction of these correlations is determined by the specific demography and whether mutations are additive or recessive. Therefore, it is essential that null models of admixture include both demography and deleterious variation before invoking other mechanisms to explain unusual patterns of genetic variation. Individuals from distinct populations sometimes will produce fertile offspring and will exchange genetic material in a process called hybridization. Genomes of hybrid individuals often show non-random patterns of hybrid ancestry across the genome, where some regions have a high frequency of ancestry from the second population and other regions have less. Typically, this pattern has been attributed to adaptive introgression, where beneficial genetic variants are passed from one population to the other, or to genomic incompatibilities between these distinct species. However, other mechanisms could lead to these heterogeneous patterns of ancestry in hybrids. Here we use simulations to investigate whether deleterious mutations affect the patterns of introgressed ancestry across genomes. We show that when ancestry from a larger population is added to a smaller population, the ancestry from the larger population dramatically increases in frequency because it carries fewer deleterious mutations. This occurs even in the absence of beneficial mutations in either population. Additionally, we show that differences in sex chromosome evolution relative to autosomes, or differences in mating system, can affect patterns of introgression in similar ways. Our study argues that deleterious mutations should be included in population genetic models used to identify unusual regions of the genome that appear to be under selection in hybrids.
Collapse
Affiliation(s)
- Bernard Y. Kim
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
| | - Christian D. Huber
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, United States of America
- * E-mail:
| |
Collapse
|
50
|
Werner MS, Sieriebriennikov B, Prabh N, Loschko T, Lanz C, Sommer RJ. Young genes have distinct gene structure, epigenetic profiles, and transcriptional regulation. Genome Res 2018; 28:1675-1687. [PMID: 30232198 PMCID: PMC6211652 DOI: 10.1101/gr.234872.118] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 09/05/2018] [Indexed: 12/22/2022]
Abstract
Species-specific, new, or "orphan" genes account for 10%-30% of eukaryotic genomes. Although initially considered to have limited function, an increasing number of orphan genes have been shown to provide important phenotypic innovation. How new genes acquire regulatory sequences for proper temporal and spatial expression is unknown. Orphan gene regulation may rely in part on origination in open chromatin adjacent to preexisting promoters, although this has not yet been assessed by genome-wide analysis of chromatin states. Here, we combine taxon-rich nematode phylogenies with Iso-Seq, RNA-seq, ChIP-seq, and ATAC-seq to identify the gene structure and epigenetic signature of orphan genes in the satellite model nematode Pristionchus pacificus Consistent with previous findings, we find young genes are shorter, contain fewer exons, and are on average less strongly expressed than older genes. However, the subset of orphan genes that are expressed exhibit distinct chromatin states from similarly expressed conserved genes. Orphan gene transcription is determined by a lack of repressive histone modifications, confirming long-held hypotheses that open chromatin is important for new gene formation. Yet orphan gene start sites more closely resemble enhancers defined by H3K4me1, H3K27ac, and ATAC-seq peaks, in contrast to conserved genes that exhibit traditional promoters defined by H3K4me3 and H3K27ac. Although the majority of orphan genes are located on chromosome arms that contain high recombination rates and repressive histone marks, strongly expressed orphan genes are more randomly distributed. Our results support a model of new gene origination by rare integration into open chromatin near enhancers.
Collapse
Affiliation(s)
- Michael S Werner
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Bogdan Sieriebriennikov
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Neel Prabh
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Tobias Loschko
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Christa Lanz
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| | - Ralf J Sommer
- Department of Evolutionary Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany
| |
Collapse
|