1
|
Shi K, Li D, Jiang X, Du Y, Yu M. Identification and Characterization of the miRNA Transcriptome Controlling Green Pigmentation of Chicken Eggshells. Genes (Basel) 2024; 15:811. [PMID: 38927746 PMCID: PMC11202967 DOI: 10.3390/genes15060811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 06/17/2024] [Accepted: 06/17/2024] [Indexed: 06/28/2024] Open
Abstract
Green eggs are mainly caused by inserting an avian endogenous retrovirus (EVA-HP) fragment into the SLCO1B3 gene. Although the genotypes for this insertion allele are consistent, eggshell color (ESC) may vary after a peak laying period; light-colored eggs are undesired by consumers and farmers and result in financial loss, so it is necessary to resolve this problem. miRNAs are small non-coding RNAs that exert essential functions in animal development and diseases. However, the regulatory miRNAs and detailed molecular mechanisms regulating eggshell greenness remain unclear. In the present study, we determined the genotype of green-eggshell hens through the detection of a homozygous allele insertion in the SLCO1B3 gene. The shell gland epithelium was obtained from green-eggshell hens that produced white and green shell eggs to perform transcriptome sequencing and investigate the important regulatory mechanisms that influence the ESC. Approximately 921 miRNAs were expressed in these two groups, which included 587 known miRNAs and 334 novel miRNAs, among which 44 were differentially expressed. There were 22 miRNAs that were significantly upregulated in the green and white groups, respectively, which targeted hundreds of genes, including KIT, HMOX2, and several solute carrier family genes. A Gene Ontology enrichment analysis of the target genes showed that the differentially expressed miRNA-targeted genes mainly belonged to the functional categories of homophilic cell adhesion, gland development, the Wnt signaling pathway, and epithelial tube morphogenesis. A KEGG enrichment analysis showed that the Hedgehog signaling pathway was significantly transformed in this study. The current study provides an overview of the miRNA expression profiles and the interaction between the miRNAs and their target genes. It provides valuable insights into the molecular mechanisms underlying green eggshell pigmentation, screening more effective hens to produce stable green eggs and obtaining higher economic benefits.
Collapse
Affiliation(s)
| | | | | | | | - Minli Yu
- College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, China; (K.S.); (D.L.)
| |
Collapse
|
2
|
Chen X, Ping Y, Sun J. Efficient estimation of Cox model with random change point. Stat Med 2024; 43:1213-1226. [PMID: 38247108 DOI: 10.1002/sim.9987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 11/27/2023] [Accepted: 11/29/2023] [Indexed: 01/23/2024]
Abstract
In clinical studies, the risk of a disease may dramatically change when some biological indexes of the human body exceed some thresholds. Furthermore, the differences in individual characteristics of patients such as physical and psychological experience may lead to subject-specific thresholds or change points. Although a large literature has been established for regression analysis of failure time data with change points, most of the existing methods assume the same, fixed change point for all study subjects. In this paper, we consider the situation where there exists a subject-specific change point and two Cox type models are presented. The proposed models also offer a framework for subgroup analysis. For inference, a sieve maximum likelihood estimation procedure is proposed and the asymptotic properties of the resulting estimators are established. An extensive simulation study is conducted to assess the empirical performance of the proposed method and indicates that it works well in practical situations. Finally the proposed approach is applied to a set of breast cancer data.
Collapse
Affiliation(s)
- Xuerong Chen
- Centre of Statistical Research, Southwestern University of Finance and Economics, Chengdu, China
| | - Yalu Ping
- Centre of Statistical Research, Southwestern University of Finance and Economics, Chengdu, China
| | - Jianguo Sun
- Department of Statistics, University of Missouri, Columbia, Missouri, USA
| |
Collapse
|
3
|
Bryce-Smith S, Burri D, Gazzara MR, Herrmann CJ, Danecka W, Fitzsimmons CM, Wan YK, Zhuang F, Fansler MM, Fernández JM, Ferret M, Gonzalez-Uriarte A, Haynes S, Herdman C, Kanitz A, Katsantoni M, Marini F, McDonnel E, Nicolet B, Poon CL, Rot G, Schärfen L, Wu PJ, Yoon Y, Barash Y, Zavolan M. Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data. RNA (NEW YORK, N.Y.) 2023; 29:1839-1855. [PMID: 37816550 PMCID: PMC10653393 DOI: 10.1261/rna.079849.123] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 09/21/2023] [Indexed: 10/12/2023]
Abstract
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.
Collapse
Affiliation(s)
- Sam Bryce-Smith
- Department of Neuromuscular Diseases, UCL Queen Square Motor Neuron Disease Centre, UCL Queen Square Institute of Neurology, UCL, London WC1N 3BG, United Kingdom
| | - Dominik Burri
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Matthew R Gazzara
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Christina J Herrmann
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Weronika Danecka
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3FF, United Kingdom
| | - Christina M Fitzsimmons
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore, Buona Vista, Singapore 138672
- Yong Loo Lin School of Medicine, National University of Singapore, Kent Ridge, Singapore 119228
| | - Farica Zhuang
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Mervin M Fansler
- Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell Graduate Studies, New York, New York 10065, USA
- Cancer Biology and Genetics, Sloan-Kettering Institute, MSKCC, New York, New York 10065, USA
| | - José M Fernández
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Meritxell Ferret
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Asier Gonzalez-Uriarte
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Samuel Haynes
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3FF, United Kingdom
| | - Chelsea Herdman
- Department of Neurobiology, University of Utah, Salt Lake City, Utah 84132, USA
| | - Alexander Kanitz
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Maria Katsantoni
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg-University Mainz, 55118 Mainz, Germany
| | - Euan McDonnel
- Leeds Institute for Data Analytics, School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9NL, United Kingdom
| | - Ben Nicolet
- Department of Hematopoiesis, Sanquin Research, Landsteiner Laboratory, Amsterdam UMC, University of Amsterdam, 1066 CX Amsterdam, The Netherlands
- Oncode Institute, 3521 AL Utrecht, The Netherlands
| | - Chi-Lam Poon
- Graduate School of Medical Sciences, Weill Cornell Medicine, New York, New York 10065, USA
| | - Gregor Rot
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Institute of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
| | - Leonard Schärfen
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Pin-Jou Wu
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, 72076 Tübingen, Germany
| | - Yoseop Yoon
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California Irvine, Irvine, California 92617, USA
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Mihaela Zavolan
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
4
|
Soulette CM, Hrabeta-Robinson E, Arevalo C, Felton C, Tang AD, Marin MG, Brooks AN. Full-length transcript alterations in human bronchial epithelial cells with U2AF1 S34F mutations. Life Sci Alliance 2023; 6:e202000641. [PMID: 37487637 PMCID: PMC10366530 DOI: 10.26508/lsa.202000641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 06/30/2023] [Accepted: 07/03/2023] [Indexed: 07/26/2023] Open
Abstract
U2AF1 is one of the most recurrently mutated splicing factors in lung adenocarcinoma and has been shown to cause transcriptome-wide pre-mRNA splicing alterations; however, the full-length altered mRNA isoforms associated with the mutation are largely unknown. To better understand the impact U2AF1 has on full-length isoform fate and function, we conducted high-throughput long-read cDNA sequencing from isogenic human bronchial epithelial cells with and without a U2AF1 S34F mutation. We identified 49,366 multi-exon transcript isoforms, more than half of which did not match GENCODE or short-read-assembled isoforms. We found 198 transcript isoforms with significant expression and usage changes relative to WT, only 68% of which were assembled by short reads. Expression of isoforms from immune-related genes is largely down-regulated in mutant cells and without observed splicing changes. Finally, we reveal that isoforms likely targeted by nonsense-mediated decay are down-regulated in U2AF1 S34F cells, suggesting that isoform changes may alter the translational output of those affected genes. Altogether, our work provides a resource of full-length isoforms associated with U2AF1 S34F in lung cells.
Collapse
Affiliation(s)
- Cameron M Soulette
- Department of Molecular, Cellular and Developmental Biology, University of California, Santa Cruz, CA, USA
| | - Eva Hrabeta-Robinson
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, USA
| | - Carlos Arevalo
- Department of Molecular, Cellular and Developmental Biology, University of California, Santa Cruz, CA, USA
| | - Colette Felton
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, USA
| | - Alison D Tang
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, USA
| | - Maximillian G Marin
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, USA
| | - Angela N Brooks
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA, USA
| |
Collapse
|
5
|
Ju L, Glastad KM, Sheng L, Gospocic J, Kingwell CJ, Davidson SM, Kocher SD, Bonasio R, Berger SL. Hormonal gatekeeping via the blood-brain barrier governs caste-specific behavior in ants. Cell 2023; 186:4289-4309.e23. [PMID: 37683635 PMCID: PMC10807403 DOI: 10.1016/j.cell.2023.08.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 05/10/2023] [Accepted: 08/01/2023] [Indexed: 09/10/2023]
Abstract
Here, we reveal an unanticipated role of the blood-brain barrier (BBB) in regulating complex social behavior in ants. Using scRNA-seq, we find localization in the BBB of a key hormone-degrading enzyme called juvenile hormone esterase (Jhe), and we show that this localization governs the level of juvenile hormone (JH3) entering the brain. Manipulation of the Jhe level reprograms the brain transcriptome between ant castes. Although ant Jhe is retained and functions intracellularly within the BBB, we show that Drosophila Jhe is naturally extracellular. Heterologous expression of ant Jhe into the Drosophila BBB alters behavior in fly to mimic what is seen in ants. Most strikingly, manipulation of Jhe levels in ants reprograms complex behavior between worker castes. Our study thus uncovers a remarkable, potentially conserved role of the BBB serving as a molecular gatekeeper for a neurohormonal pathway that regulates social behavior.
Collapse
Affiliation(s)
- Linyang Ju
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Karl M Glastad
- Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.
| | - Lihong Sheng
- Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Janko Gospocic
- Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Urology and Institute of Neuropathology, Medical Center-University of Freiburg, Freiburg, Germany
| | - Callum J Kingwell
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA
| | - Shawn M Davidson
- Lewis-Sigler Institute for Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Sarah D Kocher
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544, USA; Lewis-Sigler Institute for Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Roberto Bonasio
- Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Shelley L Berger
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA; Epigenetics Institute, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Genetics, Perelman School of Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.
| |
Collapse
|
6
|
Bryce-Smith S, Burri D, Gazzara MR, Herrmann CJ, Danecka W, Fitzsimmons CM, Wan YK, Zhuang F, Fansler MM, Fernández JM, Ferret M, Gonzalez-Uriarte A, Haynes S, Herdman C, Kanitz A, Katsantoni M, Marini F, McDonnel E, Nicolet B, Poon CL, Rot G, Schärfen L, Wu PJ, Yoon Y, Barash Y, Zavolan M. Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.23.546284. [PMID: 37425672 PMCID: PMC10327023 DOI: 10.1101/2023.06.23.546284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, and limitations and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for seamless extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies. Furthermore, the containers and reproducible workflows generated in the course of this project can be seamlessly deployed and extended in the future to evaluate new methods or datasets.
Collapse
Affiliation(s)
- Sam Bryce-Smith
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Dominik Burri
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Matthew R. Gazzara
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Christina J. Herrmann
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Weronika Danecka
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom
| | - Christina M. Fitzsimmons
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore, Buona Vista, Singapore
- National University of Singapore, Kent Ridge, Singapore
| | - Farica Zhuang
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, USA
| | - Mervin M. Fansler
- Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell GraduateStudies, New York, NY, USA
- Cancer Biology and Genetics, Sloan-Kettering Institute, MSKCC, New York, NY, USA
| | - José M. Fernández
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Meritxell Ferret
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Asier Gonzalez-Uriarte
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Samuel Haynes
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom
| | | | - Alexander Kanitz
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Maria Katsantoni
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI) - UniversityMedical Center of the Johannes Gutenberg, University Mainz, Germany
| | - Euan McDonnel
- Leeds Institute for Data Analytics, School of Molecular and Cellular Biology, University of Leeds, United Kingdom
| | - Ben Nicolet
- Department of Hematopoiesis, Sanquin Research, Landsteiner Laboratory, AmsterdamUMC, University of Amsterdam, and Oncode Institute, Amsterdam, The Netherlands
| | | | - Gregor Rot
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Life Sciences, Zurich, Switzerland
| | - Leonard Schärfen
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven CT, USA
| | - Pin-Jou Wu
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, Germany
| | - Yoseop Yoon
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California Irvine, Irvine, California, USA
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, USA
| | - Mihaela Zavolan
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
7
|
Vedanayagam J, Lin CJ, Papareddy R, Nodine M, Flynt AS, Wen J, Lai EC. Regulatory logic of endogenous RNAi in silencing de novo genomic conflicts. PLoS Genet 2023; 19:e1010787. [PMID: 37343034 DOI: 10.1371/journal.pgen.1010787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 05/17/2023] [Indexed: 06/23/2023] Open
Abstract
Although the biological utilities of endogenous RNAi (endo-RNAi) have been largely elusive, recent studies reveal its critical role in the non-model fruitfly Drosophila simulans to suppress selfish genes, whose unchecked activities can severely impair spermatogenesis. In particular, hairpin RNA (hpRNA) loci generate endo-siRNAs that suppress evolutionary novel, X-linked, meiotic drive loci. The consequences of deleting even a single hpRNA (Nmy) in males are profound, as such individuals are nearly incapable of siring male progeny. Here, comparative genomic analyses of D. simulans and D. melanogaster mutants of the core RNAi factor dcr-2 reveal a substantially expanded network of recently-emerged hpRNA-target interactions in the former species. The de novo hpRNA regulatory network in D. simulans provides insight into molecular strategies that underlie hpRNA emergence and their potential roles in sex chromosome conflict. In particular, our data support the existence of ongoing rapid evolution of Nmy/Dox-related networks, and recurrent targeting of testis HMG Box loci by hpRNAs. Importantly, the impact of the endo-RNAi network on gene expression flips the convention for regulatory networks, since we observe strong derepression of targets of the youngest hpRNAs, but only mild effects on the targets of the oldest hpRNAs. These data suggest that endo-RNAi are especially critical during incipient stages of intrinsic sex chromosome conflicts, and that continual cycles of distortion and resolution may contribute to speciation.
Collapse
Affiliation(s)
- Jeffrey Vedanayagam
- Developmental Biology Program, Sloan Kettering Institute, New York, New York, United States of America
| | - Ching-Jung Lin
- Developmental Biology Program, Sloan Kettering Institute, New York, New York, United States of America
- Weill Graduate School of Medical Sciences, Weill Cornell Medical College, New York, New York, United States of America
| | - Ranjith Papareddy
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna Biocenter (VBC), Austria
| | - Michael Nodine
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna Biocenter (VBC), Austria
| | - Alex S Flynt
- Cellular and Molecular Biology, University of Southern Mississippi, Hattiesburg, Mississippi, United States of America
| | - Jiayu Wen
- Division of Genome Sciences and Cancer, The John Curtin School of Medical Research The Australian National University, Canberra, Australia
| | - Eric C Lai
- Developmental Biology Program, Sloan Kettering Institute, New York, New York, United States of America
| |
Collapse
|
8
|
Postel MD, Culver JO, Ricker C, Craig DW. Transcriptome analysis provides critical answers to the "variants of uncertain significance" conundrum. Hum Mutat 2022; 43:1590-1608. [PMID: 35510381 PMCID: PMC9560997 DOI: 10.1002/humu.24394] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 03/16/2022] [Accepted: 04/26/2022] [Indexed: 12/30/2022]
Abstract
While whole-genome and exome sequencing have transformed our collective understanding of genetics' role in disease pathogenesis, there are certain conditions and populations for whom DNA-level data fails to identify the underlying genetic etiology. Specifically, patients of non-White race and non-European ancestry are disproportionately affected by "variants of unknown/uncertain significance" (VUS), limiting the scope of precision medicine for minority patients and perpetuating health disparities. VUS often include deep intronic and splicing variants which are difficult to interpret from DNA data alone. RNA analysis can illuminate the consequences of VUS, thereby allowing for their reclassification as pathogenic versus benign. Here we review the critical role transcriptome analysis plays in clarifying VUS in both neoplastic and non-neoplastic diseases.
Collapse
Affiliation(s)
- Mackenzie D. Postel
- Department of Translational GenomicsUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
- Keck School of Medicine of USCUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Julie O. Culver
- Keck School of Medicine of USCUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Charité Ricker
- Keck School of Medicine of USCUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - David W. Craig
- Department of Translational GenomicsUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
- Keck School of Medicine of USCUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| |
Collapse
|
9
|
Meyer E, Chaung K, Dehghannasiri R, Salzman J. ReadZS detects cell type-specific and developmentally regulated RNA processing programs in single-cell RNA-seq. Genome Biol 2022; 23:226. [PMID: 36284317 PMCID: PMC9594907 DOI: 10.1186/s13059-022-02795-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 10/13/2022] [Indexed: 11/13/2022] Open
Abstract
RNA processing, including splicing and alternative polyadenylation, is crucial to gene function and regulation, but methods to detect RNA processing from single-cell RNA sequencing data are limited by reliance on pre-existing annotations, peak calling heuristics, and collapsing measurements by cell type. We introduce ReadZS, an annotation-free statistical approach to identify regulated RNA processing in single cells. ReadZS discovers cell type-specific RNA processing in human lung and conserved, developmentally regulated RNA processing in mammalian spermatogenesis-including global 3' UTR shortening in human spermatogenesis. ReadZS also discovers global 3' UTR lengthening in Arabidopsis development, highlighting the usefulness of this method in under-annotated transcriptomes.
Collapse
Affiliation(s)
- Elisabeth Meyer
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Kaitlin Chaung
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Roozbeh Dehghannasiri
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Julia Salzman
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA.
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA.
- Department of Statistics (by courtesy), Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
10
|
Ye W, Lian Q, Ye C, Wu X. A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00121-8. [PMID: 36167284 PMCID: PMC10372920 DOI: 10.1016/j.gpb.2022.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 08/17/2022] [Accepted: 09/19/2022] [Indexed: 05/08/2023]
Abstract
Alternative polyadenylation (APA) plays important roles in modulating mRNA stability, translation, and subcellular localization, and contributes extensively to shaping eukaryotic transcriptome complexity and proteome diversity. Identification of poly(A) sites (pAs) on a genome-wide scale is a critical step toward understanding the underlying mechanism of APA-mediated gene regulation. A number of established computational tools have been proposed to predict pAs from diverse genomic data. Here we provided an exhaustive overview of computational approaches for predicting pAs from DNA sequences, bulk RNA sequencing (RNA-seq) data, and single-cell RNA sequencing (scRNA-seq) data. Particularly, we examined several representative tools using bulk RNA-seq and scRNA-seq data from peripheral blood mononuclear cells and put forward operable suggestions on how to assess the reliability of pAs predicted by different tools. We also proposed practical guidelines on choosing appropriate methods applicable to diverse scenarios. Moreover, we discussed in depth the challenges in improving the performance of pA prediction and benchmarking different methods. Additionally, we highlighted outstanding challenges and opportunities using new machine learning and integrative multi-omics techniques, and provided our perspective on how computational methodologies might evolve in the future for non-3' untranslated region, tissue-specific, cross-species, and single-cell pA prediction.
Collapse
Affiliation(s)
- Wenbin Ye
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Qiwei Lian
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Department of Automation, Xiamen University, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Coastal and Wetland Ecosystems, Ministry of Education, College of the Environment and Ecology, Xiamen University, Xiamen 361005, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China.
| |
Collapse
|
11
|
Lee S, Chen YC, Gillen AE, Taliaferro JM, Deplancke B, Li H, Lai EC. Diverse cell-specific patterns of alternative polyadenylation in Drosophila. Nat Commun 2022; 13:5372. [PMID: 36100597 PMCID: PMC9470587 DOI: 10.1038/s41467-022-32305-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 07/24/2022] [Indexed: 11/17/2022] Open
Abstract
Most genes in higher eukaryotes express isoforms with distinct 3' untranslated regions (3' UTRs), generated by alternative polyadenylation (APA). Since 3' UTRs are predominant locations of post-transcriptional regulation, APA can render such programs conditional, and can also alter protein sequences via alternative last exon (ALE) isoforms. We previously used 3'-sequencing from diverse Drosophila samples to define multiple tissue-specific APA landscapes. Here, we exploit comprehensive single nucleus RNA-sequencing data (Fly Cell Atlas) to elucidate cell-type expression of 3' UTRs across >250 adult Drosophila cell types. We reveal the cellular bases of multiple tissue-specific APA/ALE programs, such as 3' UTR lengthening in differentiated neurons and 3' UTR shortening in spermatocytes and spermatids. We trace dynamic 3' UTR patterns across cell lineages, including in the male germline, and discover new APA patterns in the intestinal stem cell lineage. Finally, we correlate expression of RNA binding proteins (RBPs), miRNAs and global levels of cleavage and polyadenylation (CPA) factors in several cell types that exhibit characteristic APA landscapes, yielding candidate regulators of transcriptome complexity. These analyses provide a comprehensive foundation for future investigations of mechanisms and biological impacts of alternative 3' isoforms across the major cell types of this widely-studied model organism.
Collapse
Affiliation(s)
- Seungjae Lee
- Developmental Biology Program, Sloan Kettering Institute, 1275 York Ave, Box 252, New York, NY, 10065, USA
| | - Yen-Chung Chen
- Department of Biology, New York University, New York, NY, 10013, USA
| | | | - Austin E Gillen
- Division of Hematology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Rocky Mountain Regional VA Medical Center, Aurora, CO, USA.,RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - J Matthew Taliaferro
- RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bio-engineering & Global Health Institute, School of Life Sciences, EPFL, CH-1015, Lausanne, Switzerland
| | - Hongjie Li
- Huffington Center on Aging, Baylor College of Medicine, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Eric C Lai
- Developmental Biology Program, Sloan Kettering Institute, 1275 York Ave, Box 252, New York, NY, 10065, USA.
| |
Collapse
|
12
|
Wei L, Lai EC. Regulation of the Alternative Neural Transcriptome by ELAV/Hu RNA Binding Proteins. Front Genet 2022; 13:848626. [PMID: 35281806 PMCID: PMC8904962 DOI: 10.3389/fgene.2022.848626] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 02/01/2022] [Indexed: 11/30/2022] Open
Abstract
The process of alternative polyadenylation (APA) generates multiple 3' UTR isoforms for a given locus, which can alter regulatory capacity and on occasion change coding potential. APA was initially characterized for a few genes, but in the past decade, has been found to be the rule for metazoan genes. While numerous differences in APA profiles have been catalogued across genetic conditions, perturbations, and diseases, our knowledge of APA mechanisms and biology is far from complete. In this review, we highlight recent findings regarding the role of the conserved ELAV/Hu family of RNA binding proteins (RBPs) in generating the broad landscape of lengthened 3' UTRs that is characteristic of neurons. We relate this to their established roles in alternative splicing, and summarize ongoing directions that will further elucidate the molecular strategies for neural APA, the in vivo functions of ELAV/Hu RBPs, and the phenotypic consequences of these regulatory paradigms in neurons.
Collapse
Affiliation(s)
- Lu Wei
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Eric C. Lai
- Developmental Biology Program, Sloan Kettering Institute, New York, NY, United States
| |
Collapse
|
13
|
Shields EJ, Sorida M, Sheng L, Sieriebriennikov B, Ding L, Bonasio R. Genome annotation with long RNA reads reveals new patterns of gene expression and improves single-cell analyses in an ant brain. BMC Biol 2021; 19:254. [PMID: 34838024 PMCID: PMC8626913 DOI: 10.1186/s12915-021-01188-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 11/10/2021] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Functional genomic analyses rely on high-quality genome assemblies and annotations. Highly contiguous genome assemblies have become available for a variety of species, but accurate and complete annotation of gene models, inclusive of alternative splice isoforms and transcription start and termination sites, remains difficult with traditional approaches. RESULTS Here, we utilized full-length isoform sequencing (Iso-Seq), a long-read RNA sequencing technology, to obtain a comprehensive annotation of the transcriptome of the ant Harpegnathos saltator. The improved genome annotations include additional splice isoforms and extended 3' untranslated regions for more than 4000 genes. Reanalysis of RNA-seq experiments using these annotations revealed several genes with caste-specific differential expression and tissue- or caste-specific splicing patterns that were missed in previous analyses. The extended 3' untranslated regions afforded great improvements in the analysis of existing single-cell RNA-seq data, resulting in the recovery of the transcriptomes of 18% more cells. The deeper single-cell transcriptomes obtained with these new annotations allowed us to identify additional markers for several cell types in the ant brain, as well as genes differentially expressed across castes in specific cell types. CONCLUSIONS Our results demonstrate that Iso-Seq is an efficient and effective approach to improve genome annotations and maximize the amount of information that can be obtained from existing and future genomic datasets in Harpegnathos and other organisms.
Collapse
Affiliation(s)
- Emily J Shields
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Department of Urology and Institute of Neuropathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Masato Sorida
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Lihong Sheng
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Bogdan Sieriebriennikov
- Department of Biology, New York University, New York, NY, USA
- Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY, USA
| | - Long Ding
- Department of Biology, New York University, New York, NY, USA
| | - Roberto Bonasio
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
| |
Collapse
|
14
|
Gospocic J, Glastad KM, Sheng L, Shields EJ, Berger SL, Bonasio R. Kr-h1 maintains distinct caste-specific neurotranscriptomes in response to socially regulated hormones. Cell 2021; 184:5807-5823.e14. [PMID: 34739833 DOI: 10.1016/j.cell.2021.10.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 07/13/2021] [Accepted: 10/07/2021] [Indexed: 10/19/2022]
Abstract
Behavioral plasticity is key to animal survival. Harpegnathos saltator ants can switch between worker and queen-like status (gamergate) depending on the outcome of social conflicts, providing an opportunity to study how distinct behavioral states are achieved in adult brains. Using social and molecular manipulations in live ants and ant neuronal cultures, we show that ecdysone and juvenile hormone drive molecular and functional differences in the brains of workers and gamergates and direct the transcriptional repressor Kr-h1 to different target genes. Depletion of Kr-h1 in the brain caused de-repression of "socially inappropriate" genes: gamergate genes were upregulated in workers, whereas worker genes were upregulated in gamergates. At the phenotypic level, loss of Kr-h1 resulted in the emergence of worker-specific behaviors in gamergates and gamergate-specific traits in workers. We conclude that Kr-h1 is a transcription factor that maintains distinct brain states established in response to socially regulated hormones.
Collapse
Affiliation(s)
- Janko Gospocic
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Urology and Institute of Neuropathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Karl M Glastad
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Lihong Sheng
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Emily J Shields
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Urology and Institute of Neuropathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Shelley L Berger
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA; Department of Biology, University of Pennsylvania School of Arts and Sciences, Philadelphia, PA 19104, USA.
| | - Roberto Bonasio
- Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.
| |
Collapse
|
15
|
Kim S, Bai Y, Fan Z, Diergaarde B, Tseng GC, Park HJ. The microRNA target site landscape is a novel molecular feature associating alternative polyadenylation with immune evasion activity in breast cancer. Brief Bioinform 2021; 22:bbaa191. [PMID: 32844230 PMCID: PMC8138879 DOI: 10.1093/bib/bbaa191] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 07/10/2020] [Accepted: 07/28/2020] [Indexed: 12/14/2022] Open
Abstract
Alternative polyadenylation (APA) in breast tumor samples results in the removal/addition of cis-regulatory elements such as microRNA (miRNA) target sites in the 3'-untranslated region (3'-UTRs) of genes. Although previous computational APA studies focused on a subset of genes strongly affected by APA (APA genes), we identify miRNAs of which widespread APA events collectively increase or decrease the number of target sites [probabilistic inference of microRNA target site modification through APA (PRIMATA-APA)]. Using PRIMATA-APA on the cancer genome atlas (TCGA) breast cancer data, we found that the global APA events change the number of the target sites of particular microRNAs [target sites modified miRNA (tamoMiRNA)] enriched for cancer development and treatments. We also found that when knockdown (KD) of NUDT21 in HeLa cells induces a different set of widespread 3'-UTR shortening than TCGA breast cancer data, it changes the target sites of the common tamoMiRNAs. Since the NUDT21 KD experiment previously demonstrated the tumorigenic role of APA events in a miRNA dependent fashion, this result suggests that the APA-initiated tumorigenesis is attributable to the miRNA target site changes, not the APA events themselves. Further, we found that the miRNA target site changes identify tumor cell proliferation and immune cell infiltration to the tumor microenvironment better than the miRNA expression levels or the APA events themselves. Altogether, our computational analyses provide a proof-of-concept demonstration that the miRNA target site information indicates the effect of global APA events with a potential as predictive biomarker.
Collapse
Affiliation(s)
- Soyeon Kim
- Department of Pediatrics, University of Pittsburgh Medical Center and in Division of Pulmonary Medicine, Children’s Hospital of Pittsburgh of UPMC
| | - YuLong Bai
- Department of Human Genetics in the Graduate School of Public Health, University of Pittsburgh
| | - Zhenjiang Fan
- Department of Computer Science, University of Pittsburgh
| | - Brenda Diergaarde
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh
| | - George C Tseng
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh
| | - Hyun Jung Park
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh
| |
Collapse
|
16
|
Kandhari N, Kraupner-Taylor CA, Harrison PF, Powell DR, Beilharz TH. The Detection and Bioinformatic Analysis of Alternative 3 ' UTR Isoforms as Potential Cancer Biomarkers. Int J Mol Sci 2021; 22:5322. [PMID: 34070203 PMCID: PMC8158509 DOI: 10.3390/ijms22105322] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/06/2021] [Accepted: 05/06/2021] [Indexed: 12/17/2022] Open
Abstract
Alternative transcript cleavage and polyadenylation is linked to cancer cell transformation, proliferation and outcome. This has led researchers to develop methods to detect and bioinformatically analyse alternative polyadenylation as potential cancer biomarkers. If incorporated into standard prognostic measures such as gene expression and clinical parameters, these could advance cancer prognostic testing and possibly guide therapy. In this review, we focus on the existing methodologies, both experimental and computational, that have been applied to support the use of alternative polyadenylation as cancer biomarkers.
Collapse
Affiliation(s)
- Nitika Kandhari
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| | - Calvin A. Kraupner-Taylor
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| | - Paul F. Harrison
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC 3800, Australia;
| | - David R. Powell
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC 3800, Australia;
| | - Traude H. Beilharz
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| |
Collapse
|
17
|
Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence. Nat Commun 2021; 12:1652. [PMID: 33712618 PMCID: PMC7955126 DOI: 10.1038/s41467-021-21894-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 02/18/2021] [Indexed: 02/01/2023] Open
Abstract
Annotation of polyadenylation sites from short-read RNA sequencing alone is a challenging computational task. Other algorithms rooted in DNA sequence predict potential polyadenylation sites; however, in vivo expression of a particular site varies based on a myriad of conditions. Here, we introduce aptardi (alternative polyadenylation transcriptome analysis from RNA-Seq data and DNA sequence information), which leverages both DNA sequence and RNA sequencing in a machine learning paradigm to predict expressed polyadenylation sites. Specifically, as input aptardi takes DNA nucleotide sequence, genome-aligned RNA-Seq data, and an initial transcriptome. The program evaluates these initial transcripts to identify expressed polyadenylation sites in the biological sample and refines transcript 3'-ends accordingly. The average precision of the aptardi model is twice that of a standard transcriptome assembler. In particular, the recall of the aptardi model (the proportion of true polyadenylation sites detected by the algorithm) is improved by over three-fold. Also, the model-trained using the Human Brain Reference RNA commercial standard-performs well when applied to RNA-sequencing samples from different tissues and different mammalian species. Finally, aptardi's input is simple to compile and its output is easily amenable to downstream analyses such as quantitation and differential expression.
Collapse
|
18
|
Jin W, Zhu Q, Yang Y, Yang W, Wang D, Yang J, Niu X, Yu D, Gong J. Animal-APAdb: a comprehensive animal alternative polyadenylation database. Nucleic Acids Res 2021; 49:D47-D54. [PMID: 32986825 PMCID: PMC7779049 DOI: 10.1093/nar/gkaa778] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 08/27/2020] [Accepted: 09/08/2020] [Indexed: 12/31/2022] Open
Abstract
Alternative polyadenylation (APA) is an important post-transcriptional regulatory mechanism that recognizes different polyadenylation signals on transcripts, resulting in transcripts with different lengths of 3′ untranslated regions and thereby influencing a series of biological processes. Recent studies have highlighted the important roles of APA in human. However, APA profiles in other animals have not been fully recognized, and there is no database that provides comprehensive APA information for other animals except human. Here, by using the RNA sequencing data collected from public databases, we systematically characterized the APA profiles in 9244 samples of 18 species. In total, we identified 342 952 APA events with a median of 17 020 per species using the DaPars2 algorithm, and 315 691 APA events with a median of 17 953 per species using the QAPA algorithm in these 18 species, respectively. In addition, we predicted the polyadenylation sites (PAS) and motifs near PAS of these species. We further developed Animal-APAdb, a user-friendly database (http://gong_lab.hzau.edu.cn/Animal-APAdb/) for data searching, browsing and downloading. With comprehensive information of APA events in different tissues of different species, Animal-APAdb may greatly facilitate the exploration of animal APA patterns and novel mechanisms, gene expression regulation and APA evolution across tissues and species.
Collapse
Affiliation(s)
- Weiwei Jin
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Qizhao Zhu
- College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, P.R. China
| | - Yanbo Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Wenqian Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Dongyang Wang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Jiajun Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Xiaohui Niu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Debing Yu
- College of Animal Science and Technology, Nanjing Agricultural University, Nanjing 210095, P.R. China
| | - Jing Gong
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China.,College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, P.R. China
| |
Collapse
|
19
|
Tu M, Li Y. Profiling Alternative 3' Untranslated Regions in Sorghum using RNA-seq Data. Front Genet 2020; 11:556749. [PMID: 33193635 PMCID: PMC7649775 DOI: 10.3389/fgene.2020.556749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 09/30/2020] [Indexed: 12/18/2022] Open
Abstract
Sorghum is an important crop widely used for food, feed, and fuel. Transcriptome-wide studies of 3′ untranslated regions (3′UTR) using regular RNA-seq remain scarce in sorghum, while transcriptomes have been characterized extensively using Illumina short-read sequencing platforms for many sorghum varieties under various conditions or developmental contexts. 3′UTR is a critical regulatory component of genes, controlling the translation, transport, and stability of messenger RNAs. In the present study, we profiled the alternative 3′UTRs at the transcriptome level in three genetically related but phenotypically contrasting lines of sorghum: Rio, BTx406, and R9188. A total of 1,197 transcripts with alternative 3′UTRs were detected using RNA-seq data. Their categorization identified 612 high-confidence alternative 3′UTRs. Importantly, the high-confidence alternative 3′UTR genes significantly overlapped with the genesets that are associated with RNA N6-methyladenosine (m6A) modification, suggesting a clear indication between alternative 3′UTR and m6A methylation in sorghum. Moreover, taking advantage of sorghum genetics, we provided evidence of genotype specificity of alternative 3′UTR usage. In summary, our work exemplifies a transcriptome-wide profiling of alternative 3′UTRs using regular RNA-seq data in non-model crops and gains insights into alternative 3′UTRs and their genotype specificity.
Collapse
Affiliation(s)
- Min Tu
- Waksman Institute of Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| | - Yin Li
- Waksman Institute of Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| |
Collapse
|
20
|
Carrasco J, Rauer M, Hummel B, Grzejda D, Alfonso-Gonzalez C, Lee Y, Wang Q, Puchalska M, Mittler G, Hilgers V. ELAV and FNE Determine Neuronal Transcript Signatures through EXon-Activated Rescue. Mol Cell 2020; 80:156-163.e6. [PMID: 33007255 DOI: 10.1016/j.molcel.2020.09.011] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 07/03/2020] [Accepted: 08/12/2020] [Indexed: 12/22/2022]
Abstract
The production of alternative RNA variants contributes to the tissue-specific regulation of gene expression. In the animal nervous system, a systematic shift toward distal sites of transcription termination produces transcript signatures that are crucial for neuron development and function. Here, we report that, in Drosophila, the highly conserved protein ELAV globally regulates all sites of neuronal 3' end processing and directly binds to proximal polyadenylation sites of target mRNAs in vivo. We uncover an endogenous strategy of functional gene rescue that safeguards neuronal RNA signatures in an ELAV loss-of-function context. When not directly repressed by ELAV, the transcript encoding the ELAV paralog FNE acquires a mini-exon, generating a new protein able to translocate to the nucleus and rescue ELAV-mediated alternative polyadenylation and alternative splicing. We propose that exon-activated functional rescue is a more widespread mechanism that ensures robustness of processes regulated by a hierarchy, rather than redundancy, of effectors.
Collapse
Affiliation(s)
- Judit Carrasco
- Max-Planck-Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany; Faculty of Biology, Albert Ludwig University, 79104 Freiburg, Germany; International Max Planck Research School for Molecular and Cellular Biology (IMPRS-MCB), 79108 Freiburg, Germany
| | - Michael Rauer
- Max-Planck-Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany
| | - Barbara Hummel
- Max-Planck-Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany
| | - Dominika Grzejda
- Max-Planck-Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany; Faculty of Biology, Albert Ludwig University, 79104 Freiburg, Germany; International Max Planck Research School for Molecular and Cellular Biology (IMPRS-MCB), 79108 Freiburg, Germany
| | - Carlos Alfonso-Gonzalez
- Max-Planck-Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany; Faculty of Biology, Albert Ludwig University, 79104 Freiburg, Germany; International Max Planck Research School for Immunology, Epigenetics and Metabolism (IMPRS-IEM), 79108 Freiburg, Germany
| | - Yeon Lee
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Qingqing Wang
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Monika Puchalska
- Max-Planck-Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany
| | - Gerhard Mittler
- Max-Planck-Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany
| | - Valérie Hilgers
- Max-Planck-Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany.
| |
Collapse
|
21
|
Hong W, Ruan H, Zhang Z, Ye Y, Liu Y, Li S, Jing Y, Zhang H, Diao L, Liang H, Han L. APAatlas: decoding alternative polyadenylation across human tissues. Nucleic Acids Res 2020; 48:D34-D39. [PMID: 31586392 PMCID: PMC6943053 DOI: 10.1093/nar/gkz876] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 09/15/2019] [Accepted: 09/29/2019] [Indexed: 02/06/2023] Open
Abstract
Alternative polyadenylation (APA) is an RNA-processing mechanism on the 3' terminus that generates distinct isoforms of mRNAs and/or other RNA polymerase II transcripts with different 3'UTR lengths. Widespread APA affects post-transcriptional gene regulation in mRNA translation, stability, and localization, and exhibits strong tissue specificity. However, no existing database provides comprehensive information about APA events in a large number of human normal tissues. Using the RNA-seq data from the Genotype-Tissue Expression project, we systematically identified APA events from 9475 samples across 53 human tissues and examined their associations with multiple traits and gene expression across tissues. We further developed APAatlas, a user-friendly database (https://hanlab.uth.edu/apa/) for searching, browsing and downloading related information. APAatlas will help the biomedical research community elucidate the functions and mechanisms of APA events in human tissues.
Collapse
Affiliation(s)
- Wei Hong
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Hang Ruan
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhao Zhang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Youqiong Ye
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Yaoming Liu
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Shengli Li
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Ying Jing
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Huiwen Zhang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Lixia Diao
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Han Liang
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Leng Han
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Center for Precision Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
22
|
Muret K, Désert C, Lagoutte L, Boutin M, Gondret F, Zerjal T, Lagarrigue S. Long noncoding RNAs in lipid metabolism: literature review and conservation analysis across species. BMC Genomics 2019; 20:882. [PMID: 31752679 PMCID: PMC6868825 DOI: 10.1186/s12864-019-6093-3] [Citation(s) in RCA: 55] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 09/10/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Lipids are important for the cell and organism life since they are major components of membranes, energy reserves and are also signal molecules. The main organs for the energy synthesis and storage are the liver and adipose tissue, both in humans and in more distant species such as chicken. Long noncoding RNAs (lncRNAs) are known to be involved in many biological processes including lipid metabolism. RESULTS In this context, this paper provides the most exhaustive list of lncRNAs involved in lipid metabolism with 60 genes identified after an in-depth analysis of the bibliography, while all "review" type articles list a total of 27 genes. These 60 lncRNAs are mainly described in human or mice and only a few of them have a precise described mode-of-action. Because these genes are still named in a non-standard way making such a study tedious, we propose a standard name for this list according to the rules dictated by the HUGO consortium. Moreover, we identified about 10% of lncRNAs which are conserved between mammals and chicken and 2% between mammals and fishes. Finally, we demonstrated that two lncRNA were wrongly considered as lncRNAs in the literature since they are 3' extensions of the closest coding gene. CONCLUSIONS Such a lncRNAs catalogue can participate to the understanding of the lipid metabolism regulators; it can be useful to better understand the genetic regulation of some human diseases (obesity, hepatic steatosis) or traits of economic interest in livestock species (meat quality, carcass composition). We have no doubt that this first set will be rapidly enriched in coming years.
Collapse
Affiliation(s)
- Kevin Muret
- PEGASE, INRA, AGROCAMPUS OUEST, 35590, Saint-Gilles, France
| | - Colette Désert
- PEGASE, INRA, AGROCAMPUS OUEST, 35590, Saint-Gilles, France
| | | | - Morgane Boutin
- PEGASE, INRA, AGROCAMPUS OUEST, 35590, Saint-Gilles, France
| | | | - Tatiana Zerjal
- GABI INRA, AgroParisTech, Université Paris-Saclay, Domaine de Vilvert, 78352, Jouy-en-Josas, France
| | | |
Collapse
|
23
|
Abstract
3' untranslated regions (3' UTRs) of messenger RNAs (mRNAs) are best known to regulate mRNA-based processes, such as mRNA localization, mRNA stability, and translation. In addition, 3' UTRs can establish 3' UTR-mediated protein-protein interactions (PPIs), and thus can transmit genetic information encoded in 3' UTRs to proteins. This function has been shown to regulate diverse protein features, including protein complex formation or posttranslational modifications, but is also expected to alter protein conformations. Therefore, 3' UTR-mediated information transfer can regulate protein features that are not encoded in the amino acid sequence. This review summarizes both 3' UTR functions-the regulation of mRNA and protein-based processes-and highlights how each 3' UTR function was discovered with a focus on experimental approaches used and the concepts that were learned. This review also discusses novel approaches to study 3' UTR functions in the future by taking advantage of recent advances in technology.
Collapse
Affiliation(s)
- Christine Mayr
- Department of Cancer Biology and Genetics, Memorial Sloan Kettering Cancer Center, New York, New York 10065
| |
Collapse
|
24
|
mountainClimber Identifies Alternative Transcription Start and Polyadenylation Sites in RNA-Seq. Cell Syst 2019; 9:393-400.e6. [PMID: 31542416 DOI: 10.1016/j.cels.2019.07.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 06/06/2019] [Accepted: 07/24/2019] [Indexed: 12/28/2022]
Abstract
Alternative transcription start (ATS) and alternative polyadenylation (APA) create alternative RNA isoforms and modulate many aspects of RNA expression and protein production. However, ATS and APA remain difficult to detect in RNA sequencing (RNA-seq). Here, we developed mountainClimber, a de novo cumulative-sum-based approach to identify ATS and APA as change points. Unlike many existing methods, mountainClimber runs on a single sample and identifies multiple ATS or APA sites anywhere in the transcript. We analyzed 2,342 GTEx samples (36 tissues, 215 individuals) and found that tissue type is the predominant driver of transcript end variations. 75% and 65% of genes exhibited differential APA and ATS across tissues, respectively. In particular, testis displayed longer 5' untranslated regions (UTRs) and shorter 3' UTRs, often in genes related to testis-specific biology. Overall, we report the largest study of transcript ends across human tissues to our knowledge. mountainClimber is available at github.com/gxiaolab/mountainClimber.
Collapse
|
25
|
Arefeen A, Liu J, Xiao X, Jiang T. TAPAS: tool for alternative polyadenylation site analysis. Bioinformatics 2019; 34:2521-2529. [PMID: 30052912 DOI: 10.1093/bioinformatics/bty110] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2017] [Accepted: 02/22/2018] [Indexed: 01/08/2023] Open
Abstract
Motivation The length of the 3' untranslated region (3' UTR) of an mRNA is essential for many biological activities such as mRNA stability, sub-cellular localization, protein translation, protein binding and translation efficiency. Moreover, correlation between diseases and the shortening (or lengthening) of 3' UTRs has been reported in the literature. This length is largely determined by the polyadenylation cleavage site in the mRNA. As alternative polyadenylation (APA) sites are common in mammalian genes, several tools have been published recently for detecting APA sites from RNA-Seq data or performing shortening/lengthening analysis. These tools consider either up to only two APA sites in a gene or only APA sites that occur in the last exon of a gene, although a gene may generally have more than two APA sites and an APA site may sometimes occur before the last exon. Furthermore, the tools are unable to integrate the analysis of shortening/lengthening events with APA site detection. Results We propose a new tool, called TAPAS, for detecting novel APA sites from RNA-Seq data. It can deal with more than two APA sites in a gene as well as APA sites that occur before the last exon. The tool is based on an existing method for finding change points in time series data, but some filtration techniques are also adopted to remove change points that are likely false APA sites. It is then extended to identify APA sites that are expressed differently between two biological samples and genes that contain 3' UTRs with shortening/lengthening events. Our extensive experiments on simulated and real RNA-Seq data demonstrate that TAPAS outperforms the existing tools for APA site detection or shortening/lengthening analysis significantly. Availability and implementation https://github.com/arefeen/TAPAS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ashraful Arefeen
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA
| | - Juntao Liu
- School of Mathematics, Shandong University, Jinan, Shandong, China
| | - Xinshu Xiao
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA.,Institute of Integrative Genome Biology, University of California, Riverside, CA, USA.,MOE Key Lab of Bioinformatics and Bioinformatics Division, TNLIST/Department of Computer Science and Technology, Tsinghua University, Beijing, China
| |
Collapse
|
26
|
Doulazmi M, Cros C, Dusart I, Trembleau A, Dubacq C. Alternative polyadenylation produces multiple 3' untranslated regions of odorant receptor mRNAs in mouse olfactory sensory neurons. BMC Genomics 2019; 20:577. [PMID: 31299892 PMCID: PMC6624953 DOI: 10.1186/s12864-019-5927-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 06/23/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Odorant receptor genes constitute the largest gene family in mammalian genomes and this family has been extensively studied in several species, but to date far less attention has been paid to the characterization of their mRNA 3' untranslated regions (3'UTRs). Given the increasing importance of UTRs in the understanding of RNA metabolism, and the growing interest in alternative polyadenylation especially in the nervous system, we aimed at identifying the alternative isoforms of odorant receptor mRNAs generated through 3'UTR variation. RESULTS We implemented a dedicated pipeline using IsoSCM instead of Cufflinks to analyze RNA-Seq data from whole olfactory mucosa of adult mice and obtained an extensive description of the 3'UTR isoforms of odorant receptor mRNAs. To validate our bioinformatics approach, we exhaustively analyzed the 3'UTR isoforms produced from 2 pilot genes, using molecular approaches including northern blot and RNA ligation mediated polyadenylation test. Comparison between datasets further validated the pipeline and confirmed the alternative polyadenylation patterns of odorant receptors. Qualitative and quantitative analyses of the annotated 3' regions demonstrate that 1) Odorant receptor 3'UTRs are longer than previously described in the literature; 2) More than 77% of odorant receptor mRNAs are subject to alternative polyadenylation, hence generating at least 2 detectable 3'UTR isoforms; 3) Splicing events in 3'UTRs are restricted to a limited subset of odorant receptor genes; and 4) Comparison between male and female data shows no sex-specific differences in odorant receptor 3'UTR isoforms. CONCLUSIONS We demonstrated for the first time that odorant receptor genes are extensively subject to alternative polyadenylation. This ground-breaking change to the landscape of 3'UTR isoforms of Olfr mRNAs opens new avenues for investigating their respective functions, especially during the differentiation of olfactory sensory neurons.
Collapse
Affiliation(s)
- Mohamed Doulazmi
- CNRS, Institut de Biologie Paris Seine, Biological adaptation and ageing, B2A, Sorbonne Université, F-75005 Paris, France
| | - Cyril Cros
- CNRS, INSERM, Institut de Biologie Paris Seine, Neuroscience Paris Seine, NPS, Sorbonne Université, F-75005 Paris, France
- Present Address: Columbia University, New York, NY 10027 USA
| | - Isabelle Dusart
- CNRS, INSERM, Institut de Biologie Paris Seine, Neuroscience Paris Seine, NPS, Sorbonne Université, F-75005 Paris, France
| | - Alain Trembleau
- CNRS, INSERM, Institut de Biologie Paris Seine, Neuroscience Paris Seine, NPS, Sorbonne Université, F-75005 Paris, France
| | - Caroline Dubacq
- CNRS, INSERM, Institut de Biologie Paris Seine, Neuroscience Paris Seine, NPS, Sorbonne Université, F-75005 Paris, France
| |
Collapse
|
27
|
Chen M, Ji G, Fu H, Lin Q, Ye C, Ye W, Su Y, Wu X. A survey on identification and quantification of alternative polyadenylation sites from RNA-seq data. Brief Bioinform 2019; 21:1261-1276. [PMID: 31267126 DOI: 10.1093/bib/bbz068] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 05/03/2019] [Accepted: 05/14/2019] [Indexed: 12/13/2022] Open
Abstract
Alternative polyadenylation (APA) has been implicated to play an important role in post-transcriptional regulation by regulating mRNA abundance, stability, localization and translation, which contributes considerably to transcriptome diversity and gene expression regulation. RNA-seq has become a routine approach for transcriptome profiling, generating unprecedented data that could be used to identify and quantify APA site usage. A number of computational approaches for identifying APA sites and/or dynamic APA events from RNA-seq data have emerged in the literature, which provide valuable yet preliminary results that should be refined to yield credible guidelines for the scientific community. In this review, we provided a comprehensive overview of the status of currently available computational approaches. We also conducted objective benchmarking analysis using RNA-seq data sets from different species (human, mouse and Arabidopsis) and simulated data sets to present a systematic evaluation of 11 representative methods. Our benchmarking study showed that the overall performance of all tools investigated is moderate, reflecting that there is still lot of scope to improve the prediction of APA site or dynamic APA events from RNA-seq data. Particularly, prediction results from individual tools differ considerably, and only a limited number of predicted APA sites or genes are common among different tools. Accordingly, we attempted to give some advice on how to assess the reliability of the obtained results. We also proposed practical recommendations on the appropriate method applicable to diverse scenarios and discussed implications and future directions relevant to profiling APA from RNA-seq data.
Collapse
Affiliation(s)
- Moliang Chen
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Hongjuan Fu
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Qianmin Lin
- Xiang' an hospital of Xiamen university, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Wenbin Ye
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Yaru Su
- College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| |
Collapse
|
28
|
Ye C, Long Y, Ji G, Li QQ, Wu X. APAtrap: identification and quantification of alternative polyadenylation sites from RNA-seq data. Bioinformatics 2019; 34:1841-1849. [PMID: 29360928 DOI: 10.1093/bioinformatics/bty029] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Accepted: 01/17/2018] [Indexed: 12/28/2022] Open
Abstract
Motivation Alternative polyadenylation (APA) has been increasingly recognized as a crucial mechanism that contributes to transcriptome diversity and gene expression regulation. As RNA-seq has become a routine protocol for transcriptome analysis, it is of great interest to leverage such unprecedented collection of RNA-seq data by new computational methods to extract and quantify APA dynamics in these transcriptomes. However, research progress in this area has been relatively limited. Conventional methods rely on either transcript assembly to determine transcript 3' ends or annotated poly(A) sites. Moreover, they can neither identify more than two poly(A) sites in a gene nor detect dynamic APA site usage considering more than two poly(A) sites. Results We developed an approach called APAtrap based on the mean squared error model to identify and quantify APA sites from RNA-seq data. APAtrap is capable of identifying novel 3' UTRs and 3' UTR extensions, which contributes to locating potential poly(A) sites in previously overlooked regions and improving genome annotations. APAtrap also aims to tally all potential poly(A) sites and detect genes with differential APA site usages between conditions. Extensive comparisons of APAtrap with two other latest methods, ChangePoint and DaPars, using various RNA-seq datasets from simulation studies, human and Arabidopsis demonstrate the efficacy and flexibility of APAtrap for any organisms with an annotated genome. Availability and implementation Freely available for download at https://apatrap.sourceforge.io. Contact liqq@xmu.edu.cn or xhuister@xmu.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Congting Ye
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Yuqi Long
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Qingshun Quinn Li
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China.,Graduate College of Biomedical Sciences, Western University of Health Sciences, Pomona, CA 91766, USA
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| |
Collapse
|
29
|
Babarinde IA, Li Y, Hutchins AP. Computational Methods for Mapping, Assembly and Quantification for Coding and Non-coding Transcripts. Comput Struct Biotechnol J 2019; 17:628-637. [PMID: 31193391 PMCID: PMC6526290 DOI: 10.1016/j.csbj.2019.04.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 04/24/2019] [Accepted: 04/29/2019] [Indexed: 12/17/2022] Open
Abstract
The measurement of gene expression has long provided significant insight into biological functions. The development of high-throughput short-read sequencing technology has revealed transcriptional complexity at an unprecedented scale, and informed almost all areas of biology. However, as researchers have sought to gather more insights from the data, these new technologies have also increased the computational analysis burden. In this review, we describe typical computational pipelines for RNA-Seq analysis and discuss their strengths and weaknesses for the assembly, quantification and analysis of coding and non-coding RNAs. We also discuss the assembly of transposable elements into transcripts, and the difficulty these repetitive elements pose. In summary, RNA-Seq is a powerful technology that is likely to remain a key asset in the biologist's toolkit.
Collapse
Affiliation(s)
| | | | - Andrew P. Hutchins
- Department of Biology, Southern University of Science and Technology, 1088 Xueyuan Lu, Shenzhen, China
| |
Collapse
|
30
|
Harrison BJ, Park JW, Gomes C, Petruska JC, Sapio MR, Iadarola MJ, Chariker JH, Rouchka EC. Detection of Differentially Expressed Cleavage Site Intervals Within 3' Untranslated Regions Using CSI-UTR Reveals Regulated Interaction Motifs. Front Genet 2019; 10:182. [PMID: 30915105 PMCID: PMC6422928 DOI: 10.3389/fgene.2019.00182] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 02/19/2019] [Indexed: 01/08/2023] Open
Abstract
The length of untranslated regions at the 3' end of transcripts (3'UTRs) is regulated by alternate polyadenylation (APA). 3'UTRs contain regions that harbor binding motifs for regulatory molecules. However, the mechanisms that coordinate the 3'UTR length of specific groups of transcripts are not well-understood. We therefore developed a method, CSI-UTR, that models 3'UTR structure as tandem segments between functional alternative-polyadenylation sites (termed cleavage site intervals-CSIs). This approach facilitated (1) profiling of 3'UTR isoform expression changes and (2) statistical enrichment of putative regulatory motifs. CSI-UTR analysis is UTR-annotation independent and can interrogate legacy data generated from standard RNA-Seq libraries. CSI-UTR identified a set of CSIs in human and rodent transcriptomes. Analysis of RNA-Seq datasets from neural tissue identified differential expression events within 3'UTRs not detected by standard gene-based differential expression analyses. Further, in many instances 3'UTR and CDS from the same gene were regulated differently. This modulation of motifs for RNA-interacting molecules with potential condition-dependent and tissue-specific RNA binding partners near the polyA signal and CSI junction may play a mechanistic role in the specificity of alternative polyadenylation. Source code, CSI BED files and example datasets are available at: https://github.com/UofLBioinformatics/CSI-UTR.
Collapse
Affiliation(s)
- Benjamin J Harrison
- Department of Biomedical Sciences, Center for Excellence in the Neurosciences, College of Osteopathic Medicine, University of New England, Biddeford, ME, United States.,Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, United States.,Kentucky Biomedical Research Infrastructure Network Bioinformatics Core, Louisville, KY, United States
| | - Juw Won Park
- Kentucky Biomedical Research Infrastructure Network Bioinformatics Core, Louisville, KY, United States.,Department of Computer Engineering and Computer Science, Speed School of Engineering, University of Louisville, Louisville, KY, United States
| | - Cynthia Gomes
- Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, United States
| | - Jeffrey C Petruska
- Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, United States.,Kentucky Spinal Cord Injury Research Center, University of Louisville, Louisville, KY, United States.,Department of Neurological Surgery, University of Louisville, Louisville, KY, United States
| | - Matthew R Sapio
- Department of Perioperative Medicine, Clinical Center, National Institutes of Health, Bethesda, MD, United States
| | - Michael J Iadarola
- Department of Perioperative Medicine, Clinical Center, National Institutes of Health, Bethesda, MD, United States
| | - Julia H Chariker
- Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, United States.,Kentucky Biomedical Research Infrastructure Network Bioinformatics Core, Louisville, KY, United States
| | - Eric C Rouchka
- Kentucky Biomedical Research Infrastructure Network Bioinformatics Core, Louisville, KY, United States.,Department of Computer Engineering and Computer Science, Speed School of Engineering, University of Louisville, Louisville, KY, United States
| |
Collapse
|
31
|
MacDonald ML, Hamaker NK, Lee KH. Bioinformatic analysis of Chinese hamster ovary host cell protein lipases. AIChE J 2018; 64:4247-4254. [PMID: 30911190 DOI: 10.1002/aic.16378] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Complete, accurate genome assemblies are necessary to design targets for genetic engineering strategies. Successful gene knockdowns and knockouts in Chinese hamster ovary (CHO) cells may prevent the expression of difficult-to-remove host cell proteins (HCPs). HCPs, if not removed, can cause problems in stability, safety, and efficacy of the biotherapeutic. A significantly improved Chinese hamster (CH) reference genome was used to identify new knockout targets with similar predicted functions and characteristics as the difficult-to-remove host cell lipases, LPL, PLBL2, and LPLA2. The CHO-K1 gene and protein sequences of several of these lipases were corrected using the updated CH genome. Sequence alignments were then used to identify conserved regions that may serve as possible targets for multiple simultaneous gene knockouts. Finally, comparison of the CHO-K1 lipase protein sequences to their human orthologs provided insight into which lipases, if persistent in the drug product, could possibly cause immunogenic responses in patients.
Collapse
Affiliation(s)
- Madolyn L. MacDonald
- Delaware Biotechnology Institute University of Delaware Newark DE, 19711
- Center for Bioinformatics and Computational Biology University of Delaware Newark DE, 19711
- Dept. of Computer and Information Sciences University of Delaware Newark DE, 19716
| | - Nathaniel K. Hamaker
- Delaware Biotechnology Institute University of Delaware Newark DE, 19711
- Dept. of Chemical and Biomolecular Engineering University of Delaware Newark DE, 19716
| | - Kelvin H. Lee
- Delaware Biotechnology Institute University of Delaware Newark DE, 19711
- Dept. of Chemical and Biomolecular Engineering University of Delaware Newark DE, 19716
| |
Collapse
|
32
|
Chang JW, Zhang W, Yeh HS, Park M, Yao C, Shi Y, Kuang R, Yong J. An integrative model for alternative polyadenylation, IntMAP, delineates mTOR-modulated endoplasmic reticulum stress response. Nucleic Acids Res 2018; 46:5996-6008. [PMID: 29733382 PMCID: PMC6158760 DOI: 10.1093/nar/gky340] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 04/11/2018] [Accepted: 04/20/2018] [Indexed: 12/18/2022] Open
Abstract
3'-untranslated regions (UTRs) can vary through the use of alternative polyadenylation sites during pre-mRNA processing. Multiple publically available pipelines combining high profiling technologies and bioinformatics tools have been developed to catalog changes in 3'-UTR lengths. In our recent RNA-seq experiments using cells with hyper-activated mammalian target of rapamycin (mTOR), we found that cellular mTOR activation leads to transcriptome-wide alternative polyadenylation (APA), resulting in the activation of multiple cellular pathways. Here, we developed a novel bioinformatics algorithm, IntMAP, which integrates RNA-Seq and PolyA Site (PAS)-Seq data for a comprehensive characterization of APA events. By applying IntMAP to the datasets from cells with hyper-activated mTOR, we identified novel APA events that could otherwise not be identified by either profiling method alone. Several transcription factors including Cebpg (CCAAT/enhancer binding protein gamma) were among the newly discovered APA transcripts, indicating that diverse transcriptional networks may be regulated by mTOR-coordinated APA. The prevention of APA in Cebpg using the CRISPR/cas9-mediated genome editing tool showed that mTOR-driven 3'-UTR shortening in Cebpg is critical in protecting cells from endoplasmic reticulum (ER) stress. Taken together, we present IntMAP as a new bioinformatics algorithm for APA analysis by which we expand our understanding of the physiological role of mTOR-coordinated APA events to ER stress response. IntMAP toolbox is available at http://compbio.cs.umn.edu/IntMAP/.
Collapse
Affiliation(s)
- Jae-Woong Chang
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Wei Zhang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Hsin-Sung Yeh
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Meeyeon Park
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Chengguo Yao
- Department of Microbiology and Molecular Genetics, University of California School of Medicine, Irvine, CA 92697, USA
| | - Yongsheng Shi
- Department of Microbiology and Molecular Genetics, University of California School of Medicine, Irvine, CA 92697, USA
| | - Rui Kuang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| |
Collapse
|
33
|
El Baidouri M, Kim KD, Abernathy B, Li YH, Qiu LJ, Jackson SA. Genic C-Methylation in Soybean Is Associated with Gene Paralogs Relocated to Transposable Element-Rich Pericentromeres. MOLECULAR PLANT 2018; 11:485-495. [PMID: 29476915 DOI: 10.1016/j.molp.2018.02.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2017] [Revised: 02/15/2018] [Accepted: 02/15/2018] [Indexed: 06/08/2023]
Abstract
Most plants are polyploid due to whole-genome duplications (WGD) and can thus have duplicated genes. Following a WGD, paralogs are often fractionated (lost) and few duplicate pairs remain. Little attention has been paid to the role of DNA methylation in the functional divergence of paralogous genes. Using high-resolution methylation maps of accessions of domesticated and wild soybean, we show that in soybean, a recent paleopolyploid with many paralogs, DNA methylation likely contributed to the elimination of genetic redundancy of polyploidy-derived gene paralogs. Transcriptionally silenced paralogs exhibit particular genomic features as they are often associated with proximal transposable elements (TEs) and are preferentially located in pericentromeres, likely due to gene movement during evolution. Additionally, we provide evidence that gene methylation associated with proximal TEs is implicated in the divergence of expression profiles between orthologous genes of wild and domesticated soybean, and within populations.
Collapse
Affiliation(s)
- Moaine El Baidouri
- Center for Applied Genetic Technologies, University of Georgia, 111 Riverbend Road, Athens, GA 30602, USA.
| | - Kyung Do Kim
- Center for Applied Genetic Technologies, University of Georgia, 111 Riverbend Road, Athens, GA 30602, USA; Corporate R&D, LG Chem, LG Science Park, 30 Magokjungang 10-ro, Gangseo-gu, Seoul 07796, Republic of Korea.
| | - Brian Abernathy
- Center for Applied Genetic Technologies, University of Georgia, 111 Riverbend Road, Athens, GA 30602, USA
| | - Ying-Hui Li
- The National Key Facility for Crop Gene Resources and Genetic Improvement (NFCRI), Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Li-Juan Qiu
- The National Key Facility for Crop Gene Resources and Genetic Improvement (NFCRI), Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Scott A Jackson
- Center for Applied Genetic Technologies, University of Georgia, 111 Riverbend Road, Athens, GA 30602, USA.
| |
Collapse
|
34
|
Cardoso TF, Quintanilla R, Castelló A, González-Prendes R, Amills M, Cánovas Á. Differential expression of mRNA isoforms in the skeletal muscle of pigs with distinct growth and fatness profiles. BMC Genomics 2018; 19:145. [PMID: 29444639 PMCID: PMC5813380 DOI: 10.1186/s12864-018-4515-2] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 01/31/2018] [Indexed: 01/03/2023] Open
Abstract
Background The identification of genes differentially expressed in the skeletal muscle of pigs displaying distinct growth and fatness profiles might contribute to identify the genetic factors that influence the phenotypic variation of such traits. So far, the majority of porcine transcriptomic studies have investigated differences in gene expression at a global scale rather than at the mRNA isoform level. In the current work, we have investigated the differential expression of mRNA isoforms in the gluteus medius (GM) muscle of 52 Duroc HIGH (increased backfat thickness, intramuscular fat and saturated and monounsaturated fatty acids contents) and LOW pigs (opposite phenotype, with an increased polyunsaturated fatty acids content). Results Our analysis revealed that 10.9% of genes expressed in the GM muscle generate alternative mRNA isoforms, with an average of 2.9 transcripts per gene. By using two different pipelines, one based on the CLC Genomics Workbench and another one on the STAR, RSEM and DESeq2 softwares, we have identified 10 mRNA isoforms that both pipelines categorize as differentially expressed in HIGH vs LOW pigs (P-value < 0.01 and ±0.6 log2fold-change). Only five mRNA isoforms, produced by the ITGA5, SEMA4D, LITAF, TIMP1 and ANXA2 genes, remain significant after correction for multiple testing (q-value < 0.05 and ±0.6 log2fold-change), being upregulated in HIGH pigs. Conclusions The increased levels of specific ITGA5, LITAF, TIMP1 and ANXA2 mRNA isoforms in HIGH pigs is consistent with reports indicating that the overexpression of these four genes is associated with obesity and metabolic disorders in humans. A broader knowledge about the functional attributes of these mRNA variants would be fundamental to elucidate the consequences of transcript diversity on the determinism of porcine phenotypes of economic interest. Electronic supplementary material The online version of this article (10.1186/s12864-018-4515-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Tainã Figueiredo Cardoso
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus de la Universitat Autònoma de Barcelona, Bellaterra, 08193, Barcelona, Spain.,CAPES Foundation, Ministry of Education of Brazil, Brasilia D.F, 70.040-020, Brazil
| | - Raquel Quintanilla
- Animal Breeding and Genetics Programme, Institute for Research and Technology in Food and Agriculture (IRTA), Torre Marimon, 08140, Caldes de Montbui, Spain
| | - Anna Castelló
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus de la Universitat Autònoma de Barcelona, Bellaterra, 08193, Barcelona, Spain.,Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, Bellaterra, 08193, Barcelona, Spain
| | - Rayner González-Prendes
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus de la Universitat Autònoma de Barcelona, Bellaterra, 08193, Barcelona, Spain
| | - Marcel Amills
- Department of Animal Genetics, Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus de la Universitat Autònoma de Barcelona, Bellaterra, 08193, Barcelona, Spain. .,Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, Bellaterra, 08193, Barcelona, Spain.
| | - Ángela Cánovas
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, Guelph, ON, Canada.
| |
Collapse
|
35
|
Yeh HS, Zhang W, Yong J. Analyses of alternative polyadenylation: from old school biochemistry to high-throughput technologies. BMB Rep 2018; 50:201-207. [PMID: 28148393 PMCID: PMC5437964 DOI: 10.5483/bmbrep.2017.50.4.019] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Indexed: 01/08/2023] Open
Abstract
Alternations in usage of polyadenylation sites during transcription termination yield transcript isoforms from a gene. Recent findings of transcriptome-wide alternative polyadenylation (APA) as a molecular response to changes in biology position APA not only as a molecular event of early transcriptional termination but also as a cellular regulatory step affecting various biological pathways. With the development of high-throughput profiling technologies at a single nucleotide level and their applications targeted to the 3'-end of mRNAs, dynamics in the landscape of mRNA 3'-end is measureable at a global scale. In this review, methods and technologies that have been adopted to study APA events are discussed. In addition, various bioinformatics algorithms for APA isoform analysis using publicly available RNA-seq datasets are introduced. [BMB Reports 2017; 50(4): 201-207].
Collapse
Affiliation(s)
- Hsin-Sung Yeh
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Wei Zhang
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
36
|
Sanfilippo P, Wen J, Lai EC. Landscape and evolution of tissue-specific alternative polyadenylation across Drosophila species. Genome Biol 2017; 18:229. [PMID: 29191225 PMCID: PMC5707805 DOI: 10.1186/s13059-017-1358-0] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 11/08/2017] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Drosophila melanogaster has one of best-described transcriptomes of any multicellular organism. Nevertheless, the paucity of 3'-sequencing data in this species precludes comprehensive assessment of alternative polyadenylation (APA), which is subject to broad tissue-specific control. RESULTS Here, we generate deep 3'-sequencing data from 23 developmental stages, tissues, and cell lines of D. melanogaster, yielding a comprehensive atlas of ~ 62,000 polyadenylated ends. These data broadly extend the annotated transcriptome, identify ~ 40,000 novel 3' termini, and reveal that two-thirds of Drosophila genes are subject to APA. Furthermore, we dramatically expand the numbers of genes known to be subject to tissue-specific APA, such as 3' untranslated region (UTR) lengthening in head and 3' UTR shortening in testis, and characterize new tissue and developmental 3' UTR patterns. Our thorough 3' UTR annotations permit reassessment of post-transcriptional regulatory networks, via conserved miRNA and RNA binding protein sites. To evaluate the evolutionary conservation and divergence of APA patterns, we generate developmental and tissue-specific 3'-seq libraries from Drosophila yakuba and Drosophila virilis. We document broadly analogous tissue-specific APA trends in these species, but also observe significant alterations in 3' end usage across orthologs. We exploit the population of functionally evolving poly(A) sites to gain clear evidence that evolutionary divergence in core polyadenylation signal (PAS) and downstream sequence element (DSE) motifs drive broad alterations in 3' UTR isoform expression across the Drosophila phylogeny. CONCLUSIONS These data provide a critical resource for the Drosophila community and offer many insights into the complex control of alternative tissue-specific 3' UTR formation and its consequences for post-transcriptional regulatory networks.
Collapse
Affiliation(s)
- Piero Sanfilippo
- Department of Developmental Biology, Sloan-Kettering Institute, New York, New York, 10065, USA
- Louis V. Gerstner, Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York, New York, 10065, USA
| | - Jiayu Wen
- Department of Developmental Biology, Sloan-Kettering Institute, New York, New York, 10065, USA
- Present address: Biochemistry and Biomedical Sciences, Research School of Biology, ANU College of Science, The Australian National University, Canberra, ACT 2601, Australia
| | - Eric C Lai
- Department of Developmental Biology, Sloan-Kettering Institute, New York, New York, 10065, USA.
- Louis V. Gerstner, Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York, New York, 10065, USA.
| |
Collapse
|
37
|
Szkop KJ, Nobeli I. Untranslated Parts of Genes Interpreted: Making Heads or Tails of High-Throughput Transcriptomic Data via Computational Methods: Computational methods to discover and quantify isoforms with alternative untranslated regions. Bioessays 2017; 39. [PMID: 29052251 DOI: 10.1002/bies.201700090] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2017] [Revised: 09/12/2017] [Indexed: 01/07/2023]
Abstract
In this review we highlight the importance of defining the untranslated parts of transcripts, and present a number of computational approaches for the discovery and quantification of alternative transcription start and poly-adenylation events in high-throughput transcriptomic data. The fate of eukaryotic transcripts is closely linked to their untranslated regions, which are determined by the position at which transcription starts and ends at a genomic locus. Although the extent of alternative transcription starts and alternative poly-adenylation sites has been revealed by sequencing methods focused on the ends of transcripts, the application of these methods is not yet widely adopted by the community. We suggest that computational methods applied to standard high-throughput technologies are a useful, albeit less accurate, alternative to the expertise-demanding 5' and 3' sequencing and they are the only option for analysing legacy transcriptomic data. We review these methods here, focusing on technical challenges and arguing for the need to include better normalization of the data and more appropriate statistical models of the expected variation in the signal.
Collapse
Affiliation(s)
- Krzysztof J Szkop
- Institute of Structural and Molecular Biology, Department of Biological Sciences Birkbeck, University of London, Malet Street, London WC1E 7HX, UK
| | - Irene Nobeli
- Institute of Structural and Molecular Biology, Department of Biological Sciences Birkbeck, University of London, Malet Street, London WC1E 7HX, UK
| |
Collapse
|
38
|
Genome-wide profiling of the 3' ends of polyadenylated RNAs. Methods 2017; 126:86-94. [PMID: 28602807 DOI: 10.1016/j.ymeth.2017.06.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Revised: 05/29/2017] [Accepted: 06/03/2017] [Indexed: 11/24/2022] Open
Abstract
Alternative polyadenylation (APA) diversifies the 3' termini of a majority of mRNAs in most eukaryotes, and is consequently inferred to have substantial consequences for the utilization of post-transcriptional regulatory mechanisms. Since conventional RNA-sequencing methods do not accurately define mRNA termini, a number of protocols have been developed that permit sequencing of the 3' ends of polyadenylated transcripts (3'-seq). We present here our experimental protocol to generate 3'-seq libraries using a dT-priming approach, including extensive details on considerations that will enable successful library cloning. We pair this with a set of computational tools that allow the user to process the raw sequence data into a filtered set of clusters that represent high-confidence functional polyadenylation sites. The data are single-nucleotide resolution and quantitative, and can be used for downstream analyses of APA.
Collapse
|
39
|
Macchiaroli N, Maldonado LL, Zarowiecki M, Cucher M, Gismondi MI, Kamenetzky L, Rosenzvit MC. Genome-wide identification of microRNA targets in the neglected disease pathogens of the genus Echinococcus. Mol Biochem Parasitol 2017; 214:91-100. [PMID: 28385564 DOI: 10.1016/j.molbiopara.2017.04.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 03/30/2017] [Accepted: 04/01/2017] [Indexed: 01/01/2023]
Abstract
MicroRNAs (miRNAs), a class of small non-coding RNAs, are key regulators of gene expression at post-transcriptional level and play essential roles in biological processes such as development. MiRNAs silence target mRNAs by binding to complementary sequences in the 3'untranslated regions (3'UTRs). The parasitic helminths of the genus Echinococcus are the causative agents of echinococcosis, a zoonotic neglected disease. In previous work, we performed a comprehensive identification and characterization of Echinococcus miRNAs. However, current knowledge about their targets is limited. Since target prediction algorithms rely on complementarity between 3'UTRs and miRNA sequences, a major limitation is the lack of accurate sequence information of 3'UTR for most species including parasitic helminths. We performed RNA-seq and developed a pipeline that integrates the transcriptomic data with available genomic data of this parasite in order to identify 3'UTRs of Echinococcus canadensis. The high confidence set of 3'UTRs obtained allowed the prediction of miRNA targets in Echinococcus through a bioinformatic approach. We performed for the first time a comparative analysis of miRNA targets in Echinococcus and Taenia. We found that many evolutionarily conserved target sites in Echinococcus and Taenia may be functional and under selective pressure. Signaling pathways such as MAPK and Wnt were among the most represented pathways indicating miRNA roles in parasite growth and development. Genome-wide identification and characterization of miRNA target genes in Echinococcus provide valuable information to guide experimental studies in order to understand miRNA functions in the parasites biology. miRNAs involved in essential functions, especially those being absent in the host or showing sequence divergence with respect to host orthologs, might be considered as novel therapeutic targets for echinococcosis control.
Collapse
Affiliation(s)
- Natalia Macchiaroli
- Instituto de Investigaciones en Microbiología y Parasitología Médicas (IMPaM), Facultad de Medicina, Universidad de Buenos Aires (UBA)-Consejo Nacional de Investigaciones Científicas y Tecnológicas (CONICET), Buenos Aires, Argentina
| | - Lucas L Maldonado
- Instituto de Investigaciones en Microbiología y Parasitología Médicas (IMPaM), Facultad de Medicina, Universidad de Buenos Aires (UBA)-Consejo Nacional de Investigaciones Científicas y Tecnológicas (CONICET), Buenos Aires, Argentina
| | - Magdalena Zarowiecki
- Parasite Genomics Group, Wellcome Trust Sanger Institute, Hinxton, United Kingdom
| | - Marcela Cucher
- Instituto de Investigaciones en Microbiología y Parasitología Médicas (IMPaM), Facultad de Medicina, Universidad de Buenos Aires (UBA)-Consejo Nacional de Investigaciones Científicas y Tecnológicas (CONICET), Buenos Aires, Argentina
| | | | - Laura Kamenetzky
- Instituto de Investigaciones en Microbiología y Parasitología Médicas (IMPaM), Facultad de Medicina, Universidad de Buenos Aires (UBA)-Consejo Nacional de Investigaciones Científicas y Tecnológicas (CONICET), Buenos Aires, Argentina
| | - Mara Cecilia Rosenzvit
- Instituto de Investigaciones en Microbiología y Parasitología Médicas (IMPaM), Facultad de Medicina, Universidad de Buenos Aires (UBA)-Consejo Nacional de Investigaciones Científicas y Tecnológicas (CONICET), Buenos Aires, Argentina.
| |
Collapse
|
40
|
Son HG, Seo M, Ham S, Hwang W, Lee D, An SWA, Artan M, Seo K, Kaletsky R, Arey RN, Ryu Y, Ha CM, Kim YK, Murphy CT, Roh TY, Nam HG, Lee SJV. RNA surveillance via nonsense-mediated mRNA decay is crucial for longevity in daf-2/insulin/IGF-1 mutant C. elegans. Nat Commun 2017; 8:14749. [PMID: 28276441 PMCID: PMC5347137 DOI: 10.1038/ncomms14749] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Accepted: 01/30/2017] [Indexed: 12/14/2022] Open
Abstract
Long-lived organisms often feature more stringent protein and DNA quality control. However, whether RNA quality control mechanisms, such as nonsense-mediated mRNA decay (NMD), which degrades both abnormal as well as some normal transcripts, have a role in organismal aging remains unexplored. Here we show that NMD mediates longevity in C. elegans strains with mutations in daf-2/insulin/insulin-like growth factor 1 receptor. We find that daf-2 mutants display enhanced NMD activity and reduced levels of potentially aberrant transcripts. NMD components, including smg-2/UPF1, are required to achieve the longevity of several long-lived mutants, including daf-2 mutant worms. NMD in the nervous system of the animals is particularly important for RNA quality control to promote longevity. Furthermore, we find that downregulation of yars-2/tyrosyl-tRNA synthetase, an NMD target transcript, by daf-2 mutations contributes to longevity. We propose that NMD-mediated RNA surveillance is a crucial quality control process that contributes to longevity conferred by daf-2 mutations.
Collapse
Affiliation(s)
- Heehwa G. Son
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| | - Mihwa Seo
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
- School of Interdisciplinary Bioscience and Bioengineering, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
- Center for Plant Aging Research, Institute for Basic Science, Daegu 42988, South Korea
| | - Seokjin Ham
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| | - Wooseon Hwang
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| | - Dongyeop Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| | - Seon Woo A. An
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| | - Murat Artan
- Information Technology Convergence Engineering, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| | - Keunhee Seo
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| | - Rachel Kaletsky
- Department of Molecular Biology & LSI Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Rachel N. Arey
- Department of Molecular Biology & LSI Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Youngjae Ryu
- Research Division, Korea Brain Research Institute, Daegu 41068, South Korea
| | - Chang Man Ha
- Research Division, Korea Brain Research Institute, Daegu 41068, South Korea
| | - Yoon Ki Kim
- Creative Research Initiatives Center for Molecular Biology of Translation, Korea University, Seoul 02841, South Korea
- Division of Life Sciences, Korea University, Seoul 02841, South Korea
| | - Coleen T. Murphy
- Department of Molecular Biology & LSI Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Tae-Young Roh
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
- Division of Integrative Biosciences and Biotechnology, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| | - Hong Gil Nam
- Center for Plant Aging Research, Institute for Basic Science, Daegu 42988, South Korea
- Department of New Biology, DGIST, Daegu 42988, South Korea
| | - Seung-Jae V. Lee
- Department of Life Sciences, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
- School of Interdisciplinary Bioscience and Bioengineering, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
- Information Technology Convergence Engineering, Pohang University of Science and Technology, Pohang, Gyeongbuk 37673, South Korea
| |
Collapse
|
41
|
Lebedeva S, de Jesus Domingues AM, Butter F, Ketting RF. Characterization of genetic loss-of-function of Fus in zebrafish. RNA Biol 2017; 14:29-35. [PMID: 27898262 PMCID: PMC5270537 DOI: 10.1080/15476286.2016.1256532] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Revised: 10/24/2016] [Accepted: 10/30/2016] [Indexed: 12/13/2022] Open
Abstract
The RNA-binding protein FUS is implicated in transcription, alternative splicing of neuronal genes and DNA repair. Mutations in FUS have been linked to human neurodegenerative diseases such as ALS (amyotrophic lateral sclerosis). We genetically disrupted fus in zebrafish (Danio rerio) using the CRISPR-Cas9 system. The fus knockout animals are fertile and did not show any distinctive phenotype. Mutation of fus induces mild changes in gene expression on the transcriptome and proteome level in the adult brain. We observed a significant influence of genetic background on gene expression and 3'UTR usage, which could mask the effects of loss of Fus. Unlike published fus morphants, maternal zygotic fus mutants do not show motoneuronal degeneration and exhibit normal locomotor activity.
Collapse
Affiliation(s)
| | | | - Falk Butter
- Institute of Molecular Biology, Mainz, Germany
| | | |
Collapse
|
42
|
Grassi E, Mariella E, Lembo A, Molineris I, Provero P. Roar: detecting alternative polyadenylation with standard mRNA sequencing libraries. BMC Bioinformatics 2016; 17:423. [PMID: 27756200 PMCID: PMC5069797 DOI: 10.1186/s12859-016-1254-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 09/08/2016] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate. RESULTS We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach. CONCLUSIONS We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.
Collapse
Affiliation(s)
- Elena Grassi
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy.
| | - Elisa Mariella
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Antonio Lembo
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Ivan Molineris
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Paolo Provero
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
- Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 60, Milan, 20132, Italy
| |
Collapse
|