51
|
Wyman D, Mortazavi A. TranscriptClean: variant-aware correction of indels, mismatches and splice junctions in long-read transcripts. Bioinformatics 2019; 35:340-342. [PMID: 29912287 PMCID: PMC6329999 DOI: 10.1093/bioinformatics/bty483] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 06/13/2018] [Indexed: 11/14/2022] Open
Abstract
Motivation Long-read, single-molecule sequencing platforms hold great potential for isoform discovery and characterization of multi-exon transcripts. However, their high error rates are an obstacle to distinguishing novel transcript isoforms from sequencing artifacts. Therefore, we developed the package TranscriptClean to correct mismatches, microindels and noncanonical splice junctions in mapped transcripts using the reference genome while preserving known variants. Results Our method corrects nearly all mismatches and indels present in a publically available human PacBio Iso-seq dataset, and rescues 39% of noncanonical splice junctions. Availability and implementation All Python and R scripts used in this paper are available at https://github.com/dewyman/TranscriptClean.
Collapse
Affiliation(s)
- Dana Wyman
- Department of Developmental and Cell Biology, UC Irvine, Irvine, CA, USA.,Center for Complex Biological Systems, UC Irvine, Irvine, CA, USA
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, UC Irvine, Irvine, CA, USA.,Center for Complex Biological Systems, UC Irvine, Irvine, CA, USA
| |
Collapse
|
52
|
BinEssa HA, Zou M, Al-Enezi AF, Alomrani B, Al-Faham MSA, Al-Rijjal RA, Meyer BF, Shi Y. Functional analysis of 22 splice-site mutations in the PHEX, the causative gene in X-linked dominant hypophosphatemic rickets. Bone 2019; 125:186-193. [PMID: 31102713 DOI: 10.1016/j.bone.2019.05.017] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Revised: 04/24/2019] [Accepted: 05/13/2019] [Indexed: 12/19/2022]
Abstract
CONTEXT X-linked hypophosphatemic rickets (XLH) is caused by inactivating mutations in the PHEX gene and is the most common form of hereditary rickets. The splice-site mutations account for 17% of all reported PHEX mutations. The functional consequence of these splice-site mutations has not been systemically investigated. OBJECTIVE The current study was undertaken to functionally annotate previously reported 22 splice-site mutations in the PHEX gene. METHODS PHEX mini-genes with different splice-site mutations were created by site-directed mutagenesis and expressed in HEK293 cells. The mRNA transcripts were analyzed by RT-PCR, cloning, and sequencing. RESULTS These splicing mutations led to a variety of consequences, including exon skipping, intron retention, and activation of cryptic splice sites. Among 22 splice-site mutations, exon skipping was the most common event accounting for 73% (16/22). Non-canonical splice-site mutations could result in splicing errors to the same extent as canonical splice-site mutations such as c.436+3G>C, c.436+4A>C, c.436+6T>C, c.437-3C>G, c.850-3C>G, c.1080-3C>A, c.1482+5G>C, c.1586+6T>C, c.1645+5G>A, c.1645+6T>C, c.1701-16T>A, c.1768+5G>A, and c.1899+5G>A. Interestingly, non-canonical (c.436+6T>C and c.1586+6T>C) and canonical splice-site mutations (c.1769-1G>C) could generate partial splicing errors (both wild-type and mutant transcripts were detected), resulting in incomplete inactivation of PHEX gene, which may explain the mild disease phenotype reported previously, providing evidence of genotype-phenotype correlation. c.1645C>T (p.R549*) had no impact on pre-mRNA splicing although it is located next to canonical splice donor site GT. CONCLUSIONS Exon skipping is the most common outcome due to splice-site mutations. Both canonical and non-canonical splice-site mutations can result in either severe or mild RNA splicing defects, contributing to phenotype heterogeneity. Non-canonical splice-site mutations should not be overlooked in genetic screening especially those located within 50 bp from canonical splice site.
Collapse
Affiliation(s)
- Huda A BinEssa
- Department of Genetics, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia
| | - Minjing Zou
- Department of Genetics, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia
| | - Anwar F Al-Enezi
- Department of Genetics, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia
| | - Basma Alomrani
- Department of Genetics, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia
| | - Manar S A Al-Faham
- Department of Genetics, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia
| | - Roua A Al-Rijjal
- Department of Genetics, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia
| | - Brian F Meyer
- Department of Genetics, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia
| | - Yufei Shi
- Department of Genetics, King Faisal Specialist Hospital & Research Centre, Riyadh, Saudi Arabia.
| |
Collapse
|
53
|
Frankiw L, Baltimore D, Li G. Alternative mRNA splicing in cancer immunotherapy. Nat Rev Immunol 2019; 19:675-687. [PMID: 31363190 DOI: 10.1038/s41577-019-0195-7] [Citation(s) in RCA: 156] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/02/2019] [Indexed: 12/12/2022]
Abstract
Immunotherapies are yielding effective treatments for several previously untreatable cancers. Still, the identification of suitable antigens specific to the tumour that can be targets for cancer vaccines and T cell therapies is a challenge. Alternative processing of mRNA, a phenomenon that has been shown to alter the proteomic diversity of many cancers, may offer the potential of a broadened target space. Here, we discuss the promise of analysing mRNA processing events in cancer cells, with an emphasis on mRNA splicing, for the identification of potential new targets for cancer immunotherapy. Further, we highlight the challenges that must be overcome for this new avenue to have clinical applicability.
Collapse
Affiliation(s)
- Luke Frankiw
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - David Baltimore
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | - Guideng Li
- Center of Systems Medicine, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China. .,Suzhou Institute of Systems Medicine, Suzhou, China.
| |
Collapse
|
54
|
Lin JH, Tang XY, Boulling A, Zou WB, Masson E, Fichou Y, Raud L, Le Tertre M, Deng SJ, Berlivet I, Ka C, Mort M, Hayden M, Leman R, Houdayer C, Le Gac G, Cooper DN, Li ZS, Férec C, Liao Z, Chen JM. First estimate of the scale of canonical 5' splice site GT>GC variants capable of generating wild-type transcripts. Hum Mutat 2019; 40:1856-1873. [PMID: 31131953 DOI: 10.1002/humu.23821] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Revised: 04/10/2019] [Accepted: 05/24/2019] [Indexed: 12/13/2022]
Abstract
It has long been known that canonical 5' splice site (5'SS) GT>GC variants may be compatible with normal splicing. However, to date, the actual scale of canonical 5'SSs capable of generating wild-type transcripts in the case of GT>GC substitutions remains unknown. Herein, combining data derived from a meta-analysis of 45 human disease-causing 5'SS GT>GC variants and a cell culture-based full-length gene splicing assay of 103 5'SS GT>GC substitutions, we estimate that ~15-18% of canonical GT 5'SSs retain their capacity to generate between 1% and 84% normal transcripts when GT is substituted by GC. We further demonstrate that the canonical 5'SSs in which substitution of GT by GC-generated normal transcripts exhibit stronger complementarity to the 5' end of U1 snRNA than those sites whose substitutions of GT by GC did not lead to the generation of normal transcripts. We also observed a correlation between the generation of wild-type transcripts and a milder than expected clinical phenotype but found that none of the available splicing prediction tools were capable of reliably distinguishing 5'SS GT>GC variants that generated wild-type transcripts from those that did not. Our findings imply that 5'SS GT>GC variants in human disease genes may not invariably be pathogenic.
Collapse
Affiliation(s)
- Jin-Huan Lin
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France.,Department of Gastroenterology, Changhai Hospital, Second Military Medical University, Shanghai, China.,Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Xin-Ying Tang
- Department of Gastroenterology, Changhai Hospital, Second Military Medical University, Shanghai, China.,Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Arnaud Boulling
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France
| | - Wen-Bin Zou
- Department of Gastroenterology, Changhai Hospital, Second Military Medical University, Shanghai, China.,Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Emmanuelle Masson
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France.,CHU Brest, Service de Génétique, Brest, France
| | - Yann Fichou
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France.,Laboratory of Excellence GR-Ex, Paris, France
| | - Loann Raud
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France
| | | | - Shun-Jiang Deng
- Department of Gastroenterology, Changhai Hospital, Second Military Medical University, Shanghai, China.,Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | | | - Chandran Ka
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France.,CHU Brest, Service de Génétique, Brest, France.,Laboratory of Excellence GR-Ex, Paris, France
| | - Matthew Mort
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Matthew Hayden
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Raphaël Leman
- Laboratoire de Biologie et Génétique du Cancer, Centre François Baclesse, Caen, France.,Department of Genetics, F76000 and Normandy University, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen University Hospital, Rouen, France
| | - Claude Houdayer
- Department of Genetics, F76000 and Normandy University, UNIROUEN, Inserm U1245, Normandy Centre for Genomic and Personalized Medicine, Rouen University Hospital, Rouen, France
| | - Gerald Le Gac
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France.,CHU Brest, Service de Génétique, Brest, France.,Laboratory of Excellence GR-Ex, Paris, France
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Zhao-Shen Li
- Department of Gastroenterology, Changhai Hospital, Second Military Medical University, Shanghai, China.,Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Claude Férec
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France
| | - Zhuan Liao
- Department of Gastroenterology, Changhai Hospital, Second Military Medical University, Shanghai, China.,Shanghai Institute of Pancreatic Diseases, Shanghai, China
| | - Jian-Min Chen
- EFS, Univ Brest, Inserm, UMR 1078, GGB, F-29200, Brest, France
| |
Collapse
|
55
|
Zhang L, Vielle A, Espinosa S, Zhao R. RNAs in the spliceosome: Insight from cryoEM structures. WILEY INTERDISCIPLINARY REVIEWS-RNA 2019; 10:e1523. [PMID: 30729694 DOI: 10.1002/wrna.1523] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2018] [Revised: 12/12/2018] [Accepted: 12/28/2018] [Indexed: 12/28/2022]
Abstract
Pre-mRNA splicing is catalyzed by the spliceosome, a multimegadalton RNA-protein complex. The spliceosome undergoes dramatic compositional and conformational changes through the splicing cycle, forming at least 10 distinct complexes. Recent high-resolution cryoEM structures of various spliceosomal complexes revealed unprecedented details of this large molecular machine. This review highlights insight into the structure and function of the spliceosomal RNA components obtained from these new structures, with a focus on the yeast spliceosome. This article is categorized under: RNA Processing > Splicing Mechanisms RNA Structure and Dynamics > RNA Structure, Dynamics, and Chemistry RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes.
Collapse
Affiliation(s)
- Lingdi Zhang
- Department of Biochemistry and Molecular Genetics, Anschutz Medical Campus, University of Colorado Denver, Aurora, Colorado
| | - Anne Vielle
- Department of Biochemistry and Molecular Genetics, Anschutz Medical Campus, University of Colorado Denver, Aurora, Colorado
| | - Sara Espinosa
- Department of Biochemistry and Molecular Genetics, Anschutz Medical Campus, University of Colorado Denver, Aurora, Colorado
| | - Rui Zhao
- Department of Biochemistry and Molecular Genetics, Anschutz Medical Campus, University of Colorado Denver, Aurora, Colorado
| |
Collapse
|
56
|
Samuel CE. Adenosine deaminase acting on RNA (ADAR1), a suppressor of double-stranded RNA-triggered innate immune responses. J Biol Chem 2019; 294:1710-1720. [PMID: 30710018 PMCID: PMC6364763 DOI: 10.1074/jbc.tm118.004166] [Citation(s) in RCA: 111] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Herbert "Herb" Tabor, who celebrated his 100th birthday this past year, served the Journal of Biological Chemistry as a member of the Editorial Board beginning in 1961, as an Associate Editor, and as Editor-in-Chief for 40 years, from 1971 until 2010. Among the many discoveries in biological chemistry during this period was the identification of RNA modification by C6 deamination of adenosine (A) to produce inosine (I) in double-stranded (ds) RNA. This posttranscriptional RNA modification by adenosine deamination, known as A-to-I RNA editing, diversifies the transcriptome and modulates the innate immune interferon response. A-to-I editing is catalyzed by a family of enzymes, adenosine deaminases acting on dsRNA (ADARs). The roles of A-to-I editing are varied and include effects on mRNA translation, pre-mRNA splicing, and micro-RNA silencing. Suppression of dsRNA-triggered induction and action of interferon, the cornerstone of innate immunity, has emerged as a key function of ADAR1 editing of self (cellular) and nonself (viral) dsRNAs. A-to-I modification of RNA is essential for the normal regulation of cellular processes. Dysregulation of A-to-I editing by ADAR1 can have profound consequences, ranging from effects on cell growth and development to autoimmune disorders.
Collapse
Affiliation(s)
- Charles E Samuel
- Department of Molecular, Cellular and Developmental Biology, University of California, Santa Barbara, California 93106.
| |
Collapse
|
57
|
Pucker B, Brockington SF. Genome-wide analyses supported by RNA-Seq reveal non-canonical splice sites in plant genomes. BMC Genomics 2018; 19:980. [PMID: 30594132 PMCID: PMC6310983 DOI: 10.1186/s12864-018-5360-z] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Accepted: 12/10/2018] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Most eukaryotic genes comprise exons and introns thus requiring the precise removal of introns from pre-mRNAs to enable protein biosynthesis. U2 and U12 spliceosomes catalyze this step by recognizing motifs on the transcript in order to remove the introns. A process which is dependent on precise definition of exon-intron borders by splice sites, which are consequently highly conserved across species. Only very few combinations of terminal dinucleotides are frequently observed at intron ends, dominated by the canonical GT-AG splice sites on the DNA level. RESULTS Here we investigate the occurrence of diverse combinations of dinucleotides at predicted splice sites. Analyzing 121 plant genome sequences based on their annotation revealed strong splice site conservation across species, annotation errors, and true biological divergence from canonical splice sites. The frequency of non-canonical splice sites clearly correlates with their divergence from canonical ones indicating either an accumulation of probably neutral mutations, or evolution towards canonical splice sites. Strong conservation across multiple species and non-random accumulation of substitutions in splice sites indicate a functional relevance of non-canonical splice sites. The average composition of splice sites across all investigated species is 98.7% for GT-AG, 1.2% for GC-AG, 0.06% for AT-AC, and 0.09% for minor non-canonical splice sites. RNA-Seq data sets of 35 species were incorporated to validate non-canonical splice site predictions through gaps in sequencing reads alignments and to demonstrate the expression of affected genes. CONCLUSION We conclude that bona fide non-canonical splice sites are present and appear to be functionally relevant in most plant genomes, although at low abundance.
Collapse
Affiliation(s)
- Boas Pucker
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, UK
- Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Samuel F. Brockington
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, UK
| |
Collapse
|
58
|
Erkelenz S, Theiss S, Kaisers W, Ptok J, Walotka L, Müller L, Hillebrand F, Brillen AL, Sladek M, Schaal H. Ranking noncanonical 5' splice site usage by genome-wide RNA-seq analysis and splicing reporter assays. Genome Res 2018; 28:1826-1840. [PMID: 30355602 PMCID: PMC6280755 DOI: 10.1101/gr.235861.118] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Accepted: 10/20/2018] [Indexed: 01/01/2023]
Abstract
Most human pathogenic mutations in 5' splice sites affect the canonical GT in positions +1 and +2, leading to noncanonical dinucleotides. On the other hand, noncanonical dinucleotides are observed under physiological conditions in ∼1% of all human 5'ss. It is therefore a challenging task to understand the pathogenic mutation mechanisms underlying the conditions under which noncanonical 5'ss are used. In this work, we systematically examined noncanonical 5' splice site selection, both experimentally using splicing competition reporters and by analyzing a large RNA-seq data set of 54 fibroblast samples from 27 subjects containing a total of 2.4 billion gapped reads covering 269,375 exon junctions. From both approaches, we consistently derived a noncanonical 5'ss usage ranking GC > TT > AT > GA > GG > CT. In our competition splicing reporter assay, noncanonical splicing was strictly dependent on the presence of upstream or downstream splicing regulatory elements (SREs), and changes in SREs could be compensated by variation of U1 snRNA complementarity in the competing 5'ss. In particular, we could confirm splicing at different positions (i.e., -1, +1, +5) of a splice site for all noncanonical dinucleotides "weaker" than GC. In our comprehensive RNA-seq data set analysis, noncanonical 5'ss were preferentially detected in weakly used exon junctions of highly expressed genes. Among high-confidence splice sites, they were 10-fold overrepresented in clusters with a neighboring, more frequently used 5'ss. Conversely, these more frequently used neighbors contained only the dinucleotides GT, GC, and TT, in accordance with the above ranking.
Collapse
Affiliation(s)
- Steffen Erkelenz
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Stephan Theiss
- Institute of Clinical Neuroscience and Medical Psychology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Wolfgang Kaisers
- Center for Biological and Medical Research (BMFZ), Center of Bioinformatics and Biostatistics (CBiBs), Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Johannes Ptok
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Lara Walotka
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Lisa Müller
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Frank Hillebrand
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Anna-Lena Brillen
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Michael Sladek
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| | - Heiner Schaal
- Institute of Virology, Medical Faculty, Heinrich Heine University Düsseldorf, D-40225 Düsseldorf, Germany
| |
Collapse
|
59
|
Fux JE, Mehta A, Moffat J, Spafford JD. Eukaryotic Voltage-Gated Sodium Channels: On Their Origins, Asymmetries, Losses, Diversification and Adaptations. Front Physiol 2018; 9:1406. [PMID: 30519187 PMCID: PMC6259924 DOI: 10.3389/fphys.2018.01406] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 09/14/2018] [Indexed: 12/19/2022] Open
Abstract
The appearance of voltage-gated, sodium-selective channels with rapid gating kinetics was a limiting factor in the evolution of nervous systems. Two rounds of domain duplications generated a common 24 transmembrane segment (4 × 6 TM) template that is shared amongst voltage-gated sodium (Nav1 and Nav2) and calcium channels (Cav1, Cav2, and Cav3) and leak channel (NALCN) plus homologs from yeast, different single-cell protists (heterokont and unikont) and algae (green and brown). A shared architecture in 4 × 6 TM channels include an asymmetrical arrangement of extended extracellular L5/L6 turrets containing a 4-0-2-2 pattern of cysteines, glycosylated residues, a universally short III-IV cytoplasmic linker and often a recognizable, C-terminal PDZ binding motif. Six intron splice junctions are conserved in the first domain, including a rare U12-type of the minor spliceosome provides support for a shared heritage for sodium and calcium channels, and a separate lineage for NALCN. The asymmetrically arranged pores of 4x6 TM channels allows for a changeable ion selectivity by means of a single lysine residue change in the high field strength site of the ion selectivity filter in Domains II or III. Multicellularity and the appearance of systems was an impetus for Nav1 channels to adapt to sodium ion selectivity and fast ion gating. A non-selective, and slowly gating Nav2 channel homolog in single cell eukaryotes, predate the diversification of Nav1 channels from a basal homolog in a common ancestor to extant cnidarians to the nine vertebrate Nav1.x channel genes plus Nax. A close kinship between Nav2 and Nav1 homologs is evident in the sharing of most (twenty) intron splice junctions. Different metazoan groups have lost their Nav1 channel genes altogether, while vertebrates rapidly expanded their gene numbers. The expansion in vertebrate Nav1 channel genes fills unique functional niches and generates overlapping properties contributing to redundancies. Specific nervous system adaptations include cytoplasmic linkers with phosphorylation sites and tethered elements to protein assemblies in First Initial Segments and nodes of Ranvier. Analogous accessory beta subunit appeared alongside Nav1 channels within different animal sub-phyla. Nav1 channels contribute to pace-making as persistent or resurgent currents, the former which is widespread across animals, while the latter is a likely vertebrate adaptation.
Collapse
Affiliation(s)
- Julia E Fux
- Department of Biology, University of Waterloo, Waterloo, ON, Canada
| | - Amrit Mehta
- Department of Biology, University of Waterloo, Waterloo, ON, Canada
| | - Jack Moffat
- Department of Biology, University of Waterloo, Waterloo, ON, Canada
| | - J David Spafford
- Department of Biology, University of Waterloo, Waterloo, ON, Canada
| |
Collapse
|
60
|
Sieber P, Voigt K, Kämmer P, Brunke S, Schuster S, Linde J. Comparative Study on Alternative Splicing in Human Fungal Pathogens Suggests Its Involvement During Host Invasion. Front Microbiol 2018; 9:2313. [PMID: 30333805 PMCID: PMC6176087 DOI: 10.3389/fmicb.2018.02313] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2018] [Accepted: 09/11/2018] [Indexed: 11/13/2022] Open
Abstract
Alternative splicing (AS) is an important regulatory mechanism in eukaryotes but only little is known about its impact in fungi. Human fungal pathogens are of high clinical interest causing recurrent or life-threatening infections. AS can be well-investigated genome-wide and quantitatively with the powerful technology of RNA-Seq. Here, we systematically studied AS in human fungal pathogens based on RNA-Seq data. To do so, we investigated its effect in seven fungi during conditions simulating ex vivo infection processes and during in vitro stress. Genes undergoing AS are species-specific and act independently from differentially expressed genes pointing to an independent mechanism to change abundance and functionality. Candida species stand out with a low number of introns with higher and more varying lengths and more alternative splice sites. Moreover, we identified a functional difference between response to host and other stress conditions: During stress, AS affects more genes and is involved in diverse regulatory functions. In contrast, during response-to-host conditions, genes undergoing AS have membrane functionalities and might be involved in the interaction with the host. We assume that AS plays a crucial regulatory role in pathogenic fungi and is important in both response to host and stress conditions.
Collapse
Affiliation(s)
- Patricia Sieber
- Department of Bioinformatics, Faculty of Biological Sciences, Friedrich Schiller University, Jena, Germany.,Research Group Systems Biology, Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany
| | - Kerstin Voigt
- Jena Microbial Resource Collection, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany.,Institute of Microbiology, Faculty of Biological Sciences, Friedrich Schiller University, Jena, Germany
| | - Philipp Kämmer
- Microbial Pathogenicity Mechanisms, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany
| | - Sascha Brunke
- Microbial Pathogenicity Mechanisms, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany
| | - Stefan Schuster
- Department of Bioinformatics, Faculty of Biological Sciences, Friedrich Schiller University, Jena, Germany
| | - Jörg Linde
- Research Group PiDOMICS, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute, Jena, Germany.,Institute for Bacterial Infections and Zoonoses, Federal Research Institute for Animal Health-Friedrich-Loeffler-Institute, Jena, Germany
| |
Collapse
|
61
|
Abstract
Nucleosomal modifications have been implicated in fundamental epigenetic regulation, whereas the roles of nucleosome binding in shaping changes through evolution remain to be addressed. Here we performed a comparative study to clarify the roles of nucleosome occupancy in exon origination. By profiling a high-resolution, cross-species mononucleosome landscape for mammalian tissues, we found nucleosome occupancy profiles are conserved across tissues and species. Further, through a phylogenetic approach, we found that the feature of differential nucleosome occupancy appears prior to the origination of new exons and, presumably, facilitates the origin of new exons by increasing the splice strength of the ancestral nonexonic regions through driving a local difference in GC content, which suggests the function of nucleosome binding in exonization. Nucleosomal modifications have been implicated in fundamental epigenetic regulation, but the roles of nucleosome occupancy in shaping changes through evolution remain to be addressed. Here we present high-resolution nucleosome occupancy profiles for multiple tissues derived from human, macaque, tree shrew, mouse, and pig. Genome-wide comparison reveals conserved nucleosome occupancy profiles across both different species and tissue types. Notably, we found significantly higher levels of nucleosome occupancy in exons than in introns, a pattern correlated with the different exon–intron GC content. We then determined whether this biased occupancy may play roles in the origination of new exons through evolution, rather than being a downstream effect of exonization, through a comparative approach to sequentially trace the order of the exonization and biased nucleosome binding. By identifying recently evolved exons in human but not in macaque using matched RNA sequencing, we found that higher exonic nucleosome occupancy also existed in macaque regions orthologous to these exons. Presumably, such biased nucleosome occupancy facilitates the origination of new exons by increasing the splice strength of the ancestral nonexonic regions through driving a local difference in GC content. These data thus support a model that sites bound by nucleosomes are more likely to evolve into exons, which we term the “nucleosome-first” model.
Collapse
|
62
|
Montalban G, Fraile-Bethencourt E, López-Perolio I, Pérez-Segura P, Infante M, Durán M, Alonso-Cerezo MC, López-Fernández A, Diez O, de la Hoya M, Velasco EA, Gutiérrez-Enríquez S. Characterization of spliceogenic variants located in regions linked to high levels of alternative splicing: BRCA2 c.7976+5G > T as a case study. Hum Mutat 2018; 39:1155-1160. [PMID: 29969168 DOI: 10.1002/humu.23583] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Revised: 06/04/2018] [Accepted: 06/27/2018] [Indexed: 12/21/2022]
Abstract
Many BRCA1 and BRCA2 (BRCA1/2) genetic variants have been studied at mRNA level and linked to hereditary breast and ovarian cancer due to splicing alteration. In silico tools are reliable when assessing variants located in consensus splice sites, but we may identify variants in complex genomic contexts for which bioinformatics is not precise enough. In this study, we characterize BRCA2 c.7976 + 5G > T variant located in intron 17 which has an atypical donor site (GC). This variant was identified in three unrelated Spanish families and we have detected exon 17 skipping as the predominant transcript occurring in carriers. We have also detected several isoforms (Δ16-18, Δ17,18, Δ18, and ▼17q224 ) at different expression levels among carriers and controls. This study remarks the challenge of interpreting genetic variants when multiple alternative isoforms are present, and that caution must be taken when using in silico tools to identify potential spliceogenic variants located in GC-AG introns.
Collapse
Affiliation(s)
- Gemma Montalban
- Oncogenetics Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| | - Eugenia Fraile-Bethencourt
- Splicing and genetic susceptibility to cancer, Instituto de Biología y Genética Molecular (CSIC-UVa), Valladolid, Spain
| | - Irene López-Perolio
- Molecular Oncology Laboratory CIBERONC, Hospital Clinico San Carlos, IdISSC (Instituto de Investigación Sanitaria del Hospital Clínico San Carlos), Madrid, Spain
| | - Pedro Pérez-Segura
- Molecular Oncology Laboratory CIBERONC, Hospital Clinico San Carlos, IdISSC (Instituto de Investigación Sanitaria del Hospital Clínico San Carlos), Madrid, Spain
| | - Mar Infante
- Cancer Genetics, Instituto de Biología y Genética Molecular (CSIC-UVa), Valladolid, Spain
| | - Mercedes Durán
- Cancer Genetics, Instituto de Biología y Genética Molecular (CSIC-UVa), Valladolid, Spain
| | - María Concepción Alonso-Cerezo
- Genética Clínica. Servicio Análisis Clínicos. Hospital Universitario de la Princesa, Instituto de Investigación Sanitaria Hospital Universitario de la Princesa, Madrid, Spain
| | - Adrià López-Fernández
- High Risk and Cancer Prevention Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain
| | - Orland Diez
- Oncogenetics Group, Vall d'Hebron Institute of Oncology (VHIO), Barcelona, Spain.,Area of Clinical and Molecular Genetics, University Hospital of Vall d'Hebron, Barcelona, Spain
| | - Miguel de la Hoya
- Molecular Oncology Laboratory CIBERONC, Hospital Clinico San Carlos, IdISSC (Instituto de Investigación Sanitaria del Hospital Clínico San Carlos), Madrid, Spain
| | - Eladio A Velasco
- Splicing and genetic susceptibility to cancer, Instituto de Biología y Genética Molecular (CSIC-UVa), Valladolid, Spain
| | | |
Collapse
|
63
|
Cetacea are natural knockouts for IL20. Immunogenetics 2018; 70:681-687. [DOI: 10.1007/s00251-018-1071-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Accepted: 07/01/2018] [Indexed: 10/28/2022]
|
64
|
Yang XQ, Jing XY, Zhang CX, Song YF, Liu D. Isolation and characterization of porcine PILRB gene and its alternative splicing variants. Gene 2018; 672:8-15. [PMID: 29879501 DOI: 10.1016/j.gene.2018.06.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2017] [Revised: 06/01/2018] [Accepted: 06/04/2018] [Indexed: 01/15/2023]
Abstract
Paired immunoglobulin-like type 2 receptor (PILR)β regulates inflammatory responses to pathogen infection, and therefore plays an important role in host disease resistance/susceptibility. However porcine PILRβ remains poorly characterized. In this study, we obtained the cDNA (V1) of its encoding gene, PILRB, and three alternative splicing (AS) variants (V2-4). The complete coding sequence of V1 was 621 bp long encoding a polypeptide of 206 aa. Compared with V1, V2 and V3 were formed by exon-skipping in the 3'-untranslated region (UTR), while V4 was formed by alternative 3' splice site of exon 3, resulting in a premature termination codon, combined with exon skipping in the 3'-UTR. Expression profile analysis showed that all the isoforms were most abundant in the spleen, and V1 was strongly induced by poly(I:C). Furthermore, the transcription of V1 altered with the increasing age and differed between species. Exon skipping in the 3'-UTR of V2 and V3 down-regulated expression of the luciferase reporter gene, and hence presumably of the PILRB gene, while V4 was subjected to nonsense-mediated mRNA decay. Additionally, five novel splicing patterns were detected using the minigene approach, indicating complex AS of porcine PILRB. These results will help to reveal the role of PILRβ in the host immune response using pig models, and will facilitate the breeding of pigs resistant to viral diseases through molecular breeding methods.
Collapse
Affiliation(s)
- Xiu-Qin Yang
- College of Animal Science and Technology, Northeast Agricultural University, Harbin 150030, PR China.
| | - Xiao-Yan Jing
- College of Animal Science and Technology, Northeast Agricultural University, Harbin 150030, PR China
| | - Cai-Xia Zhang
- College of Animal Science and Technology, Northeast Agricultural University, Harbin 150030, PR China
| | - Yan-Fang Song
- College of Animal Science and Technology, Northeast Agricultural University, Harbin 150030, PR China
| | - Di Liu
- Institute of Animal Husbandry, Heilongjiang Academy of Agricultural Sciences, Harbin, 150086, PR China.
| |
Collapse
|
65
|
Anna A, Monika G. Splicing mutations in human genetic disorders: examples, detection, and confirmation. J Appl Genet 2018; 59:253-268. [PMID: 29680930 PMCID: PMC6060985 DOI: 10.1007/s13353-018-0444-7] [Citation(s) in RCA: 391] [Impact Index Per Article: 65.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Revised: 04/08/2018] [Accepted: 04/10/2018] [Indexed: 01/02/2023]
Abstract
Precise pre-mRNA splicing, essential for appropriate protein translation, depends on the presence of consensus "cis" sequences that define exon-intron boundaries and regulatory sequences recognized by splicing machinery. Point mutations at these consensus sequences can cause improper exon and intron recognition and may result in the formation of an aberrant transcript of the mutated gene. The splicing mutation may occur in both introns and exons and disrupt existing splice sites or splicing regulatory sequences (intronic and exonic splicing silencers and enhancers), create new ones, or activate the cryptic ones. Usually such mutations result in errors during the splicing process and may lead to improper intron removal and thus cause alterations of the open reading frame. Recent research has underlined the abundance and importance of splicing mutations in the etiology of inherited diseases. The application of modern techniques allowed to identify synonymous and nonsynonymous variants as well as deep intronic mutations that affected pre-mRNA splicing. The bioinformatic algorithms can be applied as a tool to assess the possible effect of the identified changes. However, it should be underlined that the results of such tests are only predictive, and the exact effect of the specific mutation should be verified in functional studies. This article summarizes the current knowledge about the "splicing mutations" and methods that help to identify such changes in clinical diagnosis.
Collapse
Affiliation(s)
- Abramowicz Anna
- Department of Medical Genetics, Institute of Mother and Child, Kasprzaka 17a, 01-211, Warsaw, Poland
| | - Gos Monika
- Department of Medical Genetics, Institute of Mother and Child, Kasprzaka 17a, 01-211, Warsaw, Poland.
| |
Collapse
|
66
|
Lokits AD, Indrischek H, Meiler J, Hamm HE, Stadler PF. Tracing the evolution of the heterotrimeric G protein α subunit in Metazoa. BMC Evol Biol 2018; 18:51. [PMID: 29642851 PMCID: PMC5896119 DOI: 10.1186/s12862-018-1147-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 03/06/2018] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Heterotrimeric G proteins are fundamental signaling proteins composed of three subunits, Gα and a Gβγ dimer. The role of Gα as a molecular switch is critical for transmitting and amplifying intracellular signaling cascades initiated by an activated G protein Coupled Receptor (GPCR). Despite their biochemical and therapeutic importance, the study of G protein evolution has been limited to the scope of a few model organisms. Furthermore, of the five primary Gα subfamilies, the underlying gene structure of only two families has been thoroughly investigated outside of Mammalia evolution. Therefore our understanding of Gα emergence and evolution across phylogeny remains incomplete. RESULTS We have computationally identified the presence and absence of every Gα gene (GNA-) across all major branches of Deuterostomia and evaluated the conservation of the underlying exon-intron structures across these phylogenetic groups. We provide evidence of mutually exclusive exon inclusion through alternative splicing in specific lineages. Variations of splice site conservation and isoforms were found for several paralogs which coincide with conserved, putative motifs of DNA-/RNA-binding proteins. In addition to our curated gene annotations, within Primates, we identified 15 retrotranspositions, many of which have undergone pseudogenization. Most importantly, we find numerous deviations from previous findings regarding the presence and absence of individual GNA- genes, nuanced differences in phyla-specific gene copy numbers, novel paralog duplications and subsequent intron gain and loss events. CONCLUSIONS Our curated annotations allow us to draw more accurate inferences regarding the emergence of all Gα family members across Metazoa and to present a new, updated theory of Gα evolution. Leveraging this, our results are critical for gaining new insights into the co-evolution of the Gα subunit and its many protein binding partners, especially therapeutically relevant G protein - GPCR signaling pathways which radiated in Vertebrata evolution.
Collapse
Affiliation(s)
- A. D. Lokits
- 0000 0001 2264 7217grid.152326.1Neuroscience Program, Vanderbilt University, Nashville, TN USA ,0000 0001 2264 7217grid.152326.1Center for Structural Biology, Vanderbilt University, Nashville, TN USA
| | - H. Indrischek
- 0000 0001 2230 9752grid.9647.cBioinformatics Group, Department of Computer Science, Leipzig University, Leipzig, Germany ,0000 0001 2230 9752grid.9647.cComputational EvoDevo Group, Bioinformatics Department, Leipzig University, Leipzig, Germany
| | - J. Meiler
- 0000 0001 2264 7217grid.152326.1Center for Structural Biology, Vanderbilt University, Nashville, TN USA ,0000 0001 2264 7217grid.152326.1Chemistry Department, Vanderbilt University, Nashville, TN USA
| | - H. E. Hamm
- 0000 0004 1936 9916grid.412807.8Pharmacology Department, Vanderbilt University Medical Center, Nashville, TN USA
| | - P. F. Stadler
- 0000 0001 2230 9752grid.9647.cBioinformatics Group, Department of Computer Science, Leipzig University, Leipzig, Germany ,0000 0001 0674 042Xgrid.5254.6Center for non-coding RNA in Technology and Health, University of Copenhagen, Frederiksberg C, Denmark ,0000 0001 2286 1424grid.10420.37Institute for Theoretical Chemistry, University of Vienna, Wien, Austria ,0000 0001 2230 9752grid.9647.cIZBI-Interdisciplinary Center for Bioinformatics and LIFE-Leipzig Research Center for Civilization Diseases and Competence Center for Scalable Data Services and Solutions, University Leipzig, Leipzig, Germany ,grid.419532.8Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany ,0000 0001 1941 1940grid.209665.eSanta Fe Institute, Santa Fe, NM USA
| |
Collapse
|
67
|
Tardaguila M, de la Fuente L, Marti C, Pereira C, Pardo-Palacios FJ, Del Risco H, Ferrell M, Mellado M, Macchietto M, Verheggen K, Edelmann M, Ezkurdia I, Vazquez J, Tress M, Mortazavi A, Martens L, Rodriguez-Navarro S, Moreno-Manzano V, Conesa A. SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification. Genome Res 2018; 28:396-411. [PMID: 29440222 PMCID: PMC5848618 DOI: 10.1101/gr.222976.117] [Citation(s) in RCA: 224] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 01/08/2018] [Indexed: 01/15/2023]
Abstract
High-throughput sequencing of full-length transcripts using long reads has paved the way for the discovery of thousands of novel transcripts, even in well-annotated mammalian species. The advances in sequencing technology have created a need for studies and tools that can characterize these novel variants. Here, we present SQANTI, an automated pipeline for the classification of long-read transcripts that can assess the quality of data and the preprocessing pipeline using 47 unique descriptors. We apply SQANTI to a neuronal mouse transcriptome using Pacific Biosciences (PacBio) long reads and illustrate how the tool is effective in characterizing and describing the composition of the full-length transcriptome. We perform extensive evaluation of ToFU PacBio transcripts by PCR to reveal that an important number of the novel transcripts are technical artifacts of the sequencing approach and that SQANTI quality descriptors can be used to engineer a filtering strategy to remove them. Most novel transcripts in this curated transcriptome are novel combinations of existing splice sites, resulting more frequently in novel ORFs than novel UTRs, and are enriched in both general metabolic and neural-specific functions. We show that these new transcripts have a major impact in the correct quantification of transcript levels by state-of-the-art short-read-based quantification algorithms. By comparing our iso-transcriptome with public proteomics databases, we find that alternative isoforms are elusive to proteogenomics detection. SQANTI allows the user to maximize the analytical outcome of long-read technologies by providing the tools to deliver quality-evaluated and curated full-length transcriptomes.
Collapse
Affiliation(s)
- Manuel Tardaguila
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | - Lorena de la Fuente
- Genomics of Gene Expression Laboratory, Centro de Investigaciones Principe Felipe (CIPF), 46012 Valencia, Spain
| | - Cristina Marti
- Genomics of Gene Expression Laboratory, Centro de Investigaciones Principe Felipe (CIPF), 46012 Valencia, Spain
| | - Cécile Pereira
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | | | - Hector Del Risco
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | - Marc Ferrell
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | | | - Marissa Macchietto
- Department of Developmental and Cell Biology, University of California, Irvine, California 92617, USA
| | - Kenneth Verheggen
- VIB-UGent Center for Medical Biotechnology, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Mariola Edelmann
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
| | - Iakes Ezkurdia
- Centro Nacional de Investigaciones Cardiovasculares CNIC, 28029 Madrid, Spain
| | - Jesus Vazquez
- Centro Nacional de Investigaciones Cardiovasculares CNIC, 28029 Madrid, Spain
| | - Michael Tress
- Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain
| | - Ali Mortazavi
- Department of Developmental and Cell Biology, University of California, Irvine, California 92617, USA
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, B-9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, B-9000 Ghent, Belgium
| | - Susana Rodriguez-Navarro
- Gene Expression and mRNA Metabolism Laboratory, CSIC, IBV, 46010 Valencia, Spain
- Gene Expression and mRNA Metabolism Laboratory, CIPF, 46012 Valencia, Spain
| | | | - Ana Conesa
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, Genetics Institute, University of Florida, Gainesville, Florida 32611, USA
- Genomics of Gene Expression Laboratory, Centro de Investigaciones Principe Felipe (CIPF), 46012 Valencia, Spain
| |
Collapse
|
68
|
Zhou C, Liu S, Song W, Luo S, Meng G, Yang C, Yang H, Ma J, Wang L, Gao S, Wang J, Yang H, Zhao Y, Wang H, Zhou X. Characterization of viral RNA splicing using whole-transcriptome datasets from host species. Sci Rep 2018; 8:3273. [PMID: 29459752 PMCID: PMC5818608 DOI: 10.1038/s41598-018-21190-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 01/31/2018] [Indexed: 01/16/2023] Open
Abstract
RNA alternative splicing (AS) is an important post-transcriptional mechanism enabling single genes to produce multiple proteins. It has been well demonstrated that viruses deploy host AS machinery for viral protein productions. However, knowledge on viral AS is limited to a few disease-causing viruses in model species. Here we report a novel approach to characterizing viral AS using whole transcriptome dataset from host species. Two insect transcriptomes (Acheta domesticus and Planococcus citri) generated in the 1,000 Insect Transcriptome Evolution (1KITE) project were used as a proof of concept using the new pipeline. Two closely related densoviruses (Acheta domesticus densovirus, AdDNV, and Planococcus citri densovirus, PcDNV, Ambidensovirus, Densovirinae, Parvoviridae) were detected and analyzed for AS patterns. The results suggested that although the two viruses shared major AS features, dramatic AS divergences were observed. Detailed analysis of the splicing junctions showed clusters of AS events occurred in two regions of the virus genome, demonstrating that transcriptome analysis could gain valuable insights into viral splicing. When applied to large-scale transcriptomics projects with diverse taxonomic sampling, our new method is expected to rapidly expand our knowledge on RNA splicing mechanisms for a wide range of viruses.
Collapse
Affiliation(s)
- Chengran Zhou
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, China.,BGI-Shenzhen, Shenzhen, 518083, China.,China National GeneBank, BGI-Shenzhen, Shenzhen, 518120, China
| | - Shanlin Liu
- BGI-Shenzhen, Shenzhen, 518083, China.,China National GeneBank, BGI-Shenzhen, Shenzhen, 518120, China.,Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350, Copenhagen, Denmark
| | - Wenhui Song
- BGI-Shenzhen, Shenzhen, 518083, China.,China National GeneBank, BGI-Shenzhen, Shenzhen, 518120, China
| | - Shiqi Luo
- Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Plant Protection, China Agricultural University, Beijing, 100193, China
| | - Guanliang Meng
- BGI-Shenzhen, Shenzhen, 518083, China.,China National GeneBank, BGI-Shenzhen, Shenzhen, 518120, China
| | - Chentao Yang
- BGI-Shenzhen, Shenzhen, 518083, China.,China National GeneBank, BGI-Shenzhen, Shenzhen, 518120, China
| | - Hua Yang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, China
| | - Jinmin Ma
- BGI-Shenzhen, Shenzhen, 518083, China.,China National GeneBank, BGI-Shenzhen, Shenzhen, 518120, China
| | - Liang Wang
- CAS Key Laboratory of Biomedical & Diagnostic Technology, CAS/Suzhou Institute of Biomedical Engineering and Technology, Suzhou, 215163, China
| | - Shan Gao
- CAS Key Laboratory of Biomedical & Diagnostic Technology, CAS/Suzhou Institute of Biomedical Engineering and Technology, Suzhou, 215163, China
| | - Jian Wang
- BGI-Shenzhen, Shenzhen, 518083, China.,James D. Watson Institute of Genome Sciences, Hangzhou, 310058, China
| | - Huanming Yang
- BGI-Shenzhen, Shenzhen, 518083, China.,James D. Watson Institute of Genome Sciences, Hangzhou, 310058, China
| | - Yun Zhao
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, China.
| | - Hui Wang
- BGI-Shenzhen, Shenzhen, 518083, China. .,China National GeneBank, BGI-Shenzhen, Shenzhen, 518120, China. .,The Institute of Biomedical Engineering, University of Oxford, Oxford, OX3 7DQ, UK.
| | - Xin Zhou
- Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Plant Protection, China Agricultural University, Beijing, 100193, China. .,National Engineering Research Center for Fruit and Vegetable Processing, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
69
|
Pashaei E, Aydin N. Markovian encoding models in human splice site recognition using SVM. Comput Biol Chem 2018; 73:159-170. [PMID: 29486390 DOI: 10.1016/j.compbiolchem.2018.02.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2017] [Revised: 02/04/2018] [Accepted: 02/05/2018] [Indexed: 11/26/2022]
Abstract
Splice site recognition is among the most significant and challenging tasks in bioinformatics due to its key role in gene annotation. Effective prediction of splice site requires nucleotide encoding methods that reveal the characteristics of DNA sequences to provide appropriate features to serve as input of machine learning classifiers. Markovian models are the most influential encoding methods that highly used for pattern recognition in biological data. However, a direct performance comparison of these methods in splice site domain has not been assessed yet. This study compares various Markovian encoding models for splice site prediction utilizing support vector machine, as the most outstanding learning method in the domain, and conducts a new precise evaluation of Markovian approaches that corrects this limitation. Moreover, a novel sequence encoding approach based on third order Markov model (MM3) is proposed. The experimental results show that the proposed method, namely MM3-SVM, performs significantly better than thirteen best known state-of-the-art algorithms, while tested on HS3D dataset considering several performance criteria. Further, it achieved higher prediction accuracy than several well-known tools like NNsplice, MEM, MM1, WMM, and GeneID, using an independent test set of 50 genes. We also developed MMSVM, a web tool to predict splice sites in any human sequence using the proposed approach. The MMSVM web server can be assessed at https://pashaei.shinyapps.io/mmsvm.
Collapse
Affiliation(s)
- Elham Pashaei
- Department of Computer Engineering, Yildiz Technical University, Istanbul, Turkey.
| | - Nizamettin Aydin
- Department of Computer Engineering, Yildiz Technical University, Istanbul, Turkey.
| |
Collapse
|
70
|
Singh NN, Del Rio-Malewski JB, Luo D, Ottesen EW, Howell MD, Singh RN. Activation of a cryptic 5' splice site reverses the impact of pathogenic splice site mutations in the spinal muscular atrophy gene. Nucleic Acids Res 2017; 45:12214-12240. [PMID: 28981879 PMCID: PMC5716214 DOI: 10.1093/nar/gkx824] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Accepted: 09/06/2017] [Indexed: 01/08/2023] Open
Abstract
Spinal muscular atrophy (SMA) is caused by deletions or mutations of the Survival Motor Neuron 1 (SMN1) gene coupled with predominant skipping of SMN2 exon 7. The only approved SMA treatment is an antisense oligonucleotide that targets the intronic splicing silencer N1 (ISS-N1), located downstream of the 5' splice site (5'ss) of exon 7. Here, we describe a novel approach to exon 7 splicing modulation through activation of a cryptic 5'ss (Cr1). We discovered the activation of Cr1 in transcripts derived from SMN1 that carries a pathogenic G-to-C mutation at the first position (G1C) of intron 7. We show that Cr1-activating engineered U1 snRNAs (eU1s) have the unique ability to reprogram pre-mRNA splicing and restore exon 7 inclusion in SMN1 carrying a broad spectrum of pathogenic mutations at both the 3'ss and 5'ss of the exon 7. Employing a splicing-coupled translation reporter, we demonstrate that mRNAs generated by an eU1-induced activation of Cr1 produce full-length SMN. Our findings underscore a wider role for U1 snRNP in splicing regulation and reveal a novel approach for the restoration of SMN exon 7 inclusion for a potential therapy of SMA.
Collapse
Affiliation(s)
- Natalia N Singh
- Department of Biomedical Sciences, Iowa State University, Ames, IA 50011, USA
| | - José Bruno Del Rio-Malewski
- Department of Biomedical Sciences, Iowa State University, Ames, IA 50011, USA.,Interdepartmental Genetics and Genomics Program, Iowa State University, Ames, IA 50011, USA
| | - Diou Luo
- Department of Biomedical Sciences, Iowa State University, Ames, IA 50011, USA
| | - Eric W Ottesen
- Department of Biomedical Sciences, Iowa State University, Ames, IA 50011, USA
| | - Matthew D Howell
- Department of Biomedical Sciences, Iowa State University, Ames, IA 50011, USA
| | - Ravindra N Singh
- Department of Biomedical Sciences, Iowa State University, Ames, IA 50011, USA.,Interdepartmental Genetics and Genomics Program, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
71
|
Noncanonical GA and GG 5' Intron Donor Splice Sites Are Common in the Copepod Eurytemora affinis. G3-GENES GENOMES GENETICS 2017; 7:3967-3969. [PMID: 29079681 PMCID: PMC5714493 DOI: 10.1534/g3.117.300189] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The noncanonical 5′ intron donor splice sites GA and GG are exceedingly rare in described eukaryotic genomes; however, they are present in ∼12% of introns in the genome of the copepod Eurytemora affinis. Failure to recognize the high frequency of these donor sites compromised the modeling of genes in this newly sequenced genome, including 10 conserved ionotropic glutamate receptor (GluR) family genes curated herein. These introns appear to have been acquired recently, along with many additional idiosyncratic introns. Their high frequency implies the evolution of modified intron donor splice site recognition in this copepod.
Collapse
|
72
|
Kaisers W, Ptok J, Schwender H, Schaal H. Validation of Splicing Events in Transcriptome Sequencing Data. Int J Mol Sci 2017; 18:ijms18061110. [PMID: 28545234 PMCID: PMC5485934 DOI: 10.3390/ijms18061110] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2017] [Revised: 04/26/2017] [Accepted: 04/28/2017] [Indexed: 11/16/2022] Open
Abstract
Genomic alignments of sequenced cellular messenger RNA contain gapped alignments which are interpreted as consequence of intron removal. The resulting gap-sites, genomic locations of alignment gaps, are landmarks representing potential splice-sites. As alignment algorithms report gap-sites with a considerable false discovery rate, validations are required. We describe two quality scores, gap quality score (gqs) and weighted gap information score (wgis), developed for validation of putative splicing events: While gqs solely relies on alignment data wgis additionally considers information from the genomic sequence. FASTQ files obtained from 54 human dermal fibroblast samples were aligned against the human genome (GRCh38) using TopHat and STAR aligner. Statistical properties of gap-sites validated by gqs and wgis were evaluated by their sequence similarity to known exon-intron borders. Within the 54 samples, TopHat identifies 1,000,380 and STAR reports 6,487,577 gap-sites. Due to the lack of strand information, however, the percentage of identified GT-AG gap-sites is rather low. While gap-sites from TopHat contain ≈89% GT-AG, gap-sites from STAR only contain ≈42% GT-AG dinucleotide pairs in merged data from 54 fibroblast samples. Validation with gqs yields 156,251 gap-sites from TopHat alignments and 166,294 from STAR alignments. Validation with wgis yields 770,327 gap-sites from TopHat alignments and 1,065,596 from STAR alignments. Both alignment algorithms, TopHat and STAR, report gap-sites with considerable false discovery rate, which can drastically be reduced by validation with gqs and wgis.
Collapse
Affiliation(s)
- Wolfgang Kaisers
- Department for Anaesthesiology, University Hospital Düsseldorf, Heinrich Heine University, 40225 Düsseldorf, Germany.
- BMFZ (Biologisch-Medizinisches Forschungszentrum), Heinrich Heine University, 40225 Düsseldorf, Germany.
| | - Johannes Ptok
- Institute of Virology, University Hospital Düsseldorf, Heinrich Heine University, 40225 Düsseldorf, Germany.
| | - Holger Schwender
- BMFZ (Biologisch-Medizinisches Forschungszentrum), Heinrich Heine University, 40225 Düsseldorf, Germany.
- Mathematical Institute, Heinrich Heine University, 40225 Düsseldorf, Germany.
| | - Heiner Schaal
- BMFZ (Biologisch-Medizinisches Forschungszentrum), Heinrich Heine University, 40225 Düsseldorf, Germany.
- Institute of Virology, University Hospital Düsseldorf, Heinrich Heine University, 40225 Düsseldorf, Germany.
| |
Collapse
|
73
|
Abstract
Inosine is one of the most common modifications found in human RNAs and the Adenosine Deaminases that act on RNA (ADARs) are the main enzymes responsible for its production. ADARs were first discovered in the 1980s and since then our understanding of ADARs has advanced tremendously. For instance, it is now known that defective ADAR function can cause human diseases. Furthermore, recently solved crystal structures of the human ADAR2 deaminase bound to RNA have provided insights regarding the catalytic and substrate recognition mechanisms. In this chapter, we describe the occurrence of inosine in human RNAs and the newest perspective on the ADAR family of enzymes, including their substrate recognition, catalytic mechanism, regulation as well as the consequences of A-to-I editing, and their relation to human diseases.
Collapse
|
74
|
Acfs: accurate circRNA identification and quantification from RNA-Seq data. Sci Rep 2016; 6:38820. [PMID: 27929140 PMCID: PMC5144000 DOI: 10.1038/srep38820] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 11/14/2016] [Indexed: 12/22/2022] Open
Abstract
Circular RNAs (circRNAs) are a group of single-stranded RNAs in closed circular form. They are splicing-generated, widely expressed in various tissues and have functional implications in development and diseases. To facilitate genome-wide characterization of circRNAs using RNA-Seq data, we present a freely available software package named acfs. Acfs allows de novo, accurate and fast identification and abundance quantification of circRNAs from single- and paired-ended RNA-Seq data. On simulated datasets, acfs achieved the highest F1 accuracy and lowest false discovery rate among current state-of-the-art tools. On real-world datasets, acfs efficiently identified more bona fide circRNAs. Furthermore, we demonstrated the power of circRNA analysis on two leukemia datasets. We identified a set of circRNAs that are differentially expressed between AML and APL samples, which might shed light on the potential molecular classification of complex diseases using circRNA profiles. Moreover, chromosomal translocation, as manifested in numerous diseases, could produce not only fusion transcripts but also fusion circRNAs of clinical relevance. Featured with high accuracy, low FDR and the ability to identify fusion circRNAs, we believe that acfs is well suited for a wide spectrum of applications in characterizing the landscape of circRNAs from non-model organisms to cancer biology.
Collapse
|
75
|
Abstract
We examine exon junctions near apparent amino acid insertions and deletions in alignments of orthologous protein-coding genes. In 1,917 ortholog families across nine oomycete genomes, 10–20% of introns are near an alignment gap, indicating at first sight that splice-site displacements are frequent. We designed a robust algorithmic procedure for the delineation of intron-containing homologous regions, and combined it with a parsimony-based reconstruction of intron loss, gain, and splice-site shift events on a phylogeny. The reconstruction implies that 12% of introns underwent an acceptor-site shift, and 10% underwent a donor-site shift. In order to offset gene annotation problems, we amended the procedure with the reannotation of intron boundaries using alignment evidence. The corresponding reconstruction involves much fewer intron gain and splice-site shift events. The frequency of acceptor- and donor-side shifts drops to 4% and 3%, respectively, which are not much different from what one would expect by random codon insertions and deletions. In other words, gaps near exon junctions are mostly artifacts of gene annotation rather than evidence of sliding intron boundaries. Our study underscores the importance of using well-supported gene structure annotations in comparative studies. When transcription evidence is not available, we propose a robust ancestral reconstruction procedure that corrects misannotated intron boundaries using sequence alignments. The results corroborate the view that boundary shifts and complete intron sliding are only accidental in eukaryotic genome evolution and have a negligible impact on protein diversity.
Collapse
Affiliation(s)
- Steven Sêton Bocco
- Department of Biochemistry and Molecular Medicine, University of Montréal, Montréal, Canada
| | - Miklós Csűrös
- Department of Computer Science and Operations Research, University of Montréal, Montréal, Canada Institute of Genetics, Biological Research Centre, Hungarian Academy of Sciences, Szeged, Hungary
| |
Collapse
|
76
|
Abstract
Recent improvements in experimental and computational techniques that are used to study the transcriptome have enabled an unprecedented view of RNA processing, revealing many previously unknown non-canonical splicing events. This includes cryptic events located far from the currently annotated exons and unconventional splicing mechanisms that have important roles in regulating gene expression. These non-canonical splicing events are a major source of newly emerging transcripts during evolution, especially when they involve sequences derived from transposable elements. They are therefore under precise regulation and quality control, which minimizes their potential to disrupt gene expression. We explain how non-canonical splicing can lead to aberrant transcripts that cause many diseases, and also how it can be exploited for new therapeutic strategies.
Collapse
|
77
|
van Bon BW, Coe BP, Bernier R, Green C, Gerdts J, Witherspoon K, Kleefstra T, Willemsen MH, Kumar R, Bosco P, Fichera M, Li D, Amaral D, Cristofoli F, Peeters H, Haan E, Romano C, Mefford HC, Scheffer I, Gecz J, de Vries BB, Eichler EE. Disruptive de novo mutations of DYRK1A lead to a syndromic form of autism and ID. Mol Psychiatry 2016; 21:126-32. [PMID: 25707398 PMCID: PMC4547916 DOI: 10.1038/mp.2015.5] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Revised: 11/20/2014] [Accepted: 12/19/2014] [Indexed: 12/13/2022]
Abstract
Dual-specificity tyrosine-(Y)-phosphorylation-regulated kinase 1 A (DYRK1A) maps to the Down syndrome critical region; copy number increase of this gene is thought to have a major role in the neurocognitive deficits associated with Trisomy 21. Truncation of DYRK1A in patients with developmental delay (DD) and autism spectrum disorder (ASD) suggests a different pathology associated with loss-of-function mutations. To understand the phenotypic spectrum associated with DYRK1A mutations, we resequenced the gene in 7162 ASD/DD patients (2446 previously reported) and 2169 unaffected siblings and performed a detailed phenotypic assessment on nine patients. Comparison of our data and published cases with 8696 controls identified a significant enrichment of DYRK1A truncating mutations (P=0.00851) and an excess of de novo mutations (P=2.53 × 10(-10)) among ASD/intellectual disability (ID) patients. Phenotypic comparison of all novel (n=5) and recontacted (n=3) cases with previous case reports, including larger CNV and translocation events (n=7), identified a syndromal disorder among the 15 patients. It was characterized by ID, ASD, microcephaly, intrauterine growth retardation, febrile seizures in infancy, impaired speech, stereotypic behavior, hypertonia and a specific facial gestalt. We conclude that mutations in DYRK1A define a syndromic form of ASD and ID with neurodevelopmental defects consistent with murine and Drosophila knockout models.
Collapse
Affiliation(s)
- Bregje W.M. van Bon
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
- School of Paediatrics and Reproductive Health, University of Adelaide, Adelaide, SA, Australia
| | - Bradley P. Coe
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Raphael Bernier
- Department of Psychiatry, University of Washington, Seattle, WA 98195, USA
| | - Cherie Green
- Florey Institute, University of Melbourne, Austin Health and Royal Children’s Hospital, Melbourne 3010, Australia
| | - Jennifer Gerdts
- Department of Psychiatry, University of Washington, Seattle, WA 98195, USA
| | - Kali Witherspoon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Tjitske Kleefstra
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
| | - Marjolein H. Willemsen
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
| | - Raman Kumar
- School of Paediatrics and Reproductive Health, University of Adelaide, Adelaide, SA, Australia
| | - Paolo Bosco
- I.R.C.C.S. Associazione Oasi Maria Santissima, Troina 94018, Italy
| | - Marco Fichera
- I.R.C.C.S. Associazione Oasi Maria Santissima, Troina 94018, Italy
- Medical Genetics, University of Catania, Catania 95123, Italy
| | - Deana Li
- Representing the Autism Phenome Project, MIND Institute, University of California-Davis, Sacramento, CA 95817, USA
| | - David Amaral
- Representing the Autism Phenome Project, MIND Institute, University of California-Davis, Sacramento, CA 95817, USA
| | - Francesca Cristofoli
- Center for Human Genetics, University Hospitals Leuven, KU Leuven, Leuven 3000, Belgium
| | - Hilde Peeters
- Center for Human Genetics, University Hospitals Leuven, KU Leuven, Leuven 3000, Belgium
- Leuven Autism Research (LAuRes), Leuven 3000, Belgium
| | - Eric Haan
- School of Paediatrics and Reproductive Health, University of Adelaide, Adelaide, SA, Australia
- South Australian Clinical Genetics Service, SA Pathology, Adelaide, Australia
| | - Corrado Romano
- I.R.C.C.S. Associazione Oasi Maria Santissima, Troina 94018, Italy
| | - Heather C. Mefford
- Department of Psychiatry, University of Washington, Seattle, WA 98195, USA
| | - Ingrid Scheffer
- Florey Institute, University of Melbourne, Austin Health and Royal Children’s Hospital, Melbourne 3010, Australia
| | - Jozef Gecz
- School of Paediatrics and Reproductive Health, University of Adelaide, Adelaide, SA, Australia
- South Australian Clinical Genetics Service, SA Pathology, Adelaide, Australia
- Robinson Institute, University of Adelaide, Adelaide, SA 5005, Australia
| | - Bert B.A. de Vries
- Department of Human Genetics, Radboud university medical center, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud university medical center, Nijmegen, The Netherlands
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
78
|
Gazzoli I, Pulyakhina I, Verwey NE, Ariyurek Y, Laros JFJ, 't Hoen PAC, Aartsma-Rus A. Non-sequential and multi-step splicing of the dystrophin transcript. RNA Biol 2015; 13:290-305. [PMID: 26670121 PMCID: PMC4829307 DOI: 10.1080/15476286.2015.1125074] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
The dystrophin protein encoding DMD gene is the longest human gene. The 2.2 Mb long human dystrophin transcript takes 16 hours to be transcribed and is co-transcriptionally spliced. It contains long introns (24 over 10kb long, 5 over 100kb long) and the heterogeneity in intron size makes it an ideal transcript to study different aspects of the human splicing process. Splicing is a complex process and much is unknown regarding the splicing of long introns in human genes. Here, we used ultra-deep transcript sequencing to characterize splicing of the dystrophin transcripts in 3 different human skeletal muscle cell lines, and explored the order of intron removal and multi-step splicing. Coverage and read pair analyses showed that around 40% of the introns were not always removed sequentially. Additionally, for the first time, we report that non-consecutive intron removal resulted in 3 or more joined exons which are flanked by unspliced introns and we defined these joined exons as an exon block. Lastly, computational and experimental data revealed that, for the majority of dystrophin introns, multistep splicing events are used to splice out a single intron. Overall, our data show for the first time in a human transcript, that multi-step intron removal is a general feature of mRNA splicing.
Collapse
Affiliation(s)
- Isabella Gazzoli
- a Department of Human Genetics , Leiden University Medical Center , Leiden , the Netherlands
| | - Irina Pulyakhina
- a Department of Human Genetics , Leiden University Medical Center , Leiden , the Netherlands
| | - Nisha E Verwey
- a Department of Human Genetics , Leiden University Medical Center , Leiden , the Netherlands
| | - Yavuz Ariyurek
- b Leiden Genome Technology Center, Leiden University Medical Center , Leiden , The Netherlands
| | - Jeroen F J Laros
- a Department of Human Genetics , Leiden University Medical Center , Leiden , the Netherlands.,b Leiden Genome Technology Center, Leiden University Medical Center , Leiden , The Netherlands
| | - Peter A C 't Hoen
- a Department of Human Genetics , Leiden University Medical Center , Leiden , the Netherlands
| | - Annemieke Aartsma-Rus
- a Department of Human Genetics , Leiden University Medical Center , Leiden , the Netherlands
| |
Collapse
|
79
|
Piovesan A, Caracausi M, Ricci M, Strippoli P, Vitale L, Pelleri MC. Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank. DNA Res 2015; 22:495-503. [PMID: 26581719 PMCID: PMC4675715 DOI: 10.1093/dnares/dsv028] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 10/07/2015] [Indexed: 01/26/2023] Open
Abstract
We have developed GeneBase, a full parser of the National Center for Biotechnology Information (NCBI) Gene database, which generates a fully structured local database with an intuitive user-friendly graphic interface for personal computers. Features of all the annotated eukaryotic genes are accessible through three main software tables, including for each entry details such as the gene summary, the gene exon/intron structure and the specific Gene Ontology attributions. The structuring of the data, the creation of additional calculation fields and the integration with nucleotide sequences allow users to make many types of comparisons and calculations that are useful for data retrieval and analysis. We provide an original example analysis of the existing introns across all the available species, through which the classic biological problem of the ‘minimal intron’ may find a solution using available data. Based on all currently available data, we can define the shortest known eukaryotic GT-AG intron length, setting the physical limit at the 30 base pair intron belonging to the human MST1L gene. This ‘model intron’ will shed light on the minimal requirement elements of recognition used for conventional splicing functioning. Remarkably, this size is indeed consistent with the sum of the splicing consensus sequence lengths.
Collapse
Affiliation(s)
- Allison Piovesan
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| | - Maria Caracausi
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| | - Marco Ricci
- Department of Biological, Geological and Environmental Sciences (BIGeA), University of Bologna, Bologna, BO 40126, Italy
| | - Pierluigi Strippoli
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| | - Lorenza Vitale
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| | - Maria Chiara Pelleri
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Bologna, BO 40126, Italy
| |
Collapse
|
80
|
Chiba M, Ariga H, Maita H. A Splicing Reporter Tuned to Non-AG Acceptor Sites Reveals that Luteolin Enhances the Recognition of Non-canonical Acceptor Sites. Chem Biol Drug Des 2015; 87:275-82. [PMID: 26348996 DOI: 10.1111/cbdd.12656] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Revised: 07/31/2015] [Accepted: 08/27/2015] [Indexed: 12/20/2022]
Abstract
Removal of an intron requires precise recognition of the splice donor and acceptor sites located at the 5' and 3' termini of introns. Although the roles of these sequences differ, mutations in both sites easily block normal splicing and produce an aberrant mRNA. For example, many splice-site mutations occur in patients with inherited diseases. Several approaches have been evaluated to restore expression of a functional protein; however, because of the strict requirement for an AG dinucleotide at the 3' terminus of a U2-type intron, no method is available to correct splicing at a mutated sequence. To identify compounds that allow splicing at the non-AG acceptor site, in the present study we constructed a reporter gene with a modified polypyrimidine tract. However, the modified polypyrimidine tract mediated splicing at adjacent non-canonical acceptor sites, including the original mutated site. Further, we show that certain flavones such as luteolin and apigenin enhanced aberrant splicing at the non-canonical acceptor site of the reporter gene. These results suggest that the reporter gene and luteolin may be useful for further screening to identify molecules that correct aberrant splicing caused by a disease-associated splice acceptor site mutation.
Collapse
Affiliation(s)
- Masanori Chiba
- Graduate School of Life Science, Hokkaido University, Sapporo, 060-0812, Japan
| | - Hiroyoshi Ariga
- Faculty of Pharmaceutical Sciences, Hokkaido University, Sapporo, 060-0812, Japan
| | - Hiroshi Maita
- Faculty of Pharmaceutical Sciences, Hokkaido University, Sapporo, 060-0812, Japan
| |
Collapse
|
81
|
Sveen A, Kilpinen S, Ruusulehto A, Lothe RA, Skotheim RI. Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes. Oncogene 2015; 35:2413-27. [PMID: 26300000 DOI: 10.1038/onc.2015.318] [Citation(s) in RCA: 333] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Revised: 07/22/2015] [Accepted: 07/22/2015] [Indexed: 02/07/2023]
Abstract
Alternative splicing is a widespread process contributing to structural transcript variation and proteome diversity. In cancer, the splicing process is commonly disrupted, resulting in both functional and non-functional end-products. Cancer-specific splicing events are known to contribute to disease progression; however, the dysregulated splicing patterns found on a genome-wide scale have until recently been less well-studied. In this review, we provide an overview of aberrant RNA splicing and its regulation in cancer. We then focus on the executors of the splicing process. Based on a comprehensive catalog of splicing factor encoding genes and analyses of available gene expression and somatic mutation data, we identify cancer-associated patterns of dysregulation. Splicing factor genes are shown to be significantly differentially expressed between cancer and corresponding normal samples, and to have reduced inter-individual expression variation in cancer. Furthermore, we identify enrichment of predicted cancer-critical genes among the splicing factors. In addition to previously described oncogenic splicing factor genes, we propose 24 novel cancer-critical splicing factors predicted from somatic mutations.
Collapse
Affiliation(s)
- A Sveen
- Department of Molecular Oncology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway.,K.G. Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | | | | | - R A Lothe
- Department of Molecular Oncology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway.,K.G. Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - R I Skotheim
- Department of Molecular Oncology, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, Oslo, Norway.,K.G. Jebsen Colorectal Cancer Research Centre, Oslo University Hospital, Oslo, Norway.,Centre for Cancer Biomedicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| |
Collapse
|
82
|
D'Alton S, Altshuler M, Lewis J. Studies of alternative isoforms provide insight into TDP-43 autoregulation and pathogenesis. RNA (NEW YORK, N.Y.) 2015; 21:1419-1432. [PMID: 26089325 PMCID: PMC4509932 DOI: 10.1261/rna.047647.114] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Accepted: 04/20/2015] [Indexed: 06/04/2023]
Abstract
TDP-43 is a soluble, nuclear protein that undergoes cytoplasmic redistribution and aggregation in the majority of cases of amyotrophic lateral sclerosis and frontotemporal lobar degeneration. TDP-43 autoregulates the abundance of its own transcript TARDBP by binding to an intron in the 3' untranslated region, although the mechanisms underlying this activity have been debated. Herein, we provide the most extensive analysis of TARDBP transcript yet undertaken. We detail the existence of a plethora of complex splicing events and alternative poly(A) use and provide data that explain the discrepancies reported to date regarding the autoregulatory capacity of TDP-43. Additionally, although many splice isoforms emanating from the TARDBP locus contain the regulated intron in the 3' UTR, we find only evidence for autoregulation of the transcript encoding full-length TDP-43. Finally, we use a novel cytoplasmic isoform of TDP to induce disease-like loss of soluble, nuclear TDP-43, which results in aberrant splicing and up-regulation of endogenous TARDBP. These results reveal a previously underappreciated complexity to TDP-43 regulated splicing and suggest that loss of TDP-43 autoregulatory capacity may contribute to the pathogenesis of ALS.
Collapse
Affiliation(s)
- Simon D'Alton
- Center for Translational Research in Neurodegenerative Disease, Department of Neuroscience, University of Florida, Gainesville, Florida 32610, USA
| | - Marcelle Altshuler
- Center for Translational Research in Neurodegenerative Disease, Department of Neuroscience, University of Florida, Gainesville, Florida 32610, USA
| | - Jada Lewis
- Center for Translational Research in Neurodegenerative Disease, Department of Neuroscience, University of Florida, Gainesville, Florida 32610, USA
| |
Collapse
|
83
|
Abstract
In the context of the FlyBase annotated gene models in Drosophila melanogaster, we describe the many exceptional cases we have curated from the literature or identified in the course of FlyBase analysis. These range from atypical but common examples such as dicistronic and polycistronic transcripts, noncanonical splices, trans-spliced transcripts, noncanonical translation starts, and stop-codon readthroughs, to single exceptional cases such as ribosomal frameshifting and HAC1-type intron processing. In FlyBase, exceptional genes and transcripts are flagged with Sequence Ontology terms and/or standardized comments. Because some of the rule-benders create problems for handlers of high-throughput data, we discuss plans for flagging these cases in bulk data downloads.
Collapse
|
84
|
Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M. The BioGRID interaction database: 2015 update. Nucleic Acids Res 2014; 43:D470-8. [PMID: 25428363 PMCID: PMC4383984 DOI: 10.1093/nar/gku1204] [Citation(s) in RCA: 648] [Impact Index Per Article: 64.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
The Biological General Repository for Interaction Datasets (BioGRID: http://thebiogrid.org) is an open access database that houses genetic and protein interactions curated from the primary biomedical literature for all major model organism species and humans. As of September 2014, the BioGRID contains 749 912 interactions as drawn from 43 149 publications that represent 30 model organisms. This interaction count represents a 50% increase compared to our previous 2013 BioGRID update. BioGRID data are freely distributed through partner model organism databases and meta-databases and are directly downloadable in a variety of formats. In addition to general curation of the published literature for the major model species, BioGRID undertakes themed curation projects in areas of particular relevance for biomedical sciences, such as the ubiquitin-proteasome system and various human disease-associated interaction networks. BioGRID curation is coordinated through an Interaction Management System (IMS) that facilitates the compilation interaction records through structured evidence codes, phenotype ontologies, and gene annotation. The BioGRID architecture has been improved in order to support a broader range of interaction and post-translational modification types, to allow the representation of more complex multi-gene/protein interactions, to account for cellular phenotypes through structured ontologies, to expedite curation through semi-automated text-mining approaches, and to enhance curation quality control.
Collapse
Affiliation(s)
- Andrew Chatr-Aryamontri
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada
| | - Bobby-Joe Breitkreutz
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Lorrie Boucher
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Sven Heinicke
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Daici Chen
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada
| | - Chris Stark
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Ashton Breitkreutz
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Nadine Kolas
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Lara O'Donnell
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Teresa Reguly
- The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Julie Nixon
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Lindsay Ramage
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Andrew Winter
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| | - Adnane Sellam
- Centre Hospitalier de l'Université Laval (CHUL), Québec, Québec G1V 4G2, Canada
| | - Christie Chang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Jodi Hirschman
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Chandra Theesfeld
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Jennifer Rust
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Michael S Livstone
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Mike Tyers
- Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Quebec H3C 3J7, Canada The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario M5G 1X5, Canada School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JR, UK
| |
Collapse
|