1
|
Gulledge AA, Roberts AD, Vora H, Patel K, Loraine AE. Mining Arabidopsis thaliana RNA-seq data with Integrated Genome Browser reveals stress-induced alternative splicing of the putative splicing regulator SR45a. AMERICAN JOURNAL OF BOTANY 2012; 99:219-31. [PMID: 22291167 DOI: 10.3732/ajb.1100355] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
PREMISE OF THE STUDY High-throughput sequencing of cDNA libraries prepared from diverse samples (RNA-seq) can reveal genome-wide changes in alternative splicing. Using RNA-seq data to assess splicing at the level of individual genes requires the ability to visualize read alignments alongside genomic annotations. To meet this need, we added RNA-seq visualization capability to Integrated Genome Browser (IGB), a free desktop genome visualization tool. To illustrate this capability, we present an in-depth analysis of abiotic stresses and their effects on alternative splicing of SR45a (AT1G07350), a putative splicing regulator from Arabidopsis thaliana. METHODS cDNA libraries prepared from Arabidopsis plants that were subjected to heat and dehydration stresses were sequenced on an Illumina GAIIx sequencer, yielding more than 511 million high-quality 75-base, single-end sequence reads. Reads were aligned onto the reference genome and visualized in IGB. KEY RESULTS Using IGB, we confirmed exon-skipping alternative splicing in SR45a. Exon-skipped variant AT1G07350.1 encodes full-length SR45a protein with intact RS and RNA recognition motifs, while nonskipped variant AT1G07350.2 lacks the C-terminal RS region due to a frameshift in the alternative exon. Heat and drought stresses increased both transcript abundance and the proportion of exon-skipped transcripts encoding the full-length protein. We identified new splice sites and observed frequent intron retention flanking the alternative exon. CONCLUSIONS This study underlines the importance of visual inspection of RNA-seq alignments when investigating alternatively spliced genes. We showed that heat and dehydration stresses increase overall abundance of SR45a mRNA while also increasing production of transcripts encoding the full-length SR45a protein relative to other splice variants.
Collapse
Affiliation(s)
- Alyssa A Gulledge
- Department of Bioinformatics and Genomics, North Carolina Research Campus, University of North Carolina at Charlotte, 600 Laureate Way, Kannapolis, North Carolina 28081, USA
| | | | | | | | | |
Collapse
|
2
|
English AC, Patel KS, Loraine AE. Prevalence of alternative splicing choices in Arabidopsis thaliana. BMC PLANT BIOLOGY 2010; 10:102. [PMID: 20525311 PMCID: PMC3017808 DOI: 10.1186/1471-2229-10-102] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 06/04/2010] [Indexed: 05/04/2023]
Abstract
BACKGROUND Around 14% of protein-coding genes of Arabidopsis thaliana genes from the TAIR9 genome release are annotated as producing multiple transcript variants through alternative splicing. However, for most alternatively spliced genes in Arabidopsis, the relative expression level of individual splicing variants is unknown. RESULTS We investigated prevalence of alternative splicing (AS) events in Arabidopsis thaliana using ESTs. We found that for most AS events with ample EST coverage, the majority of overlapping ESTs strongly supported one major splicing choice, with less than 10% of ESTs supporting the minor form. Analysis of ESTs also revealed a small but noteworthy subset of genes for which alternative choices appeared with about equal prevalence, suggesting that for these genes the variant splicing forms co-occur in the same cell types. Of the AS events in which both forms were about equally prevalent, more than 80% affected untranslated regions or involved small changes to the encoded protein sequence. CONCLUSIONS Currently available evidence from ESTs indicates that alternative splicing in Arabidopsis occurs and affects many genes, but for most genes with documented alternative splicing, one AS choice predominates. To aid investigation of the role AS may play in modulating function of Arabidopsis genes, we provide an on-line resource (ArabiTag) that supports searching AS events by gene, by EST library keyword search, and by relative prevalence of minor and major forms.
Collapse
Affiliation(s)
- Adam C English
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, 600 Laureate Way, Kannapolis, NC 28081, USA
| | - Ketan S Patel
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, 600 Laureate Way, Kannapolis, NC 28081, USA
| | - Ann E Loraine
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, 600 Laureate Way, Kannapolis, NC 28081, USA
| |
Collapse
|
3
|
Kanapin AA, Mulder N, Kuznetsov VA. Projection of gene-protein networks to the functional space of the proteome and its application to analysis of organism complexity. BMC Genomics 2010; 11 Suppl 1:S4. [PMID: 20158875 PMCID: PMC2822532 DOI: 10.1186/1471-2164-11-s1-s4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
We consider the problem of biological complexity via a projection of protein-coding genes of complex organisms onto the functional space of the proteome. The latter can be defined as a set of all functions committed by proteins of an organism. Alternative splicing (AS) allows an organism to generate diverse mature RNA transcripts from a single mRNA strand and thus it could be one of the key mechanisms of increasing of functional complexity of the organism's proteome and a driving force of biological evolution. Thus, the projection of transcription units (TU) and alternative splice-variant (SV) forms onto proteome functional space could generate new types of relational networks (e.g. SV-protein function networks, SFN) and lead to discoveries of novel evolutionarily conservative functional modules. Such types of networks might provide new reliable characteristics of organism complexity and a better understanding of the evolutionary integration and plasticity of interconnection of genome-transcriptome-proteome functions.
Collapse
|
4
|
Floris M, Orsini M, Thanaraj TA. Splice-mediated Variants of Proteins (SpliVaP) - data and characterization of changes in signatures among protein isoforms due to alternative splicing. BMC Genomics 2008; 9:453. [PMID: 18831736 PMCID: PMC2573899 DOI: 10.1186/1471-2164-9-453] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2008] [Accepted: 10/02/2008] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND It is often the case that mammalian genes are alternatively spliced; the resulting alternate transcripts often encode protein isoforms that differ in amino acid sequences. Changes among the protein isoforms can alter the cellular properties of proteins. The effect can range from a subtle modulation to a complete loss of function. RESULTS (i) We examined human splice-mediated protein isoforms (as extracted from a manually curated data set, and from a computationally predicted data set) for differences in the annotation for protein signatures (Pfam domains and PRINTS fingerprints) and we characterized the differences & their effects on protein functionalities. An important question addressed relates to the extent of protein isoforms that may lack any known function in the cell. (ii) We present a database that reports differences in protein signatures among human splice-mediated protein isoform sequences. CONCLUSION (i) Characterization: The work points to distinct sets of alternatively spliced genes with varying degrees of annotation for the splice-mediated protein isoforms. Protein molecular functions seen to be often affected are those that relate to: binding, catalytic, transcription regulation, structural molecule, transporter, motor, and antioxidant; and the processes that are often affected are nucleic acid binding, signal transduction, and protein-protein interactions. Signatures are often included/excluded and truncated in length among protein isoforms; truncation is seen as the predominant type of change. Analysis points to the following novel aspects: (a) Analysis using data from the manually curated Vega indicates that one in 8.9 genes can lead to a protein isoform of no "known" function; and one in 18 expressed protein isoforms can be such an "orphan" isoform; the corresponding numbers as seen with computationally predicted ASD data set are: one in 4.9 genes and one in 9.8 isoforms. (b) When swapping of signatures occurs, it is often between those of same functional classifications. (c) Pfam domains can occur in varying lengths, and PRINTS fingerprints can occur with varying number of constituent motifs among isoforms - since such a variation is seen in large number of genes, it could be a general mechanism to modulate protein function. (ii) DATA The reported resource (at http://www.bioinformatica.crs4.org/tools/dbs/splivap/) provides the community ability to access data on splice-mediated protein isoforms (with value-added annotation such as association with diseases) through changes in protein signatures.
Collapse
Affiliation(s)
- Matteo Floris
- CRS4-Bioinformatica, Parco Scientifico e Technologico, POLARIS, Edificio 3, 09010 PULA (CA), Sardinia, Italy.
| | | | | |
Collapse
|
6
|
Romero PR, Zaidi S, Fang YY, Uversky VN, Radivojac P, Oldfield CJ, Cortese MS, Sickmeier M, LeGall T, Obradovic Z, Dunker AK. Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms. Proc Natl Acad Sci U S A 2006; 103:8390-5. [PMID: 16717195 PMCID: PMC1482503 DOI: 10.1073/pnas.0507916103] [Citation(s) in RCA: 345] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Alternative splicing of pre-mRNA generates two or more protein isoforms from a single gene, thereby contributing to protein diversity. Despite intensive efforts, an understanding of the protein structure-function implications of alternative splicing is still lacking. Intrinsic disorder, which is a lack of equilibrium 3D structure under physiological conditions, may provide this understanding. Intrinsic disorder is a common phenomenon, particularly in multicellular eukaryotes, and is responsible for important protein functions including regulation and signaling. We hypothesize that polypeptide segments affected by alternative splicing are most often intrinsically disordered such that alternative splicing enables functional and regulatory diversity while avoiding structural complications. We analyzed a set of 46 differentially spliced genes encoding experimentally characterized human proteins containing both structured and intrinsically disordered amino acid segments. We show that 81% of 75 alternatively spliced fragments in these proteins were associated with fully (57%) or partially (24%) disordered protein regions. Regions affected by alternative splicing were significantly biased toward encoding disordered residues, with a vanishingly small P value. A larger data set composed of 558 SwissProt proteins with known isoforms produced by 1,266 alternatively spliced fragments was characterized by applying the pondr vsl1 disorder predictor. Results from prediction data are consistent with those obtained from experimental data, further supporting the proposed hypothesis. Associating alternative splicing with protein disorder enables the time- and tissue-specific modulation of protein function needed for cell differentiation and the evolution of multicellular organisms.
Collapse
Affiliation(s)
- Pedro R. Romero
- *School of Informatics, Indiana University–Purdue University Indianapolis, 535 West Michigan Street, IT475, Indianapolis, IN 46202
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Saima Zaidi
- *School of Informatics, Indiana University–Purdue University Indianapolis, 535 West Michigan Street, IT475, Indianapolis, IN 46202
| | - Ya Yin Fang
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Vladimir N. Uversky
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Predrag Radivojac
- School of Informatics, Indiana University, Eigenmann Hall 1005, 1900 East 10th Street, Bloomington, IN 47406; and
| | - Christopher J. Oldfield
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Marc S. Cortese
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Megan Sickmeier
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Tanguy LeGall
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
| | - Zoran Obradovic
- Center for Information Science and Technology, Temple University, 303 Wachman Hall (038-24), 1805 North Broad Street, Philadelphia, PA 19122
| | - A. Keith Dunker
- *School of Informatics, Indiana University–Purdue University Indianapolis, 535 West Michigan Street, IT475, Indianapolis, IN 46202
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, 714 North Senate Avenue, Suite 250, Indianapolis, IN 46202
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|